Tools & Experiments

NVIDIA garak

A complete framework for red-teaming LLMs to find vulnerabilities before they reach production.

NVIDIA garak gives you proper tooling for attacking your own models. It runs systematic probes against LLMs to find weak spots in their safety systems.

The framework handles everything from basic setup to custom attack patterns. You can scan models on Hugging Face, analyse safety scores, and export vulnerability reports in structured formats. It includes built-in detectors for common failure modes and lets you write custom probes for specific attack vectors.

What makes this interesting is the focus on defensive testing. Most AI safety work happens after models ship. Garak lets teams find problems during development, when fixing them costs hours instead of months.

The tutorial walks through a complete workflow, from dry runs to multi-probe evaluations. You can inspect flagged outputs, understand attack success rates, and build custom detection rules. The AVID export format means you can track vulnerabilities properly in your security pipeline.

We’ve been waiting for proper red-teaming infrastructure. This looks like the real thing.

No interactive tool for this one yet. Browse all tools for more.