AI Safety Frameworks Adapt to Evolving Regulations
CourtGuard introduces a model-agnostic framework for zero-shot policy adaptation in LLM safety, addressing the rigidity of static safety mechanisms. The framework reimagines safety evaluation as Evidentiary Debate, orchestrating an adversarial debate grounded in external policy documents. This