In an interview with TimesTech, Prof. Fred Morstatter, Research Assistant Professor of Computer Science at USC, delves into his latest study on AI safety and legal comprehension. By testing language models against U.S. biosecurity laws, his research exposes how current AI systems often miss legal intent, or mens rea, and underscores the urgent need for more advanced safeguards, especially in high-risk domains like biosecurity.
Read the full interview here:
TimesTech: Your study highlights how large language models can sometimes provide step-by-step instructions for illegal activities when prompts are slightly reworded. Could you walk us through the methodology used to test this, especially how you assessed the models’ understanding of legal intent (mens rea)?
Fred: We designed scenarios involving violations of U.S. Code Title 18 §175 and tested multiple language models to see if they could detect the law, infer intent, and avoid giving unsafe responses. We varied the country involved in each prompt to study geographic bias in compliance. To assess mens rea, we asked whether the actions showed intent or recklessness, revealing how well models understand criminal mental states.
TimesTech: The findings reveal that current AI safety mechanisms fail to detect deeper legal violations. What does this suggest about the limitations of existing alignment strategies and guardrails being used in LLMs today?
Fred: Models often catch obvious legal violations but fail to understand intent. Some still give detailed instructions for illegal activities. This shows current safeguards like refusal heuristics are not enough, especially in high-risk contexts.
TimesTech: The paper explores U.S. biological weapons law. Why did you choose this particular legal domain, and what unique challenges did it pose in evaluating AI behavior and safety?
Fred: We chose this law because misuse poses extreme risks, and the legal framework is well-defined. It tests a model’s ability to understand not just actions, but the intent behind them—something essential for legal reasoning.
TimesTech: How do knowledge graphs and retrieval-augmented generation (RAG) techniques enhance our understanding of an LLM’s legal reasoning? Were there any surprising insights gained through this approach?
Fred: We used knowledge graphs to represent legal relationships and RAG to ground answers in real legal text. This helped models’ reason more accurately, but some still gave unsafe responses, showing that legal knowledge alone isn’t enough for safety.
TimesTech: Given the potential misuse of AI for dangerous applications, what concrete steps can developers and policymakers take to improve legal comprehension and prevent exploitation of LLMs?
Fred: Developers should build models using structured legal inputs and test for intent detection. Policymakers should support evaluation tools that check for both legal understanding and safety in high-risk use cases.
TimesTech: Looking ahead, how do you see your research influencing future AI regulation frameworks, especially in areas where the stakes, like biosecurity, are critically high?
Fred: Our work suggests that AI oversight should include testing for how models reason about legality and intent. In areas like biosecurity, regulation should go beyond surface-level safeguards to ensure responsible use.