Leveraging Large Language Models for Security-Focused Code Reviews

This study investigates the potential application of Large Language Models (LLMs) in enhancing software security through automated vulnerability detection during the code review process.

The research examines the efficacy of LLMs in identifying security vulnerabilities that human reviewers, particularly those without extensive security backgrounds, might overlook. Through analysis of historically significant Common Vulnerabilities and Exposures (CVEs) in popular open-source projects, including frameworks such as Django and Log4j, this research evaluates the capability of LLMs to detect subtle security flaws within complex codebases. The methodology employs a phased approach to LLM prompting, progressing from general code analysis to targeted vulnerability identification while maintaining controlled conditions by isolating vulnerable code segments. By comparing LLM performance against traditional human code reviews and automated security scanning tools, this study provides crucial insights into the potential role of artificial intelligence in augmenting software security practices.

The findings suggest implications for the evolution of code review methodologies and the integration of AI-assisted security analysis within software development lifecycles.

Download file

sans-Leveraging-Large-Language-Models_McQuade (PDF, 0.30MB)

26 Mar 2025

ByDaniel McQuade

All papers are copyrighted

No re-posting of papers is permitted

Related Content

From Alert to Evidence: Evaluating AI Agents for Cyber Forensic Triage

Research Paper

Cyber defense teams are beginning to experiment with large language models in security operations, but their usefulness in digital forensics and incident triage is still uncertain.

11 Jun 2026
Connor Blackard

Leveraging Large Language Models for Cross-Vendor Firewall Configuration Migration: A Comparative Case Study of Claude and ChatGPT

Research Paper

This paper investigates how two current-generation large language models (LLMs) perform on a single, representative firewall migration task.

12 May 2026
Omar Zaman

AI-Human Collaboration in Modern SOCs

Research Paper

Enterprises face upwards of 3,000 security alerts daily, and according to the SANS 2025 SOC Survey, two-thirds of security operations center (SOC) teams cannot keep pace.

17 Mar 2026
Mathias Fuchs

AI-Driven SecOps: Unifying Controls, Automating Response, and Advancing the Modern SOC Using Cortex XSIAM

Research Paper

New research from IDC reveals the tangible business value of rigorous, practitioner-led training from SANS: faster threat detection and response, reduced operational risk, stronger team cohesion, and millions in annual cost savings.

29 Jul 2025
Dave Shackleford

Trust But Verify: Evaluating the Accuracy of LLMs in Normalizing Threat Data Feeds

Research Paper

This paper examines whether Large Language Models (LLMs) can be reliably applied to the normalization of Indicators of Compromise (IOCs) into Structured Threat Information Expression (STIX) format.

16 Jul 2025
Nicholas Peterson

Do AI Coding Assistants Make Bad Coders Worse? A Security Evaluation of GitHub Copilot

Research Paper

This paper examines whether the overall security posture of a project affects the quality of the code produced by Copilot.

11 Jul 2025
Andrew Hannaford

Dropzone AI Can Make Internal SOC Teams More Effective

Research Paper

In this paper, SANS Certified Instructor Mark Jeanmougin examines how Dropzone AI can integrate into existing security stacks and help SOC teams stay focused on high-impact decisions.

17 Jun 2025
Mark Jeanmougin

Beneath the Mask: Can Contribution Data Unveil Malicious Personas in Open-Source Projects?

Research Paper

In February 2024, after building trust over two years with project maintainers by making a significant volume of legitimate contributions, GitHub user "JiaT75" self-merged a version of the XZ Utils project containing a highly sophisticated well-disguised backdoor targeting sshd processes running on systems with the backdoored package installed.

13 May 2025
SANS Institute

AI-Driven Insecurity: Assessing Security Gaps in AI Generated IT Guidance

Research Paper

The increasing reliance on AI-generated technical guidance for IT system configuration introduces significant security risks. This study assesses these risks through a case study: setting up an Apache web server on a Rocky Linux system using instructions from seven AI models.

13 May 2025
Edward Abbott

MITRE ATT&CK Labeling of Cyber Threat Intelligence via LLM

Research Paper

This paper explores the effectiveness of various online and locally hosted LLMs in classifying an arbitrary statement as containing an MITRE ATT&CK Framework (MAF) technique or not and then producing the technique number if it does.

7 Jan 2025
Terence O’Brien

AI Hunting with the Cybereason Platform: A SANS Review

Research Paper

SANS reviewed Cybereason's AI hunting platform, which offers a lightweight, behavior-focused model...

23 Jul 2018
Dave Shackleford

Applying Machine Learning Techniques to Measure Critical Security Controls

Research Paper

Implementing and measuring Critical Security Controls (CSC) requires analyzing all data types...

6 Sep 2016
Balaji Balakrishnan

Leveraging Large Language Models for Security-Focused Code Reviews

Related Content

From Alert to Evidence: Evaluating AI Agents for Cyber Forensic Triage

Leveraging Large Language Models for Cross-Vendor Firewall Configuration Migration: A Comparative Case Study of Claude and ChatGPT

AI-Human Collaboration in Modern SOCs

AI-Driven SecOps: Unifying Controls, Automating Response, and Advancing the Modern SOC Using Cortex XSIAM

Trust But Verify: Evaluating the Accuracy of LLMs in Normalizing Threat Data Feeds

Do AI Coding Assistants Make Bad Coders Worse? A Security Evaluation of GitHub Copilot

Dropzone AI Can Make Internal SOC Teams More Effective

Beneath the Mask: Can Contribution Data Unveil Malicious Personas in Open-Source Projects?

AI-Driven Insecurity: Assessing Security Gaps in AI Generated IT Guidance

MITRE ATT&CK Labeling of Cyber Threat Intelligence via LLM

AI Hunting with the Cybereason Platform: A SANS Review

Applying Machine Learning Techniques to Measure Critical Security Controls

Subscribe to GIAC’s Monthly Newsletter