How an AI Agent Hacked McKinsey’s AI Platform

On March 9, 2026, CodeWall announced that its artificial intelligence (AI) agent had hacked McKinsey’s internal AI chatbot ‘Lilli’.

CodeWall’s autonomous agent was designed to probe systems for weaknesses and had no insider knowledge of McKinsey’s AI platform. Researchers simply pointed the agent at the system and allowed it to test for vulnerabilities. During the exercise, it identified several serious security flaws, including the ability to modify system prompts.

This exercise, conducted as security research, highlights a broader issue. As organizations rush to deploy AI agents that interact with internal systems, APIs, and sensitive data, these platforms quickly become high-value targets. The findings illustrate the risks organizations face when AI systems are deployed without strong security controls and proper threat modelling.

How CodeWall’s AI Agent Hacked Mckinsey’s ‘Lilli’ Chatbot

CodeWall’s AI agent began by mapping the attack surface of McKinsey’s AI platform. During this process, it discovered that the API documentation was publicly accessible, exposing more than 200 documented endpoints.

Of these endpoints, 22 did not require authentication. Exploiting this gap, the agent executed a simple SQL injection attack and gained system-wide access in under two hours.

Lilli is used by over 70% of McKinsey staff, with the AI chatbot processing over 500,000 prompts each month. Purpose-built for the consulting firm, Lilli supports chat, document analysis, and search capabilities, with access to decades of proprietary research. Once access was obtained, the system exposed a large volume of sensitive data, including:

  • 46.5 million chat messages, revealing strategy discussions, financial information, internal research, and client engagements.
  • 3.68 million RAG document chunks, representing the underlying knowledge base used by Lilli.
  • 728,000 files, including Microsoft Office documents and PDFs.
  • 57,000 user accounts, covering the full workforce using the system.
  • 384,000 AI assistants, exposing how McKinsey uses AI internally.

Worryingly, the system prompts controlling Lilli’s behavior were also found in the same database. CodeWall disclosed the vulnerability to McKinsey.

McKinsey promptly released a statement confirming the issue had been resolved and stating there was no evidence that client data or confidential information had been accessed by CodeWall or any other unauthorized third party. The remediation patched the unauthenticated endpoints, took the development environment offline, and removed public access to the API documentation.

Test your web apps in real time with PtaaS

The risks of unsecured AI systems

The Lilli breach highlights a growing security challenge for enterprise AI deployments. Systems designed to search internal knowledge, analyze documents, and answer employee queries often sit directly in front of large volumes of corporate data.

When authentication controls, APIs, or underlying data stores are misconfigured, these platforms can provide attackers with a direct path to sensitive information. The weaknesses identified in the Lilli system exposed two primary exploitation paths:

1. Confidential data exfiltration: Third-party exposure

A breach of this type could expose data belonging not only to McKinsey but also its clients, including multinational corporations, financial institutions, government agencies, and public sector bodies. Without strong access controls and authentication, a single vulnerable endpoint can provide a path to a much larger pool of confidential information.

2. Prompt manipulation: Output integrity risk

An attacker could modify the behavior of the AI agent, enabling data poisoning or deliberately misleading outputs. For organizations that rely on AI assistants to retrieve or summarize information, manipulated prompts could distort results or surface incorrect data. Users often place significant trust in AI-generated outputs, making these types of attacks particularly dangerous.

How can organizations mitigate AI platform risks?

When AI agents query internal systems, retrieve documents, or interact with APIs, they effectively become another application layer inside the organization. As a result, they require the same level of security as any system handling confidential information.

In the case of McKinsey’s Lilli and similar enterprise AI assistants, the primary risk is not the model itself but the surrounding ecosystem: credentials, APIs, internal document stores, and the permissions that connect them. Security controls and threat modelling for AI agents should be implemented before deployment, not after researchers or attackers discover weaknesses.

Organizations looking to reduce risk around internal AI platforms should focus on several practical areas:

  • Treat AI agents like privileged applications: Applying least-privilege access controls, limiting the scope of what the AI agent can retrieve, and separating access across different data domains can significantly reduce the impact if the system is abused or manipulated.
  • Incorporate AI-specific threat modelling: Application security reviews don’t always account for issues like prompt injection, indirect data exfiltration, or context manipulation. Before launching an internal AI assistant, organizations should map how the system interacts with data stores and APIs, and test how it behaves when exposed to adversarial prompts designed to bypass safeguards.
  • Monitor and log AI activity: Logging AI queries that retrieve internal data, configuring alerts for unusual access patterns, and integrating AI activity into existing security monitoring platforms can help detect misuse or early-stage probing.
  • Include AI platforms in penetration testing: Including AI assistants in regular penetration tests and red team exercises, particularly testing prompt injection and data exfiltration scenarios, helps uncover weaknesses before attackers do.

How Outpost24 helps

Outpost24’s Penetration-Testing-as-a-Service (PTaaS) is a comprehensive penetration testing service that uncovers web application, API and mobile app vulnerabilities and delivers actionable results through a single platform. Combining human-led testing with automated scanning, our PTaaS solution focuses on real-world application risk and enables direct collaboration between developers and testers.

Outpost24 has decades of experience helping organizations manage their attack surface, with strong roots in ethical hacking. If you’re interested in seeing how we can help secure your applications, contact us today.

About the Author

David Ketler is a cybersecurity consultant based in Toronto, Canada with 10+ years of experience in software development and cybersecurity. He writes about password cracking, dark web activity, and password management.