Anthropic Leaks Source Code for AI Coding Tool in Second Major Security Breach

Anthropic has inadvertently disclosed the source code of its widely-used coding tool, Claude Code.

This leak occurred shortly after Fortune reported that the company had mistakenly made nearly 3,000 files publicly accessible. This included a draft blog post outlining a powerful upcoming model, known internally as “Mythos” and “Capybara,” which presents significant cybersecurity concerns.

The leaked source code contained approximately 500,000 lines spread over about 1,900 files. In response to inquiries, Anthropic acknowledged that “some internal source code” had been leaked during a “Claude Code release.”

A spokesperson clarified, “No sensitive customer data or credentials were involved or exposed. This was due to a release packaging mistake caused by human error, not a security breach. We are implementing measures to prevent a recurrence.”

This recent data leak could pose greater risks for Anthropic compared to a previous accidental release of the draft blog post about its forthcoming model. While the latest mishap did not expose the weights of the Claude model itself, it granted technically savvy individuals the ability to extract additional internal details from the company’s codebase, as noted by a cybersecurity expert consulted by Fortune.

Claude Code stands out as one of Anthropic’s most popular offerings, with substantial uptake among large enterprises. Its capabilities derive not only from the underlying large language model but also from the software ‘harness’ that encapsulates the AI model, dictating how it interacts with various software tools while ensuring it operates within key behavioral limits. The leak involves the source code for this functional harness.

This disclosure presents the possibility for competitors to reverse-engineer how Claude Code’s harness operates, potentially enhancing their products. Additionally, some developers may attempt to create open-source alternatives based on the leaked code.

The leak also surfaced details about a new model internally referred to as “Capybara,” which the company is reportedly preparing to launch. According to Roy Paz, a senior AI security researcher at LayerX Security, the leak indicated the existence of both a “fast” and “slow” version of this new model, likely intended to succeed Opus, Anthropic’s leading model.

Currently, Anthropic markets its models in three distinct sizes: the most capable versions branded as Opus, faster yet less capable variants known as Sonnet, and the smallest, cheapest, and fastest versions called Haiku. The recent draft blog post obtained by Fortune describes “Capybara” as a new tier of model, larger and more advanced than Opus, albeit at a higher cost.

This leak was first highlighted in an X post, and appears to have occurred when Anthropic uploaded the entire original code of Claude Code to NPM, a platform commonly used by developers for sharing and updating software, instead of just the final version meant for operational use. It seems to be a case of human error, where someone bypassed standard release safeguards, as mentioned by Paz.

Paz noted, “Typically, large companies have stringent processes and multiple checks prior to code reaching production, similar to a vault needing several keys for access. It appears Anthropic lack these safeguards, allowing a single misconfiguration or misclick to expose the complete source code.”

Further concerns were raised regarding how the tool interfaces with Anthropic’s internal systems. Even without encrypted access keys usually needed for such systems, it seems feasible to access internal services that should remain restricted. Paz warned that this leak could present new avenues for malicious actors, including nation-states, to exploit Anthropic’s models for developing more sophisticated cyberattack tools and circumventing established safeguards.

Anthropic’s most powerful model, Claude 4.6 Opus, is already regarded as a high-risk asset concerning cybersecurity. The company has indicated that its Opus models can autonomously identify zero-day vulnerabilities in software, a capability intended to assist companies in identifying and resolving flaws. However, such features could also be misappropriated by hackers, including state-sponsored actors, to both discover and exploit vulnerabilities.

This incident isn’t the first occurrence of Anthropic inadvertently leaking details about its popular Claude Code tool. Back in February 2025, an early version of Claude Code accidentally leaked its original code in a similar episode, revealing insights into its operational workings and its connections with Anthropic’s internal systems. Subsequently, Anthropic retracted the software and removed the publicly accessible code.

In summary, the accidental leak of Claude Code’s source code highlights the vulnerabilities that can arise from human error in tech companies. As cyber threats continue to evolve, it is crucial for organizations to enhance their protocols to safeguard sensitive information and maintain customer trust. The repercussions of this incident could reshape future cybersecurity practices within the industry.

Leave a Reply 取消回复

You May Also Like

I’ve Got a Hunch

Using GPT-5.6: A Guide from Ben’s Bites

Grok and Cursor Collaboration