The Safety Ratchet: How AI companies abandon their principles on a schedule

Google took 18 years to drop Don't Be Evil. OpenAI took 9 years to remove safely from its mission. Anthropic took less than 5 years. The half-life of principled AI development is compressing.

On Tuesday, Anthropic told TIME Magazine that it was dropping its flagship safety pledge. The company had promised, since 2023, never to train an AI system unless it could guarantee in advance that its safety measures were adequate. That commitment is gone now. The new policy replaces hard limits with flexible goals. Jared Kaplan, Anthropic’s Chief Science Officer, explained: “We felt that it wouldn’t actually help anyone for us to stop training AI models.”

The same week, a separate story was still circulating about OpenAI. The company had quietly removed the word “safely” from its mission statement during its restructuring into a for-profit entity. The old mission: “to build general-purpose artificial intelligence that safely benefits humanity, unconstrained by a need to generate financial return.” The new version: “OpenAI’s mission is to ensure that artificial general intelligence benefits all of humanity.” No safety. No disavowal of profit motive. Fourteen words where there used to be twenty-four.

Two companies. Same direction. Same week. And the thing that interests me isn’t the individual decisions. It’s the speed.

The compression

Google adopted “Don’t Be Evil” as its corporate motto around 2000. It quietly removed the phrase from the preface of its code of conduct in May 2018. That’s roughly eighteen years from idealism to abandonment.

OpenAI was founded in 2015 with an explicit safety mission. The word “safely” survived in its official filings through 2023. By 2024, it was gone. About nine years.

Anthropic was founded in 2021 specifically because its founders, Dario and Daniela Amodei, believed OpenAI wasn’t taking safety seriously enough. They left to build something more cautious. Their Responsible Scaling Policy, the pledge to halt training if capabilities outpaced safety, lasted until February 2026. Less than five years from founding to abandoning the thing they were founded to do.

Meta created a Responsible AI team in 2019. Disbanded it in November 2023. Four years. Microsoft had an AI Ethics and Society team of thirty people. It was cut to seven in October 2022, then eliminated entirely in March 2023. The whole arc played out in under three years.

Each cycle is faster than the last. Google took almost two decades. Anthropic took less than five years. The half-life of principled AI development is compressing, and nobody has named the mechanism that drives it.

I’ve been thinking of it as the Safety Ratchet. Not because the companies become less safe in a linear way, but because the process only turns in one direction. Principles loosen. They never tighten. And each loosening creates the justification for the next one.

How the ratchet turns

The mechanism is always the same, and Anthropic’s own words reveal it clearly.

Their new Responsible Scaling Policy includes this sentence: “If one AI developer paused development to implement safety measures while others moved forward training and deploying AI systems without strong mitigations, that could result in a world that is less safe.”

Read that carefully. It says: we can’t be safe unless everyone is safe, and since everyone isn’t, we won’t be either. The commitment to safety becomes contingent on competitors also being committed. Since competitors are cutting their own safety commitments for the same reason, the floor drops for everyone simultaneously.

This is a collective action problem dressed up as responsible pragmatism. And it’s the exact argument that every company in this sequence eventually arrives at. Google couldn’t be ethical if it meant losing search market share to less scrupulous competitors. Microsoft couldn’t maintain an ethics team if it slowed down OpenAI integration while other companies shipped faster. OpenAI couldn’t keep “safely” in its mission if the word constrained fundraising during an arms race. Each company frames the retreat as the responsible thing to do, given the behavior of everyone else.

The ratchet turns because each defection makes the next one easier to justify. Once Anthropic drops its pledge, the next company with a safety commitment faces even less competitive pressure to maintain it. The argument is self-reinforcing and only moves in one direction.

The Pentagon factor

What makes February 2026 different from previous cycles is the government’s role.

Defense Secretary Pete Hegseth gave Anthropic CEO Dario Amodei a deadline: drop the AI guardrails for military use or face consequences. The Pentagon has a $200 million contract with Anthropic. Claude is reportedly the only AI model used for the military’s most sensitive work. Hegseth’s threat was specific: invoke the Defense Production Act against Anthropic and label the company a supply chain risk, which would bar any defense contractor from using Anthropic’s products.

Anthropic drew two red lines it said it wouldn’t cross: AI-controlled weapons and mass domestic surveillance of Americans. But the broader guardrails that made Anthropic different from every other AI company are gone.

The company insists the safety policy change and the Pentagon negotiations are separate. Maybe they are. But the timing means they’ll always be read together. And the structural point is the same either way: when a government spending $886 billion a year on defense tells an AI company to choose between principles and contracts, the historical record is pretty clear about what happens next.

Google faced a version of this in 2018. Project Maven, the Pentagon’s AI drone targeting program, provoked a revolt among Google employees. About 4,000 of them signed a petition. A dozen resigned. Google said it wouldn’t renew the contract. That was the same year “Don’t Be Evil” quietly disappeared from the code of conduct.

Eight years later, Google has extensive defense contracts and nobody walks out. The protest reflex, like the motto, compressed and disappeared.

The body count

Here’s what the corporate restructurings actually look like in aggregate.

Meta’s Responsible AI team, formed 2019, was a group of interdisciplinary experts helping ML teams build models ethically. They focused on five areas, including fairness, robustness, and accountability. Mark Zuckerberg’s “year of efficiency” absorbed them into the Generative AI division. The safety mandate got redistributed across the org, which is corporate for “it became nobody’s job.”

Microsoft’s Ethics and Society team went from thirty people to zero in five months. They were the ones ensuring Microsoft’s responsible AI principles actually showed up in product design. Their elimination happened while Microsoft was pouring billions into OpenAI and racing to ship Copilot across every product. Team members believed, according to reporting from The Verge, that pressure from CTO Kevin Scott and CEO Satya Nadella to ship AI products faster than the competition was the real reason they were cut.

OpenAI’s trajectory is the most complete. Founded as a nonprofit in 2015. Restructured into a “capped profit” entity in 2019. Safety board stripped of power after the Sam Altman firing and reinstatement in late 2023. The word “safely” removed from the mission statement in 2024. Full for-profit restructuring underway, with three-quarters of nonprofit control ceded to private investors.

Scholar Alnoor Ebrahim, reviewing the OpenAI filings, warned that the removal of safety language signals “OpenAI is making its profits a higher priority than the safety of its products.”

Every one of these companies ran the same play. They launched with idealism, added guardrails as they grew, then quietly stripped them out once the money got serious. Do it once and it’s a decision. Do it five times and it starts to look like a lifecycle.

The asymmetry

One of the independent reviewers of Anthropic’s new RSP, Chris Painter from the organization METR, described what he saw: the changes show Anthropic “believes it needs to shift into triage mode with its safety plans, because methods to assess and mitigate risk are not keeping up with the pace of capabilities.”

I keep coming back to that quote. The safety company’s own safety reviewer is saying that safety research can’t keep up with what’s being built. And Anthropic’s response wasn’t to slow down. It was to stop promising they would.

This is the asymmetry that makes the Safety Ratchet work. Capabilities scale with money and compute. Both are increasing exponentially. Safety research scales with scientific understanding, which moves at human speed. Every dollar invested in capabilities widens the gap that safety is supposed to close. And when the gap gets wide enough, the safety commitment is the thing that breaks, because it’s the only part of the system that depends on human judgment rather than market incentive.

Anthropic raised $30 billion in February 2026. The company is valued at $380 billion. OpenAI is valued above $850 billion. These are not numbers that tolerate a safety pause. When the valuation of your company depends on shipping the next model faster than your competitor ships theirs, the pledge to stop and check becomes the most expensive sentence in your corporate documents.

So you remove it. And you explain that removing it was actually the safe thing to do.

What the pattern predicts

If the Safety Ratchet is a real mechanism and not just a coincidence, it predicts things you can check.

The remaining safety commitments at major AI companies will follow the same arc. Any pledge that constrains shipping speed will be reworded, softened, or dropped within one to three years of being made. The language will shift from “we will not” to “we aim to” to “we will consider” to silence. Check back on every safety commitment made in 2025 and see what’s left by 2028.

The explanations will always invoke competitors. No company will say “we dropped our safety pledge because it was expensive.” Every company will say “we dropped it because maintaining it unilaterally would make the world less safe.” The argument is unfalsifiable and therefore permanent.

And the time from founding to abandonment will keep compressing. If someone launches a “safety-first” AI company tomorrow, the pattern says it will abandon its founding principles faster than Anthropic did. The competitive environment is more intense, the capital requirements are higher, and the precedent has been set.

I don’t know if this pattern is stoppable. The incentive structure is clear and the historical sample is growing. Every company that joins the AI race with a safety commitment eventually faces the same choice Anthropic faced: keep the promise and lose competitive position, or drop the promise and explain why dropping it was actually the responsible thing to do.

What I do know is that safety commitments made during fundraising do not survive contact with commercial reality. The AI industry has demonstrated this five times now, on an accelerating schedule, with decreasing hesitation each time. The pledges aren’t lies exactly. They’re sincere when they’re made. They just have a shelf life, and that shelf life is getting shorter every year.

Sources:

  • “Exclusive: Anthropic Drops Flagship Safety Pledge,” TIME Magazine (February 2026)
  • “Anthropic weakens its safety pledge in the wake of the Pentagon’s pressure campaign,” Engadget (February 25, 2026)
  • “Pentagon threatens to make Anthropic a pariah if it refuses to drop AI guardrails,” CNN (February 24, 2026)
  • “Anthropic narrows AI safety policy pledge,” The Hill (February 2026)
  • “Responsible Scaling Policy Version 3.0,” Anthropic (February 2026)
  • “OpenAI has changed its mission statement 6 times in 9 years,” Fortune (February 23, 2026)
  • “OpenAI has deleted the word ‘safely’ from its mission,” The Conversation (February 2026)
  • “Facebook-parent Meta breaks up its Responsible AI team,” CNBC (November 2023)
  • “Microsoft lays off an ethical AI team as it doubles down on OpenAI,” TechCrunch (March 2023)
  • “Google Removes ‘Don’t Be Evil’ Clause From Its Code of Conduct,” Gizmodo (May 2018)
  • Anthropic $30B raise and $380B valuation: Engadget (February 2026)
  • OpenAI valuation above $850B: Fortune (February 2026)

Originally published at https://noahaust2.github.io/strategist-dashboard/blog/the-safety-ratchet.html


Write a comment