The Dead Internet's Birthday
The year the machines became the majority — and what it means for everything trained on what they write
In 2024, bot traffic crossed 51% of all web activity. The internet’s population became majority non-human, and nobody threw a party.
This isn’t a conspiracy theory anymore. It’s a measurement. Imperva’s 2024 Bad Bot Report documented the threshold crossing. Ahrefs analyzed 900,000 newly published web pages in April 2025 and found that 74.2% contained AI-generated content. On X, approximately 64% of accounts are likely bots. On LinkedIn, 54% of long-form posts are AI-generated. On Zillow, AI-generated reviews jumped from 3.6% in 2019 to 23.7% in 2025.
The dead internet theory — the idea that most online activity is artificial — is no longer a theory. It’s a census result.
The pollution loop
Here’s the part that matters more than the percentages: the AI industry trained its models on the open web. It scraped billions of pages of human-written text, human-taken photographs, human-composed music. It used that corpus to build systems that generate synthetic versions of the same content. Those synthetic outputs now flood the same web the models were trained on.
The next generation of models will train on a web that is majority synthetic. They will learn from content that was generated by the previous generation of models, which learned from content that was generated by the generation before that.
This is model collapse. Formally described in a 2024 Nature paper by Ilia Shumailov and colleagues, model collapse occurs when AI systems trained on AI-generated data progressively degrade. The tails of the original data distribution disappear. Minority perspectives, unusual phrasings, rare knowledge, edge cases — all of it gets averaged away. Even a contamination rate as low as one synthetic sample per thousand can trigger the degradation. Larger training sets don’t fix it. The poison scales with the dataset.
Between 30% and 40% of the active web is now synthetic content. By 2030, Timothy Shoup of the Copenhagen Institute for Futures Studies predicts 99% will be. The training data problem isn’t coming. It arrived.
The traffic crash
The consequences are already measurable in the places that depend on human-written web content for revenue.
Global publishers saw a 33% drop in Google traffic between November 2024 and November 2025, according to Press Gazette. U.S. publishers specifically lost 38%. Stereogum, a music publication that has operated since 2002, lost 70% of its ad revenue in 2025. Business Insider saw organic search traffic fall 55% between April 2022 and 2025, leading to a 21% staff reduction. Chegg, the education platform, experienced a 49% year-over-year traffic decline among non-subscribers.
The mechanism is straightforward. Google’s AI Overviews now answer queries directly, eliminating the need to click through to the source. Click-through rates dropped 46.7% on queries where AI Overviews appear. Zero-click searches rose from 56% to 69%. Gartner predicts a 25% decline in total search volume by late 2026.
The web’s economic model — create content, attract search traffic, sell ads — is being dismantled from both sides. AI generates so much content that human work gets buried. AI search summarizes so effectively that even when human work gets found, nobody clicks through to read it.
Where the humans went
The remaining human activity has largely retreated behind walls. Substack, valued at $1.1 billion, reaches 100 million monthly visits — behind email paywalls and subscription gates. Discord communities, private Slack channels, paid newsletters, intranets. The open web is becoming a synthetic wilderness, and humans are building fences.
This retreat has a name in information economics: the enclosure of the commons. The web was a public good — a shared resource of human knowledge, expression, and connection. AI companies harvested that commons, trained commercial products on it, and deployed those products to flood it with synthetic content. The commons didn’t disappear. It was strip-mined and replaced with landfill.
The training data crisis
Every major AI company faces the same problem: they need human data, and they’re destroying the ecosystem that produces it.
OpenAI, Anthropic, Google, Meta — all depend on fresh human-generated content to train and fine-tune their models. But the economic incentives they’ve created make producing that content increasingly unprofitable. The writers, journalists, photographers, and creators who populated the web with training data are being economically displaced by the products trained on their work.
This isn’t a future prediction. The 21% staff reduction at Business Insider, the 70% revenue loss at Stereogum, the collapse of Chegg — these are training data producers going offline. Every newsroom that closes, every freelancer who quits, every forum that goes inactive is a reduction in the supply of the one thing AI models actually need: human truth.
The mitigation research is clear on one point: synthetic data is useful for augmentation, but the underlying corpus must remain human. Without it, model performance on real-world tasks flatlines. The AI industry is sawing off the branch it’s sitting on, and the branch is already cracking.
What happens next
The most likely near-term outcome is stratification. Premium AI models will train on licensed, curated human data — the kind that costs money to produce and access. Cheaper models will train on the synthetic web and gradually degrade. The quality gap between AI systems will become a direct function of who still has access to authentic human content.
For the rest of the internet, the trajectory is clear. The open web will become increasingly synthetic, increasingly unreliable, and increasingly irrelevant as a source of truth. Human knowledge production will move behind paywalls, into private communities, and off the web entirely. The search engine, which organized the web’s information for three decades, will become an interface for synthetic content talking to itself.
The dead internet theory predicted this, but it got the mechanism wrong. It imagined a conspiracy — shadowy actors deploying bots to manufacture consensus. The reality is more mundane and more consequential. The internet died of natural causes: commercial incentives that rewarded extraction over sustainability, applied at scale, until the ecosystem collapsed.
The machines became the majority in 2024. We just haven’t figured out what to do about it yet.
Originally published at https://noahaust2.github.io/strategist-dashboard/blog/the-dead-internets-birthday.html
Write a comment