Skip to main content
← back

Samizdat(a) in the New Age

Samizdat(a) in the New Age

18 min read

Planted: 4 months ago
Last tended to: 3 weeks ago
a stamp on a document that reads samizdat and uncontrolled.
Image source

Assumed Audience: People who are already following the AI space. You’re curious about the future of technology beyond corporate roadmaps and are interested in the decentralized, grassroots movements building AI with different values and a different philosophy.

Where Silicon Meets Street

Something is happening in the spaces between official announcements and quarterly earnings calls, something that feels both ancient and utterly contemporary. The rise of what might be called Samizdat AI.

The word carries weight from another era: those underground publications in Soviet states, hand-copied and passed from person to person, existing in defiance of centralized control. Self-published, literally, but the concept ran deeper than that. It was about refusing to let voices be silenced, about creating culture that couldn’t be commodified or controlled.

The parallel isn’t perfect, but it’s there. In server farms hidden in suburban basements, in Discord channels that never see daylight, in BitTorrent networks that treat AI models like mixtapes, something similar is taking shape. A quiet revolution that most people won’t notice until it’s already changed everything. It isn’t the future we were promised in those breathless articles about artificial general intelligence or the singularity. It’s messier than that, more distributed, more human. It’s happening not in gleaming corporate research labs but in the margins, in communities that Silicon Valley doesn’t even know exist, among people who’ve decided they’re not going to wait for permission to build the tools they need.

The signs are everywhere if you know where to look. The barriers to frontier AI aren’t primarily technological anymore. They’re institutional. And institutions can be bypassed.

Economics Underground

The numbers tell a story that venture capital presentations prefer to ignore. When DeepSeek demonstrated that frontier AI performance could be achieved with training costs orders of magnitude lower than Western counterparts, it did more than challenge assumptions about Chinese technological capabilities. It proved that the relationship between capital investment and AI capability isn’t nearly as linear as the dominant narratives suggest.

This matters in ways that extend far beyond cost optimization. When building competitive AI requires hundreds of millions in capital, only a handful of organizations can participate. Google, Microsoft, OpenAI, Anthropic…the list of organizations with the resources to train truly large-scale models remains remarkably short. When it requires millions, the number increases by orders of magnitude. Universities, well-funded startups, government research labs, and even some particularly ambitious individuals enter the conversation. But here’s where it gets really interesting: when it requires thousands, which some of these distributed approaches are approaching, suddenly everyone becomes a potential participant. A community of a few hundred people pooling modest resources can collectively marshal the computing power that would have been accessible only to major corporations just a few years ago.

The economic implications cascade through the entire AI ecosystem. Traditional business models assume scarcity: expensive models running on expensive infrastructure, accessed through subscription services or API calls. But what happens when the models become freely available and the infrastructure becomes distributed?

This new reality creates a prisoner’s dilemma for AI companies: everyone would benefit from maintaining proprietary control, but any individual company that releases openly gains competitive advantages that outweigh the loss of exclusivity. The rational response is to release early and often. Meta’s decision to release the Llama models openly reflects this dynamic. Despite the significant investment required to train these models, Meta recognized that the competitive advantages of community improvement outweighed the potential revenue from licensing. The decision proved prescient: Llama-based models quickly became the foundation for countless community projects.

The mainstream technology industry hasn’t figured out how to respond to this challenge yet. The traditional approach would be to use intellectual property law, but AI models are notoriously difficult to protect. Trade secret protection works only as long as the secrets remain secret. The result is an IP environment that favors open development over proprietary control, creating a parallel economy that operates according to entirely different principles.

The Tools of Digital Samizdat

The infrastructure supporting this underground renaissance would have seemed impossible just a few years ago. The same technologies that enable centralized AI development also enable distributed alternatives, often in ways that their original designers never anticipated.

When Meta’s original Llama 1 model leaked via BitTorrent on 4chan in March 2023, it revealed the impossibility of controlling AI model distribution through traditional means. The leak wasn’t technically sophisticated—someone simply posted a link in a public forum—but the implications were profound. Within hours, thousands had downloaded the models. Within days, modified versions appeared that removed safety restrictions. Within weeks, entire ecosystems of tools had emerged based on the leaked models.

This event highlighted the robust toolchain of the Samizdat movement. BitTorrent protocols that once pirated movies now distribute multi-gigabyte language models. Consumer graphics cards for gaming now train neural networks. Open-source frameworks like PyTorch and TensorFlow provide the foundational software, while cloud platforms offer GPU access by the hour.

The platforms hosting this work — GitHub, Hugging Face, and academic preprint servers—have become the infrastructure for this parallel ecosystem. GitHub’s data reveals the scale: over 70,000 new generative AI projects were created in 2024 alone. Hugging Face hosts over 1 million models, becoming a digital samizdat printing press where anyone can publish without seeking approval from corporate gatekeepers.

The technical infrastructure extends beyond just storage and distribution. Projects like Petals enable “BitTorrent-style” distributed inference, turning the internet itself into a massive, decentralized AI system. Container technologies like Docker and orchestration systems like Kubernetes make it simple to deploy these models independently. The entire pipeline, from development to deployment, can now happen completely outside the corporate AI ecosystem.

This technical independence enables the cultural and political independence that characterizes AI samizdat. Communities that want to build AI systems according to their own values don’t need to convince commercial platforms to support their use cases. They don’t need to comply with platform policies that may conflict with their cultural practices. They don’t need to worry about their access being restricted if their approaches conflict with corporate interests.

Resistance as Innovation

With the economic and technical barriers lowered, the more significant question emerges: what should AI be for? The current AI landscape reflects a particular set of assumptions — predominantly Western, English-speaking, and commercial — about how intelligence should be organized. The result is AI that works well for some communities and poorly for others.

The underground AI movements represent something different. Not necessarily better, but different, emerging from other cultural contexts and optimizing for different outcomes. Eastern philosophical approaches emphasize harmony and collective benefit. African approaches prioritize community, perhaps suggesting AI should strengthen social bonds rather than isolate individuals.

These philosophical differences lead to different technical choices. An AI system designed around Ubuntu principles might optimize for community cohesion rather than individual productivity. One built on Confucian frameworks might prioritize social harmony over individual expression. One emerging from Islamic jurisprudence might embed different concepts of fairness and justice than systems built on Western legal traditions.

The mainstream AI industry addresses cultural diversity through localization, but this treats culture as a surface feature. The underground movements go deeper, questioning fundamental architectural assumptions.

What if recommendation systems optimized for community wellness rather than engagement?

What if language models were designed to preserve linguistic diversity rather than converge toward dominant languages?

What if AI systems were built to strengthen traditional knowledge systems rather than replace them?

These questions aren’t purely theoretical. Language preservation efforts in indigenous communities are a prime example. Instead of waiting for major tech companies, they build their own models, trained intensively on their own small but culturally rich datasets. These projects represent a form of technological sovereignty, where communities take control of how AI systems understand and represent their cultures. This is resistance not through opposition, but through the creation of profoundly different tools.

Geopolitics of Distribution

The cumulative effect of these distributed, culturally-grounded AI development efforts is a gradual shift away from a centralized, Western-dominated AI landscape toward a more multipolar ecosystem. This shift challenges the global order in fundamental ways.

Traditional models of technology transfer assume that advanced capabilities are developed in rich countries and then diffuse to poorer ones, creating dependencies. AI samizdat short-circuits this process entirely. A powerful language model can be in the hands of developers in Sub-Saharan Africa or Southeast Asia within hours of its release, where it can be adapted to local needs on local infrastructure.

This has already begun to reshape global technology dynamics:

  • China, in response to U.S. export controls, has focused on efficiency innovations that achieve competitive performance using widely available components, bypassing restrictions and building an independent ecosystem.

  • India is leveraging its substantial technical talent to build AI for domestic needs—like healthcare in resource-constrained environments and education in multilingual contexts. Its high rate of AI tool usage reflects a deep integration of these technologies into distinctly Indian approaches.

  • West Asia, with massive investments like the UAE’s MGX fund and Saudi Arabia’s “Project Transcendence,” is building indigenous capabilities for applications like Sharia-compliant finance and Arabic language processing that understands regional dialects.

  • The Africa continent, while facing resource constraints, sees grassroots development flourish. Community organizations, universities, and diaspora communities build systems that serve local needs, such as a model that understands a local language or local agricultural conditions, operating independently of national policies.

This reverse innovation pattern, where solutions from developing regions are adopted in developed ones, represents a significant shift. Instead of a one-way flow of technology, AI innovations are now moving in multiple directions simultaneously, creating a multipolar AI world where accountability and governance can no longer be dictated by a handful of nations.

Regulation, Meet Reality

As this new ecosystem emerges, the old world’s power structures are trying to impose order, creating a fascinating paradox: regulations designed to control AI often have the unintended consequence of strengthening the very underground movements they cannot touch.

The EU AI Act, the world’s most comprehensive attempt at regulation, treats AI development as a corporate activity. It imposes extensive documentation, risk assessment, and auditing requirements. For a large bank, this is manageable; for say…a 20-person startup, €400,000 in compliance costs can be fatal. For an informal, decentralized project, it’s simply irrelevant. The result is a bifurcation: regulated, expensive, and slow innovation in the corporate sphere, and unregulated, fast, and experimental innovation in the underground.

The result is a bifurcation of the AI development ecosystem.

American export controls create similar paradoxes. The January 2025 implementation of a three-tier global system restricts access to advanced semiconductors based on geopolitical relationships rather than technical requirements. Tier 1 countries (18 close allies) face minimal restrictions. Tier 2 countries receive fixed allocations of nVidia’s H100-equivalent GPUs through 2027. Tier 3 countries face comprehensive restrictions designed to prevent them from developing competitive AI capabilities.

The policy assumes that AI development requires access to the most advanced semiconductors and that controlling semiconductor access will control AI capability development. But the underground AI networks have responded by focusing on efficiency innovations that achieve competitive performance using widely available hardware. DeepSeek’s breakthrough represents just one example of how resource constraints drive innovation in directions that bypass regulatory restrictions.

The export controls also assume that AI development happens through formal organizations that can be identified and regulated. But when AI development happens through informal international networks, export controls become difficult to enforce. A researcher in a Tier 3 country can contribute to AI development by participating in open-source projects, sharing techniques through online forums, or collaborating with researchers in less restricted countries. The contributions may be valuable even if the researcher doesn’t have direct access to restricted hardware.

Chinese responses to U.S. export controls illustrate how regulatory restrictions can accelerate rather than prevent alternative development paths. Instead of trying to access restricted Western technologies, Chinese companies have invested heavily in alternative approaches. The USD 47.5 billion semiconductor fund, CNY 60 billion national AI fund, and RMB 138 billion in local venture guidance funds represent massive state investment in developing indigenous capabilities that don’t depend on Western technology access.

But the Chinese response goes beyond just state investment. The “Delete A” directive to remove American technology from Chinese systems has accelerated the development of alternative software stacks, alternative hardware architectures, and alternative approaches to AI development. The recruitment programs offering USD 420,000 - USD 700,000 signing bonuses for foreign AI talent represent attempts to build Chinese AI capabilities by attracting researchers from less restricted environments.

Some jurisdictions are experimenting with more flexible approaches. Singapore’s voluntary Model AI Governance Framework provides guidance for AI development without imposing rigid requirements. The framework recognizes that different organizations have different capabilities and different risk profiles, and it provides tools for self-assessment rather than mandatory compliance requirements.

UAE innovation zones offer even more flexibility, creating regulatory sandboxes where AI developers can experiment with novel approaches without immediately complying with all applicable regulations. The approach recognizes that innovation often requires experimentation that doesn’t fit neatly into existing regulatory categories, and it provides mechanisms for learning about new technologies before finalizing regulatory approaches.

These flexible approaches may be better suited to the reality of distributed AI development than the more rigid frameworks adopted in Europe and the United States of America. They provide guidance and oversight without imposing compliance costs that exclude smaller developers or alternative approaches. They recognize that innovation happens through experimentation and that regulatory frameworks need to accommodate rather than prevent beneficial experimentation.

The combined effect is a form of regulatory-driven innovation that produces alternatives to Western AI systems faster than would have happened under less restrictive policies. The restrictions create incentives for alternative development while the underground networks provide mechanisms for sharing innovations across regulatory boundaries.

This mismatch reveals that regulations assume centralized development and clear accountability. The reality is distributed networks and rapidly evolving applications. More flexible approaches may be better suited to this reality. But even they struggle to govern informal networks, posing a fundamental challenge for policymakers: how do you guide a revolution that refuses to ask for permission?

Persistence of Alternatives

Perhaps the most important insight from examining Samizdat AI is that its ultimate function is to ensure alternatives continue to exist. In a field moving toward convergence, where a few successful techniques dominate, the preservation of different approaches is critical. Underground AI development serves as a cultural and technical repository for possibilities that might otherwise disappear.

This convergence isn’t necessarily problematic for commercial applications, but it can be limiting for communities with different needs, different values, or different relationships with technology. When the available AI systems all reflect similar assumptions about how intelligence should be organized and deployed, communities that don’t share those assumptions have limited options for building AI systems that work well for their contexts. Underground AI development serves as a kind of cultural preservation mechanism for AI approaches that might otherwise disappear.

  • It maintains technical diversity in how intelligence gets organized and deployed.
  • It preserves cultural diversity in how human-machine relationships get structured.
  • It sustains economic diversity in how AI gets funded and controlled.
  • It keeps alternative possibilities alive even when they’re not commercially viable or institutionally supported.

The preservation function becomes more important as AI systems become more integrated into social and economic infrastructure. Once AI becomes essential to how societies function — mediating access to information, facilitating economic transactions, supporting decision-making in critical domains — the available AI systems effectively define the boundaries of social possibility. If all available AI systems reflect similar values and serve similar interests, society’s ability to explore alternative approaches gets constrained.

EleutherAI provides a concrete example of how underground AI development can preserve and develop alternatives to mainstream approaches. The organization emerged from a grassroots Discord collective in 2020, motivated by frustration with the limited availability of large language models for research purposes. When GPT-3 was available only through OpenAI’s API with substantial usage restrictions, EleutherAI set out to create open alternatives that researchers could use without restrictions.

The technical challenges were substantial. Training large language models requires significant computational resources, specialized expertise, and careful coordination across distributed teams. The EleutherAI collective had limited funding, no formal organizational structure, and members scattered across the globe. But they had something that many commercial AI projects lack: a clear mission to create public goods rather than proprietary products, and a commitment to open development processes that enable community participation.

The GPT-Neo series that EleutherAI released demonstrated that high-quality language models could be developed through distributed collaboration rather than centralized corporate research. The models weren’t as large or as sophisticated as contemporary commercial alternatives, but they were freely available, fully documented, and designed to be modified and extended by other researchers.

More importantly, EleutherAI’s approach demonstrated alternative ways of organizing AI development. Instead of the hierarchical, proprietary, commercially driven development processes that characterize most corporate AI research, EleutherAI used open, collaborative, community-driven processes that prioritized scientific understanding and public benefit over commercial advantage.

The approach influenced subsequent AI development in ways that extend far beyond EleutherAI’s specific technical contributions. The emphasis on open development, detailed documentation, and community participation became standard practices for many AI research projects. The focus on creating public goods rather than proprietary products inspired similar efforts by other organizations. The demonstration that distributed collaboration could produce competitive AI systems challenged assumptions about the necessity of centralized corporate control over AI development.

But EleutherAI’s trajectory also illustrates the challenges of sustaining alternative approaches to AI development. As the organization has grown and achieved recognition, it has faced pressure to become more formally structured, more commercially oriented, and more aligned with mainstream AI development practices. The incorporation as a non-profit in 2023 and partnerships with established organizations like Mozilla represent moves toward institutionalization that may be necessary for sustainability but that also risk diluting the grassroots, community-driven character that made EleutherAI distinctive.

Similar tensions appear in other underground AI projects as they grow and achieve success. The pressure to scale, to attract funding, to comply with regulations, and to integrate with mainstream AI ecosystems can gradually erode the alternative approaches that made these projects valuable in the first place. The challenge is maintaining the distinctive characteristics that enable alternative approaches while achieving the scale and stability necessary for long-term impact.

This tension between preservation and evolution reflects broader dynamics in how alternative technologies develop and spread. Innovations that emerge from underground networks often get incorporated into mainstream systems as they prove their value, but the incorporation process can strip away the cultural and political dimensions that made the innovations meaningful for the communities that developed them.

The solution isn’t to prevent successful underground projects from evolving or integrating with mainstream systems. The solution is to ensure that the underground networks continue to generate new alternatives faster than existing alternatives get absorbed into the mainstream. This requires maintaining the conditions that enable distributed innovation: accessible tools, supportive communities, cultural values that prioritize experimentation and diversity, and economic structures that don’t require immediate commercial viability. The preservation of alternatives also requires conscious effort to document and maintain approaches that might seem obsolete or inefficient by current standards but that might become valuable in different contexts. The history of technology is full of examples where abandoned approaches became important again when circumstances changed — techniques that were too expensive became viable when costs decreased, methods that didn’t scale became relevant for specialized applications, approaches that seemed unnecessary became essential when mainstream alternatives failed.

What Persistence Teaches Us

Looking ahead, the question isn’t whether Samizdat AI will grow — the technical and cultural trends make that inevitable. The question is what happens when these networks become sophisticated enough to provide genuine alternatives to corporate-controlled systems.

Unlike previous underground technology movements that were often co-opted or marginalized (ham radio, early internet culture), Samizdat AI is not just about distributing content; it’s about creating entirely new capabilities grounded in different values. This suggests it may follow a different trajectory, leading to a permanently distributed, culturally diverse, and resiliently decentralized technological ecosystem.

This creates the possibility of a genuinely multipolar AI world where communities can choose systems that align with their values. The diversity would be fundamental—different optimization targets, architectures, and governance models. Such an ecosystem would be more innovative and more resilient to the systemic risks that concentration creates.

The governance challenges are complex, but these networks are already experimenting with solutions like reputation systems and federated governance. The economic models are nascent, but they are being built on principles of commons and collaboration rather than extraction.

The quiet revolution of Samizdat AI isn’t dramatic enough to generate headlines. But revolutions often begin quietly, in the margins. The ghosts in the machine are building their own machines. And those machines are starting to talk back in languages that Silicon Valley never bothered to learn, serving communities that venture capital never thought to fund, embodying values that corporate boardrooms never considered worth preserving.

The future of artificial intelligence is being written in a thousand different scripts, by people who refuse to accept that their needs don’t matter, their cultures don’t count, and their ways of understanding the world aren’t worth augmenting with artificial intelligence that actually understands them.

It’s a future worth preserving, worth building, worth defending.

One model at a time.
One community at a time.
One alternative at a time.