In an exciting development that could reshape the landscape of artificial intelligence (AI), Flower AI and Vana are collaborating to create a new kind of large language model (LLM), called Collective-1, using a decentralized approach to AI training. This approach leverages GPUs spread across the globe, breaking away from the conventional model that depends on data centers packed with powerful hardware.
⚙️ Collective-1: A New Kind of AI Model

Flower AI and Vana’s collaboration has led to the creation of Collective-1, a model with 7 billion parameters, which is small by modern standards compared to the hundreds of billions of parameters in models like ChatGPT and Gemini. Despite its smaller size, Collective-1 represents a major leap forward in how AI models are trained.
The companies have employed an innovative training technique that distributes the workload across hundreds of computers connected over the internet, enabling AI models to be built without the need for centralized compute resources or vast data centers. This method could allow smaller companies and research institutions to compete in building powerful AI models, potentially lowering the barrier to entry for AI development.
🌍 Breaking the Data Center Mold
The traditional model for developing AI has relied heavily on vast quantities of data and compute power concentrated in large data centers. Only the wealthiest companies and nations with access to advanced GPUs could afford to build the most powerful AI systems. However, with distributed training, the landscape could change.
Flower AI’s approach allows for more flexible AI model training by spreading the computational workload across various locations, potentially bringing advanced AI capabilities to countries with less infrastructure. Nic Lane, cofounder of Flower AI, explains that this approach could “scale far beyond the size of Collective-1,” with plans for a 30 billion-parameter model and even a 100 billion-parameter model later this year.
🔄 Photon: The Tool Revolutionizing Distributed AI
To make distributed training more efficient, Flower AI and its collaborators developed a tool called Photon, which enhances the process of splitting up calculations between distant GPUs. Photon improves upon previous approaches, like Google’s DiPaCo, offering a more efficient way to consolidate training results and scale the process over time. While this distributed approach is slower than traditional methods, it offers greater flexibility, allowing new hardware to be incorporated as needed.
💡 Unlocking New Data for AI Training
Another key aspect of this distributed approach is that it allows access to a more diverse range of data. Vana’s contribution is vital here, as the company has created software that allows users to share personal data from platforms like X, Reddit, and Telegram with AI developers. This data is typically not available for use in training traditional models due to privacy concerns, but Vana’s system gives users control over how their data is used and ensures they benefit from their contributions.
By including decentralized and privacy-sensitive data in training, this model could unlock new opportunities for industries like healthcare and finance, which rely on sensitive, personal data for developing AI models. According to Mirco Musolesi, a computer scientist at University College London, this distributed approach could revolutionize data privacy in AI by allowing data to be used without centralizing it, reducing the risks associated with traditional data collection methods.
🚀 The Future of AI: What’s Next?
As the AI industry evolves, it’s clear that traditional, centralized models for training AI will no longer be the only option. The distributed approach, as demonstrated by Flower AI and Vana, could democratize AI development, enabling smaller players and countries with less infrastructure to compete in building powerful AI models. Whether this new approach can scale to the level of the largest industry leaders remains to be seen, but the potential for disruption is immense.
As AI continues to evolve, this shift in how AI models are built may signal a new era where distributed computing and user-contributed data play a much more central role in shaping the future of machine learning.
Would you contribute your data to a decentralized AI model like Collective-1? Let us know your thoughts in the comments below!