The Billion‑Voice Problem

There is a statistic that stops the global AI conversation cold, if anyone bothers to cite it. As of early 2026, ChatGPT has 100 million weekly active users in India, making it the single largest national user base on the platform. Indians use Google Gemini more than any other country in the world for learning and education. And yet, a Google-Kantar survey found that only 31% of Indians have ever used a generative AI platform at all. India is simultaneously the world's biggest consumer of AI it did not build, and a country where the overwhelming majority of people have never been reached by it. That paradox is the engine behind everything happening in Indian AI right now.

The models that dominate global usage were trained on data that is less than 1% Indian. They understand Harvard better than they understand a village panchayat. They can parse legalese in English but stumble over a government form written in Odia. They were built for a world that is not the world most Indians inhabit. And so India has decided, with unusual government urgency and a growing cohort of serious startups, to build its own.

This is the story of that attempt. It is a story about frugal engineering, linguistic ambition, political will, and one question that nobody can fully answer yet: can a country of 1.4 billion people, with one of the world's deepest reservoirs of engineering talent and one of the thinnest pools of frontier AI compute, actually win the sovereign AI race it has entered?

Part One

The Mission: A Government Bet Unlike Any Before It

In March 2024, the Union Cabinet approved the IndiaAI Mission with an initial outlay of ₹10,372 crore, roughly $1.1 billion. The number sounds large until you compare it: OpenAI had raised more than $18 billion by October 2024. Anthropic secured multi-billion dollar backing. Mistral raised hundreds of millions. India's entire sovereign AI budget was smaller than a single Series C in Silicon Valley.

And yet the Mission's design was smarter than its budget suggested. Rather than building a single government lab, it distributed compute subsidies to private startups and academic consortia, selected via competitive process from a shortlist of 67 applicants. Twelve companies have now been chosen across two phases. The first models shipped live in February 2026, unveiled at the India AI Impact Summit in New Delhi. It was a moment India's technology community had been waiting for years to witness.

₹10K Cr
IndiaAI Mission budget, India's total sovereign AI allocation

12
Companies selected to build sovereign foundation models

22
Official Indian languages the models must support

The philosophy behind the Mission was articulated most sharply by Sarvam AI co-founder Vivek Raghavan: "The existing models have sub 1% Indian data." That single observation frames the entire project. When less than 1% of training data is Indian, the model does not just perform poorly on Indian languages, it carries a structural bias toward Western contexts that no amount of fine-tuning can fully correct. Building from scratch, on sovereign data, with Indian infrastructure, is not nationalism for its own sake. It is an engineering requirement.

Sovereignty matters much more in AI than building the biggest models. A model trained elsewhere on foreign data cannot truly understand a village panchayat, a crop insurance scheme, or the way 600 million people actually speak.

Vivek Raghavan, Co-founder, Sarvam AI

Part Two

The Flagship: Sarvam and the Weight of a Nation's Expectations

Of the twelve companies selected under the IndiaAI Mission, one has attracted the most scrutiny, the most investment, and the most expectation: Sarvam AI, a Bengaluru startup founded in August 2023 by two IIT Madras alumni, Vivek Raghavan and Pratyush Kumar, who had previously built AI4Bharat, the most respected Indian language AI research lab in the country.

Sarvam's ambition runs deeper than any single model. The company is building India's full-stack sovereign AI infrastructure, a cloud-native platform spanning frontier foundation models, speech and vision systems, a token factory for cost-efficient inference at population scale, and enterprise-grade deployment tooling. Alongside this infrastructure play, Sarvam has also launched Samvaad, a conversational AI product that puts its models directly in the hands of end users and enterprise teams. It is a company that wants to own the entire stack: the compute layer, the model layer, and the application layer.

Sarvam's early months were marked by scepticism. The company raised $41 million within five months of founding, led by Lightspeed Venture Partners and Khosla Ventures, but its initial models were fine-tuned from Mistral, a French foundation model, and observers questioned whether this was genuinely sovereign AI or simply an Indic wrapper on Western work. The company's pace felt slow. Downloads were modest. The critique was fair.

Then, in April 2025, Sarvam was handpicked by the government from those 67 applicants to lead India's sovereign LLM programme, receiving dedicated GPU access to 4,096 NVIDIA H100 SXM chips through Yotta Data Services, alongside nearly ₹99 crore in GPU subsidies. The mandate was unambiguous: build India's first truly indigenous foundational model, from scratch, and have it ready for the India AI Impact Summit in February 2026.

They delivered.

May 2025

Sarvam-M, First, Imperfect Step

Fine-tuned from Mistral Small on Indian language datasets. Competitive on Indic benchmarks but not built from scratch, the critics weren't wrong to note the distinction.

November 2025

14 Days, 14 Launches

Sarvam runs an aggressive pre-summit campaign, unveiling speech models, vision models, translation systems, and enterprise tools in a deliberate echo of OpenAI's rapid-release playbook. The AI community takes notice.

February 2026

Sarvam-30B and Sarvam-105B, India's Moment

Two models unveiled at the India AI Impact Summit. Both trained entirely from scratch in India on 12 trillion tokens. The 105B model, named Indus at consumer launch, wins 90% of pairwise comparisons on Indian language benchmarks and outperforms DeepSeek R1 on several agentic tasks, despite being a model six times smaller. The room, including Google CEO Sundar Pichai in the audience, notices.

March 2026

Startup Programme, Kaze Wearable & Hardware Ambitions

Sarvam launches an API credits programme for early-stage Indian startups. And, in a move that surprises the industry, unveils Kaze, AI-powered smart glasses supporting 10+ Indian languages, demonstrated live at the Summit and even tried by Prime Minister Modi. The glasses represent Sarvam's signal that its ambitions extend beyond software: it is building an ecosystem that reaches from cloud infrastructure all the way to the edge of the human face.

Sarvam-105B's benchmark performance warrants a closer look. On Math500 it scores 98.6. On AIME 25 it reaches 96.7 with tool use. On Tau2, a benchmark measuring real-world multi-step agentic task completion, it tops all compared models. These are not just raw capability numbers; Tau2 in particular measures the kind of enterprise usefulness that drives actual adoption. For a model trained on a fraction of the compute available to global frontier labs, it is a remarkable result.

The honest caveats matter too. The model's knowledge cutoff is June 2025, leaving it unaware of events in the second half of last year. Independent testing has surfaced hallucination issues in edge cases. And Sarvam's co-founder openly acknowledges that matching the scale of Gemini or Claude requires far more capital than India has yet committed. The models are competitive and improving, but they are not yet global peers across the board. India's AI Independence Day has not quite arrived. But the flag has been planted.

Part Three

The Ecosystem

Sarvam's headline status can sometimes overshadow the true breadth of what the IndiaAI Mission has funded. At its heart, this isn't about individual models or isolated efforts, it's about cultivating a vibrant ecosystem. What emerges is something far more compelling than any standalone innovation: a layered, specialized stack where voice complements text, open-source coexists with proprietary approaches, and domain-specific expertise integrates with general intelligence, collectively addressing the full spectrum of India's diverse needs.

Government Lab

BharatGen (IIT Bombay)

The largest single government AI beneficiary at ₹900 crore. Led by Prof. Ganesh Ramakrishnan's consortium spanning six IITs and IIM Indore. Param2 17B MoE is open-source on Hugging Face, targeting governance, agriculture, legal services, and healthcare. The academic mandate means it is built with India-specific legal and cultural coherence in mind.

Voice-First

Gnani.ai

Not a text model, a voice company building AI for how 600 million Indians primarily interact with technology: by speaking. Vachana TTS clones voices across 12 Indian languages using just 10 seconds of audio. Everything runs inside Indian data centres, targeting government helplines, banking customer service, and healthcare access.

Enterprise IT

Tech Mahindra

India's oldest large IT company is not just integrating foreign models, it's building one. Project Indus now has an 8B parameter Hindi-first education model, designed for digital classrooms and adaptive tutoring. The philosophy: frugal innovation built entirely in-house, not licensed from abroad.

Reasoning

Fractal Analytics

India's first large reasoning model, focused on STEM, medical diagnostics, and complex problem-solving. Fractal recently went public, and launched Vaidya 2.0 at the Summit. A bet that domain-specific reasoning at scale creates more value than general-purpose chat for India's healthcare and analytics sectors.

Open Source

Soket AI Labs

Building openly, with model weights and artefacts released under permissive licences. Targeting defence, healthcare, and education. Their Pragna 1B LLM is the smallest model in the Mission, but the open-source mandate means every Indian developer can build on it.

Engineering AI

ZenteiQ (BrahmAI)

A genuinely unusual niche: AI for scientific computing, engineering simulations, and optimisation. BrahmAI is designed for deep-tech sectors that global models consistently underserve, a bet that India's engineering talent base creates a natural market for specialised technical intelligence.

Professional Copilot

OpenCraft AI

A multi-model copilot built for professionals, no audio input or output. OpenCraft AI has taken a deliberately risk-conscious path: self-hosting open-source models first to reduce business risk and build deep operational expertise, before venturing into its own LLM play. The strategy is diversification before disruption, earn the right to build a frontier model by mastering the infrastructure beneath it.

What is striking about this ecosystem is not any single player but the deliberate diversification. Text and voice. Consumer and enterprise. General reasoning and domain-specific expertise. Open source and proprietary. Small models and large ones. The IndiaAI Mission has, wittingly or not, replicated something like the Chinese "Six Small Tigers" model, except in India, it is a blend of government-backed initiatives and venture-driven innovation that together create a layered, resilient ecosystem.

Part Four

The Wildcard: Krutrim and the Cautionary Tale

No account of Indian AI is complete without Krutrim, and Krutrim is, depending on your vantage point, either the most audacious bet in India's AI story or its most instructive cautionary tale.

Founded in 2023 by Bhavish Aggarwal, the founder of Ola, India's dominant ride-hailing company, Krutrim became India's first AI unicorn in January 2024 with a $50 million raise at a $1 billion valuation. Aggarwal committed $230 million of his own capital in early 2025 and announced plans to raise $1.15 billion more. He announced partnerships with NVIDIA for India's largest supercomputer. He unveiled Krutrim-1 (7B), then Krutrim-2 (12B). He launched Kruti, an agentic voice assistant for Indian consumers. He announced plans for Krutrim-3, a 700 billion parameter model built with Lenovo. The ambition was total: Aggarwal wanted to build India's entire AI stack, models, cloud infrastructure, consumer apps, chips, under one roof.

Then, in mid-2025, things started going wrong. Multiple rounds of layoffs hit the company across June, July, and September, nearly 200 employees let go in total, including most of the linguistics team. Senior executives departed. The Kruti consumer app accumulated only around 100,000 downloads on Google Play, modest traction for a company claiming to represent India's AI independence. When thousands of delegates arrived at the India AI Impact Summit in February 2026, the natural stage for Krutrim to assert its leadership, the company was largely absent from the main stage. Rivals seized the spotlight.

The Krutrim Question

Krutrim's stumble is not a death, the company still has capital, infrastructure, and a founder with genuine ambition. But it illustrates the tension at the heart of India's AI moment: the gap between the pace of announcement and the pace of delivery. India's AI race is being watched by the world. Every miss carries a cost beyond the company itself.

The contrast with Sarvam is instructive. Both are Bengaluru-based. Both launched in 2023. Both raised substantial early capital. But Sarvam focused narrowly on building the sovereign foundation model mandate it was given, delivered at the Summit, and earned its credibility through output. Krutrim aimed at everything simultaneously, and the breadth of that ambition may have been the thing that slowed it down. The contrast with OpenCraft AI is equally telling from a different angle: where Krutrim swung for the fences immediately, OpenCraft chose to de-risk first, self-hosting open-source models, building operational depth, and only then plotting its own LLM trajectory. Different bets, different timelines, different risk profiles.

Part Five

The Constraint Advantage: What Frugal Engineering Produces

The most important structural difference between India's AI ecosystem and those of the United States or China is compute. India simply does not have the GPU density of its rivals. The entire IndiaAI Mission's compute allocation, roughly 4,000 H100s through Yotta, is a fraction of what a single American hyperscaler deploys for a single training run. When DeepSeek emerged from China and demonstrated frontier performance at a fraction of expected cost, India's AI community recognized something familiar in the story: necessity had driven innovation.

India's version of that constraint has produced a specific kind of engineering philosophy. Call it frugal AI. Not because Indian engineers lack ambition, but because efficiency is not optional when compute is scarce. The results are visible across the ecosystem.

Sarvam-105B, despite having 105 billion parameters, activates only approximately 9 billion per token through its Mixture of Experts architecture, making inference costs dramatically lower than the headline parameter count suggests. BharatGen's Param2 17B MoE applies the same logic at a smaller scale, routing queries to specialist sub-networks to maximise accuracy per compute dollar. Tech Mahindra's 8B education model was built entirely in-house at what the company describes as "frugal cost." Gnani.ai built an entire voice stack that runs inside Indian data centres, optimised for low-bandwidth environments. OpenCraft AI's self-hosting strategy reflects the same instinct from a commercial angle: why pay hyperscaler inference margins when you can own the stack, control the costs, and build the expertise that eventually lets you train your own? These are not just resource-saving measures. They are design requirements that have produced genuinely novel architectures and business models alike.

The parallel to DeepSeek is striking enough that it has been explicitly noted at the India AI Impact Summit. When you cannot throw compute at a problem, you have to think harder about the problem. India's engineers are, by structural necessity, among the world's most careful optimizers. That discipline may prove to be the country's most durable competitive advantage.

India's constraint is its curriculum. When you cannot out-spend the frontier, you learn to out-think it. The frugal engineering philosophy embedded in every IndiaAI Mission model is not a compromise, it is a discipline that the resource-rich labs are now scrambling to learn.

The Billion-Voice Problem

Part Six

The Real Stakes: Who AI Is Actually For

Strip away the benchmarks and the compute politics and the sovereign posturing, and the India AI story comes down to a single, unusually concrete question: can AI reach the people who need it most?

India has 1.4 billion people. More than 500 million of them are native speakers of languages for which frontier AI models have essentially no meaningful training data. Hundreds of millions interact with technology primarily through voice, in low-bandwidth environments, on low-cost Android handsets. A model that requires an internet connection and performs best in English is not an AI product for most Indians. It is an AI product for the small slice of India that could already access the global internet.

This is precisely the gap the IndiaAI Mission's design addresses. Gnani.ai's Vachana stack is built for voice-first, low-latency, low-bandwidth deployment, targeting government helplines, rural healthcare access, and banking services in languages that global voice models handle poorly. BharatGen's Param2 is open-sourced explicitly so that developers in every corner of the country can build applications on top of it without licensing costs. Sarvam's "Pravah" token factory is designed to make inference costs low enough for mass-market consumer applications, while its Kaze smart glasses push the frontier of Indic AI all the way to the physical world. Tech Mahindra's education model targets the structural English bias in digital learning tools that has excluded hundreds of millions of students. And at the professional end of the spectrum, copilots like OpenCraft AI are building text-first tools for the knowledge workers and enterprises that form India's growing white-collar economy, a segment that is large, underserved by global tools calibrated for Western workflows, and increasingly willing to pay for AI that actually understands the Indian professional context.

These are not features. They are the mission. And it is a mission that, if executed, would represent something genuinely rare in the history of technology: AI built not for the most sophisticated users in the most connected markets, but for the people most systematically excluded from the technology that is reshaping the world around them.

✦ ✦ ✦

India is not racing China or America. It is racing itself, against the gap between the AI it can build and the 1.4 billion people who need it.

The honest assessment is that India's AI is not yet at the frontier, not globally competitive on raw reasoning, not trained on enough compute to match GPT-class models, not yet deployed at the scale of ByteDance's Doubao or Alibaba's Qwen. But the honest assessment is also that none of those global models can do what a fully sovereign Indian AI stack can do: understand a village panchayat, read a handwritten government form in Odia, respond to a query about crop insurance in Marathi through Sarvam's infrastructure, run on the kind of device and connection that most Indians actually have, and, at the professional layer, serve the knowledge worker in Pune or Hyderabad through a copilot that understands Indian business context without needing to be retrained on Western defaults. The frugal engineering philosophy, the academic consortia, the government compute subsidies, the 12 competing foundation models, the hardware experiments like Kaze, and the risk-conscious builders like OpenCraft AI who are quietly laying the operational groundwork for the next generation of Indian LLMs, these are the raw materials of something significant. Whether they compound into a genuine sovereign AI ecosystem or remain an impressive set of government-funded experiments depends on what happens next: enterprise adoption, consumer traction, and whether the talent India has trained decides to stay and build, or leaves for a better-funded offer somewhere else. The billion-voice problem is real. India now has, for the first time, the beginning of an answer.