Indian Firms Training LLMs: Challenges, Support, and Architectural Innovations
Exploring how Indian firms are training Large Language Models (LLMs) amidst challenges.
Photo by Raj Rana
Background Context
LLMs are trained using clusters of Graphics Processing Units (GPUs), which are specialized processors designed to accelerate machine learning tasks.
Training involves feeding the model massive datasets, allowing it to learn patterns and relationships in the data. The process is computationally intensive and can cost millions of dollars due to the expense of GPUs and electricity.
Data is crucial for training LLMs. The model's performance depends on the quality and diversity of the training data. A lack of data in specific languages or domains can lead to poor performance in those areas.
The Mixture of Experts (MoE) architecture is a key innovation for improving the efficiency of LLMs. Instead of activating all parameters during inference, MoE models only activate a fraction, reducing computational costs and improving speed.
Why It Matters Now
The development of LLMs in India is significant due to the potential for these models to better understand and serve the Indian context.
Challenges include the scarcity of data in Indian languages and the limited availability of capital for training. The IndiaAI Mission is addressing these challenges by subsidizing training efforts and providing access to GPUs.
Indian firms are exploring different architectures, such as the Mixture of Experts (MoE), to create more efficient and cost-effective LLMs. This is crucial for making AI accessible and affordable for Indian users.
The Ministry of Electronics and Information Technology (MeitY) is encouraging domestic LLM development, believing that foreign-developed LLMs may not prioritize Indian languages and contexts.
Key Takeaways
- •LLMs are AI systems trained on vast amounts of text data.
- •Training LLMs requires significant computational resources and capital.
- •Data scarcity in Indian languages is a major challenge.
- •The IndiaAI Mission is subsidizing LLM training efforts in India.
- •Mixture of Experts (MoE) architecture improves LLM efficiency.
- •Domestic LLM development is crucial for serving the Indian context.
- •Sarvam AI and BharatGen are examples of Indian firms training LLMs.
Different Perspectives
- •Some argue that foreign-developed LLMs can be adapted to the Indian context through translation and fine-tuning.
- •Others believe that training LLMs from scratch on Indian data is necessary for optimal performance.
- •There are differing views on the role of government subsidies in promoting AI development.
On February 18, 2026, Bengaluru-based AI startup Sarvam launched two indigenous large language models (LLMs) trained specifically on Indian languages, with 30 billion and 105 billion parameters respectively. IT Minister Ashwini Vaishnaw selected Sarvam AI as the first startup from 67 shortlisted companies to develop India’s first indigenous foundational model under the IndiaAI Mission. Sarvam AI secured approximately ₹99 crore in subsidies for acquiring 4,096 NVIDIA H100 GPUs. The 30-billion-parameter model handles real-time conversations through 32,000 context windows, while the 105-billion-parameter model offers 128,000 tokens for more complex tasks.
Sarvam's models are optimized for voice-first interactions and accessible through 22 Indian languages. The 30B model utilizes a 19-layer depth with 128 experts and a top-6 routing strategy, while the 105B model scales to 32 layers and employs top-8 routing over 128 experts. The company plans to open source the 30B and 105B models. Sarvam launched ‘Pravah’, an AI token factory that will manufacture tokens for industrial use with a variety of models, making AI available to everybody at a fraction of the cost.
These developments are crucial for India's technological independence and data governance, aligning with the IndiaAI Mission's goal of fostering homegrown AI systems. This is relevant for UPSC exams, particularly in the Science and Technology section (GS Paper III), as it highlights advancements in AI, government initiatives, and the challenges of developing indigenous technologies.
Key Facts
Sarvam AI released two LLMs trained on 35 billion and 105 billion parameters.
LLMs are trained on clusters of GPUs, costing millions of dollars.
The IndiaAI Mission has commissioned over 36,000 GPUs in data centers.
Sarvam received access to 4,096 GPUs from the government.
The government subsidy for Sarvam is estimated at almost ₹100 crore.
BharatGen, an IIT Bombay-incubated firm, trained a multilingual 17 billion parameter model.
UPSC Exam Angles
GS Paper III (Science and Technology): Developments in AI and their applications.
GS Paper II (Governance): Government policies and interventions for development of the technology sector.
Potential questions on the role of indigenous technology in achieving self-reliance and promoting inclusive growth.
In Simple Words
Think of LLMs as computer programs that can understand and speak languages. Indian companies are now building these programs, but it's hard because they need lots of data and powerful computers. The government is helping by providing resources to train these AI models.
India Angle
This means that in the future, AI assistants and chatbots will be able to understand and respond in Indian languages more accurately. This could help shopkeepers manage inventory, farmers get weather updates, and students access educational resources.
For Instance
Imagine a farmer using an AI-powered app to get real-time advice on crop management in their local language. This app is powered by an LLM trained on Indian agricultural data, making it more relevant and useful.
If these AI models are built in India, they'll be better at understanding our needs and culture. This can lead to more useful and accessible technology for everyone.
Indian-made AI can understand Indian needs better.
Bengaluru-based startup Sarvam AI released two Large Language Models (LLMs) trained on 35 billion and 105 billion parameters, respectively. LLMs are trained on clusters of Graphics Processing Units (GPUs), costing millions of dollars. Training LLMs on Indian soil with Indian capital faces challenges due to scarce data sources and limited capital.
The IndiaAI Mission has subsidized training efforts in India by commissioning over 36,000 GPUs in data centers. Sarvam received access to 4,096 GPUs, with an estimated subsidy of almost ₹100 crore. The Ministry of Electronics and Information Technology encourages domestic LLM development, believing foreign-developed LLMs may not prioritize Indian languages.
Sarvam's models signify progress in developing inexpensive LLMs. A key breakthrough for AI models is the Mixture of Experts (MoE) architecture, which activates only a fraction of parameters, reducing computing resources.
Expert Analysis
The launch of Sarvam AI's LLMs highlights several key concepts in the field of artificial intelligence and India's technological landscape. The IndiaAI Mission, under which Sarvam AI was selected, is a government initiative aimed at fostering the development of indigenous AI capabilities. This mission reflects a strategic push towards technological sovereignty, ensuring that India has its own AI infrastructure tailored to its unique linguistic and cultural context. The mission provides funding, infrastructure support, and a framework for collaboration between startups, research institutions, and government agencies. The ₹99 crore subsidy for acquiring NVIDIA GPUs exemplifies the government's commitment to providing the necessary resources for AI development.
The architecture of Sarvam AI's models, particularly the Mixture of Experts (MoE) approach, is a crucial concept in understanding their efficiency. MoE involves activating only a fraction of the model's parameters at a time, reducing computational costs without sacrificing performance. This is particularly important for deploying AI models in resource-constrained environments, such as feature phones or edge devices. The 30B model's 19-layer depth with 128 experts and the 105B model's 32 layers with 128 experts demonstrate the scalability and complexity of this architecture.
Another significant concept is voice-first AI, which prioritizes voice interactions as the primary mode of communication with AI systems. Sarvam's models are optimized for voice-first applications across 22 Indian languages, addressing the needs of a large population that may not be literate or digitally savvy. This approach involves training models on high-quality datasets of Indian languages, accounting for linguistic nuances and cultural context. The development of multilingual AI chatbots like 'Vikram' showcases the potential of voice-first AI to bridge the digital divide and provide access to information and services for underserved communities.
For UPSC aspirants, understanding these concepts is crucial for both prelims and mains. Prelims may involve questions on the IndiaAI Mission, MoE architecture, or the applications of voice-first AI. Mains questions could explore the challenges and opportunities of developing indigenous AI capabilities, the role of government support in fostering innovation, or the ethical considerations of AI deployment in diverse cultural contexts.
Visual Insights
Key Statistics from India's LLM Development
Highlights key figures related to the IndiaAI mission and LLM development in India.
- IndiaAI Mission Budget
- ₹10,371.92 crore
- GPUs Commissioned by IndiaAI Mission
- 36,000+
- Subsidy for Sarvam AI GPUs
- ₹100 crore
Indicates the government's commitment to fostering AI development in India.
Addresses the critical need for computing infrastructure for training LLMs.
Illustrates the financial support provided to domestic AI startups.
Frequently Asked Questions
1. What's the core difference between the 30-billion and 105-billion parameter LLMs that Sarvam AI launched, and why does it matter?
The primary difference lies in their capacity for handling complex tasks. The 30-billion-parameter model is designed for real-time conversations using 32,000 context windows, while the 105-billion-parameter model, with 128,000 tokens, is built for more intricate operations. This matters because it allows for a tiered approach, where simpler tasks are handled efficiently, and more complex problems can be tackled with a more powerful model.
Exam Tip
Remember the parameter counts (30B vs 105B) and their corresponding context window/token sizes. UPSC might frame a question around the capabilities of each model, testing your understanding of their relative strengths.
2. How does the IndiaAI mission's commissioning of 36,000 GPUs relate to Sarvam AI receiving 4,096 GPUs?
The IndiaAI mission is a broader initiative to bolster AI infrastructure in India. Sarvam AI's access to 4,096 GPUs is a direct result of this mission. The government is investing in creating a pool of computational resources that can be accessed by Indian startups and researchers to develop indigenous AI models. Sarvam AI was selected as the first startup to develop India's first indigenous foundational model under this mission.
Exam Tip
UPSC might ask about the IndiaAI Mission's objectives and how it supports the development of indigenous AI. Remember that providing computational resources is a key aspect of this mission.
3. Why is the Indian government subsidizing AI development, and what are the potential benefits and risks?
The Indian government is subsidizing AI development to foster indigenous AI capabilities tailored to Indian languages and cultural nuances. Global LLMs often exhibit biases and fail to address the contextual realities of India's multilingual population. Potential benefits include:
- •Economic growth through AI-driven innovation.
- •Improved public services through AI applications in healthcare, education, and governance.
- •Reduced reliance on foreign AI technologies.
- •Job creation in the AI sector.
- •Addressing bias and fairness issues in AI systems.
Exam Tip
When discussing government subsidies, always consider both the potential benefits (economic growth, improved services) and potential risks (inefficiency, dependence on subsidies).
4. How does Sarvam AI's focus on 'voice-first interactions' connect with broader trends in technology and accessibility in India?
Voice-first interactions are crucial for accessibility in India, where many users may not be literate or comfortable using text-based interfaces. By optimizing for voice, Sarvam AI is tapping into a significant user base and addressing the digital divide. This aligns with the broader trend of making technology more inclusive and accessible to diverse populations.
Exam Tip
Consider the social implications of technology. Voice-first AI can be a powerful tool for inclusion, especially in countries with diverse languages and literacy levels.
5. What specific details about Sarvam AI and the IndiaAI Mission could be potential MCQ traps in the Prelims exam?
Here's a potential MCQ trap:
- •The Trap: A question stating Sarvam AI received ₹1,000 crore in subsidies.
- •Why it's a trap: The actual subsidy is approximately ₹100 crore.
- •Exam Tip: Pay close attention to numerical data. Examiners often create traps by slightly altering numbers. Always double-check figures related to funding, parameters, and dates.
Exam Tip
Focus on precise numerical data. Examiners often create traps by slightly altering numbers. Always double-check figures related to funding, parameters, and dates.
6. If a Mains question asks me to 'Critically examine' the IndiaAI Mission, what are some balanced arguments I should include?
When critically examining the IndiaAI Mission, consider these points:
- •Potential Benefits: Fostering indigenous AI, addressing language biases, promoting economic growth, and improving public services.
- •Potential Drawbacks: Risk of inefficiency, dependence on government subsidies, potential for misuse of AI technologies, and ethical concerns.
- •Balanced Argument: While the IndiaAI Mission has the potential to create significant benefits for India, it is crucial to address the potential drawbacks and ensure that AI development is aligned with ethical principles and societal values.
Exam Tip
In 'critically examine' questions, always present both the positive and negative aspects of the topic. Avoid taking a one-sided approach.
Practice Questions (MCQs)
1. Consider the following statements regarding the IndiaAI Mission: 1. It aims to foster the development of indigenous AI capabilities tailored to India's linguistic and cultural context. 2. The mission provides funding, infrastructure support, and a framework for collaboration between startups, research institutions, and government agencies. 3. The mission mandates that all AI models developed under it must be open source. Which of the statements given above is/are correct?
- A.1 and 2 only
- B.2 and 3 only
- C.1 and 3 only
- D.1, 2 and 3
Show Answer
Answer: A
Statement 1 is CORRECT: The IndiaAI Mission aims to foster the development of indigenous AI capabilities tailored to India's linguistic and cultural context. Statement 2 is CORRECT: The mission provides funding, infrastructure support, and a framework for collaboration between startups, research institutions, and government agencies. Statement 3 is INCORRECT: While open sourcing is encouraged, it is not mandatory for all AI models developed under the mission.
2. In the context of Large Language Models (LLMs), what is the primary advantage of using a Mixture of Experts (MoE) architecture?
- A.It increases the number of parameters in the model, leading to higher accuracy.
- B.It reduces computational costs by activating only a fraction of the parameters at a time.
- C.It allows the model to process multiple languages simultaneously.
- D.It eliminates the need for training data, as the model can learn from existing knowledge.
Show Answer
Answer: B
The Mixture of Experts (MoE) architecture reduces computational costs by activating only a fraction of the model's parameters at a time. This allows for efficient deployment of LLMs, especially in resource-constrained environments.
3. Which of the following initiatives is primarily aimed at overcoming India's literacy, digital, and linguistic barriers to make AI tools more accessible?
- A.Digital India Initiative
- B.Bhashini Initiative
- C.AIKosh
- D.National Education Policy
Show Answer
Answer: B
The Bhashini Initiative is primarily aimed at overcoming India's literacy, digital, and linguistic barriers to make AI tools more accessible. It works in tandem with AIKosh, a high-quality data capture initiative and dataset platform.
Source Articles
How are Indian firms training LLMs? | Explained - The Hindu
AI: is India falling behind? - The Hindu
On AI, India’s enthusiasm contends with fundamental constraints - The Hindu
AI for all: On the India AI Impact Summit 2026 - The Hindu
White House to host Big Tech after pledge to rein in power costs - The Hindu
About the Author
Ritu SinghTech & Innovation Current Affairs Researcher
Ritu Singh writes about Science & Technology at GKSolver, breaking down complex developments into clear, exam-relevant analysis.
View all articles →