Technology
The Technology Behind Vikasit
Frontier AI capabilities meets Indian language mastery.
Our Approach: Continual Pre-Training
Instead of training from scratch, we build on Qwen3's world-class foundation and add deep Indian language expertise.
Qwen3-235B
World-class coding, math, reasoning (Apache 2.0)
+ 2T Indic Tokens
AIKosh + curated Indian language corpus across 22+ languages
Vikasit
Global intelligence + Indian language fluency
This approach is 7-10x more cost-efficient than training from scratch while delivering a model that excels at both global tasks and Indian languages.
Model Family
Three sizes for every deployment scenario.
Vikasit-8B
Edge & Mobile
8B parameters
Runs on phones and edge devices. Optimized for low-latency inference with full 22-language support.
Vikasit-32B
Enterprise
32B dense
Single GPU deployment for production workloads. Dense architecture delivers consistent performance for enterprise applications.
Vikasit-235B
Frontier
235B MoE (22B active)
Flagship model with Mixture-of-Experts architecture. Frontier-level coding, reasoning, and Indian language mastery.
Infrastructure Stack
Production-grade infrastructure for sovereign AI.
vLLM
High-throughput inference engine with continuous batching and PagedAttention
Kong Gateway
API authentication, rate limiting, and intelligent model routing
OpenAI-Compatible API
Drop-in replacement — existing tools and SDKs just work
Kubernetes (K3s)
Container orchestration with GPU-aware scheduling and auto-scaling
Prometheus + Grafana
Real-time monitoring of GPU utilization, latency, and throughput
AIKosh Integration
5,500+ government datasets across 20 sectors for training data
Custom Indic Tokenizer
Existing multilingual tokenizers require 4-8 tokens per Indic word vs. 1.4 for English — making Indian language inference 3-5x more expensive. Our custom tokenizer achieves 1.5-2.2 tokens per word across all 22 Indian languages, cutting inference costs by 2-3x.
5.2
GPT-4 Hindi tokens/word
1.8
Sarvam-1 Hindi tokens/word
1.5
Vikasit target tokens/word
Open Source Commitment
Everything we build is open-source under Apache 2.0 — models, tokenizer, benchmarks, training code, and data processing tools. We're building India's AI commons.