Vikasit 120B
Datacenter MoE. Frontier reasoning at low inference cost.
Overview
Vikasit 120B is a datacenter-class MoE — 116.8B total with only ~5.1B active per token. Frontier reasoning and tool-use at a fraction of dense-model serving cost. Served live via the Vikasit API.
Specifications
- Total parameters
- 116.8B total
- Active parameters
- 5.1B active
- Architecture
- Mixture-of-Experts
- Experts
- 128 total / 4 activated per token
- Layers
- 36
- Attention
- GQA (64 query / 8 KV heads), alternating banded-window + dense, attention sinks
- Context window
- 131K (128K)
- Modalities
- Text in → text out
- License
- Apache 2.0
Capabilities
- Frontier reasoning at ~5B active compute
- Strong agentic tool-use and coding
- 128K context
- Configurable reasoning effort
Benchmarks
| Benchmark | Score |
|---|---|
| GPQA-Diamond | 80.1 |
| AIME 2025 | 92.5 |
| SWE-bench Verified | 62.4 |
| Humanity's Last Exam | 14.9 |
| Aider-Polyglot | 44.4 |
| MMLU | 90.0 |
| MMLU-Pro | N/A |
Numbers from the GPT-OSS-120B official model card (OpenAI, arXiv:2508.10925), high-reasoning mode. Card reports MMLU (90.0), not MMLU-Pro.
Hardware & deployment
| Precision | Memory |
|---|---|
| bf16 | ~234 GB |
| MXFP4 | ~63 GB |
Quick start
Call Vikasit 120B through the OpenAI-compatible Vikasit AI API at https://api.vikasit.ai/v1 using the model id vikasit-120b.
# pip install openai
import os
from openai import OpenAI
client = OpenAI(
base_url="https://api.vikasit.ai/v1",
api_key=os.environ["VIKASIT_API_KEY"],
)
resp = client.chat.completions.create(
model="vikasit-120b",
messages=[
{"role": "user", "content": "Explain Vikasit 120B in one sentence."}
],
)
print(resp.choices[0].message.content)# or with curl
curl https://api.vikasit.ai/v1/chat/completions \
-H "Authorization: Bearer $VIKASIT_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "vikasit-120b",
"messages": [{"role": "user", "content": "Hello"}]
}'Limitations
- Text-only (no vision/audio)
- English-centric vs Indic-focused models
Vikasit 120B FAQ
How much does Vikasit 120B cost?
Vikasit 120B is served through the Vikasit AI API on usage-based, pay-as-you-go pricing billed per million input and output tokens — see the Vikasit AI pricing page for current rates. Because it is built on the open-weight GPT-OSS-120B (OpenAI, Apache 2.0), you can also self-host the weights for free under the Apache 2.0 licence and pay only for your own compute.
Is Vikasit 120B open weight?
Yes. Vikasit 120B is built on GPT-OSS-120B (OpenAI, Apache 2.0) and distributed under the Apache 2.0 licence, so the weights are openly available for self-hosting, fine-tuning, and commercial use, subject to the upstream licence terms.
How do I use Vikasit 120B with the OpenAI SDK?
The Vikasit AI API is OpenAI-compatible. Point the OpenAI client's base URL at https://api.vikasit.ai/v1, set your Vikasit API key, and pass "vikasit-120b" as the model. The quick-start snippet above shows the exact Python call.
What context window does Vikasit 120B support?
Vikasit 120B supports a 131K (128K) context window. It is a 116.8B total (5.1B active) Mixture-of-Experts model — full specifications are listed in the table above.
License & attribution
Apache 2.0
Built on GPT-OSS-120B (OpenAI, Apache 2.0). Upstream copyright, license, and attribution notices are retained.