All models
Text / ReasoningOpen weights4B

Vikasit 4B

Balanced small model. Good code completion and multi-turn chat.

Overview

Vikasit 4B is the balanced small model — solid coding, multi-turn chat, and reasoning that punches above its size. Extends to 131K context via YaRN. Runs on a single consumer GPU or a strong laptop.

Specifications

Total parameters
4B
Architecture
Dense transformer
Layers
36
Attention
GQA (32 query / 8 KV heads), tied embeddings
Context window
32K native, 131K via YaRN
Vocabulary
151,669
Modalities
Text in → text out
License
Apache 2.0

Capabilities

  • Code completion and generation
  • Multi-turn chat with good instruction following
  • 131K extended context (YaRN)
  • Thinking and non-thinking modes
119 languages. Strong English + major Indian languages.

Benchmarks

BenchmarkScore
MMLU-Pro45.0
GPQA-Diamond55.9
AIME 202565.6
MATH-50097.0
LiveCodeBench v554.2
BFCL v365.9
IFEval81.9
HumanEvalN/A

Instruct numbers from the Qwen3 Technical Report; MMLU-Pro is the base-model figure. Thinking-mode scores shown.

Hardware & deployment

PrecisionMemory
bf16~8 GB
INT4~2.5 GB

Quick start

Vikasit 4B is an open-weight model. Self-host it with any OpenAI-compatible inference server and call it with the OpenAI SDK as shown below.

OpenAI-compatible Python (self-hosted, e.g. vLLM)
# pip install openai
import os
from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="sk-local",  # self-hosted servers accept any token
)

resp = client.chat.completions.create(
    model="vikasit-4b",
    messages=[
        {"role": "user", "content": "Explain Vikasit 4B in one sentence."}
    ],
)

print(resp.choices[0].message.content)

Limitations

  • Not ideal for very long agentic chains
  • Code quality below dedicated coder models

Vikasit 4B FAQ

How much does Vikasit 4B cost?

Vikasit 4B is an open-weight model built on Qwen3-4B (Apache 2.0). Self-hosting the weights is free under the Apache 2.0 licence — you pay only for the hardware or cloud GPUs you run it on. Typical deployment fits the memory profiles listed in the hardware section above.

Is Vikasit 4B open weight?

Yes. Vikasit 4B is built on Qwen3-4B (Apache 2.0) and distributed under the Apache 2.0 licence, so the weights are openly available for self-hosting, fine-tuning, and commercial use, subject to the upstream licence terms.

How do I run Vikasit 4B?

Because Vikasit 4B is open weight, you self-host it with any OpenAI-compatible inference server (such as vLLM or SGLang) loaded with the Qwen3-4B (Apache 2.0) weights, then call it with the OpenAI SDK by setting the base URL to your own endpoint.

What context window does Vikasit 4B support?

Vikasit 4B supports a 32K native, 131K via YaRN context window. It is a 4B Dense transformer model — full specifications are listed in the table above.

License & attribution

Apache 2.0

Built on Qwen3-4B (Apache 2.0). Upstream copyright, license, and attribution notices are retained.