Vikasit 3.5 4B
Next-gen 4B with improved reasoning and multimodal awareness.
Overview
Vikasit 3.5 4B is the next-generation 4B — a hybrid-attention MoE with markedly improved reasoning and multimodal awareness, 262K native context expandable to ~1M. Strong quality from a small footprint.
Specifications
- Total parameters
- 4B total
- Architecture
- Hybrid MoE (Gated DeltaNet + sparse MoE)
- Layers
- 32
- Context window
- 262K native, ~1M via YaRN
- Modalities
- Text in → text out (multimodal-capable base)
- License
- Apache 2.0
Capabilities
- Improved reasoning over previous 4B
- 262K native context, ~1M via YaRN
- Strong instruction following (IFEval 89.8)
- Thinking and non-thinking modes
Benchmarks
| Benchmark | Score |
|---|---|
| MMLU-Pro | 79.1 |
| GPQA-Diamond | 76.2 |
| LiveCodeBench v6 | 55.8 |
| IFEval | 89.8 |
| HMMT Feb 2025 | 74.0 |
| MATH-500 | N/A |
Built on Qwen3.5-4B; numbers from the Qwen3.5-4B HuggingFace model card. HMMT reported instead of AIME 2025.
Hardware & deployment
| Precision | Memory |
|---|---|
| bf16 | ~8 GB |
| INT4 | ~2.5 GB |
Quick start
Vikasit 3.5 4B is an open-weight model. Self-host it with any OpenAI-compatible inference server and call it with the OpenAI SDK as shown below.
# pip install openai
import os
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8000/v1",
api_key="sk-local", # self-hosted servers accept any token
)
resp = client.chat.completions.create(
model="vikasit-3.5-4b",
messages=[
{"role": "user", "content": "Explain Vikasit 3.5 4B in one sentence."}
],
)
print(resp.choices[0].message.content)Limitations
- Hybrid-attention kernels need recent runtimes
- Some classic benchmarks not published by base
Vikasit 3.5 4B FAQ
How much does Vikasit 3.5 4B cost?
Vikasit 3.5 4B is an open-weight model built on Qwen3.5-4B (Apache 2.0). Self-hosting the weights is free under the Apache 2.0 licence — you pay only for the hardware or cloud GPUs you run it on. Typical deployment fits the memory profiles listed in the hardware section above.
Is Vikasit 3.5 4B open weight?
Yes. Vikasit 3.5 4B is built on Qwen3.5-4B (Apache 2.0) and distributed under the Apache 2.0 licence, so the weights are openly available for self-hosting, fine-tuning, and commercial use, subject to the upstream licence terms.
How do I run Vikasit 3.5 4B?
Because Vikasit 3.5 4B is open weight, you self-host it with any OpenAI-compatible inference server (such as vLLM or SGLang) loaded with the Qwen3.5-4B (Apache 2.0) weights, then call it with the OpenAI SDK by setting the base URL to your own endpoint.
What context window does Vikasit 3.5 4B support?
Vikasit 3.5 4B supports a 262K native, ~1M via YaRN context window. It is a 4B total Hybrid MoE (Gated DeltaNet + sparse MoE) model — full specifications are listed in the table above.
License & attribution
Apache 2.0
Built on Qwen3.5-4B (Apache 2.0). Upstream copyright, license, and attribution notices are retained.