Mistral AI·4 sizes·General LLM
Mistral models: sizes & hardware to run them
The Mistral family spans 4 sizes from 7B to 47B. Each size maps to a different hardware tier — below is the approximate memory each needs at 4-bit and the device we’d start with for a private local deployment.
ToolsCodeMultilingualLong context
Sizes & hardware
| Model | Params | Context | ~VRAM @ 4-bit | Minimum device | Recommended |
|---|---|---|---|---|---|
| Mistral 7B | 7B | 32K | ~5GB | NVIDIA GeForce RTX 3060 12GB | NVIDIA B200 (placeholder) |
| Mistral Nemo 12B | 12B | 128K | ~8GB | NVIDIA GeForce RTX 3060 12GB | NVIDIA B200 (placeholder) |
| Mistral Small 24B | 24B | 32K | ~14GB | Intel Arc A770 16GB | NVIDIA B200 (placeholder) |
| Mixtral 8x7B (MoE) | 47B (≈13B active) | 32K | ~28GB | NVIDIA GeForce RTX 5090 (placeholder) | NVIDIA B200 (placeholder) |
Memory figures are approximate working-set estimates (weights + KV cache at modest context); treat as ±. Device picks come from our compatibility engine, best on-prem fit first.
Open each size
General LLM
Mistral 7B
Lightweight enough for 8GB GPUs; a quick, permissively-licensed assistant.
General LLM
Mistral Nemo 12B
16GB+ GPUs at 4-bit. A 128K-context, openly-licensed mid-size model built with NVIDIA.
General LLM
Mistral Small 24B
A 24GB card at 4-bit. A capable, openly-licensed mid-size model between 14B and 32B.
General LLM
Mixtral 8x7B (MoE)
~28GB+ at 4-bit; suits 48GB pro cards or unified-memory machines. Sparse activation gives good speed for the quality.
Run Mistral models inside a private AI Business OS
Run your own AI agents on hardware you control — private by design, no per-seat data leaving your premises. BrainOutput helps you pick the right machine and turn it into a working AI Business OS.
Explore the AI Business OS