BBrainOutput

LLM Hardware Requirements

Approximate memory each open model needs per quantization, and the smallest catalog device that can run it. Figures are working-set estimates (weights + KV cache at modest context) — treat as ±. As a rule of thumb, the 4-bit (Q4) column is the memory you need to budget.

General LLM

ModelParamsContextQ4Q8FP16Minimum device
Llama 3.2 1B~1B128K~1GB~1.5GB~3GBNVIDIA GeForce RTX 3060 12GB
Llama 3.2 3B~3B128K~2.5GB~4GB~7GBNVIDIA GeForce RTX 3060 12GB
Llama 3.1 8B~8B128K~6GB~9GB~17GBNVIDIA GeForce RTX 3060 12GB
Llama 3.1 70B~70B128K~42GB~75GB~140GBNVIDIA RTX A6000
Llama 3.3 70B~70B128K~42GB~75GB~140GBNVIDIA RTX A6000
Llama 3.1 405B~405B128K~230GB~410GB~810GBSupermicro 8x H100 SuperServer
Qwen2.5 7B~7B128K~5.5GB~8GB~15GBNVIDIA GeForce RTX 3060 12GB
Qwen2.5 14B~14B128K~10GB~16GB~30GBNVIDIA GeForce RTX 3060 12GB
Qwen2.5 32B~32B128K~20GB~34GB~64GBNVIDIA GeForce RTX 3090
Qwen2.5 72B~72B128K~44GB~78GB~145GBApple Mac mini (M4 Pro)
Qwen3 8B~8B128K~6GB~9GB~17GBNVIDIA GeForce RTX 3060 12GB
Qwen3 14B~14B128K~10GB~16GB~30GBNVIDIA GeForce RTX 3060 12GB
Qwen3 32B~32B128K~20GB~34GB~64GBNVIDIA GeForce RTX 3090
Qwen3 235B-A22B (MoE)~235B128K~130GB~235GB~470GBNVIDIA B200 (placeholder)
Mistral 7B~7B32K~5GB~8GB~15GBNVIDIA GeForce RTX 3060 12GB
Mistral Small 24B~24B32K~14GB~25GB~48GBIntel Arc A770 16GB
Mixtral 8x7B (MoE)~47B32K~28GB~50GB~90GBNVIDIA GeForce RTX 5090 (placeholder)
Gemma 2 9B~9B8K~7GB~10GB~19GBNVIDIA GeForce RTX 3060 12GB
Gemma 2 27B~27B8K~17GB~29GB~54GBNVIDIA GeForce RTX 3090
Phi-3 Medium (14B)~14B128K~9GB~15GB~28GBNVIDIA GeForce RTX 3060 12GB
Phi-4 (14B)~14B16K~9GB~15GB~28GBNVIDIA GeForce RTX 3060 12GB
Qwen2.5 0.5Bsmall32K~0.4GB~0.6GB~1GBNVIDIA GeForce RTX 3060 12GB
Qwen2.5 1.5B~1.5B32K~1GB~1.7GB~3GBNVIDIA GeForce RTX 3060 12GB
Qwen2.5 3B~3B32K~2.2GB~3.4GB~6GBNVIDIA GeForce RTX 3060 12GB
Gemma 2 2B~2B8K~1.6GB~2.4GB~4GBNVIDIA GeForce RTX 3060 12GB
Gemma 3 4B~4B128K~3GB~4.5GB~8GBNVIDIA GeForce RTX 3060 12GB
Gemma 3 12B~12B128K~8GB~13GB~24GBNVIDIA GeForce RTX 3060 12GB
Gemma 3 27B~27B128K~17GB~29GB~54GBNVIDIA GeForce RTX 3090
Phi-3.5 Mini (3.8B)~3.8B128K~2.5GB~4GB~8GBNVIDIA GeForce RTX 3060 12GB
Mistral Nemo 12B~12B128K~8GB~13GB~24GBNVIDIA GeForce RTX 3060 12GB
Granite 3 2B~2B128K~1.6GB~2.4GB~4GBNVIDIA GeForce RTX 3060 12GB
Granite 3 8B~8B128K~6GB~9GB~17GBNVIDIA GeForce RTX 3060 12GB
SmolLM2 1.7B~1.7B8K~1.1GB~1.9GB~3.4GBNVIDIA GeForce RTX 3060 12GB
Qwen2.5 7B Instruct~7.6B33K~4.9GB~8.4GB~15.2GBNVIDIA GeForce RTX 3060 12GB

Reasoning

Coding LLM

Embedding

Vision / Multimodal

Size a machine for your private AI Business OS

Run your own AI agents on hardware you control — private by design, no per-seat data leaving your premises. BrainOutput helps you pick the right machine and turn it into a working AI Business OS.

Get started