DeepSeek·7 sizes·Reasoning / Coding LLM

DeepSeek models: sizes & hardware to run them

The DeepSeek family spans 7 sizes from 1.5B to 671B. Each size maps to a different hardware tier — below is the approximate memory each needs at 4-bit and the device we’d start with for a private local deployment.

ReasoningCodeLong context

Sizes & hardware

Model	Params	Context	~VRAM @ 4-bit	Minimum device	Recommended
DeepSeek-R1 Distill 1.5B	1.5B	128K	~1.5GB	NVIDIA GeForce RTX 3060 12GB	NVIDIA B200 (placeholder)
DeepSeek-R1 Distill 8B	8B	128K	~6GB	NVIDIA GeForce RTX 3060 12GB	NVIDIA B200 (placeholder)
DeepSeek-R1 Distill 14B	14B	128K	~10GB	NVIDIA GeForce RTX 3060 12GB	NVIDIA B200 (placeholder)
DeepSeek-Coder V2 (class)	16B	128K	~11GB	Intel Arc A770 16GB	NVIDIA B200 (placeholder)
DeepSeek-R1 Distill 32B	32B	128K	~20GB	NVIDIA GeForce RTX 3090	NVIDIA B200 (placeholder)
DeepSeek-R1 Distill Llama 70B	70B	128K	~42GB	NVIDIA RTX A6000	NVIDIA B200 (placeholder)
DeepSeek-R1 671B (MoE)	671B (≈37B active)	128K	~400GB	Supermicro 8x H100 SuperServer	Supermicro 8x H100 SuperServer

Memory figures are approximate working-set estimates (weights + KV cache at modest context); treat as ±. Device picks come from our compatibility engine, best on-prem fit first.

Open each size

Reasoning

DeepSeek-R1 Distill 1.5B

Runs on almost any hardware, including CPUs and mini PCs. A reasoning model you can put on the edge.

Reasoning

DeepSeek-R1 Distill 8B

8GB+ GPUs at 4-bit — runs alongside most entry hardware. A reasoning model for a single office box.

Reasoning

DeepSeek-R1 Distill 14B

16GB+ GPUs at 4-bit. A mid-size reasoning model for analysis-heavy private agents.

Coding LLM

DeepSeek-Coder V2 (class)

The compact coder variants fit 16GB+ at 4-bit; larger MoE variants need 48GB+ or cloud.

Reasoning

DeepSeek-R1 Distill 32B

A 24GB+ card (RTX 3090/4090) at 4-bit. The best locally-runnable reasoning option for most teams.

Reasoning

DeepSeek-R1 Distill Llama 70B

Flagship tier — ~42GB at 4-bit needs a 48GB card, a 64GB+ unified-memory Mac, or multi-GPU.

Reasoning

DeepSeek-R1 671B (MoE)

Datacenter / multi-node or cloud. The full R1 is a mixture-of-experts at frontier scale — plan for a cluster.

Run DeepSeek models inside a private AI Business OS

Run your own AI agents on hardware you control — private by design, no per-seat data leaving your premises. BrainOutput helps you pick the right machine and turn it into a working AI Business OS.

Explore the AI Business OS