Microsoft·3 sizes·General LLM

Phi models: sizes & hardware to run them

The Phi family spans 3 sizes from 3.8B to 14B. Each size maps to a different hardware tier — below is the approximate memory each needs at 4-bit and the device we’d start with for a private local deployment.

ReasoningCodeMultilingualLong context

Sizes & hardware

Model	Params	Context	~VRAM @ 4-bit	Minimum device	Recommended
Phi-3.5 Mini (3.8B)	3.8B	128K	~2.5GB	NVIDIA GeForce RTX 3060 12GB	NVIDIA B200 (placeholder)
Phi-3 Medium (14B)	14B	128K	~9GB	NVIDIA GeForce RTX 3060 12GB	NVIDIA B200 (placeholder)
Phi-4 (14B)	14B	16K	~9GB	NVIDIA GeForce RTX 3060 12GB	NVIDIA B200 (placeholder)

Memory figures are approximate working-set estimates (weights + KV cache at modest context); treat as ±. Device picks come from our compatibility engine, best on-prem fit first.

Open each size

General LLM

Phi-3.5 Mini (3.8B)

8GB GPUs, a Mac mini, or even a strong CPU. A small reasoning-leaning model with a permissive MIT license.

General LLM

Phi-3 Medium (14B)

~12GB+ at 4-bit; comfortable on a 16GB GPU or Apple silicon. The MIT license is attractive commercially.

General LLM

Phi-4 (14B)

16GB GPU or Apple silicon at 4-bit. A current small model with strong reasoning and an MIT license.

Run Phi models inside a private AI Business OS

Run your own AI agents on hardware you control — private by design, no per-seat data leaving your premises. BrainOutput helps you pick the right machine and turn it into a working AI Business OS.

Explore the AI Business OS