Google·3 sizes·General LLM

Gemma 3 models: sizes & hardware to run them

The Gemma 3 family spans 3 sizes from 4B to 27B. Each size maps to a different hardware tier — below is the approximate memory each needs at 4-bit and the device we’d start with for a private local deployment.

VisionMultilingualLong context

Sizes & hardware

Model	Params	Context	~VRAM @ 4-bit	Minimum device	Recommended
Gemma 3 4B	4B	128K	~3GB	NVIDIA GeForce RTX 3060 12GB	NVIDIA B200 (placeholder)
Gemma 3 12B	12B	128K	~8GB	NVIDIA GeForce RTX 3060 12GB	NVIDIA B200 (placeholder)
Gemma 3 27B	27B	128K	~17GB	NVIDIA GeForce RTX 3090	NVIDIA B200 (placeholder)

Memory figures are approximate working-set estimates (weights + KV cache at modest context); treat as ±. Device picks come from our compatibility engine, best on-prem fit first.

Open each size

General LLM

Gemma 3 4B

8GB GPUs, a Mac mini, or a small mini PC at 4-bit. A current small generalist with a long context and image input.

General LLM

Gemma 3 12B

16GB+ GPUs at 4-bit. A current mid-size generalist with long context and image input.

General LLM

Gemma 3 27B

A 24GB card (RTX 3090/4090) or 32GB+ Mac at 4-bit. The flagship Gemma 3 with long context and vision.

Run Gemma 3 models inside a private AI Business OS

Run your own AI agents on hardware you control — private by design, no per-seat data leaving your premises. BrainOutput helps you pick the right machine and turn it into a working AI Business OS.

Explore the AI Business OS