NVIDIA RTX 4090: Specs & Local-AI Compatibility
The consumer flagship: 24GB and high bandwidth run 32B models well.
Specs
- Memory
- 24 GB
- Memory type
- GDDR6X
- Bandwidth
- 1,008 GB/s
- Approx FP16
- 165 TFLOPS
- Architecture
- Ada Lovelace
- Process
- TSMC 4N
- Power
- 450 W
- Launch
- 2022
Models this chip can run
Open models graded for a single NVIDIA RTX 4090, best fit first.
- Gemma 2 27BGemma · ~27B · 8K ctx · Gemma Terms of Use
Fits at Q4_K_M (~17GB) with ~4.1GB headroom — about 1 concurrent instance.
Q4_K_M · ~17GBRuns well - Gemma 3 27BGemma 3 · ~27B · 128K ctx · Gemma Terms of Use
Fits at Q4_K_M (~17GB) with ~4.1GB headroom — about 1 concurrent instance.
Q4_K_M · ~17GBRuns well - Mistral Small 24BMistral · ~24B · 32K ctx · Apache-2.0
Fits at Q4_K_M (~14GB) with ~7.1GB headroom — about 1 concurrent instance.
Q4_K_M · ~14GBRuns well - DeepSeek-Coder V2 (class)DeepSeek · ~16B · 128K ctx · DeepSeek License
Fits at Q8_0 (~18GB) with ~3.1GB headroom — about 1 concurrent instance.
Q8_0 · ~18GBRuns well - StarCoder2 15BStarCoder · ~15B · 16K ctx · BigCode OpenRAIL-M
Fits at Q8_0 (~17GB) with ~4.1GB headroom — about 1 concurrent instance.
Q8_0 · ~17GBRuns well - Qwen2.5 14BQwen · ~14B · 128K ctx · Apache-2.0
Fits at Q8_0 (~16GB) with ~5.1GB headroom — about 1 concurrent instance.
Q8_0 · ~16GBRuns well - Qwen3 14BQwen · ~14B · 128K ctx · Apache-2.0
Fits at Q8_0 (~16GB) with ~5.1GB headroom — about 1 concurrent instance.
Q8_0 · ~16GBRuns well - Phi-3 Medium (14B)Phi · ~14B · 128K ctx · MIT
Fits at Q8_0 (~15GB) with ~6.1GB headroom — about 1 concurrent instance.
Q8_0 · ~15GBRuns well
Devices built on this chip
Build a private AI Business OS on NVIDIA RTX 4090
Run your own AI agents on hardware you control — private by design, no per-seat data leaving your premises. BrainOutput helps you pick the right machine and turn it into a working AI Business OS.
Explore the AI Business OS