BBrainOutput
Intel·Datacenter acceleratorProvisional

Intel Gaudi 3 128GB: Specs & Local-AI Compatibility

128GB AI accelerator targeting training/inference — verify ecosystem fit.

Some details here are provisional (placeholder). Treat specs as approximate and verify against the manufacturer before relying on them or purchasing.

Specs

Memory
128 GB
Memory type
HBM2e
Bandwidth
3,700 GB/s
Approx FP16
to verify
Architecture
Gaudi 3
Process
TSMC 5nm
Power
900 W
Launch
2024

Models this chip can run

Open models graded for a single Intel Gaudi 3 128GB, best fit first.

  • Qwen2.5 72B
    Qwen · ~72B · 128K ctx · Qwen License

    Fits at Q8_0 (~78GB) with ~34.6GB headroom — about 1 concurrent instance.

    Q8_0 · ~78GBRuns well
  • Llama 3.1 70B
    Llama · ~70B · 128K ctx · Llama Community License

    Fits at Q8_0 (~75GB) with ~37.6GB headroom — about 1 concurrent instance.

    Q8_0 · ~75GBRuns well
  • Llama 3.3 70B
    Llama · ~70B · 128K ctx · Llama Community License

    Fits at Q8_0 (~75GB) with ~37.6GB headroom — about 1 concurrent instance.

    Q8_0 · ~75GBRuns well
  • DeepSeek-R1 Distill Llama 70B
    DeepSeek · ~70B · 128K ctx · MIT

    Fits at Q8_0 (~75GB) with ~37.6GB headroom — about 1 concurrent instance.

    Q8_0 · ~75GBRuns well
  • Mixtral 8x7B (MoE)
    Mistral · ~47B · 32K ctx · Apache-2.0

    Fits at FP16 (~90GB) with ~22.6GB headroom — about 1 concurrent instance.

    FP16 · ~90GBRuns well
  • CodeLlama 34B
    CodeLlama · ~34B · 16K ctx · Llama Community License

    Fits at FP16 (~68GB) with ~44.6GB headroom — about 1 concurrent instance.

    FP16 · ~68GBRuns well
  • Qwen2.5 32B
    Qwen · ~32B · 128K ctx · Apache-2.0

    Fits at FP16 (~64GB) with ~48.6GB headroom — about 1 concurrent instance.

    FP16 · ~64GBRuns well
  • Qwen3 32B
    Qwen · ~32B · 128K ctx · Apache-2.0

    Fits at FP16 (~64GB) with ~48.6GB headroom — about 1 concurrent instance.

    FP16 · ~64GBRuns well

Build a private AI Business OS on Intel Gaudi 3 128GB

Run your own AI agents on hardware you control — private by design, no per-seat data leaving your premises. BrainOutput helps you pick the right machine and turn it into a working AI Business OS.

Explore the AI Business OS