BBrainOutput

Compatible models for Gigabyte MI300X Server

Open models graded for the Gigabyte MI300X Server (top config: 1024GB, ~1536GB AI memory), best fit first. Lower configurations run fewer of these.

  • DeepSeek-R1 671B (MoE)
    DeepSeek · ~671B · 128K ctx · MIT

    Fits at FP16 (~1340GB) with ~11.7GB headroom — about 1 concurrent instance.

    FP16 · ~1340GBRuns well
  • Llama 3.1 405B
    Llama · ~405B · 128K ctx · Llama Community License

    Fits at FP16 (~810GB) with ~541.7GB headroom — about 1 concurrent instance.

    FP16 · ~810GBRuns well
  • Qwen3 235B-A22B (MoE)
    Qwen · ~235B · 128K ctx · Apache-2.0

    Fits at FP16 (~470GB) with ~881.7GB headroom — about 2 concurrent instances.

    FP16 · ~470GBRuns well
  • Qwen2.5 72B
    Qwen · ~72B · 128K ctx · Qwen License

    Fits at FP16 (~145GB) with ~1206.7GB headroom — about 9 concurrent instances.

    FP16 · ~145GBRuns well
  • Llama 3.1 70B
    Llama · ~70B · 128K ctx · Llama Community License

    Fits at FP16 (~140GB) with ~1211.7GB headroom — about 9 concurrent instances.

    FP16 · ~140GBRuns well
  • Llama 3.3 70B
    Llama · ~70B · 128K ctx · Llama Community License

    Fits at FP16 (~140GB) with ~1211.7GB headroom — about 9 concurrent instances.

    FP16 · ~140GBRuns well
  • DeepSeek-R1 Distill Llama 70B
    DeepSeek · ~70B · 128K ctx · MIT

    Fits at FP16 (~140GB) with ~1211.7GB headroom — about 9 concurrent instances.

    FP16 · ~140GBRuns well
  • Mixtral 8x7B (MoE)
    Mistral · ~47B · 32K ctx · Apache-2.0

    Fits at FP16 (~90GB) with ~1261.7GB headroom — about 15 concurrent instances.

    FP16 · ~90GBRuns well
  • CodeLlama 34B
    CodeLlama · ~34B · 16K ctx · Llama Community License

    Fits at FP16 (~68GB) with ~1283.7GB headroom — about 19 concurrent instances.

    FP16 · ~68GBRuns well
  • Qwen2.5 32B
    Qwen · ~32B · 128K ctx · Apache-2.0

    Fits at FP16 (~64GB) with ~1287.7GB headroom — about 21 concurrent instances.

    FP16 · ~64GBRuns well
  • Qwen3 32B
    Qwen · ~32B · 128K ctx · Apache-2.0

    Fits at FP16 (~64GB) with ~1287.7GB headroom — about 21 concurrent instances.

    FP16 · ~64GBRuns well
  • DeepSeek-R1 Distill 32B
    DeepSeek · ~32B · 128K ctx · MIT

    Fits at FP16 (~64GB) with ~1287.7GB headroom — about 21 concurrent instances.

    FP16 · ~64GBRuns well
  • Qwen2.5-Coder 32B
    Qwen · ~32B · 128K ctx · Apache-2.0

    Fits at FP16 (~64GB) with ~1287.7GB headroom — about 21 concurrent instances.

    FP16 · ~64GBRuns well
  • Gemma 2 27B
    Gemma · ~27B · 8K ctx · Gemma Terms of Use

    Fits at FP16 (~54GB) with ~1297.7GB headroom — about 25 concurrent instances.

    FP16 · ~54GBRuns well
  • Gemma 3 27B
    Gemma 3 · ~27B · 128K ctx · Gemma Terms of Use

    Fits at FP16 (~54GB) with ~1297.7GB headroom — about 25 concurrent instances.

    FP16 · ~54GBRuns well
  • Mistral Small 24B
    Mistral · ~24B · 32K ctx · Apache-2.0

    Fits at FP16 (~48GB) with ~1303.7GB headroom — about 28 concurrent instances.

    FP16 · ~48GBRuns well
  • DeepSeek-Coder V2 (class)
    DeepSeek · ~16B · 128K ctx · DeepSeek License

    Fits at FP16 (~33GB) with ~1318.7GB headroom — about 40 concurrent instances.

    FP16 · ~33GBRuns well
  • StarCoder2 15B
    StarCoder · ~15B · 16K ctx · BigCode OpenRAIL-M

    Fits at FP16 (~30GB) with ~1321.7GB headroom — about 45 concurrent instances.

    FP16 · ~30GBRuns well
  • Qwen2.5 14B
    Qwen · ~14B · 128K ctx · Apache-2.0

    Fits at FP16 (~30GB) with ~1321.7GB headroom — about 45 concurrent instances.

    FP16 · ~30GBRuns well
  • Qwen3 14B
    Qwen · ~14B · 128K ctx · Apache-2.0

    Fits at FP16 (~30GB) with ~1321.7GB headroom — about 45 concurrent instances.

    FP16 · ~30GBRuns well
  • Phi-3 Medium (14B)
    Phi · ~14B · 128K ctx · MIT

    Fits at FP16 (~28GB) with ~1323.7GB headroom — about 48 concurrent instances.

    FP16 · ~28GBRuns well
  • Phi-4 (14B)
    Phi · ~14B · 16K ctx · MIT

    Fits at FP16 (~28GB) with ~1323.7GB headroom — about 48 concurrent instances.

    FP16 · ~28GBRuns well
  • DeepSeek-R1 Distill 14B
    DeepSeek · ~14B · 128K ctx · MIT

    Fits at FP16 (~30GB) with ~1321.7GB headroom — about 45 concurrent instances.

    FP16 · ~30GBRuns well
  • Qwen2.5-Coder 14B
    Qwen · ~14B · 128K ctx · Apache-2.0

    Fits at FP16 (~30GB) with ~1321.7GB headroom — about 45 concurrent instances.

    FP16 · ~30GBRuns well
  • CodeLlama 13B
    CodeLlama · ~13B · 16K ctx · Llama Community License

    Fits at FP16 (~26GB) with ~1325.7GB headroom — about 51 concurrent instances.

    FP16 · ~26GBRuns well
  • Gemma 3 12B
    Gemma 3 · ~12B · 128K ctx · Gemma Terms of Use

    Fits at FP16 (~24GB) with ~1327.7GB headroom — about 56 concurrent instances.

    FP16 · ~24GBRuns well
  • Mistral Nemo 12B
    Mistral · ~12B · 128K ctx · Apache-2.0

    Fits at FP16 (~24GB) with ~1327.7GB headroom — about 56 concurrent instances.

    FP16 · ~24GBRuns well
  • Gemma 2 9B
    Gemma · ~9B · 8K ctx · Gemma Terms of Use

    Fits at FP16 (~19GB) with ~1332.7GB headroom — about 71 concurrent instances.

    FP16 · ~19GBRuns well
  • Llama 3.1 8B
    Llama · ~8B · 128K ctx · Llama Community License

    Fits at FP16 (~17GB) with ~1334.7GB headroom — about 79 concurrent instances.

    FP16 · ~17GBRuns well
  • Qwen3 8B
    Qwen · ~8B · 128K ctx · Apache-2.0

    Fits at FP16 (~17GB) with ~1334.7GB headroom — about 79 concurrent instances.

    FP16 · ~17GBRuns well
  • Granite 3 8B
    Granite · ~8B · 128K ctx · Apache-2.0

    Fits at FP16 (~17GB) with ~1334.7GB headroom — about 79 concurrent instances.

    FP16 · ~17GBRuns well
  • DeepSeek-R1 Distill 8B
    DeepSeek · ~8B · 128K ctx · MIT

    Fits at FP16 (~17GB) with ~1334.7GB headroom — about 79 concurrent instances.

    FP16 · ~17GBRuns well
  • Qwen2.5 7B Instruct
    Qwen2.5 · ~7.6B · 33K ctx · apache-2.0

    Fits at FP16 (~15.2GB) with ~1336.5GB headroom — about 88 concurrent instances.

    FP16 · ~15.2GBRuns well
  • Qwen2.5 Coder 7B Instruct
    Qwen2.5 · ~7.6B · 131K ctx · apache-2.0

    Fits at FP16 (~15.2GB) with ~1336.5GB headroom — about 88 concurrent instances.

    FP16 · ~15.2GBRuns well
  • Qwen2.5 7B
    Qwen · ~7B · 128K ctx · Apache-2.0

    Fits at FP16 (~15GB) with ~1336.7GB headroom — about 90 concurrent instances.

    FP16 · ~15GBRuns well
  • Mistral 7B
    Mistral · ~7B · 32K ctx · Apache-2.0

    Fits at FP16 (~15GB) with ~1336.7GB headroom — about 90 concurrent instances.

    FP16 · ~15GBRuns well
  • Qwen2.5-Coder 7B
    Qwen · ~7B · 128K ctx · Apache-2.0

    Fits at FP16 (~15GB) with ~1336.7GB headroom — about 90 concurrent instances.

    FP16 · ~15GBRuns well
  • CodeLlama 7B
    CodeLlama · ~7B · 16K ctx · Llama Community License

    Fits at FP16 (~14GB) with ~1337.7GB headroom — about 96 concurrent instances.

    FP16 · ~14GBRuns well
  • StarCoder2 7B
    StarCoder · ~7B · 16K ctx · BigCode OpenRAIL-M

    Fits at FP16 (~14GB) with ~1337.7GB headroom — about 96 concurrent instances.

    FP16 · ~14GBRuns well
  • Gemma 3 4B
    Gemma 3 · ~4B · 128K ctx · Gemma Terms of Use

    Fits at FP16 (~8GB) with ~1343.7GB headroom — about 168 concurrent instances.

    FP16 · ~8GBRuns well
  • Phi-3.5 Mini (3.8B)
    Phi · ~3.8B · 128K ctx · MIT

    Fits at FP16 (~8GB) with ~1343.7GB headroom — about 168 concurrent instances.

    FP16 · ~8GBRuns well
  • Llama 3.2 3B
    Llama · ~3B · 128K ctx · Llama Community License

    Fits at FP16 (~7GB) with ~1344.7GB headroom — about 193 concurrent instances.

    FP16 · ~7GBRuns well
  • Qwen2.5 3B
    Qwen · ~3B · 32K ctx · Qwen Research License

    Fits at FP16 (~6GB) with ~1345.7GB headroom — about 225 concurrent instances.

    FP16 · ~6GBRuns well
  • StarCoder2 3B
    StarCoder · ~3B · 16K ctx · BigCode OpenRAIL-M

    Fits at FP16 (~6GB) with ~1345.7GB headroom — about 225 concurrent instances.

    FP16 · ~6GBRuns well
  • Gemma 2 2B
    Gemma · ~2B · 8K ctx · Gemma Terms of Use

    Fits at FP16 (~4GB) with ~1347.7GB headroom — about 337 concurrent instances.

    FP16 · ~4GBRuns well
  • Granite 3 2B
    Granite · ~2B · 128K ctx · Apache-2.0

    Fits at FP16 (~4GB) with ~1347.7GB headroom — about 337 concurrent instances.

    FP16 · ~4GBRuns well
  • SmolLM2 1.7B
    SmolLM · ~1.7B · 8K ctx · Apache-2.0

    Fits at FP16 (~3.4GB) with ~1348.3GB headroom — about 397 concurrent instances.

    FP16 · ~3.4GBRuns well
  • Qwen2.5 1.5B
    Qwen · ~1.5B · 32K ctx · Apache-2.0

    Fits at FP16 (~3GB) with ~1348.7GB headroom — about 450 concurrent instances.

    FP16 · ~3GBRuns well
  • DeepSeek-R1 Distill 1.5B
    DeepSeek · ~1.5B · 128K ctx · MIT

    Fits at FP16 (~4GB) with ~1347.7GB headroom — about 337 concurrent instances.

    FP16 · ~4GBRuns well
  • Qwen2.5-Coder 1.5B
    Qwen · ~1.5B · 32K ctx · Apache-2.0

    Fits at FP16 (~3GB) with ~1348.7GB headroom — about 450 concurrent instances.

    FP16 · ~3GBRuns well
  • Llama 3.2 1B
    Llama · ~1B · 128K ctx · Llama Community License

    Fits at FP16 (~3GB) with ~1348.7GB headroom — about 450 concurrent instances.

    FP16 · ~3GBRuns well
  • Qwen2.5 0.5B
    Qwen · 32K ctx · Apache-2.0

    Fits at FP16 (~1GB) with ~1350.7GB headroom — about 1351 concurrent instances.

    FP16 · ~1GBRuns well

All Gigabyte MI300X Server configurations →

Run these models on the Gigabyte MI300X Server

Run your own AI agents on hardware you control — private by design, no per-seat data leaving your premises. BrainOutput helps you pick the right machine and turn it into a working AI Business OS.