Compatible devices for CodeLlama 13B

Every hardware profile in our catalog graded for CodeLlama 13B, best fit first. For sellable vendor configurations, see the device catalog.

Just the best hardware →

Supermicro 8x H100 SuperServer
Supermicro · AI Servers
Fits at FP16 (~26GB) with ~537.2GB headroom — about 21 concurrent instances.
FP16 · ~26GBRuns well
Dell PowerEdge XE9680
Dell · AI Servers
Fits at FP16 (~26GB) with ~537.2GB headroom — about 21 concurrent instances.
FP16 · ~26GBRuns well
AMD Instinct MI300X
AMD · Datacenter GPUs
Fits at FP16 (~26GB) with ~143GB headroom — about 6 concurrent instances.
FP16 · ~26GBRuns well
NVIDIA H200 (141GB)
NVIDIA · Datacenter GPUs
Fits at FP16 (~26GB) with ~98.1GB headroom — about 4 concurrent instances.
FP16 · ~26GBRuns well
Cloud H200 141GB (profile)
Cloud · Cloud GPU Profiles
Fits at FP16 (~26GB) with ~98.1GB headroom — about 4 concurrent instances.
FP16 · ~26GBRuns well
NVIDIA H100 (80GB)
NVIDIA · Datacenter GPUs
Fits at FP16 (~26GB) with ~44.4GB headroom — about 2 concurrent instances.
FP16 · ~26GBRuns well
Cloud H100 80GB (profile)
Cloud · Cloud GPU Profiles
Fits at FP16 (~26GB) with ~44.4GB headroom — about 2 concurrent instances.
FP16 · ~26GBRuns well
HP Z8 Fury G5 Workstation
HP · AI Workstations
Fits at FP16 (~26GB) with ~58.5GB headroom — about 3 concurrent instances.
FP16 · ~26GBRuns well
Lenovo ThinkStation PX Workstation
Lenovo · AI Workstations
Fits at FP16 (~26GB) with ~58.5GB headroom — about 3 concurrent instances.
FP16 · ~26GBRuns well
Supermicro AI Workstation
Supermicro · AI Workstations
Fits at FP16 (~26GB) with ~58.5GB headroom — about 3 concurrent instances.
FP16 · ~26GBRuns well
Apple Mac Studio (M2 Ultra)
Apple · Apple Silicon
Fits at FP16 (~26GB) with ~108.4GB headroom — about 5 concurrent instances.
FP16 · ~26GBRuns well
Quad RTX 4090 AI Workstation (reference profile)
Reference · AI Workstations
Fits at FP16 (~26GB) with ~58.5GB headroom — about 3 concurrent instances.
FP16 · ~26GBRuns well
Dell Precision 7960 AI Workstation
Dell · AI Workstations
Fits at FP16 (~26GB) with ~16.2GB headroom — about 1 concurrent instance.
FP16 · ~26GBRuns well
NVIDIA A100 80GB
NVIDIA · Datacenter GPUs
Fits at FP16 (~26GB) with ~44.4GB headroom — about 2 concurrent instances.
FP16 · ~26GBRuns well
Cloud A100 80GB (profile)
Cloud · Cloud GPU Profiles
Fits at FP16 (~26GB) with ~44.4GB headroom — about 2 concurrent instances.
FP16 · ~26GBRuns well
Apple Mac Studio (M4 Max)
Apple · Apple Silicon
Fits at FP16 (~26GB) with ~63.6GB headroom — about 3 concurrent instances.
FP16 · ~26GBRuns well
NVIDIA DGX Spark (GB10)
NVIDIA · AI Appliances
Fits at FP16 (~26GB) with ~63.6GB headroom — about 3 concurrent instances.
FP16 · ~26GBRuns well
ASUS Ascent GX10 (GB10)
ASUS · AI Appliances
Fits at FP16 (~26GB) with ~63.6GB headroom — about 3 concurrent instances.
FP16 · ~26GBRuns well
Dell Pro Max with GB10
Dell · AI Appliances
Fits at FP16 (~26GB) with ~63.6GB headroom — about 3 concurrent instances.
FP16 · ~26GBRuns well
AMD Ryzen AI Max Mini PC (Strix Halo class)
AMD · Mini PCs
Fits at FP16 (~26GB) with ~63.6GB headroom — about 3 concurrent instances.
FP16 · ~26GBRuns well
Coding Agent Workstation (reference profile)
Reference · AI Workstations
Fits at FP16 (~26GB) with ~16.2GB headroom — about 1 concurrent instance.
FP16 · ~26GBRuns well
NVIDIA L40S
NVIDIA · Datacenter GPUs
Fits at FP16 (~26GB) with ~16.2GB headroom — about 1 concurrent instance.
FP16 · ~26GBRuns well
Cloud L40S 48GB (profile)
Cloud · Cloud GPU Profiles
Fits at FP16 (~26GB) with ~16.2GB headroom — about 1 concurrent instance.
FP16 · ~26GBRuns well
Apple Mac mini (M4 Pro)
Apple · Apple Silicon
Fits at FP16 (~26GB) with ~18.8GB headroom — about 1 concurrent instance.
FP16 · ~26GBRuns well
Law Firm Private AI Box (reference profile)
Reference · AI Appliances
Fits at FP16 (~26GB) with ~16.2GB headroom — about 1 concurrent instance.
FP16 · ~26GBRuns well
NVIDIA RTX 6000 Ada Generation
NVIDIA · Professional GPUs
Fits at FP16 (~26GB) with ~16.2GB headroom — about 1 concurrent instance.
FP16 · ~26GBRuns well
AMD Radeon PRO W7900
AMD · Professional GPUs
Fits at FP16 (~26GB) with ~16.2GB headroom — about 1 concurrent instance.
FP16 · ~26GBRuns well
NVIDIA RTX A6000
NVIDIA · Professional GPUs
Fits at FP16 (~26GB) with ~16.2GB headroom — about 1 concurrent instance.
FP16 · ~26GBRuns well
Accounting / Odoo AI Box (reference profile)
Reference · AI Appliances
Fits at Q8_0 (~14GB) with ~7.1GB headroom — about 1 concurrent instance.
Q8_0 · ~14GBRuns well
Small Business Mini PC (reference profile)
Reference · Mini PCs
Fits at Q8_0 (~14GB) with ~8.4GB headroom — about 1 concurrent instance.
Q8_0 · ~14GBRuns well
NVIDIA GeForce RTX 4090
NVIDIA · Consumer GPUs
Fits at Q8_0 (~14GB) with ~7.1GB headroom — about 1 concurrent instance.
Q8_0 · ~14GBRuns well
Apple Mac mini (M4)
Apple · Apple Silicon
Fits at Q8_0 (~14GB) with ~8.4GB headroom — about 1 concurrent instance.
Q8_0 · ~14GBRuns well
AMD Radeon RX 7900 XTX
AMD · Consumer GPUs
Fits at Q8_0 (~14GB) with ~7.1GB headroom — about 1 concurrent instance.
Q8_0 · ~14GBRuns well
NVIDIA GeForce RTX 3090
NVIDIA · Consumer GPUs
Fits at Q8_0 (~14GB) with ~7.1GB headroom — about 1 concurrent instance.
Q8_0 · ~14GBRuns well
Dual RTX 3060 Local Server (reference profile)
Reference · AI Servers
Fits at Q8_0 (~14GB) with ~7.1GB headroom — about 1 concurrent instance.
Q8_0 · ~14GBRuns well
Local Office AI Appliance (reference profile)
Reference · AI Appliances
Fits at Q8_0 (~14GB) with ~0.1GB headroom — about 1 concurrent instance.
Q8_0 · ~14GBRuns well
Hotel AI Automation Box (reference profile)
Reference · AI Appliances
Fits at Q8_0 (~14GB) with ~0.1GB headroom — about 1 concurrent instance.
Q8_0 · ~14GBRuns well
Intel Arc A770 16GB
Intel · Consumer GPUs
Fits at Q8_0 (~14GB) with ~0.1GB headroom — about 1 concurrent instance.
Q8_0 · ~14GBRuns well
Intel Arc B580 12GB
Intel · Consumer GPUs
Fits at Q4_K_M (~8GB) with ~2.6GB headroom — about 1 concurrent instance.
Q4_K_M · ~8GBRuns well
NVIDIA GeForce RTX 3060 12GB
NVIDIA · Consumer GPUs
Fits at Q4_K_M (~8GB) with ~2.6GB headroom — about 1 concurrent instance.
Q4_K_M · ~8GBRuns well

Run CodeLlama 13B privately

Run your own AI agents on hardware you control — private by design, no per-seat data leaving your premises. BrainOutput helps you pick the right machine and turn it into a working AI Business OS.

Explore the AI Business OS