Local LLM Library & Hardware Requirements
67 models you can search and filter by type, capability and size. Memory figures are working-set estimates per quantization (treat as ±) — they map model sizes to hardware tiers, not exact benchmarks. Open each model for compatible devices and a recommended build.
67 models
DeepSeek-R1 671B (MoE)
ReasoningDeepSeek · DeepSeek · ~671B (37B active) · 128K ctxReasoningCodeLong context~Q4 memory: 400GBfrontier reasoningMoE efficiencyMIT licenseHardware & compatible devices →Llama 3.1 405B
General LLMLlama · Meta · ~405B · 128K ctxToolsReasoningMultilingualLong context~Q4 memory: 230GBfrontier qualitycomplex reasoningHardware & compatible devices →Qwen3 235B-A22B (MoE)
General LLMQwen · Alibaba · ~235B (22B active) · 128K ctxToolsReasoningCodeMultilingualLong context~Q4 memory: 130GBMoE efficiencyfrontier-class qualityreasoningHardware & compatible devices →Qwen2.5 72B
General LLMQwen · Alibaba · ~72B · 128K ctxToolsCodeReasoningMultilingualLong context~Q4 memory: 44GBcodingreasoningmultilingualHardware & compatible devices →Llama 3.1 70B
General LLMLlama · Meta · ~70B · 128K ctxToolsReasoningMultilingualLong context~Q4 memory: 42GBhigh qualityreasoningtool useHardware & compatible devices →Llama 3.3 70B
General LLMLlama · Meta · ~70B · 128K ctxToolsReasoningMultilingualLong context~Q4 memory: 42GBhigh qualityreasoningagentsHardware & compatible devices →DeepSeek-R1 Distill Llama 70B
ReasoningDeepSeek · DeepSeek · ~70B · 128K ctxReasoningLong context~Q4 memory: 42GBstrong reasoningagentsMIT licenseHardware & compatible devices →Mixtral 8x7B (MoE)
General LLMMistral · Mistral AI · ~47B (13B active) · 32K ctxToolsMultilingual~Q4 memory: 28GBthroughputMoE efficiencygeneral assistantHardware & compatible devices →CodeLlama 34B
Coding LLMCodeLlama · Meta · ~34B · 16K ctxCode~Q4 memory: 21GBcoderepo understandingrefactoringHardware & compatible devices →Qwen2.5 32B
General LLMQwen · Alibaba · ~32B · 128K ctxToolsCodeReasoningMultilingualLong context~Q4 memory: 20GBcodingreasoningagentsHardware & compatible devices →Qwen3 32B
General LLMQwen · Alibaba · ~32B · 128K ctxToolsReasoningCodeMultilingualLong context~Q4 memory: 20GBthinking modereasoningcodingHardware & compatible devices →DeepSeek-R1 Distill 32B
ReasoningDeepSeek · DeepSeek · ~32B · 128K ctxReasoningLong context~Q4 memory: 20GBstrong reasoningagentsMIT licenseHardware & compatible devices →Qwen2.5-Coder 32B
Coding LLMQwen · Alibaba · ~32B · 128K ctxCodeToolsLong context~Q4 memory: 20GBcodingagentsrepo understandingHardware & compatible devices →Gemma 2 27B
General LLMGemma · Google · ~27B · 8K ctxMultilingual~Q4 memory: 17GBgeneral assistantquality responsesHardware & compatible devices →Gemma 3 27B
General LLMGemma 3 · Google · ~27B · 128K ctxVisionMultilingualLong context~Q4 memory: 17GBquality responsesmultilingualvisionHardware & compatible devices →Mistral Small 24B
General LLMMistral · Mistral AI · ~24B · 32K ctxToolsCodeMultilingualLong context~Q4 memory: 14GBbalanced qualitypermissive licenseagentsHardware & compatible devices →DeepSeek-Coder V2 (class)
Coding LLMDeepSeek · DeepSeek · ~16B · 128K ctxCodeLong context~Q4 memory: 11GBcodingfill-in-the-middlerepo-scale contextHardware & compatible devices →StarCoder2 15B
Coding LLMStarCoder · BigCode · ~15B · 16K ctxCode~Q4 memory: 10GBcoderepo understandingfill-in-the-middleHardware & compatible devices →Qwen2.5 14B
General LLMQwen · Alibaba · ~14B · 128K ctxToolsCodeReasoningMultilingualLong context~Q4 memory: 10GBcodingreasoningmultilingualHardware & compatible devices →Qwen3 14B
General LLMQwen · Alibaba · ~14B · 128K ctxToolsReasoningCodeMultilingualLong context~Q4 memory: 10GBthinking modereasoningcodingHardware & compatible devices →Phi-3 Medium (14B)
General LLMPhi · Microsoft · ~14B · 128K ctxReasoningLong context~Q4 memory: 9GBreasoningcompactMIT licenseHardware & compatible devices →Phi-4 (14B)
General LLMPhi · Microsoft · ~14B · 16K ctxReasoningCode~Q4 memory: 9GBreasoningmathMIT licenseHardware & compatible devices →DeepSeek-R1 Distill 14B
ReasoningDeepSeek · DeepSeek · ~14B · 128K ctxReasoningLong context~Q4 memory: 10GBreasoninganalysisMIT licenseHardware & compatible devices →Qwen2.5-Coder 14B
Coding LLMQwen · Alibaba · ~14B · 128K ctxCodeToolsLong context~Q4 memory: 10GBcodingrepo understandingagentsHardware & compatible devices →CodeLlama 13B
Coding LLMCodeLlama · Meta · ~13B · 16K ctxCode~Q4 memory: 8GBcodefill-in-the-middlerepo understandingHardware & compatible devices →LLaVA 13B (vision)
Vision / MultimodalLLaVA · LLaVA · ~13B · 4K ctxVision~Q4 memory: 9GBimage understandingvisual Q&AcaptioningHardware & compatible devices →Gemma 3 12B
General LLMGemma 3 · Google · ~12B · 128K ctxVisionMultilingualLong context~Q4 memory: 8GBquality responsesmultilingualvisionHardware & compatible devices →Mistral Nemo 12B
General LLMMistral · Mistral AI · ~12B · 128K ctxToolsMultilingualLong context~Q4 memory: 8GBmultilinguallong-contexttool useHardware & compatible devices →Llama 3.2 Vision 11B
Vision / MultimodalLlama · Meta · ~11B · 128K ctxVisionLong context~Q4 memory: 9GBimage + text reasoningdocument understandingHardware & compatible devices →Gemma 2 9B
General LLMGemma · Google · ~9B · 8K ctxMultilingual~Q4 memory: 7GBquality responsescompactHardware & compatible devices →Llama 3.1 8B
General LLMLlama · Meta · ~8B · 128K ctxToolsMultilingualLong context~Q4 memory: 6GBgeneral assistantsummarizationtool useHardware & compatible devices →Qwen3 8B
General LLMQwen · Alibaba · ~8B · 128K ctxToolsReasoningCodeMultilingualLong context~Q4 memory: 6GBthinking modemultilingualagentsHardware & compatible devices →Granite 3 8B
General LLMGranite · IBM · ~8B · 128K ctxToolsMultilingualLong context~Q4 memory: 6GBtool useenterprise-friendlylong-contextHardware & compatible devices →DeepSeek-R1 Distill 8B
ReasoningDeepSeek · DeepSeek · ~8B · 128K ctxReasoningLong context~Q4 memory: 6GBstep-by-step reasoningcompactMIT licenseHardware & compatible devices →LLaVA-Llama3 8B (vision)
Vision / MultimodalLLaVA · LLaVA · ~8B · 8K ctxVision~Q4 memory: 6.5GBimage understandingvisual Q&Astronger language backboneHardware & compatible devices →MiniCPM-V 8B (vision)
Vision / MultimodalMiniCPM · OpenBMB · ~8B · 32K ctxVisionMultilingual~Q4 memory: 7GBdocument/OCR visionimage understandingmultilingualHardware & compatible devices →Qwen2.5 7B Instruct
General LLMQwen2.5 · Alibaba · ~7.6B · 33K ctx~Q4 memory: 4.9GBHardware & compatible devices →Qwen2.5 Coder 7B Instruct
Coding LLMQwen2.5 · Alibaba · ~7.6B · 131K ctxCodeLong context~Q4 memory: 4.9GBHardware & compatible devices →Qwen2.5 7B
General LLMQwen · Alibaba · ~7B · 128K ctxToolsCodeMultilingualLong context~Q4 memory: 5.5GBmultilingualcodingstructured outputHardware & compatible devices →Mistral 7B
General LLMMistral · Mistral AI · ~7B · 32K ctxTools~Q4 memory: 5GBfastlow latencygeneral assistantHardware & compatible devices →Qwen2.5-Coder 7B
Coding LLMQwen · Alibaba · ~7B · 128K ctxCodeToolsLong context~Q4 memory: 5.5GBcodingin-editor completionlow latencyHardware & compatible devices →CodeLlama 7B
Coding LLMCodeLlama · Meta · ~7B · 16K ctxCode~Q4 memory: 5GBcodefill-in-the-middlelow latencyHardware & compatible devices →StarCoder2 7B
Coding LLMStarCoder · BigCode · ~7B · 16K ctxCode~Q4 memory: 5GBcodefill-in-the-middlerepo understandingHardware & compatible devices →Qwen2-VL 7B (vision)
Vision / MultimodalQwen · Alibaba · ~7B · 32K ctxVisionMultilingual~Q4 memory: 7GBimage understandingdocument/screenshot parsingOCR-style tasksHardware & compatible devices →LLaVA 7B (vision)
Vision / MultimodalLLaVA · LLaVA · ~7B · 4K ctxVision~Q4 memory: 6GBimage understandingvisual Q&AcaptioningHardware & compatible devices →Gemma 3 4B
General LLMGemma 3 · Google · ~4B · 128K ctxVisionMultilingualLong context~Q4 memory: 3GBcompactmultilingualvisionHardware & compatible devices →Phi-3.5 Mini (3.8B)
General LLMPhi · Microsoft · ~3.8B · 128K ctxReasoningLong contextMultilingual~Q4 memory: 2.5GBcompactreasoningMIT licenseHardware & compatible devices →Llama 3.2 3B
General LLMLlama · Meta · ~3B · 128K ctxToolsMultilingualLong context~Q4 memory: 2.5GBcompactfastinstruction followingHardware & compatible devices →Qwen2.5 3B
General LLMQwen · Alibaba · ~3B · 32K ctxToolsMultilingual~Q4 memory: 2.2GBcompactfastinstruction followingHardware & compatible devices →StarCoder2 3B
Coding LLMStarCoder · BigCode · ~3B · 16K ctxCode~Q4 memory: 2.2GBcodefill-in-the-middlelow latencyHardware & compatible devices →Gemma 2 2B
General LLMGemma · Google · ~2B · 8K ctxMultilingual~Q4 memory: 1.6GBtinyquality responsesfastHardware & compatible devices →Granite 3 2B
General LLMGranite · IBM · ~2B · 128K ctxToolsMultilingualLong context~Q4 memory: 1.6GBcompacttool useenterprise-friendlyHardware & compatible devices →Moondream 2 (vision)
Vision / MultimodalMoondream · Moondream · ~1.8B · 2K ctxVision~Q4 memory: 1.5GBtiny visionedgecaptioningHardware & compatible devices →SmolLM2 1.7B
General LLMSmolLM · Hugging Face · ~1.7B · 8K ctxTools~Q4 memory: 1.1GBtinyedge / CPUfastHardware & compatible devices →Qwen2.5 1.5B
General LLMQwen · Alibaba · ~1.5B · 32K ctxToolsMultilingual~Q4 memory: 1GBtinyfastedgeHardware & compatible devices →DeepSeek-R1 Distill 1.5B
ReasoningDeepSeek · DeepSeek · ~1.5B · 128K ctxReasoningLong context~Q4 memory: 1.5GBtiny reasoningedgestep-by-stepHardware & compatible devices →Qwen2.5-Coder 1.5B
Coding LLMQwen · Alibaba · ~1.5B · 32K ctxCode~Q4 memory: 1GBcodefill-in-the-middlelow latencyHardware & compatible devices →Llama 3.2 1B
General LLMLlama · Meta · ~1B · 128K ctxToolsMultilingualLong context~Q4 memory: 1GBtinyedge / CPUfastHardware & compatible devices →BGE-M3 Embeddings (class)
EmbeddingBAAI · BAAI · 8K ctxEmbeddingMultilingualHosted API — no local footprintmultilingual retrievallong documentsRAGHardware & compatible devices →Qwen2.5 0.5B
General LLMQwen · Alibaba · 32K ctxToolsMultilingual~Q4 memory: 0.4GBtinyedge / CPUfastHardware & compatible devices →mxbai-embed-large (class)
EmbeddingMixedbread · Mixedbread · 0.5K ctxEmbeddingHosted API — no local footprintquality retrievalRAGlightweightHardware & compatible devices →Snowflake Arctic Embed (class)
EmbeddingSnowflake · Snowflake · 0.5K ctxEmbeddingHosted API — no local footprintquality retrievalRAGlightweightHardware & compatible devices →Nomic Embed Text (class)
EmbeddingNomic · Nomic · 8K ctxEmbeddingHosted API — no local footprintfast retrievallightweightRAGHardware & compatible devices →all-MiniLM (class)
EmbeddingSentence-Transformers · Sentence-Transformers · 0.5K ctxEmbeddingHosted API — no local footprinttinyvery fastRAGHardware & compatible devices →Claude (Anthropic API)
APIClaude · Anthropic · 200K ctxToolsReasoningCodeVisionLong contextMultilingualHosted API — no local footprintfrontier qualitylong contextstrong reasoning & codingHardware & compatible devices →GPT-class (OpenAI API)
APIGPT · OpenAI · 128K ctxToolsReasoningCodeVisionLong contextMultilingualHosted API — no local footprintfrontier qualitybroad ecosystemtool useHardware & compatible devices →Gemini-class (Google API)
APIGemini · Google · 1000K ctxToolsReasoningCodeVisionLong contextMultilingualHosted API — no local footprintvery long contextmultimodalfrontier qualityHardware & compatible devices →
Browse by family
See every size in a family side by side, with the hardware each one needs.
On the frontier API models: Claude (Anthropic API), GPT-class (OpenAI API), Gemini-class (Google API) are listed only as quality and cost comparison anchors for the hybrid strategy — they run as hosted services, not on local hardware, and send data to the provider.
A note on honesty: model sizes, context windows and footprints change between releases, and licenses vary. Treat figures here as approximate guidance and verify the exact variant and its terms before deploying. Reasoning, vision and API entries are flagged accordingly.
Run these models inside a private AI Business OS
Run your own AI agents on hardware you control — private by design, no per-seat data leaving your premises. BrainOutput helps you pick the right machine and turn it into a working AI Business OS.
Explore the AI Business OS