Biblioteca de LLMs locais e requisitos de hardware
67 modelos que pode pesquisar e filtrar por tipo, capacidade e tamanho. Os valores de memória são estimativas do conjunto de trabalho por quantização (trate-os como ±): associam tamanhos de modelo a níveis de hardware, não a benchmarks exatos. Abra cada modelo para ver dispositivos compatíveis e uma configuração recomendada.
67 models
DeepSeek-R1 671B (MoE)
ReasoningDeepSeek · DeepSeek · ~671B (37B active) · 128K ctxReasoningCodeLong context~Q4 memory: 400GBfrontier reasoningMoE efficiencyMIT licenseHardware & compatible devices →Llama 3.1 405B
General LLMLlama · Meta · ~405B · 128K ctxToolsReasoningMultilingualLong context~Q4 memory: 230GBfrontier qualitycomplex reasoningHardware & compatible devices →Qwen3 235B-A22B (MoE)
General LLMQwen · Alibaba · ~235B (22B active) · 128K ctxToolsReasoningCodeMultilingualLong context~Q4 memory: 130GBMoE efficiencyfrontier-class qualityreasoningHardware & compatible devices →Qwen2.5 72B
General LLMQwen · Alibaba · ~72B · 128K ctxToolsCodeReasoningMultilingualLong context~Q4 memory: 44GBcodingreasoningmultilingualHardware & compatible devices →Llama 3.1 70B
General LLMLlama · Meta · ~70B · 128K ctxToolsReasoningMultilingualLong context~Q4 memory: 42GBhigh qualityreasoningtool useHardware & compatible devices →Llama 3.3 70B
General LLMLlama · Meta · ~70B · 128K ctxToolsReasoningMultilingualLong context~Q4 memory: 42GBhigh qualityreasoningagentsHardware & compatible devices →DeepSeek-R1 Distill Llama 70B
ReasoningDeepSeek · DeepSeek · ~70B · 128K ctxReasoningLong context~Q4 memory: 42GBstrong reasoningagentsMIT licenseHardware & compatible devices →Mixtral 8x7B (MoE)
General LLMMistral · Mistral AI · ~47B (13B active) · 32K ctxToolsMultilingual~Q4 memory: 28GBthroughputMoE efficiencygeneral assistantHardware & compatible devices →CodeLlama 34B
Coding LLMCodeLlama · Meta · ~34B · 16K ctxCode~Q4 memory: 21GBcoderepo understandingrefactoringHardware & compatible devices →Qwen2.5 32B
General LLMQwen · Alibaba · ~32B · 128K ctxToolsCodeReasoningMultilingualLong context~Q4 memory: 20GBcodingreasoningagentsHardware & compatible devices →Qwen3 32B
General LLMQwen · Alibaba · ~32B · 128K ctxToolsReasoningCodeMultilingualLong context~Q4 memory: 20GBthinking modereasoningcodingHardware & compatible devices →DeepSeek-R1 Distill 32B
ReasoningDeepSeek · DeepSeek · ~32B · 128K ctxReasoningLong context~Q4 memory: 20GBstrong reasoningagentsMIT licenseHardware & compatible devices →Qwen2.5-Coder 32B
Coding LLMQwen · Alibaba · ~32B · 128K ctxCodeToolsLong context~Q4 memory: 20GBcodingagentsrepo understandingHardware & compatible devices →Gemma 2 27B
General LLMGemma · Google · ~27B · 8K ctxMultilingual~Q4 memory: 17GBgeneral assistantquality responsesHardware & compatible devices →Gemma 3 27B
General LLMGemma 3 · Google · ~27B · 128K ctxVisionMultilingualLong context~Q4 memory: 17GBquality responsesmultilingualvisionHardware & compatible devices →Mistral Small 24B
General LLMMistral · Mistral AI · ~24B · 32K ctxToolsCodeMultilingualLong context~Q4 memory: 14GBbalanced qualitypermissive licenseagentsHardware & compatible devices →DeepSeek-Coder V2 (class)
Coding LLMDeepSeek · DeepSeek · ~16B · 128K ctxCodeLong context~Q4 memory: 11GBcodingfill-in-the-middlerepo-scale contextHardware & compatible devices →StarCoder2 15B
Coding LLMStarCoder · BigCode · ~15B · 16K ctxCode~Q4 memory: 10GBcoderepo understandingfill-in-the-middleHardware & compatible devices →Qwen2.5 14B
General LLMQwen · Alibaba · ~14B · 128K ctxToolsCodeReasoningMultilingualLong context~Q4 memory: 10GBcodingreasoningmultilingualHardware & compatible devices →Qwen3 14B
General LLMQwen · Alibaba · ~14B · 128K ctxToolsReasoningCodeMultilingualLong context~Q4 memory: 10GBthinking modereasoningcodingHardware & compatible devices →Phi-3 Medium (14B)
General LLMPhi · Microsoft · ~14B · 128K ctxReasoningLong context~Q4 memory: 9GBreasoningcompactMIT licenseHardware & compatible devices →Phi-4 (14B)
General LLMPhi · Microsoft · ~14B · 16K ctxReasoningCode~Q4 memory: 9GBreasoningmathMIT licenseHardware & compatible devices →DeepSeek-R1 Distill 14B
ReasoningDeepSeek · DeepSeek · ~14B · 128K ctxReasoningLong context~Q4 memory: 10GBreasoninganalysisMIT licenseHardware & compatible devices →Qwen2.5-Coder 14B
Coding LLMQwen · Alibaba · ~14B · 128K ctxCodeToolsLong context~Q4 memory: 10GBcodingrepo understandingagentsHardware & compatible devices →CodeLlama 13B
Coding LLMCodeLlama · Meta · ~13B · 16K ctxCode~Q4 memory: 8GBcodefill-in-the-middlerepo understandingHardware & compatible devices →LLaVA 13B (vision)
Vision / MultimodalLLaVA · LLaVA · ~13B · 4K ctxVision~Q4 memory: 9GBimage understandingvisual Q&AcaptioningHardware & compatible devices →Gemma 3 12B
General LLMGemma 3 · Google · ~12B · 128K ctxVisionMultilingualLong context~Q4 memory: 8GBquality responsesmultilingualvisionHardware & compatible devices →Mistral Nemo 12B
General LLMMistral · Mistral AI · ~12B · 128K ctxToolsMultilingualLong context~Q4 memory: 8GBmultilinguallong-contexttool useHardware & compatible devices →Llama 3.2 Vision 11B
Vision / MultimodalLlama · Meta · ~11B · 128K ctxVisionLong context~Q4 memory: 9GBimage + text reasoningdocument understandingHardware & compatible devices →Gemma 2 9B
General LLMGemma · Google · ~9B · 8K ctxMultilingual~Q4 memory: 7GBquality responsescompactHardware & compatible devices →Llama 3.1 8B
General LLMLlama · Meta · ~8B · 128K ctxToolsMultilingualLong context~Q4 memory: 6GBgeneral assistantsummarizationtool useHardware & compatible devices →Qwen3 8B
General LLMQwen · Alibaba · ~8B · 128K ctxToolsReasoningCodeMultilingualLong context~Q4 memory: 6GBthinking modemultilingualagentsHardware & compatible devices →Granite 3 8B
General LLMGranite · IBM · ~8B · 128K ctxToolsMultilingualLong context~Q4 memory: 6GBtool useenterprise-friendlylong-contextHardware & compatible devices →DeepSeek-R1 Distill 8B
ReasoningDeepSeek · DeepSeek · ~8B · 128K ctxReasoningLong context~Q4 memory: 6GBstep-by-step reasoningcompactMIT licenseHardware & compatible devices →LLaVA-Llama3 8B (vision)
Vision / MultimodalLLaVA · LLaVA · ~8B · 8K ctxVision~Q4 memory: 6.5GBimage understandingvisual Q&Astronger language backboneHardware & compatible devices →MiniCPM-V 8B (vision)
Vision / MultimodalMiniCPM · OpenBMB · ~8B · 32K ctxVisionMultilingual~Q4 memory: 7GBdocument/OCR visionimage understandingmultilingualHardware & compatible devices →Qwen2.5 7B Instruct
General LLMQwen2.5 · Alibaba · ~7.6B · 33K ctx~Q4 memory: 4.9GBHardware & compatible devices →Qwen2.5 Coder 7B Instruct
Coding LLMQwen2.5 · Alibaba · ~7.6B · 131K ctxCodeLong context~Q4 memory: 4.9GBHardware & compatible devices →Qwen2.5 7B
General LLMQwen · Alibaba · ~7B · 128K ctxToolsCodeMultilingualLong context~Q4 memory: 5.5GBmultilingualcodingstructured outputHardware & compatible devices →Mistral 7B
General LLMMistral · Mistral AI · ~7B · 32K ctxTools~Q4 memory: 5GBfastlow latencygeneral assistantHardware & compatible devices →Qwen2.5-Coder 7B
Coding LLMQwen · Alibaba · ~7B · 128K ctxCodeToolsLong context~Q4 memory: 5.5GBcodingin-editor completionlow latencyHardware & compatible devices →CodeLlama 7B
Coding LLMCodeLlama · Meta · ~7B · 16K ctxCode~Q4 memory: 5GBcodefill-in-the-middlelow latencyHardware & compatible devices →StarCoder2 7B
Coding LLMStarCoder · BigCode · ~7B · 16K ctxCode~Q4 memory: 5GBcodefill-in-the-middlerepo understandingHardware & compatible devices →Qwen2-VL 7B (vision)
Vision / MultimodalQwen · Alibaba · ~7B · 32K ctxVisionMultilingual~Q4 memory: 7GBimage understandingdocument/screenshot parsingOCR-style tasksHardware & compatible devices →LLaVA 7B (vision)
Vision / MultimodalLLaVA · LLaVA · ~7B · 4K ctxVision~Q4 memory: 6GBimage understandingvisual Q&AcaptioningHardware & compatible devices →Gemma 3 4B
General LLMGemma 3 · Google · ~4B · 128K ctxVisionMultilingualLong context~Q4 memory: 3GBcompactmultilingualvisionHardware & compatible devices →Phi-3.5 Mini (3.8B)
General LLMPhi · Microsoft · ~3.8B · 128K ctxReasoningLong contextMultilingual~Q4 memory: 2.5GBcompactreasoningMIT licenseHardware & compatible devices →Llama 3.2 3B
General LLMLlama · Meta · ~3B · 128K ctxToolsMultilingualLong context~Q4 memory: 2.5GBcompactfastinstruction followingHardware & compatible devices →Qwen2.5 3B
General LLMQwen · Alibaba · ~3B · 32K ctxToolsMultilingual~Q4 memory: 2.2GBcompactfastinstruction followingHardware & compatible devices →StarCoder2 3B
Coding LLMStarCoder · BigCode · ~3B · 16K ctxCode~Q4 memory: 2.2GBcodefill-in-the-middlelow latencyHardware & compatible devices →Gemma 2 2B
General LLMGemma · Google · ~2B · 8K ctxMultilingual~Q4 memory: 1.6GBtinyquality responsesfastHardware & compatible devices →Granite 3 2B
General LLMGranite · IBM · ~2B · 128K ctxToolsMultilingualLong context~Q4 memory: 1.6GBcompacttool useenterprise-friendlyHardware & compatible devices →Moondream 2 (vision)
Vision / MultimodalMoondream · Moondream · ~1.8B · 2K ctxVision~Q4 memory: 1.5GBtiny visionedgecaptioningHardware & compatible devices →SmolLM2 1.7B
General LLMSmolLM · Hugging Face · ~1.7B · 8K ctxTools~Q4 memory: 1.1GBtinyedge / CPUfastHardware & compatible devices →Qwen2.5 1.5B
General LLMQwen · Alibaba · ~1.5B · 32K ctxToolsMultilingual~Q4 memory: 1GBtinyfastedgeHardware & compatible devices →DeepSeek-R1 Distill 1.5B
ReasoningDeepSeek · DeepSeek · ~1.5B · 128K ctxReasoningLong context~Q4 memory: 1.5GBtiny reasoningedgestep-by-stepHardware & compatible devices →Qwen2.5-Coder 1.5B
Coding LLMQwen · Alibaba · ~1.5B · 32K ctxCode~Q4 memory: 1GBcodefill-in-the-middlelow latencyHardware & compatible devices →Llama 3.2 1B
General LLMLlama · Meta · ~1B · 128K ctxToolsMultilingualLong context~Q4 memory: 1GBtinyedge / CPUfastHardware & compatible devices →BGE-M3 Embeddings (class)
EmbeddingBAAI · BAAI · 8K ctxEmbeddingMultilingualHosted API — no local footprintmultilingual retrievallong documentsRAGHardware & compatible devices →Qwen2.5 0.5B
General LLMQwen · Alibaba · 32K ctxToolsMultilingual~Q4 memory: 0.4GBtinyedge / CPUfastHardware & compatible devices →mxbai-embed-large (class)
EmbeddingMixedbread · Mixedbread · 0.5K ctxEmbeddingHosted API — no local footprintquality retrievalRAGlightweightHardware & compatible devices →Snowflake Arctic Embed (class)
EmbeddingSnowflake · Snowflake · 0.5K ctxEmbeddingHosted API — no local footprintquality retrievalRAGlightweightHardware & compatible devices →Nomic Embed Text (class)
EmbeddingNomic · Nomic · 8K ctxEmbeddingHosted API — no local footprintfast retrievallightweightRAGHardware & compatible devices →all-MiniLM (class)
EmbeddingSentence-Transformers · Sentence-Transformers · 0.5K ctxEmbeddingHosted API — no local footprinttinyvery fastRAGHardware & compatible devices →Claude (Anthropic API)
APIClaude · Anthropic · 200K ctxToolsReasoningCodeVisionLong contextMultilingualHosted API — no local footprintfrontier qualitylong contextstrong reasoning & codingHardware & compatible devices →GPT-class (OpenAI API)
APIGPT · OpenAI · 128K ctxToolsReasoningCodeVisionLong contextMultilingualHosted API — no local footprintfrontier qualitybroad ecosystemtool useHardware & compatible devices →Gemini-class (Google API)
APIGemini · Google · 1000K ctxToolsReasoningCodeVisionLong contextMultilingualHosted API — no local footprintvery long contextmultimodalfrontier qualityHardware & compatible devices →
Explorar por família
Compare todos os tamanhos de uma família lado a lado, com o hardware que cada um precisa.
Sobre os modelos de API de ponta: Claude (Anthropic API), GPT-class (OpenAI API), Gemini-class (Google API) estão listados apenas como referência de qualidade e custo para a estratégia híbrida — funcionam como serviços alojados, não em hardware local, e enviam dados ao fornecedor.
Uma nota sobre honestidade: os tamanhos dos modelos, as janelas de contexto e a pegada de memória mudam entre versões, e as licenças variam. Trate os valores como orientação aproximada e verifique a variante exata e os seus termos antes de implantar. As entradas de raciocínio, visão e API estão assinaladas em conformidade.
Execute estes modelos dentro de um AI Business OS privado
Run your own AI agents on hardware you control — private by design, no per-seat data leaving your premises. BrainOutput helps you pick the right machine and turn it into a working AI Business OS.
Explorar o AI Business OS