Catalog
17 commercial-safe base models
We’ve vetted each license. Llama derivatives must include “Llama” in the model name; everything else is unrestricted commercial use. Tuned weights belong to you — Yachay never reads them.
Llama(5)
Meta's open weights — strongest general-purpose base.
Llama 4 Scout
109B (17B active) · Llama Community
general chat · code · long context · multilingual
Typical tune: ~$80 · LORA
Llama 3.3 70B
70B params · Llama Community
general chat · instruction following · reasoning
Typical tune: ~$80 · QLORA
Llama 3.1 8B
8B params · Llama Community
general chat · instruction following · popular tuning base
Typical tune: ~$10 · LORA
Llama 3.2 3B
3B params · Llama Community
edge · fast tuning · summarization
Typical tune: ~$10 · LORA
Llama 3.2 1B
1B params · Llama Community
edge · extraction · classification
Typical tune: ~$10 · LORA
Qwen(4)
Alibaba's Apache 2.0 multilingual workhorse.
Qwen 3 32B
32B params · Apache 2.0
multilingual · code · reasoning
Typical tune: ~$40 · QLORA
Qwen 3 14B
14B params · Apache 2.0
multilingual · code
Typical tune: ~$10 · LORA
Qwen 3 8B
8B params · Apache 2.0
multilingual · tool use · Apache 2.0 mid-tier
Typical tune: ~$10 · LORA
Qwen 3 4B
4B params · Apache 2.0
edge · Apache 2.0 small
Typical tune: ~$10 · LORA
Gemma(2)
Google's open-weight family, derived from Gemini research.
Phi(2)
Microsoft's synthetic-data-heavy reasoners (MIT).
Mistral(2)
European Apache 2.0 family — strong cost/perf.
DeepSeek Distill(2)
Distilled reasoning checkpoints — math and chain-of-thought.