DeepSeek Distill
DeepSeek-R1-Distill-Qwen-14B
14B parameters
Best for
- reasoning
- math
License terms
DeepSeek-R1-Distill-Qwen-14B is released under the MIT license. Yachay has vetted it for commercial use (allowed).
No naming or attribution constraints on derivatives.
How Yachay tunes it
We default to LORA on Google Cloud Vertex AI Tuning. Yachay auto-selects the right GPU for this model — typically A100 40 GB (Spot) based on the 28 GB FP16 footprint. Hyperparameters are customisable at submit (epochs, batch size, learning rate).
Released January 20, 2025. Weights are pulled from HuggingFace at job start; expect 5–15 minutes of cold start before tuning begins.
Related models
Same family first, then nearest size across families.
Ready to fine-tune DeepSeek-R1-Distill-Qwen-14B?
Sign in with your Condor umbrella account — from ~$0.15 / M training tokens on production-size jobs. No upfront bond; card on file, charged at completion.