DeepSeek Distill

DeepSeek-R1-Distill-Qwen-14B

14B parameters

License

MIT

Tune type

LORA

Typical cost

~$10

Cold start

5–15 min

Best for

reasoning
math

License terms

DeepSeek-R1-Distill-Qwen-14B is released under the MIT license. Yachay has vetted it for commercial use (allowed).

No naming or attribution constraints on derivatives.

How Yachay tunes it

We default to LORA on Google Cloud Vertex AI Tuning. Yachay auto-selects the right GPU for this model — typically A100 40 GB (Spot) based on the 28 GB FP16 footprint. Hyperparameters are customisable at submit (epochs, batch size, learning rate).

Released January 20, 2025. Weights are pulled from HuggingFace at job start; expect 5–15 minutes of cold start before tuning begins.

Related models

Same family first, then nearest size across families.

DeepSeek Distill

DeepSeek-R1-Distill-Llama-8B

Ready to fine-tune DeepSeek-R1-Distill-Qwen-14B?

Sign in with your Condor umbrella account — from ~$0.15 / M training tokens on production-size jobs. No upfront bond; card on file, charged at completion.

Start a tune →

← Back to catalog · See pricing