Question 1

What does Yachay actually do?

Accepted Answer

Yachay (Condor Models) is a managed fine-tuning service. You pick a base open-source LLM, upload your training data as JSONL or CSV, pay usage-based, and download a LoRA or QLoRA adapter when training finishes. No GCP account, no GPU, no Python required.

Question 2

What does it cost?

Accepted Answer

Pricing follows a markup curve over raw GCP compute — small jobs (1B–8B base) land between the $5 minimum on trial-size data and ~$15 on production-size, mid-size (24B–32B) are $30–$40, large (70B QLoRA) land ~$80 on trial-size data and ~$160 on production-size (100 MB / 3 epochs), and frontier (Scout/Maverick MoE) start ~$160 and scale up with dataset size. Storage retention is included free for 30 days; extensions are $0.06/GB-month.

Question 3

How does that compare to Fireworks / Together / OpenAI in $/M training tokens?

Accepted Answer

Every estimate also shows a $/M training-tokens rate so you can directly compare. On production-size Llama 3.1 8B LoRA jobs we land around $0.15–$0.20/M tokens — measurably cheaper than Fireworks ($0.50/M) and Together ($0.48/M), and about 20× cheaper than OpenAI gpt-4o-mini fine-tuning ($3.00/M). Smaller trial-size jobs settle near the $5 minimum, which still typically prices below the competitive baseline. We compute the rate from your actual GPU-hour estimate using a 4-bytes-per-token English average; code-heavy data tokenizes denser (~3.5 bytes/token) so the effective rate is slightly lower for code.

Question 4

What dataset formats do you accept?

Accepted Answer

OpenAI chat JSONL (the format the OpenAI fine-tuning API uses), Alpaca JSONL ({instruction, input, output}), ShareGPT JSONL ({conversations: [{from, value}]}), and CSV/TSV (first column = prompt, last column = response). Anything non-native is normalized to OpenAI chat in your browser before upload.

Question 5

What's the difference between LoRA and QLoRA?

Accepted Answer

LoRA trains low-rank adapters in full precision — faster wall-clock, better fidelity, but the base model has to fit on a single GPU. QLoRA quantizes the frozen base to 4-bit so much larger models fit, at a ~1.4× training-time slowdown. We default to LoRA for ≤16B parameters and QLoRA for everything bigger.

Question 6

Can I commercialize the resulting model?

Accepted Answer

Yes — every base model in the catalog is licensed for commercial use. Some upstream licenses (Meta's Llama family in particular) require the derivative name to include the family token. We enforce that at submit time, so you can't accidentally ship a non-compliant model.

Question 7

What if my dataset is bad and the tune fails?

Accepted Answer

For infrastructure-side failures (Spot pre-emption, Vertex internal errors, missing adapter artifact, or our 3× runaway-timeout circuit-breaker tripping) no invoice fires at all — the invoice path only runs on a successful finalize, and since billing is charge-at-completion against your card on file, "no invoice" literally means "no charge." The dashboard surfaces the failure reason on the job page. For dataset-side failures (INVALID_DATASET) the linter catches most issues before submit; a borderline case that slipped past still results in no charge (we eat the GPU cost on this side too, currently — the policy is generous while we're in private beta). Email hello@condorbox.ai with the job ID if you think a failure was misclassified.

Question 8

How is billing handled?

Accepted Answer

One card on file, charged at completion. You save a card once on /dashboard/setup-card via Stripe Checkout in setup mode — no money moves there. Every subsequent submit goes through with zero Stripe redirects. When a job finishes, we issue a Stripe invoice for the realized cost and Stripe charges your saved card off-session. Mid-run cancels charge only the realized compute (pro-rated by elapsed GPU-time at the same per-class rate as a completion) — no flat cancel fee. The card-on-file is the entire payment relationship; updates and replacements happen at /dashboard/setup-card or /portal/billing on the umbrella. One Stripe customer-of-record across the whole CondorBox umbrella, so there's no billing-account proliferation.

Question 9

How long are adapters retained?

Accepted Answer

Free for 30 days after job completion. After that you can either let them auto-delete or extend retention for $0.06/GB-month plus a small handling fee ($0.50), up to 24 months. The Re-tune button on a completed job's page also persists the configuration — even if the adapter expires, you can still recreate it from the same hyperparameters.

Question 10

Do you have an API or CLI?

Accepted Answer

Not yet — v1 is dashboard-only. A REST API and matching CLI are on the v1.1 roadmap; they will mirror the existing dashboard surface (catalog, estimate, submit, status, download) and bill at the same dynamic price as the web flow. If API-first access is a hard requirement, email hello@condorbox.ai and we'll loop you into the early-access list.

Question 11

What's the maximum job size?

Accepted Answer

Per-job caps tier with your payment history: $500 for new accounts, $2,000 once you've completed 3 clean jobs, $5,000 once you've completed 5 jobs AND $10K lifetime spend (about a 70B QLoRA on a long dataset). The cap matches what competitors do — OpenAI and Together both ramp first-job caps over time. Above $5,000 you need enterprise pre-approval — email hello@condorbox.ai with your customer ID and we typically clear it within one business day. Your current tier is shown on /dashboard/account.

Frequently asked questions