HostYourAI offers three execution modes side by side on a single account and credit balance. Start with EU Hosted inference, move to dedicated capacity when the workload or compliance profile requires it.
1. EU Hosted Gateway: pay per token
One OpenAI-compatible API key, model catalog, hyai/auto, and scale-to-zero shared capacity. Best for SaaS integrations, agencies, agent apps, and experimentation. Indicative tariffs (EUR per million tokens):
| Model class | Input €/M | Output €/M |
|---|---|---|
| ≤ 8B (Llama 3.2, Qwen 3 small, Phi-4 mini) | 0.03 | 0.06 |
| ≤ 16B (Llama 3.1 8B, Qwen 2.5 7B, Mistral 7B) | 0.05 | 0.10 |
| ≤ 32B (Qwen 3 14B, Phi-4, Mistral Nemo 12B) | 0.10 | 0.18 |
| ≤ 48B (Mistral Small 3.1, DeepSeek Coder V2 Lite) | 0.15 | 0.25 |
| ≤ 80B (Qwen 3 32B, DeepSeek R1 distill 32B) | 0.25 | 0.40 |
| Large (70B+, Mixtral, DeepSeek-V3) | 0.40 | 0.60 |
Current beta framing: EU Hosted means EU-located GPU processing with shared router capacity. EU Sovereignty Mode is sold separately once a fully EU-sovereign provider chain, DPA, subprocessors, audit export, and support-access controls are active.
2. Dedicated EU Deployment: pay per hour
You pick a GPU class and region, deploy your own vLLM instance, and pay for as long as it runs. Best for custom Hugging Face models, BYOK upstreams, steady high-volume workloads, or when you need full control over the deployment.
| GPU class | Typical use | From (EUR / hr) |
|---|---|---|
| 1× L40S / RTX 4090 (24 to 48 GB) | Models up to ~20B | € 0.40 |
| 1× A100 80 GB / H100 80 GB | Models up to ~70B (quantised) | € 2.20 |
| 2× H100 (160 GB) | 70B fp16 / HA | € 4.40 |
| 4× H100 (320 GB) | Large models / high throughput | € 8.80 |
Indicative: live pricing depends on EU-region GPU availability at our providers. The exact price for each offer is shown before you deploy.
3. Private single-tenant: on request
Need an isolated runtime with dedicated GPUs per customer, at-rest encryption, and a private network policy? For healthcare, government, legal, finance, and workloads that cannot use shared capacity, we scope and price this per project. The configurations below are typical starting points, not a self-serve product.
| Configuration | VRAM | Indicative / month | Setup (one-off) |
|---|---|---|---|
| 1× L40S | 48 GB | from € 1,200 | € 500 |
| 1× H100 | 80 GB | from € 3,500 | € 1,000 |
| 2× H100 | 160 GB | from € 6,500 | € 1,000 |
| 4× H100 | 320 GB | from € 12,500 | € 1,500 |
Indicative, scoped per project. Talk to us via /contact. Confidential computing (TEE) is on the roadmap; we will not price what we have not yet validated.
BYOK: bring your own API key
You can attach your own OpenAI, Anthropic, Google or Mistral API key to an instance. We forward your traffic to the upstream under your contract with them. BYOK currently carries no platform fee: you only pay your own provider. Useful for hybrid setups that mix EU-hosted open-weights with frontier closed models.
Getting started
- Creating an account is free. No credit card to sign up.
- Pay as you go from a single prepaid credit balance. No subscription, no minimum.
- Top up with iDEAL, card or SEPA, then call the Router or deploy an instance.
Billing
- Currency: EUR. VAT added where applicable; reverse-charge for EU B2B with valid VAT number.
- Method: Stripe (credit card, iDEAL, SEPA direct debit). Invoices auto-issued from your dashboard.
- Credits: top up in advance; balance is consumed by all three modes from a single pool.
- Volume / partner tier: for €> 2 000 / month in tokens or one or more single-tenant deployments, we offer a partner tier with discounts, SLAs, and a dedicated technical contact. Contact partners@hostyourai.com.
What's not on the price list
Bespoke procurement, custom contracts, NEN 7510 / BIO audit packages, white-label / reseller arrangements, and confidential-computing deployments are quoted per project. Talk to us via /contact.