NL EN Book Demo Login Get Started

Model garden

Llama 4 Maverick 17B 128E

Name: Llama 4 Maverick 17B 128E hosting (EU)
Brand: HostYourAI
Price: 0.40 EUR
Availability: InStock

Instantly via the EU router or as a dedicated GPU deployment. Data stays in Europe.

Llama 4 Maverick 17B 128E is an multimodal language model from Meta with 402B parameters, hosted on EU GPUs via an OpenAI-compatible API.

Start for free ← All models

meta-llama/Llama-4-Maverick-17B-128E

text+image->text · meta-llama · EU-hosted

402B

Parameters

—

Context window

924GB

Minimum VRAM

POST /api/v1/chat/completions200 OK

Specifications

Parameters 402B

Minimum VRAM 924 GB

Architecture Llama4ForConditionalGeneration (vLLM)

License other

Modality text+image->text

Released April 2025

Publisher meta-llama ↗

Pricing

€0.40

Input (per 1M tokens)

€0.60

Output (per 1M tokens)

Shared EU router, pay-per-token, scale-to-zero. Dedicated GPU deployments are billed hourly — see pricing.

Call it now

Drop-in replacement for OpenAI: change only the base URL and API key. The Anthropic format (/v1/messages) is supported too.

curl https://hostyourai.com/api/v1/chat/completions \
  -H "Authorization: Bearer hyai-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/Llama-4-Maverick-17B-128E",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Frequently asked questions

Can I run Llama 4 Maverick 17B 128E in the EU?

Yes. HostYourAI runs Llama 4 Maverick 17B 128E on GPUs in European datacenters via vLLM. Prompts and outputs never leave the EU and there is no US cloud provider in the chain.

Is hosting Llama 4 Maverick 17B 128E GDPR-compliant?

Yes. All processing happens inside the EU, a Data Processing Agreement (DPA) is available and the subprocessor list is public. Open-source weights also mean: no training on your data.

How much does Llama 4 Maverick 17B 128E cost?

Via the shared EU router you pay €0.40 per million input tokens and €0.60 per million output tokens, with no fixed costs. For high volume or isolation you can also run Llama 4 Maverick 17B 128E as a dedicated hourly GPU instance.

Is the API OpenAI-compatible?

Yes. You use the standard OpenAI SDKs with a custom base URL (https://hostyourai.com/api/v1). The Anthropic Messages API is supported as a drop-in as well.

More models from Meta

Llama Guard 4 12B

Llama Guard 4 12B is an multimodal language model from Meta with 12B parameters, hosted on EU GPUs via an OpenAI-compatible API.

12B View model →

Llama 4 Scout 17B 16E

Llama 4 Scout 17B 16E is an multimodal language model from Meta with 109B parameters, hosted on EU GPUs via an OpenAI-compatible API.

109B View model →

Llama 4 Scout 17B 16E Instruct

Llama 4 Scout 17B 16E Instruct is an multimodal language model from Meta with 109B parameters, hosted on EU GPUs via an OpenAI-compatible API.

109B View model →

Llama 4 Maverick 17B 128E Instruct

Llama 4 Maverick 17B 128E Instruct is an multimodal language model from Meta with 402B parameters, hosted on EU GPUs via an OpenAI-compatible API.

402B View model →

Llama 4 Maverick 17B 128E Instruct FP8

Llama 4 Maverick 17B 128E Instruct FP8 is an multimodal language model from Meta with 402B parameters, hosted on EU GPUs via an OpenAI-compatible API.

402B View model →

Llama 3.3 70B Instruct

Llama 3.3 70B Instruct is an open-source language model from Meta with 71B parameters, hosted on EU GPUs via an OpenAI-compatible API.

71B View model →

Try Llama 4 Maverick 17B 128E for free

Creating an account takes a minute. Test Llama 4 Maverick 17B 128E straight away in the playground.

Start for free