Model garden

Modelcatalogus

382 open-source modellen, gehost op GPU's in de EU. Eén OpenAI-compatibele API-key, scale-to-zero of dedicated.

382 modellen

GLM 5.2
Z.AI · 753B

We're introducing GLM-5.2, our latest flagship model for long-horizon tasks. It marks a substantial leap in long-horizon task capability over its predecessor GLM-5.1 and, for the first time, delivers that capability on a solid 1M-token context. GLM-5.2's new capabilities include: - Solid 1M Context: A solid 1M-token context that stably sustains long-horizon work - Advanced Coding with Flexible Effort: Stronger coding capabilities with multiple thinking effort levels to balance performance and latency - Improved Architecture: We propose IndexShare, which reuses the same indexer across every fou

1M context €0.40 per 1M in 🇪🇺 EU
GLM 5.2 FP8
Z.AI · 753B

We're introducing GLM-5.2, our latest flagship model for long-horizon tasks. It marks a substantial leap in long-horizon task capability over its predecessor GLM-5.1 and, for the first time, delivers that capability on a solid 1M-token context. GLM-5.2's new capabilities include: - Solid 1M Context: A solid 1M-token context that stably sustains long-horizon work - Advanced Coding with Flexible Effort: Stronger coding capabilities with multiple thinking effort levels to balance performance and latency - Improved Architecture: We propose IndexShare, which reuses the same indexer across every fou

1M context €0.40 per 1M in 🇪🇺 EU
FastContext 1.0 4B RL
Microsoft · 4B

FastContext-1.0 is a lightweight repository-exploration subagent for LLM coding agents. Instead of letting a single model both explore the repository and solve the task, FastContext separates these two roles: it is invoked on demand by a main coding agent, issues parallel read-only tool calls (READ, GLOB, GREP), and returns compact file paths and line ranges as focused context.

262K context €0.05 per 1M in 🇪🇺 EU
FastContext 1.0 4B SFT
Microsoft · 4B

FastContext-1.0 is a lightweight repository-exploration subagent for LLM coding agents. Instead of letting a single model both explore the repository and solve the task, FastContext separates these two roles: it is invoked on demand by a main coding agent, issues parallel read-only tool calls (READ, GLOB, GREP), and returns compact file paths and line ranges as focused context.

262K context €0.05 per 1M in 🇪🇺 EU
gemma 4 31B it qat w4a16 ct
Google · 34B

[!Note] This model card is for the new versions of the Gemma 4 family optimized with Quantization-Aware Training (QAT), which allows preserving similar quality to bfloat16 while dramatically reducing the memory requirements to load the model. Four versions of the QAT checkpoints are available: Unquantized QAT checkpoints (Q40): Half-precision weights extracted from the QAT pipeline, ideal for custom downstream compilation and research. Available for Gemma 4 E2B, E4B, 12B, 26B A4B, and 31B, and their drafter models. GGUF (Q40): Ready-to-deploy formats for broad ecosystem compatibility. Availabl

€0.25 per 1M in multimodal 🇪🇺 EU
gemma 4 26B A4B it qat q4 0 unquantized
Google · 27B

[!Note] This model card is for the new versions of the Gemma 4 family optimized with Quantization-Aware Training (QAT), which allows preserving similar quality to bfloat16 while dramatically reducing the memory requirements to load the model. Four versions of the QAT checkpoints are available: Unquantized QAT checkpoints (Q40): Half-precision weights extracted from the QAT pipeline, ideal for custom downstream compilation and research. Available for Gemma 4 E2B, E4B, 12B, 26B A4B, and 31B, and their drafter models. GGUF (Q40): Ready-to-deploy formats for broad ecosystem compatibility. Availabl

€0.25 per 1M in multimodal 🇪🇺 EU
gemma 4 31B it qat q4 0 unquantized
Google · 33B

[!Note] This model card is for the new versions of the Gemma 4 family optimized with Quantization-Aware Training (QAT), which allows preserving similar quality to bfloat16 while dramatically reducing the memory requirements to load the model. Four versions of the QAT checkpoints are available: Unquantized QAT checkpoints (Q40): Half-precision weights extracted from the QAT pipeline, ideal for custom downstream compilation and research. Available for Gemma 4 E2B, E4B, 12B, 26B A4B, and 31B, and their drafter models. GGUF (Q40): Ready-to-deploy formats for broad ecosystem compatibility. Availabl

€0.25 per 1M in multimodal 🇪🇺 EU
DeepSeek V4 Pro
DeepSeek · 862B

We present a preview version of DeepSeek-V4 series, including two strong Mixture-of-Experts (MoE) language models — DeepSeek-V4-Pro with 1.6T parameters (49B activated) and DeepSeek-V4-Flash with 284B parameters (13B activated) — both supporting a context length of one million tokens.

1M context €0.40 per 1M in 🇪🇺 EU
DeepSeek V4 Flash
DeepSeek · 158B

We present a preview version of DeepSeek-V4 series, including two strong Mixture-of-Experts (MoE) language models — DeepSeek-V4-Pro with 1.6T parameters (49B activated) and DeepSeek-V4-Flash with 284B parameters (13B activated) — both supporting a context length of one million tokens.

1M context €0.40 per 1M in 🇪🇺 EU
Qwen3.6 27B FP8
Qwen · 28B

[!Note] This repository contains FP8-quantized model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc. The quantization method is fine-grained fp8 quantization with block size of 128, and its performance metrics are nearly identical to those of the original model.

€0.25 per 1M in multimodal 🇪🇺 EU
Qwen3.6 27B
Qwen · 28B

[!Note] This repository contains model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc.

€0.25 per 1M in multimodal 🇪🇺 EU
Qwen3.6 35B A3B FP8
Qwen · 36B

[!Note] This repository contains FP8-quantized model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc. The quantization method is fine-grained fp8 quantization with block size of 128, and its performance metrics are nearly identical to those of the original model.

€0.40 per 1M in multimodal 🇪🇺 EU
Qwen3.6 35B A3B
Qwen · 36B

[!Note] This repository contains model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc.

€0.40 per 1M in multimodal 🇪🇺 EU
GLM 5.1 FP8
Z.AI · 754B

GLM-5.1 is our next-generation flagship model for agentic engineering, with significantly stronger coding capabilities than its predecessor. It achieves state-of-the-art performance on SWE-Bench Pro and leads GLM-5 by a wide margin on NL2Repo (repo generation) and Terminal-Bench 2.0 (real-world terminal tasks).

203K context €0.40 per 1M in 🇪🇺 EU
GLM 5.1
Z.AI · 754B

GLM-5.1 is our next-generation flagship model for agentic engineering, with significantly stronger coding capabilities than its predecessor. It achieves state-of-the-art performance on SWE-Bench Pro and leads GLM-5 by a wide margin on NL2Repo (repo generation) and Terminal-Bench 2.0 (real-world terminal tasks).

203K context €0.40 per 1M in 🇪🇺 EU
gemma 4 31B
Google · 33B

Gemma is a family of open models built by Google DeepMind. Gemma 4 models are multimodal, handling text and image input (with audio supported on small models) and generating text output. This release includes open-weights models in both pre-trained and instruction-tuned variants. Gemma 4 features a context window of up to 256K tokens and maintains multilingual support in over 140 languages.

€0.25 per 1M in multimodal 🇪🇺 EU
gemma 4 26B A4B
Google · 27B

Gemma is a family of open models built by Google DeepMind. Gemma 4 models are multimodal, handling text and image input (with audio supported on small models) and generating text output. This release includes open-weights models in both pre-trained and instruction-tuned variants. Gemma 4 features a context window of up to 256K tokens and maintains multilingual support in over 140 languages.

€0.25 per 1M in multimodal 🇪🇺 EU
gemma 4 26B A4B it
Google · 27B

Gemma is a family of open models built by Google DeepMind. Gemma 4 models are multimodal, handling text and image input (with audio supported on small models) and generating text output. This release includes open-weights models in both pre-trained and instruction-tuned variants. Gemma 4 features a context window of up to 256K tokens and maintains multilingual support in over 140 languages.

€0.25 per 1M in multimodal 🇪🇺 EU
gemma 4 31B it
Google · 33B

Gemma is a family of open models built by Google DeepMind. Gemma 4 models are multimodal, handling text and image input (with audio supported on small models) and generating text output. This release includes open-weights models in both pre-trained and instruction-tuned variants. Gemma 4 features a context window of up to 256K tokens and maintains multilingual support in over 140 languages.

€0.25 per 1M in multimodal 🇪🇺 EU
Qwen3.5 35B A3B GPTQ Int4
Qwen · 36B

[!Note] This repository contains int4-quantized model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc.

€0.40 per 1M in multimodal 🇪🇺 EU
Qwen3.5 27B GPTQ Int4
Qwen · 28B

[!Note] This repository contains int4-quantized model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc.

€0.25 per 1M in multimodal 🇪🇺 EU
Qwen3.5 122B A10B GPTQ Int4
Qwen · 125B

[!Note] This repository contains int4-quantized model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc.

€0.40 per 1M in multimodal 🇪🇺 EU
Qwen3.5 0.8B Base
Qwen · 0.9B

[!Note] This repository contains model weights and configuration files for the pre-trained only model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, etc. The intended use cases are fine-tuning, in-context learning experiments, and other research or development purposes, not direct interaction. However, the control tokens, e.g., <|imstart| and <|imend| were trained to allow efficient LoRA-style PEFT with the official chat template, mitigating the need to finetune embeddings, a significant optimization given Qwen3.5's larger

€0.03 per 1M in multimodal 🇪🇺 EU
Qwen3.5 0.8B
Qwen · 0.9B

[!Note] This repository contains model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc. In light of its parameter scale, the intended use cases are prototyping, task-specific fine-tuning, and other research or development purposes.

€0.03 per 1M in multimodal 🇪🇺 EU
Qwen3.5 2B
Qwen · 2.3B

[!Note] This repository contains model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc. In light of its parameter scale, the intended use cases are prototyping, task-specific fine-tuning, and other research or development purposes.

€0.03 per 1M in multimodal 🇪🇺 EU
Qwen3.5 4B
Qwen · 4.7B

[!Note] This repository contains model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc.

€0.05 per 1M in multimodal 🇪🇺 EU
Qwen3.5 9B
Qwen · 9.7B

[!Note] This repository contains model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc.

€0.10 per 1M in multimodal 🇪🇺 EU
Qwen3.5 35B A3B FP8
Qwen · 36B

[!Note] This repository contains FP8-quantized model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc. The quantization method is fine-grained fp8 quantization with block size of 128, and its performance metrics are nearly identical to those of the original model.

€0.40 per 1M in multimodal 🇪🇺 EU
Qwen3.5 27B FP8
Qwen · 28B

[!Note] This repository contains FP8-quantized model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc. The quantization method is fine-grained fp8 quantization with block size of 128, and its performance metrics are nearly identical to those of the original model.

€0.25 per 1M in multimodal 🇪🇺 EU
Qwen3.5 122B A10B FP8
Qwen · 125B

[!Note] This repository contains FP8-quantized model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc. The quantization method is fine-grained fp8 quantization with block size of 128, and its performance metrics are nearly identical to those of the original model.

€0.40 per 1M in multimodal 🇪🇺 EU
Qwen3.5 122B A10B
Qwen · 125B

[!Note] This repository contains model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc.

€0.40 per 1M in multimodal 🇪🇺 EU
Qwen3.5 27B
Qwen · 28B

[!Note] This repository contains model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc.

€0.25 per 1M in multimodal 🇪🇺 EU
Qwen3.5 35B A3B
Qwen · 36B

[!Note] This repository contains model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc.

€0.40 per 1M in multimodal 🇪🇺 EU
Qwen3.5 397B A17B FP8
Qwen · 403B

[!Note] This repository contains FP8-quantized model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc. The quantization method is fine-grained fp8 quantization with block size of 128, and its performance metrics are nearly identical to those of the original model.

€0.40 per 1M in multimodal 🇪🇺 EU
Qwen3.5 397B A17B
Qwen · 403B

[!Note] This repository contains model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc.

€0.40 per 1M in multimodal 🇪🇺 EU
GLM 5
Z.AI · 754B

We are launching GLM-5, targeting complex systems engineering and long-horizon agentic tasks. Scaling is still one of the most important ways to improve the intelligence efficiency of Artificial General Intelligence (AGI). Compared to GLM-4.5, GLM-5 scales from 355B parameters (32B active) to 744B parameters (40B active), and increases pre-training data from 23T to 28.5T tokens. GLM-5 also integrates DeepSeek Sparse Attention (DSA), largely reducing deployment cost while preserving long-context capacity.

203K context €0.40 per 1M in 🇪🇺 EU
GLM 5 FP8
Z.AI · 754B

We are launching GLM-5, targeting complex systems engineering and long-horizon agentic tasks. Scaling is still one of the most important ways to improve the intelligence efficiency of Artificial General Intelligence (AGI). Compared to GLM-4.5, GLM-5 scales from 355B parameters (32B active) to 744B parameters (40B active), and increases pre-training data from 23T to 28.5T tokens. GLM-5 also integrates DeepSeek Sparse Attention (DSA), largely reducing deployment cost while preserving long-context capacity.

203K context €0.40 per 1M in 🇪🇺 EU
X Reasoner 7B
Microsoft · 8.3B

We introduce X-Reasoner, a vision-language model posttrained solely on general-domain text for generalizable reasoning, using a twostage approach: an initial supervised fine-tuning phase with distilled long chainof-thoughts, followed by reinforcement learning with verifiable rewards. Experiments show that X-Reasoner successfully transfers reasoning capabilities to both multimodal and out-of-domain settings, outperforming existing state-of-theart models trained with in-domain and multimodal data across various general and medical benchmarks. More details can be found in the paper: X-Reasoner: T

128K context €0.10 per 1M in multimodal 🇪🇺 EU
Qwen3 Coder Next FP8
Qwen · 80B

Today, we're announcing Qwen3-Coder-Next-FP8, an open-weight language model designed specifically for coding agents and local development. It features the following key enhancements:

262K context €0.40 per 1M in 🇪🇺 EU
Qwen3 Coder Next
Qwen · 80B

Today, we're announcing Qwen3-Coder-Next, an open-weight language model designed specifically for coding agents and local development. It features the following key enhancements:

262K context €0.40 per 1M in 🇪🇺 EU
GLM OCR
Z.AI · 1.3B

GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture. It introduces Multi-Token Prediction (MTP) loss and stable full-task reinforcement learning to improve training efficiency, recognition accuracy, and generalization. The model integrates the CogViT visual encoder pre-trained on large-scale image–text data, a lightweight cross-modal connector with efficient token downsampling, and a GLM-0.5B language decoder. Combined with a two-stage pipeline of layout analysis and parallel recognition based on PP-DocLayout-V3, GLM-OCR deliver

€0.03 per 1M in multimodal 🇪🇺 EU
DeepSeek OCR 2
DeepSeek · 3.4B

Inference using Huggingface transformers on NVIDIA GPUs. Requirements tested on python 3.12.9 + CUDA11.8:

8K context €0.03 per 1M in multimodal 🇪🇺 EU
GLM 4.7 Flash
Z.AI · 31B

GLM-4.7-Flash is a 30B-A3B MoE model. As the strongest model in the 30B class, GLM-4.7-Flash offers a new option for lightweight deployment that balances performance and efficiency.

203K context €0.25 per 1M in 🇪🇺 EU
translategemma 27b it
Google · 29B

translategemma 27b it is een multimodaal taalmodel van Google met 29B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in multimodal 🇪🇺 EU
translategemma 12b it
Google · 13B

translategemma 12b it is een multimodaal taalmodel van Google met 13B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in multimodal 🇪🇺 EU
medgemma 1.5 4b it
Google · 4.3B

medgemma 1.5 4b it is een multimodaal taalmodel van Google met 4.3B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in multimodal 🇪🇺 EU
FrogBoss 32B 2510
Microsoft · 32B

FrogBoss is built on the Qwen3-32B transformer architecture with a maximum context length of 64k tokens. The model uses multi-turn debugging workflows and complex code reasoning. Unlike general-purpose LLMs, FrogBoss is specialized for software engineering tasks.

41K context €0.25 per 1M in 🇪🇺 EU
GLM 4.7 FP8
Z.AI · 358B

GLM-4.7, your new coding partner, is coming with the following features:

203K context €0.40 per 1M in 🇪🇺 EU
GLM 4.7
Z.AI · 358B

GLM-4.7, your new coding partner, is coming with the following features:

203K context €0.40 per 1M in 🇪🇺 EU
OptiMind SFT
Microsoft · 21B

OptiMind-SFT is a specialized 20B parameter model designed to bridge the gap between natural language and executable optimization solvers. It automates the translation of complex decision-making problems—such as supply chain planning, scheduling, and resource allocation—into correct MILP formulations.

131K context €0.25 per 1M in 🇪🇺 EU
AutoGLM Phone 9B Multilingual
Z.AI · 0B

⚠️ This project is intended for research and educational purposes only. Any use for illegal data access, system interference, or unlawful activities is strictly prohibited. Please review our Terms of Use carefully.

€0.03 per 1M in multimodal 🇪🇺 EU
AutoGLM Phone 9B
Z.AI · 0B

⚠️ This project is intended for research and educational purposes only. Any use for illegal data access, system interference, or unlawful activities is strictly prohibited. Please review our Terms of Use carefully.

€0.03 per 1M in multimodal 🇪🇺 EU
GLM 4.6V FP8
Z.AI · 108B

This model is part of the GLM-V family of models, introduced in the paper GLM-4.1V-Thinking and GLM-4.5V: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning.

€0.40 per 1M in multimodal 🇪🇺 EU
GLM 4.6V
Z.AI · 108B

This model is part of the GLM-V family of models, introduced in the paper GLM-4.1V-Thinking and GLM-4.5V: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning.

€0.40 per 1M in multimodal 🇪🇺 EU
GLM 4.6V Flash
Z.AI · 10B

This model is part of the GLM-V family of models, introduced in the paper GLM-4.1V-Thinking and GLM-4.5V: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning.

€0.10 per 1M in multimodal 🇪🇺 EU
DeepSeek V3.2
DeepSeek · 685B

We introduce DeepSeek-V3.2, a model that harmonizes high computational efficiency with superior reasoning and agent performance. Our approach is built upon three key technical breakthroughs:

164K context €0.40 per 1M in 🇪🇺 EU
Ministral 3 3B Instruct 2512 ONNX
Mistral · 3B

[!Tip] This model was contributed by Xenova from Hugging Face. We sincerely appreciate the integration and community collaboration. While preliminary functionality checks have been performed, comprehensive testing has not yet been completed. We recommend you to proceed with caution and conducting your own evaluations for specific use cases. If any issues arise, open a PR/Issue here and we will try to address them promptly.

€0.03 per 1M in multimodal 🇪🇺 EU
WebVIA Agent
Z.AI · 10B

- Repository: https://github.com/zheny2751-dotcom/WebVIA - Paper: https://arxiv.org/pdf/2511.06251

€0.10 per 1M in multimodal 🇪🇺 EU
UI2Code N
Z.AI

- Repository: https://github.com/zai-org/UI2CodeN - Paper: https://arxiv.org/abs/2511.08195

€0.10 per 1M in multimodal 🇪🇺 EU
Fara 7B
Microsoft · 8.3B

Description: Fara-7B is Microsoft's first agentic small language model (SLM) designed specifically for computer use. With only 7 billion parameters, Fara-7B is an ultra-compact Computer Use Agent (CUA) that achieves state-of-the-art performance within its size class and is competitive with larger, more resource-intensive agentic systems.

128K context €0.10 per 1M in multimodal 🇪🇺 EU
Glyph
Z.AI · 10B

- Repository: https://github.com/thu-coai/Glyph - Paper: https://arxiv.org/abs/2510.17800

€0.10 per 1M in multimodal 🇪🇺 EU
Qwen3 VL 2B Instruct
Qwen · 2.1B

Meet Qwen3-VL — the most powerful vision-language model in the Qwen series to date.

€0.03 per 1M in multimodal 🇪🇺 EU
Qwen3 VL 32B Instruct FP8
Qwen · 33B

This repository contains an FP8 quantized version of the Qwen3-VL-32B-Instruct model. The quantization method is fine-grained fp8 quantization with block size of 128, and its performance metrics are nearly identical to those of the original BF16 model. Enjoy!

€0.25 per 1M in multimodal 🇪🇺 EU
Qwen3 VL 32B Thinking FP8
Qwen · 33B

This repository contains an FP8 quantized version of the Qwen3-VL-32B-Thinking model. The quantization method is fine-grained fp8 quantization with block size of 128, and its performance metrics are nearly identical to those of the original BF16 model. Enjoy!

€0.25 per 1M in multimodal 🇪🇺 EU
Qwen3 VL 32B Instruct
Qwen · 33B

Meet Qwen3-VL — the most powerful vision-language model in the Qwen series to date.

€0.25 per 1M in multimodal 🇪🇺 EU
DeepSeek OCR
DeepSeek · 3.3B

torch==2.6.0 transformers==4.46.3 tokenizers==0.20.3 einops addict easydict pip install flash-attn==2.7.3 --no-build-isolation

8K context €0.03 per 1M in multimodal 🇪🇺 EU
Qwen3 VL 8B Instruct FP8
Qwen · 8.8B

This repository contains an FP8 quantized version of the Qwen3-VL-8B-Instruct model. The quantization method is fine-grained fp8 quantization with block size of 128, and its performance metrics are nearly identical to those of the original BF16 model. Enjoy!

€0.10 per 1M in multimodal 🇪🇺 EU
Qwen3 VL 4B Instruct FP8
Qwen · 4.8B

This repository contains an FP8 quantized version of the Qwen3-VL-4B-Instruct model. The quantization method is fine-grained fp8 quantization with block size of 128, and its performance metrics are nearly identical to those of the original BF16 model. Enjoy!

€0.05 per 1M in multimodal 🇪🇺 EU
Qwen3 VL 8B Instruct
Qwen · 8.8B

Meet Qwen3-VL — the most powerful vision-language model in the Qwen series to date.

€0.10 per 1M in multimodal 🇪🇺 EU
Qwen3 VL 4B Instruct
Qwen · 4.4B

Meet Qwen3-VL — the most powerful vision-language model in the Qwen series to date.

€0.05 per 1M in multimodal 🇪🇺 EU
Qwen3 VL 30B A3B Instruct FP8
Qwen · 31B

This repository contains an FP8 quantized version of the Qwen3-VL-30B-A3B-Instruct model. The quantization method is fine-grained fp8 quantization with block size of 128, and its performance metrics are nearly identical to those of the original BF16 model. Enjoy!

€0.25 per 1M in multimodal 🇪🇺 EU
UserLM 8b
Microsoft · 8B

Unlike typical LLMs that are trained to play the role of the "assistant" in conversation, we trained UserLM-8b to simulate the “user” role in conversation (by training it to predict user turns in a large corpus of conversations called WildChat). This model is useful in simulating more realistic conversations, which is in turn useful in the development of more robust assistants.

8K context €0.10 per 1M in 🇪🇺 EU
Qwen3 VL 30B A3B Instruct
Qwen · 31B

Meet Qwen3-VL — the most powerful vision-language model in the Qwen series to date.

€0.25 per 1M in multimodal 🇪🇺 EU
GLM 4.6
Z.AI · 357B

Compared with GLM-4.5, GLM-4.6 brings several key improvements:

203K context €0.40 per 1M in 🇪🇺 EU
GLM 4.6 FP8
Z.AI · 358B

Compared with GLM-4.5, GLM-4.6 brings several key improvements:

203K context €0.40 per 1M in 🇪🇺 EU
DeepSeek V3.2 Exp
DeepSeek · 685B

We are excited to announce the official release of DeepSeek-V3.2-Exp, an experimental version of our model. As an intermediate step toward our next-generation architecture, V3.2-Exp builds upon V3.1-Terminus by introducing DeepSeek Sparse Attention—a sparse attention mechanism designed to explore and validate optimizations for training and inference efficiency in long-context scenarios.

164K context €0.40 per 1M in 🇪🇺 EU
DeepSeek V3.1 Terminus
DeepSeek · 685B

This update maintains the model's original capabilities while addressing issues reported by users, including:

164K context €0.40 per 1M in 🇪🇺 EU
Qwen3 VL 235B A22B Instruct
Qwen · 236B

Meet Qwen3-VL — the most powerful vision-language model in the Qwen series to date.

€0.40 per 1M in multimodal 🇪🇺 EU
gpt oss safeguard 20b
OpenAI · 22B

gpt-oss-safeguard-120b and gpt-oss-safeguard-20b are safety reasoning models built-upon gpt-oss. With these models, you can classify text content based on safety policies that you provide and perform a suite of foundational safety tasks. These models are intended for safety use cases. For other applications, we recommend using gpt-oss models.

131K context €0.25 per 1M in 🇪🇺 EU
gpt oss safeguard 120b
OpenAI · 120B

gpt-oss-safeguard-120b and gpt-oss-safeguard-20b are safety reasoning models built-upon gpt-oss. With these models, you can classify text content based on safety policies that you provide and perform a suite of foundational safety tasks. These models are intended for safety use cases. For other applications, we recommend using gpt-oss models.

131K context €0.40 per 1M in 🇪🇺 EU
DeepSeek V3.1
DeepSeek · 685B

DeepSeek-V3.1 is a hybrid model that supports both thinking mode and non-thinking mode. Compared to the previous version, this upgrade brings improvements in multiple aspects:

164K context €0.40 per 1M in 🇪🇺 EU
DeepSeek V3.1 Base
DeepSeek · 685B

DeepSeek-V3.1 is a hybrid model that supports both thinking mode and non-thinking mode. Compared to the previous version, this upgrade brings improvements in multiple aspects:

164K context €0.40 per 1M in 🇪🇺 EU
GLM 4.5V
Z.AI · 108B

This model is part of the GLM-V family of models, introduced in the paper GLM-4.1V-Thinking and GLM-4.5V: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning.

€0.40 per 1M in multimodal 🇪🇺 EU
GLM 4.5V FP8
Z.AI · 108B

Vision-language models (VLMs) have become a key cornerstone of intelligent systems. As real-world AI tasks grow increasingly complex, VLMs urgently need to enhance reasoning capabilities beyond basic multimodal perception — improving accuracy, comprehensiveness, and intelligence — to enable complex problem solving, long-context understanding, and multimodal agents.

€0.40 per 1M in multimodal 🇪🇺 EU
Qwen3 4B Instruct 2507 FP8
Qwen · 4.4B

We introduce the updated version of the Qwen3-4B-FP8 non-thinking mode, named Qwen3-4B-Instruct-2507-FP8, featuring the following key enhancements:

262K context €0.05 per 1M in 🇪🇺 EU
Qwen3 4B Instruct 2507
Qwen · 4B

We introduce the updated version of the Qwen3-4B non-thinking mode, named Qwen3-4B-Instruct-2507, featuring the following key enhancements:

262K context €0.05 per 1M in 🇪🇺 EU
gpt oss 20b
OpenAI · 22B

Welcome to the gpt-oss series, OpenAI’s open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases.

131K context €0.25 per 1M in 🇪🇺 EU
gpt oss 120b
OpenAI · 120B

Welcome to the gpt-oss series, OpenAI’s open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases.

131K context €0.40 per 1M in 🇪🇺 EU
Qwen3 Coder 30B A3B Instruct FP8
Qwen · 31B

Qwen3-Coder is available in multiple sizes. Today, we're excited to introduce Qwen3-Coder-30B-A3B-Instruct-FP8. This streamlined model maintains impressive performance and efficiency, featuring the following key enhancements:

262K context €0.25 per 1M in 🇪🇺 EU
Qwen3 Coder 30B A3B Instruct
Qwen · 31B

Qwen3-Coder is available in multiple sizes. Today, we're excited to introduce Qwen3-Coder-30B-A3B-Instruct. This streamlined model maintains impressive performance and efficiency, featuring the following key enhancements:

262K context €0.25 per 1M in 🇪🇺 EU
GLM 4.5 Base
Z.AI · 358B

The GLM-4.5 series models are foundation models designed for intelligent agents. GLM-4.5 has 355 billion total parameters with 32 billion active parameters, while GLM-4.5-Air adopts a more compact design with 106 billion total parameters and 12 billion active parameters. GLM-4.5 models unify reasoning, coding, and intelligent agent capabilities to meet the complex demands of intelligent agent applications.

131K context €0.40 per 1M in 🇪🇺 EU
GLM 4.5 Air FP8
Z.AI · 111B

The GLM-4.5 series models are foundation models designed for intelligent agents. GLM-4.5 has 355 billion total parameters with 32 billion active parameters, while GLM-4.5-Air adopts a more compact design with 106 billion total parameters and 12 billion active parameters. GLM-4.5 models unify reasoning, coding, and intelligent agent capabilities to meet the complex demands of intelligent agent applications.

131K context €0.40 per 1M in 🇪🇺 EU
GLM 4.5 FP8
Z.AI · 358B

We present GLM-4.5, an open-source Mixture-of-Experts (MoE) large language model with 355B total parameters and 32B activated parameters, featuring a hybrid reasoning method that supports both thinking and direct response modes. Through multi-stage training on 23T tokens and comprehensive post-training with expert model iteration and reinforcement learning, GLM-4.5 achieves strong performance across agentic, reasoning, and coding (ARC) tasks, scoring 70.1% on TAU-Bench, 91.0% on AIME 24, and 64.2% on SWE-bench Verified. With much fewer parameters than several competitors, GLM-4.5 ranks 3rd ove

131K context €0.40 per 1M in 🇪🇺 EU
GLM 4.5 Air
Z.AI · 110B

The GLM-4.5 series models are foundation models designed for intelligent agents. GLM-4.5 has 355 billion total parameters with 32 billion active parameters, while GLM-4.5-Air adopts a more compact design with 106 billion total parameters and 12 billion active parameters. GLM-4.5 models unify reasoning, coding, and intelligent agent capabilities to meet the complex demands of intelligent agent applications.

131K context €0.40 per 1M in 🇪🇺 EU
GLM 4.5
Z.AI · 358B

The GLM-4.5 series models are foundation models designed for intelligent agents. GLM-4.5 has 355 billion total parameters with 32 billion active parameters, while GLM-4.5-Air adopts a more compact design with 106 billion total parameters and 12 billion active parameters. GLM-4.5 models unify reasoning, coding, and intelligent agent capabilities to meet the complex demands of intelligent agent applications.

131K context €0.40 per 1M in 🇪🇺 EU
GLM 4.5 Air Base
Z.AI · 110B

The GLM-4.5 series models are foundation models designed for intelligent agents. GLM-4.5 has 355 billion total parameters with 32 billion active parameters, while GLM-4.5-Air adopts a more compact design with 106 billion total parameters and 12 billion active parameters. GLM-4.5 models unify reasoning, coding, and intelligent agent capabilities to meet the complex demands of intelligent agent applications.

131K context €0.40 per 1M in 🇪🇺 EU
MediPhi Instruct
Microsoft · 3.8B

The MediPhi Model Collection comprises 7 small language models of 3.8B parameters from the base model Phi-3.5-mini-instruct specialized in the medical and clinical domains. The collection is designed in a modular fashion. Five MediPhi experts are fine-tuned on various medical corpora (i.e. PubMed commercial, Medical Wikipedia, Medical Guidelines, Medical Coding, and open-source clinical documents) and merged back with the SLERP method in their base model to conserve general abilities. One model combined all five experts into one general expert with the multi-model merging method BreadCrumbs. F

131K context €0.05 per 1M in 🇪🇺 EU
Dayhoff 3b GR HM c
Microsoft · 3B

Dayhoff is an Atlas of both protein sequence data and generative language models — a centralized resource that brings together 3.34 billion protein sequences across 1.7 billion clusters of metagenomic and natural protein sequences (GigaRef), 46 million structure-derived synthetic sequences (BackboneRef), and 16 million multiple sequence alignments (OpenProteinSet). These models can natively predict zero-shot mutation effects on fitness, scaffold structural motifs by conditioning on evolutionary or structural context, and perform guided generation of novel proteins within specified families. Le

262K context €0.03 per 1M in 🇪🇺 EU
GLM 4.1V 9B Thinking
Z.AI · 10B

Vision-Language Models (VLMs) have become foundational components of intelligent systems. As real-world AI tasks grow increasingly complex, VLMs must evolve beyond basic multimodal perception to enhance their reasoning capabilities in complex tasks. This involves improving accuracy, comprehensiveness, and intelligence, enabling applications such as complex problem solving, long-context understanding, and multimodal agents.

€0.10 per 1M in multimodal 🇪🇺 EU
GLM 4.1V 9B Base
Z.AI · 10B

Vision-Language Models (VLMs) have become foundational components of intelligent systems. As real-world AI tasks grow increasingly complex, VLMs must evolve beyond basic multimodal perception to enhance their reasoning capabilities in complex tasks. This involves improving accuracy, comprehensiveness, and intelligence, enabling applications such as complex problem solving, long-context understanding, and multimodal agents.

€0.10 per 1M in multimodal 🇪🇺 EU
Phi tiny MoE instruct
Microsoft · 3.8B

Phi-tiny-MoE is a lightweight Mixture of Experts (MoE) model with 3.8B total parameters and 1.1B activated parameters. It is compressed and distilled from the base model shared by Phi-3.5-MoE and GRIN-MoE using the SlimMoE approach, then post-trained via supervised fine-tuning and direct preference optimization for instruction following and safety. The model is trained on Phi-3 synthetic data and filtered public documents, with a focus on high-quality, reasoning-dense content. It is part of the SlimMoE series, which includes a larger variant, Phi-mini-MoE, with 7.6B total and 2.4B activated pa

4K context €0.05 per 1M in 🇪🇺 EU
Phi mini MoE instruct
Microsoft · 7.6B

Phi-mini-MoE is a lightweight Mixture of Experts (MoE) model with 7.6B total parameters and 2.4B activated parameters. It is compressed and distilled from the base model shared by Phi-3.5-MoE and GRIN-MoE using the SlimMoE approach, then post-trained via supervised fine-tuning and direct preference optimization for instruction following and safety. The model is trained on Phi-3 synthetic data and filtered public documents, with a focus on high-quality, reasoning-dense content. It is part of the SlimMoE series, which includes a smaller variant, Phi-tiny-MoE, with 3.8B total and 1.1B activated p

4K context €0.10 per 1M in 🇪🇺 EU
gemma 3n E2B it
Google · 5.4B

gemma 3n E2B it is een multimodaal taalmodel van Google met 5.4B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in multimodal 🇪🇺 EU
gemma 3n E4B it
Google · 7.8B

gemma 3n E4B it is een multimodaal taalmodel van Google met 7.8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in multimodal 🇪🇺 EU
GUI Actor Verifier 2B
Microsoft · 2.2B

This model was introduced in the paper GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents. It is developed based on UI-TARS-2B-SFT and is designed to predict the correctness of an action position given a language instruction. This model is well-suited for GUI-Actor, as its attention map effectively provides diverse candidates for verification with only a single inference.

33K context €0.03 per 1M in multimodal 🇪🇺 EU
DeepSeek R1 0528 Qwen3 8B
DeepSeek · 8.2B

The DeepSeek R1 model has undergone a minor version upgrade, with the current version being DeepSeek-R1-0528. In the latest update, DeepSeek R1 has significantly improved its depth of reasoning and inference capabilities by leveraging increased computational resources and introducing algorithmic optimization mechanisms during post-training. The model has demonstrated outstanding performance across various benchmark evaluations, including mathematics, programming, and general logic. Its overall performance is now approaching that of leading models, such as O3 and Gemini 2.5 Pro.

131K context €0.10 per 1M in 🇪🇺 EU
DeepSeek R1 0528
DeepSeek · 685B

The DeepSeek R1 model has undergone a minor version upgrade, with the current version being DeepSeek-R1-0528. In the latest update, DeepSeek R1 has significantly improved its depth of reasoning and inference capabilities by leveraging increased computational resources and introducing algorithmic optimization mechanisms during post-training. The model has demonstrated outstanding performance across various benchmark evaluations, including mathematics, programming, and general logic. Its overall performance is now approaching that of leading models, such as O3 and Gemini 2.5 Pro.

164K context €0.40 per 1M in 🇪🇺 EU
medgemma 27b text it
Google · 27B

medgemma 27b text it is een open-source taalmodel van Google met 27B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU
medgemma 4b it
Google · 4.3B

medgemma 4b it is een multimodaal taalmodel van Google met 4.3B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in multimodal 🇪🇺 EU
Qwen3 14B AWQ
Qwen · 15B

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

41K context €0.15 per 1M in 🇪🇺 EU
Phi 4 mini reasoning
Microsoft · 3.8B

Phi-4-mini-reasoning is a lightweight open model built upon synthetic data with a focus on high-quality, reasoning dense data further finetuned for more advanced math reasoning capabilities. The model belongs to the Phi-4 model family and supports 128K token context length.

131K context €0.05 per 1M in 🇪🇺 EU
Qwen3 0.6B FP8
Qwen · 0.8B

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

41K context €0.03 per 1M in 🇪🇺 EU
Qwen3 1.7B Base
Qwen · 1.7B

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Building upon extensive advancements in training data, model architecture, and optimization techniques, Qwen3 delivers the following key improvements over the previously released Qwen2.5:

33K context €0.03 per 1M in 🇪🇺 EU
Qwen3 4B Base
Qwen · 4B

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Building upon extensive advancements in training data, model architecture, and optimization techniques, Qwen3 delivers the following key improvements over the previously released Qwen2.5:

33K context €0.05 per 1M in 🇪🇺 EU
Qwen3 235B A22B
Qwen · 235B

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

41K context €0.40 per 1M in 🇪🇺 EU
Qwen3 32B
Qwen · 33B

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

41K context €0.25 per 1M in 🇪🇺 EU
Qwen3 30B A3B
Qwen · 31B

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

41K context €0.25 per 1M in 🇪🇺 EU
Qwen3 14B
Qwen · 15B

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

41K context €0.15 per 1M in 🇪🇺 EU
Qwen3 8B
Qwen · 8.2B

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

41K context €0.10 per 1M in 🇪🇺 EU
Qwen3 4B
Qwen · 4B

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

41K context €0.05 per 1M in 🇪🇺 EU
Qwen3 1.7B
Qwen · 2B

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

41K context €0.03 per 1M in 🇪🇺 EU
Qwen3 0.6B
Qwen · 0.8B

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

41K context €0.03 per 1M in 🇪🇺 EU
Llama Guard 4 12B
Meta · 12B

Llama Guard 4 12B is een multimodaal taalmodel van Meta met 12B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in multimodal 🇪🇺 EU
Phi 4 reasoning plus
Microsoft · 15B

[!IMPORTANT] To fully take advantage of the model's capabilities, inference must use temperature=0.8, topk=50, topp=0.95, and dosample=True. For more complex queries, set maxnewtokens=32768 to allow for longer chain-of-thought (CoT).

33K context €0.15 per 1M in 🇪🇺 EU
Phi 4 reasoning
Microsoft · 15B

[!IMPORTANT] To fully take advantage of the model's capabilities, inference must use temperature=0.8, topk=50, topp=0.95, and dosample=True. For more complex queries, set maxnewtokens=32768 to allow for longer chain-of-thought (CoT).

33K context €0.15 per 1M in 🇪🇺 EU
gemma 3 12b it qat q4 0 unquantized
Google · 12B

gemma 3 12b it qat q4 0 unquantized is een multimodaal taalmodel van Google met 12B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in multimodal 🇪🇺 EU
GLM Z1 32B 0414
Z.AI · 33B

The GLM family welcomes a new generation of open-source models, the GLM-4-32B-0414 series, featuring 32 billion parameters. Its performance is comparable to OpenAI's GPT series and DeepSeek's V3/R1 series, and it supports very user-friendly local deployment features. GLM-4-32B-Base-0414 was pre-trained on 15T of high-quality data, including a large amount of reasoning-type synthetic data, laying the foundation for subsequent reinforcement learning extensions. In the post-training stage, in addition to human preference alignment for dialogue scenarios, we also enhanced the model's performance i

33K context €0.25 per 1M in 🇪🇺 EU
GLM Z1 9B 0414
Z.AI · 9.4B

The GLM family welcomes a new generation of open-source models, the GLM-4-32B-0414 series, featuring 32 billion parameters. Its performance is comparable to OpenAI's GPT series and DeepSeek's V3/R1 series, and it supports very user-friendly local deployment features. GLM-4-32B-Base-0414 was pre-trained on 15T of high-quality data, including a large amount of reasoning-type synthetic data, laying the foundation for subsequent reinforcement learning extensions. In the post-training stage, in addition to human preference alignment for dialogue scenarios, we also enhanced the model's performance i

33K context €0.10 per 1M in 🇪🇺 EU
GLM 4 32B 0414
Z.AI · 33B

The GLM family welcomes new members, the GLM-4-32B-0414 series models, featuring 32 billion parameters. Its performance is comparable to OpenAI’s GPT series and DeepSeek’s V3/R1 series. It also supports very user-friendly local deployment features. GLM-4-32B-Base-0414 was pre-trained on 15T of high-quality data, including substantial reasoning-type synthetic data. This lays the foundation for subsequent reinforcement learning extensions. In the post-training stage, we employed human preference alignment for dialogue scenarios. Additionally, using techniques like rejection sampling and reinforc

33K context €0.25 per 1M in 🇪🇺 EU
GLM 4 9B 0414
Z.AI · 9.4B

The GLM family welcomes new members, the GLM-4-32B-0414 series models, featuring 32 billion parameters. Its performance is comparable to OpenAI’s GPT series and DeepSeek’s V3/R1 series. It also supports very user-friendly local deployment features. GLM-4-32B-Base-0414 was pre-trained on 15T of high-quality data, including substantial reasoning-type synthetic data. This lays the foundation for subsequent reinforcement learning extensions. In the post-training stage, we employed human preference alignment for dialogue scenarios. Additionally, using techniques like rejection sampling and reinforc

33K context €0.10 per 1M in 🇪🇺 EU
Llama 4 Maverick 17B 128E
Meta · 402B

Llama 4 Maverick 17B 128E is een multimodaal taalmodel van Meta met 402B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in multimodal 🇪🇺 EU
Llama 4 Scout 17B 16E
Meta · 109B

Llama 4 Scout 17B 16E is een multimodaal taalmodel van Meta met 109B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in multimodal 🇪🇺 EU
Llama 4 Scout 17B 16E Instruct
Meta · 109B

Llama 4 Scout 17B 16E Instruct is een multimodaal taalmodel van Meta met 109B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in multimodal 🇪🇺 EU
Llama 4 Maverick 17B 128E Instruct
Meta · 402B

Llama 4 Maverick 17B 128E Instruct is een multimodaal taalmodel van Meta met 402B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in multimodal 🇪🇺 EU
Llama 4 Maverick 17B 128E Instruct FP8
Meta · 402B

Llama 4 Maverick 17B 128E Instruct FP8 is een multimodaal taalmodel van Meta met 402B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in multimodal 🇪🇺 EU
DeepSeek V3 0324
DeepSeek · 685B

DeepSeek-V3-0324 demonstrates notable improvements over its predecessor, DeepSeek-V3, in several key aspects.

164K context €0.40 per 1M in 🇪🇺 EU
txgemma 9b chat
Google · 9.2B

txgemma 9b chat is een open-source taalmodel van Google met 9.2B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
txgemma 2b predict
Google · 2.6B

txgemma 2b predict is een open-source taalmodel van Google met 2.6B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
Qwen2.5 VL 32B Instruct
Qwen · 33B

In the past five months since Qwen2-VL’s release, numerous developers have built new models on the Qwen2-VL vision-language models, providing us with valuable feedback. During this period, we focused on building more useful vision-language models. Today, we are excited to introduce the latest addition to the Qwen family: Qwen2.5-VL.

128K context €0.25 per 1M in multimodal 🇪🇺 EU
gemma 3 1b it
Google · 1B

gemma 3 1b it is een open-source taalmodel van Google met 1B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
gemma 3 12b it
Google · 12B

gemma 3 12b it is een multimodaal taalmodel van Google met 12B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in multimodal 🇪🇺 EU
gemma 3 12b pt
Google · 12B

gemma 3 12b pt is een multimodaal taalmodel van Google met 12B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in multimodal 🇪🇺 EU
gemma 3 27b it
Google · 27B

gemma 3 27b it is een multimodaal taalmodel van Google met 27B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in multimodal 🇪🇺 EU
gemma 3 27b pt
Google · 27B

gemma 3 27b pt is een multimodaal taalmodel van Google met 27B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in multimodal 🇪🇺 EU
gemma 3 1b pt
Google · 1B

gemma 3 1b pt is een open-source taalmodel van Google met 1B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
gemma 3 4b it
Google · 4.3B

gemma 3 4b it is een multimodaal taalmodel van Google met 4.3B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in multimodal 🇪🇺 EU
gemma 3 4b pt
Google · 4.3B

gemma 3 4b pt is een multimodaal taalmodel van Google met 4.3B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in multimodal 🇪🇺 EU
Phi 4 mini instruct
Microsoft · 3.8B

🎉Phi-4: [mini-reasoning | reasoning] | [multimodal-instruct | onnx]; [mini-instruct | onnx]

131K context €0.05 per 1M in 🇪🇺 EU
Qwen2.5 VL 7B Instruct AWQ
Qwen · 8.3B

In the past five months since Qwen2-VL’s release, numerous developers have built new models on the Qwen2-VL vision-language models, providing us with valuable feedback. During this period, we focused on building more useful vision-language models. Today, we are excited to introduce the latest addition to the Qwen family: Qwen2.5-VL.

128K context €0.10 per 1M in multimodal 🇪🇺 EU
Qwen2.5 VL 3B Instruct AWQ
Qwen · 3.8B

--- licensename: qwen-research licenselink: https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct-AWQ/blob/main/LICENSE language: - en pipelinetag: image-text-to-text tags: - multimodal libraryname: transformers basemodel: - Qwen/Qwen2.5-VL-3B-Instruct ---

128K context €0.05 per 1M in multimodal 🇪🇺 EU
Qwen2.5 VL 72B Instruct
Qwen · 73B

--- license: other licensename: qwen licenselink: https://huggingface.co/Qwen/Qwen2.5-VL-72B-Instruct/blob/main/LICENSE language: - en pipelinetag: image-text-to-text tags: - multimodal libraryname: transformers ---

128K context €0.40 per 1M in multimodal 🇪🇺 EU
Qwen2.5 VL 7B Instruct
Qwen · 8.3B

--- license: apache-2.0 language: - en pipelinetag: image-text-to-text tags: - multimodal libraryname: transformers ---

128K context €0.10 per 1M in multimodal 🇪🇺 EU
Qwen2.5 VL 3B Instruct
Qwen · 3.8B

--- licensename: qwen-research licenselink: https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct/blob/main/LICENSE language: - en pipelinetag: image-text-to-text tags: - multimodal libraryname: transformers ---

128K context €0.05 per 1M in multimodal 🇪🇺 EU
DeepSeek R1 Distill Qwen 32B
DeepSeek · 33B

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorpor

131K context €0.25 per 1M in 🇪🇺 EU
DeepSeek R1 Distill Qwen 7B
DeepSeek · 7.6B

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorpor

131K context €0.10 per 1M in 🇪🇺 EU
DeepSeek R1 Distill Llama 70B
DeepSeek · 71B

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorpor

131K context €0.40 per 1M in 🇪🇺 EU
DeepSeek R1 Distill Llama 8B
DeepSeek · 8B

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorpor

131K context €0.10 per 1M in 🇪🇺 EU
DeepSeek R1 Distill Qwen 1.5B
DeepSeek · 1.8B

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorpor

131K context €0.03 per 1M in 🇪🇺 EU
DeepSeek R1
DeepSeek · 685B

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorpor

164K context €0.40 per 1M in 🇪🇺 EU
DeepSeek R1 Zero
DeepSeek · 685B

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorpor

164K context €0.40 per 1M in 🇪🇺 EU
glm 4 9b hf
Z.AI · 9.4B

If you are using the weights from this repository, please update to

8K context €0.10 per 1M in 🇪🇺 EU
DeepSeek V3
DeepSeek · 685B

We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which were thoroughly validated in DeepSeek-V2. Furthermore, DeepSeek-V3 pioneers an auxiliary-loss-free strategy for load balancing and sets a multi-token prediction training objective for stronger performance. We pre-train DeepSeek-V3 on 14.8 trillion diverse and high-quality tokens, followed by Supervised Fine-Tuning

164K context €0.40 per 1M in 🇪🇺 EU
cogagent 9b 20241220
Z.AI · 14B

The CogAgent-9B-20241220 model is based on GLM-4V-9B, a bilingual open-source VLM base model. Through data collection and optimization, multi-stage training, and strategy improvements, CogAgent-9B-20241220 achieves significant advancements in GUI perception, inference prediction accuracy, action space completeness, and task generalizability. The model supports bilingual (Chinese and English) interaction with both screenshots and language input.

€0.10 per 1M in multimodal 🇪🇺 EU
phi 4
Microsoft · 15B

Our training data is an extension of the data used for Phi-3 and includes a wide variety of sources from:

16K context €0.15 per 1M in 🇪🇺 EU
Llama 3.3 70B Instruct
Meta · 71B

Llama 3.3 70B Instruct is een open-source taalmodel van Meta met 71B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU
glm edge v 5b
Z.AI · 4.9B

Install the transformers library from the source code:

4K context €0.05 per 1M in multimodal 🇪🇺 EU
glm edge v 2b
Z.AI · 2.1B

Install the transformers library from the source code:

4K context €0.03 per 1M in multimodal 🇪🇺 EU
paligemma2 28b mix 448
Google · 28B

paligemma2 28b mix 448 is een multimodaal taalmodel van Google met 28B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in multimodal 🇪🇺 EU
paligemma2 3b pt 448
Google · 3B

paligemma2 3b pt 448 is een multimodaal taalmodel van Google met 3B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in multimodal 🇪🇺 EU
paligemma2 3b pt 224
Google · 3B

paligemma2 3b pt 224 is een multimodaal taalmodel van Google met 3B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in multimodal 🇪🇺 EU
paligemma2 3b ft docci 448
Google · 3B

paligemma2 3b ft docci 448 is een multimodaal taalmodel van Google met 3B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in multimodal 🇪🇺 EU
paligemma2 3b mix 224
Google · 3B

paligemma2 3b mix 224 is een multimodaal taalmodel van Google met 3B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in multimodal 🇪🇺 EU
glm edge 4b chat
Z.AI · 4.3B

Install the transformers library from the source code:

8K context €0.05 per 1M in 🇪🇺 EU
glm edge 1.5b chat
Z.AI · 1.6B

Install the transformers library from the source code:

8K context €0.03 per 1M in 🇪🇺 EU
Qwen2.5 Coder 14B Instruct AWQ
Qwen · 15B

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). As of now, Qwen2.5-Coder has covered six mainstream model sizes, 0.5, 1.5, 3, 7, 14, 32 billion parameters, to meet the needs of different developers. Qwen2.5-Coder brings the following improvements upon CodeQwen1.5:

33K context €0.15 per 1M in 🇪🇺 EU
Qwen2.5 Coder 32B Instruct AWQ
Qwen · 33B

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). As of now, Qwen2.5-Coder has covered six mainstream model sizes, 0.5, 1.5, 3, 7, 14, 32 billion parameters, to meet the needs of different developers. Qwen2.5-Coder brings the following improvements upon CodeQwen1.5:

33K context €0.25 per 1M in 🇪🇺 EU
Qwen2.5 Coder 32B Instruct
Qwen · 33B

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). As of now, Qwen2.5-Coder has covered six mainstream model sizes, 0.5, 1.5, 3, 7, 14, 32 billion parameters, to meet the needs of different developers. Qwen2.5-Coder brings the following improvements upon CodeQwen1.5:

33K context €0.25 per 1M in 🇪🇺 EU
Qwen2.5 Coder 14B Instruct
Qwen · 15B

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). As of now, Qwen2.5-Coder has covered six mainstream model sizes, 0.5, 1.5, 3, 7, 14, 32 billion parameters, to meet the needs of different developers. Qwen2.5-Coder brings the following improvements upon CodeQwen1.5:

33K context €0.15 per 1M in 🇪🇺 EU
glm 4 9b chat 1m hf
Z.AI · 9.5B

If you are using the weights from this repository, please update to

1M context €0.10 per 1M in 🇪🇺 EU
glm 4 9b chat hf
Z.AI · 9.4B

If you are using the weights from this repository, please update to

131K context €0.10 per 1M in 🇪🇺 EU
OmniParser
Microsoft

This model hub includes a finetuned version of YOLOv8 and a finetuned BLIP-2 model on the above dataset respectively. For more details of the models used and finetuning, please refer to the paper.

€0.10 per 1M in multimodal 🇪🇺 EU
gemma 2 2b jpn it
Google · 2.6B

gemma 2 2b jpn it is een open-source taalmodel van Google met 2.6B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
Llama Guard 3 1B
Meta · 1.5B

Llama Guard 3 1B is een open-source taalmodel van Meta met 1.5B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
Llama Guard 3 11B Vision
Meta · 11B

Llama Guard 3 11B Vision is een multimodaal taalmodel van Meta met 11B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in multimodal 🇪🇺 EU
Llama 3.2 90B Vision Instruct
Meta · 89B

Llama 3.2 90B Vision Instruct is een multimodaal taalmodel van Meta met 89B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in multimodal 🇪🇺 EU
Llama 3.2 90B Vision
Meta · 89B

Llama 3.2 90B Vision is een multimodaal taalmodel van Meta met 89B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in multimodal 🇪🇺 EU
Llama 3.2 11B Vision Instruct
Meta · 11B

Llama 3.2 11B Vision Instruct is een multimodaal taalmodel van Meta met 11B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in multimodal 🇪🇺 EU
Llama 3.2 11B Vision
Meta · 11B

Llama 3.2 11B Vision is een multimodaal taalmodel van Meta met 11B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in multimodal 🇪🇺 EU
Llama 3.2 3B
Meta · 3.2B

Llama 3.2 3B is een open-source taalmodel van Meta met 3.2B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
Llama 3.2 3B Instruct
Meta · 3.2B

Llama 3.2 3B Instruct is een open-source taalmodel van Meta met 3.2B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
Llama 3.2 1B Instruct
Meta · 1.2B

Llama 3.2 1B Instruct is een open-source taalmodel van Meta met 1.2B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
Llama 3.2 1B
Meta · 1.2B

Llama 3.2 1B is een open-source taalmodel van Meta met 1.2B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
Qwen2.5 Coder 1.5B Instruct
Qwen · 1.5B

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). As of now, Qwen2.5-Coder has covered six mainstream model sizes, 0.5, 1.5, 3, 7, 14, 32 billion parameters, to meet the needs of different developers. Qwen2.5-Coder brings the following improvements upon CodeQwen1.5:

33K context €0.03 per 1M in 🇪🇺 EU
Qwen2.5 1.5B Instruct
Qwen · 1.5B

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

33K context €0.03 per 1M in 🇪🇺 EU
Qwen2.5 3B Instruct
Qwen · 3.1B

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

33K context €0.03 per 1M in 🇪🇺 EU
Qwen2.5 72B Instruct AWQ
Qwen · 73B

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

33K context €0.40 per 1M in 🇪🇺 EU
Qwen2.5 32B Instruct AWQ
Qwen · 33B

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

33K context €0.25 per 1M in 🇪🇺 EU
Qwen2.5 14B Instruct AWQ
Qwen · 15B

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

33K context €0.15 per 1M in 🇪🇺 EU
Qwen2.5 7B Instruct AWQ
Qwen · 7.6B

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

33K context €0.10 per 1M in 🇪🇺 EU
Qwen2.5 1.5B Instruct AWQ
Qwen · 1.8B

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

33K context €0.03 per 1M in 🇪🇺 EU
Qwen2.5 Coder 7B Instruct
Qwen · 7.6B

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). As of now, Qwen2.5-Coder has covered six mainstream model sizes, 0.5, 1.5, 3, 7, 14, 32 billion parameters, to meet the needs of different developers. Qwen2.5-Coder brings the following improvements upon CodeQwen1.5:

33K context €0.10 per 1M in 🇪🇺 EU
Qwen2.5 32B Instruct
Qwen · 33B

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

33K context €0.25 per 1M in 🇪🇺 EU
Qwen2.5 14B Instruct
Qwen · 15B

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

33K context €0.15 per 1M in 🇪🇺 EU
Qwen2.5 7B Instruct
Qwen · 7.6B

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

33K context €0.10 per 1M in 🇪🇺 EU
Qwen2.5 0.5B Instruct
Qwen · 0.5B

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

33K context €0.03 per 1M in 🇪🇺 EU
Qwen2.5 1.5B
Qwen · 1.5B

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

131K context €0.03 per 1M in 🇪🇺 EU
Qwen2.5 0.5B
Qwen · 0.5B

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

33K context €0.03 per 1M in 🇪🇺 EU
gemma 7b aps it
Google · 8.5B

gemma 7b aps it is een open-source taalmodel van Google met 8.5B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
DeepSeek Coder V2 Instruct 0724
DeepSeek · 236B

In standard benchmark evaluations, DeepSeek-Coder-V2 achieves superior performance compared to closed-source models such as GPT4-Turbo, Claude 3 Opus, and Gemini 1.5 Pro in coding and math benchmarks. The list of supported programming languages can be found here.

164K context €0.40 per 1M in 🇪🇺 EU
DeepSeek V2.5
DeepSeek · 236B

DeepSeek-V2.5 is an upgraded version that combines DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct. The new model integrates the general and coding abilities of the two previous versions. For model details, please visit DeepSeek-V2 page for more information.

164K context €0.40 per 1M in 🇪🇺 EU
Qwen2 VL 7B Instruct AWQ
Qwen · 8.3B

We're excited to unveil Qwen2-VL, the latest iteration of our Qwen-VL model, representing nearly a year of innovation.

33K context €0.10 per 1M in multimodal 🇪🇺 EU
Qwen2 VL 7B Instruct
Qwen · 8.3B

We're excited to unveil Qwen2-VL, the latest iteration of our Qwen-VL model, representing nearly a year of innovation.

33K context €0.10 per 1M in multimodal 🇪🇺 EU
Qwen2 VL 2B Instruct
Qwen · 2.2B

We're excited to unveil Qwen2-VL, the latest iteration of our Qwen-VL model, representing nearly a year of innovation.

33K context €0.03 per 1M in multimodal 🇪🇺 EU
Phi 3.5 MoE instruct
Microsoft · 42B

Phi-3.5-MoE is a lightweight, state-of-the-art open model built upon datasets used for Phi-3 - synthetic data and filtered publicly available documents - with a focus on very high-quality, reasoning dense data. The model supports multilingual and comes with 128K context length (in tokens). The model underwent a rigorous enhancement process, incorporating supervised fine-tuning, proximal policy optimization, and direct preference optimization to ensure precise instruction adherence and robust safety measures.

131K context €0.40 per 1M in 🇪🇺 EU
Phi 3.5 vision instruct
Microsoft · 4.1B

Phi-3.5-vision is a lightweight, state-of-the-art open multimodal model built upon datasets which include - synthetic data and filtered publicly available websites - with a focus on very high-quality, reasoning dense data both on text and vision. The model belongs to the Phi-3 model family, and the multimodal version comes with 128K context length (in tokens) it can support. The model underwent a rigorous enhancement process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures.

131K context €0.05 per 1M in multimodal 🇪🇺 EU
Phi 3.5 mini instruct
Microsoft · 3.8B

🎉Phi-4: [multimodal-instruct | onnx]; [mini-instruct | onnx]

131K context €0.05 per 1M in 🇪🇺 EU
LongWriter glm4 9b
Z.AI · 9.4B

LongWriter-glm4-9b is trained based on glm-4-9b, and is capable of generating 10,000+ words at once.

€0.10 per 1M in 🇪🇺 EU
LongWriter llama3.1 8b
Z.AI · 8B

LongWriter-llama3.1-8b is trained based on Meta-Llama-3.1-8B, and is capable of generating 10,000+ words at once.

131K context €0.10 per 1M in 🇪🇺 EU
gemma 2b AWQ
Google · 3B

AWQ quantized version of google/gemma-2b.

8K context €0.03 per 1M in 🇪🇺 EU
Llama Guard 3 8B
Meta · 8B

Llama Guard 3 8B is een open-source taalmodel van Meta met 8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
Llama Guard 3 8B INT8
Meta · 8B

Llama Guard 3 8B INT8 is een open-source taalmodel van Meta met 8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
Llama 3.1 405B FP8
Meta · 406B

Llama 3.1 405B FP8 is een open-source taalmodel van Meta met 406B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU
Llama 3.1 405B Instruct FP8
Meta · 406B

Llama 3.1 405B Instruct FP8 is een open-source taalmodel van Meta met 406B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU
Llama 3.1 8B Instruct
Meta · 8B

Llama 3.1 8B Instruct is een open-source taalmodel van Meta met 8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
DeepSeek V2 Chat 0628
DeepSeek · 236B

DeepSeek-V2-Chat-0628 is an improved version of DeepSeek-V2-Chat. For model details, please visit DeepSeek-V2 page for more information.

164K context €0.40 per 1M in 🇪🇺 EU
shieldgemma 9b
Google · 9.2B

shieldgemma 9b is een open-source taalmodel van Google met 9.2B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
shieldgemma 2b
Google · 2.6B

shieldgemma 2b is een open-source taalmodel van Google met 2.6B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
Llama 3.1 405B Instruct
Meta · 406B

Llama 3.1 405B Instruct is een open-source taalmodel van Meta met 406B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU
Llama 3.1 70B Instruct
Meta · 71B

Llama 3.1 70B Instruct is een open-source taalmodel van Meta met 71B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU
gemma 2 2b it
Google · 2.6B

gemma 2 2b it is een open-source taalmodel van Google met 2.6B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
gemma 2 2b
Google · 2.6B

gemma 2 2b is een open-source taalmodel van Google met 2.6B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
Llama 3.1 405B
Meta · 406B

Llama 3.1 405B is een open-source taalmodel van Meta met 406B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU
Llama 3.1 70B
Meta · 71B

Llama 3.1 70B is een open-source taalmodel van Meta met 71B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU
Llama 3.1 8B
Meta · 8B

Llama 3.1 8B is een open-source taalmodel van Meta met 8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
codegeex4 all 9b
Z.AI · 9.4B

We introduce CodeGeeX4-ALL-9B, the open-source version of the latest CodeGeeX4 model series. It is a multilingual code generation model continually trained on the GLM-4-9B, significantly enhancing its code generation capabilities. Using a single CodeGeeX4-ALL-9B model, it can support comprehensive functions such as code completion and generation, code interpreter, web search, function call, repository-level code Q&A, covering various scenarios of software development. CodeGeeX4-ALL-9B has achieved highly competitive performance on public benchmarks, such as BigCodeBench and NaturalCodeBench. I

€0.10 per 1M in 🇪🇺 EU
gemma 2 9b
Google · 9.2B

gemma 2 9b is een open-source taalmodel van Google met 9.2B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
gemma 2 9b it
Google · 9.2B

gemma 2 9b it is een open-source taalmodel van Google met 9.2B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
gemma 2 27b
Google · 27B

gemma 2 27b is een open-source taalmodel van Google met 27B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU
gemma 2 27b it
Google · 27B

gemma 2 27b it is een open-source taalmodel van Google met 27B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU
Florence 2 base ft
Microsoft · 0.2B

This Hub repository contains a HuggingFace's transformers implementation of Florence-2 model from Microsoft.

€0.03 per 1M in multimodal 🇪🇺 EU
Florence 2 large ft
Microsoft · 0.8B

This Hub repository contains a HuggingFace's transformers implementation of Florence-2 model from Microsoft.

€0.03 per 1M in multimodal 🇪🇺 EU
Florence 2 base
Microsoft · 0.2B

This Hub repository contains a HuggingFace's transformers implementation of Florence-2 model from Microsoft.

€0.03 per 1M in multimodal 🇪🇺 EU
Florence 2 large
Microsoft · 0.8B

This is a continued pretrained version of Florence-2-large model with 4k context length, only 0.1B samples are used for continue pretraining, thus it might not be trained well. In addition, OCR task has been updated with line separator ('\n'). COCO OD AP 39.8

€0.03 per 1M in multimodal 🇪🇺 EU
DeepSeek Coder V2 Lite Instruct
DeepSeek · 16B

In standard benchmark evaluations, DeepSeek-Coder-V2 achieves superior performance compared to closed-source models such as GPT4-Turbo, Claude 3 Opus, and Gemini 1.5 Pro in coding and math benchmarks. The list of supported programming languages can be found here.

164K context €0.15 per 1M in 🇪🇺 EU
DeepSeek Coder V2 Lite Base
DeepSeek · 16B

In standard benchmark evaluations, DeepSeek-Coder-V2 achieves superior performance compared to closed-source models such as GPT4-Turbo, Claude 3 Opus, and Gemini 1.5 Pro in coding and math benchmarks. The list of supported programming languages can be found here.

164K context €0.15 per 1M in 🇪🇺 EU
DeepSeek Coder V2 Instruct
DeepSeek · 236B

In standard benchmark evaluations, DeepSeek-Coder-V2 achieves superior performance compared to closed-source models such as GPT4-Turbo, Claude 3 Opus, and Gemini 1.5 Pro in coding and math benchmarks. The list of supported programming languages can be found here.

164K context €0.40 per 1M in 🇪🇺 EU
Qwen2 7B Instruct
Qwen · 7.6B

Qwen2 is the new series of Qwen large language models. For Qwen2, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters, including a Mixture-of-Experts model. This repo contains the instruction-tuned 7B Qwen2 model.

33K context €0.10 per 1M in 🇪🇺 EU
glm 4 9b
Z.AI · 9.4B

2024/08/12, 本仓库代码已更新并使用 transformers=4.44.0, 请及时更新依赖。

€0.10 per 1M in 🇪🇺 EU
Qwen2 1.5B Instruct
Qwen · 1.5B

Qwen2 is the new series of Qwen large language models. For Qwen2, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters, including a Mixture-of-Experts model. This repo contains the instruction-tuned 1.5B Qwen2 model.

33K context €0.03 per 1M in 🇪🇺 EU
Qwen2 0.5B
Qwen · 0.5B

Qwen2 is the new series of Qwen large language models. For Qwen2, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters, including a Mixture-of-Experts model. This repo contains the 0.5B Qwen2 base language model.

131K context €0.03 per 1M in 🇪🇺 EU
Phi 3 vision 128k instruct
Microsoft · 4.1B

🎉 Phi-3.5: [[mini-instruct]](https://huggingface.co/microsoft/Phi-3.5-mini-instruct); [[MoE-instruct]](https://huggingface.co/microsoft/Phi-3.5-MoE-instruct) ; [[vision-instruct]](https://huggingface.co/microsoft/Phi-3.5-vision-instruct)

131K context €0.05 per 1M in 🇪🇺 EU
DeepSeek V2 Lite Chat
DeepSeek · 16B

Last week, the release and buzz around DeepSeek-V2 have ignited widespread interest in MLA (Multi-head Latent Attention)! Many in the community suggested open-sourcing a smaller MoE model for in-depth research. And now DeepSeek-V2-Lite comes out:

164K context €0.15 per 1M in 🇪🇺 EU
DeepSeek V2 Lite
DeepSeek · 16B

Last week, the release and buzz around DeepSeek-V2 have ignited widespread interest in MLA (Multi-head Latent Attention)! Many in the community suggested open-sourcing a smaller MoE model for in-depth research. And now DeepSeek-V2-Lite comes out:

164K context €0.15 per 1M in 🇪🇺 EU
paligemma 3b ft cococap 448
Google · 2.9B

paligemma 3b ft cococap 448 is een multimodaal taalmodel van Google met 2.9B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in multimodal 🇪🇺 EU
paligemma 3b pt 448
Google · 2.9B

paligemma 3b pt 448 is een multimodaal taalmodel van Google met 2.9B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in multimodal 🇪🇺 EU
paligemma 3b mix 224
Google · 2.9B

paligemma 3b mix 224 is een multimodaal taalmodel van Google met 2.9B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in multimodal 🇪🇺 EU
paligemma 3b pt 224
Google · 2.9B

paligemma 3b pt 224 is een multimodaal taalmodel van Google met 2.9B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in multimodal 🇪🇺 EU
Phi 3 small 128k instruct
Microsoft · 7.4B

🎉 Phi-3.5: [[mini-instruct]](https://huggingface.co/microsoft/Phi-3.5-mini-instruct); [[MoE-instruct]](https://huggingface.co/microsoft/Phi-3.5-MoE-instruct) ; [[vision-instruct]](https://huggingface.co/microsoft/Phi-3.5-vision-instruct)

131K context €0.10 per 1M in 🇪🇺 EU
Phi 3 small 8k instruct
Microsoft · 7.4B

🎉 Phi-3.5: [[mini-instruct]](https://huggingface.co/microsoft/Phi-3.5-mini-instruct); [[MoE-instruct]](https://huggingface.co/microsoft/Phi-3.5-MoE-instruct) ; [[vision-instruct]](https://huggingface.co/microsoft/Phi-3.5-vision-instruct)

8K context €0.10 per 1M in 🇪🇺 EU
Phi 3 medium 128k instruct
Microsoft · 14B

🎉 Phi-3.5: [[mini-instruct]](https://huggingface.co/microsoft/Phi-3.5-mini-instruct); [[MoE-instruct]](https://huggingface.co/microsoft/Phi-3.5-MoE-instruct) ; [[vision-instruct]](https://huggingface.co/microsoft/Phi-3.5-vision-instruct)

131K context €0.15 per 1M in 🇪🇺 EU
Phi 3 medium 4k instruct
Microsoft · 14B

🎉 Phi-3.5: [[mini-instruct]](https://huggingface.co/microsoft/Phi-3.5-mini-instruct); [[MoE-instruct]](https://huggingface.co/microsoft/Phi-3.5-MoE-instruct) ; [[vision-instruct]](https://huggingface.co/microsoft/Phi-3.5-vision-instruct)

4K context €0.15 per 1M in 🇪🇺 EU
DeepSeek V2 Chat
DeepSeek · 236B

Due to the constraints of HuggingFace, the open-source code currently experiences slower performance than our internal codebase when running on GPUs with Huggingface. To facilitate the efficient execution of our model, we offer a dedicated vllm solution that optimizes performance for running our model effectively.

164K context €0.40 per 1M in 🇪🇺 EU
Phi 3 mini 128k instruct
Microsoft · 3.8B

🎉Phi-4: [multimodal-instruct | onnx]; [mini-instruct | onnx]

131K context €0.05 per 1M in 🇪🇺 EU
Phi 3 mini 4k instruct
Microsoft · 3.8B

🎉 Phi-3.5: [[mini-instruct]](https://huggingface.co/microsoft/Phi-3.5-mini-instruct); [[MoE-instruct]](https://huggingface.co/microsoft/Phi-3.5-MoE-instruct) ; [[vision-instruct]](https://huggingface.co/microsoft/Phi-3.5-vision-instruct)

4K context €0.05 per 1M in 🇪🇺 EU
DeepSeek V2
DeepSeek · 236B

Due to the constraints of HuggingFace, the open-source code currently experiences slower performance than our internal codebase when running on GPUs with Huggingface. To facilitate the efficient execution of our model, we offer a dedicated vllm solution that optimizes performance for running our model effectively.

164K context €0.40 per 1M in 🇪🇺 EU
Meta Llama Guard 2 8B
Meta · 8B

Meta Llama Guard 2 8B is een open-source taalmodel van Meta met 8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
Meta Llama 3 8B
Meta · 8B

Meta Llama 3 8B is een open-source taalmodel van Meta met 8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
Meta Llama 3 8B Instruct
Meta · 8B

Meta Llama 3 8B Instruct is een open-source taalmodel van Meta met 8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
Meta Llama 3 70B Instruct
Meta · 71B

Meta Llama 3 70B Instruct is een open-source taalmodel van Meta met 71B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU
Meta Llama 3 70B
Meta · 71B

Meta Llama 3 70B is een open-source taalmodel van Meta met 71B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU
gemma 1.1 2b it
Google · 2.5B

gemma 1.1 2b it is een open-source taalmodel van Google met 2.5B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
gemma 1.1 7b it
Google · 8.5B

gemma 1.1 7b it is een open-source taalmodel van Google met 8.5B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
codegemma 7b
Google · 8.5B

codegemma 7b is een open-source taalmodel van Google met 8.5B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
codegemma 2b
Google · 2.5B

codegemma 2b is een open-source taalmodel van Google met 2.5B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
CodeLlama 34b Instruct hf
Meta · 34B

CodeLlama 34b Instruct hf is een open-source taalmodel van Meta met 34B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU
CodeLlama 70b Instruct hf
Meta · 69B

CodeLlama 70b Instruct hf is een open-source taalmodel van Meta met 69B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU
CodeLlama 70b hf
Meta · 69B

CodeLlama 70b hf is een open-source taalmodel van Meta met 69B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU
CodeLlama 13b Instruct hf
Meta · 13B

CodeLlama 13b Instruct hf is een open-source taalmodel van Meta met 13B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
CodeLlama 13b hf
Meta · 13B

CodeLlama 13b hf is een open-source taalmodel van Meta met 13B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
CodeLlama 7b Instruct hf
Meta · 6.7B

CodeLlama 7b Instruct hf is een open-source taalmodel van Meta met 6.7B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU
CodeLlama 7b Python hf
Meta · 6.7B

CodeLlama 7b Python hf is een open-source taalmodel van Meta met 6.7B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU
CodeLlama 7b hf
Meta · 7B

CodeLlama 7b hf is een open-source taalmodel van Meta met 7B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
gemma 7b it
Google · 8.5B

gemma 7b it is een open-source taalmodel van Google met 8.5B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
gemma 7b
Google · 8.5B

gemma 7b is een open-source taalmodel van Google met 8.5B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
gemma 2b it
Google · 2.5B

gemma 2b it is een open-source taalmodel van Google met 2.5B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
gemma 2b
Google · 2.5B

gemma 2b is een open-source taalmodel van Google met 2.5B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
deepseek math 7b instruct
DeepSeek · 7B

❗❗❗ Please use chain-of-thought prompt to test DeepSeekMath-Instruct and DeepSeekMath-RL:

4K context €0.10 per 1M in 🇪🇺 EU
deepseek coder 7b instruct v1.5
DeepSeek · 6.9B

Deepseek-Coder-7B-Instruct-v1.5 is continue pre-trained from Deepseek-LLM 7B on 2T tokens by employing a window size of 4K and next token prediction objective, and then fine-tuned on 2B tokens of instruction data.

4K context €0.05 per 1M in 🇪🇺 EU
deepseek moe 16b chat
DeepSeek · 16B

python import torch from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig

4K context €0.15 per 1M in 🇪🇺 EU
deepseek moe 16b base
DeepSeek · 16B

modelname = "deepseek-ai/deepseek-moe-16b-base" tokenizer = AutoTokenizer.frompretrained(modelname) model = AutoModelForCausalLM.frompretrained(modelname, torchdtype=torch.bfloat16, devicemap="auto") model.generationconfig = GenerationConfig.frompretrained(modelname) model.generationconfig.padtokenid = model.generationconfig.eostokenid

4K context €0.15 per 1M in 🇪🇺 EU
phi 2
Microsoft · 2.8B

Phi-2 is a Transformer with 2.7 billion parameters. It was trained using the same data sources as Phi-1.5, augmented with a new data source that consists of various NLP synthetic texts and filtered websites (for safety and educational value). When assessed against benchmarks testing common sense, language understanding, and logical reasoning, Phi-2 showcased a nearly state-of-the-art performance among models with less than 13 billion parameters.

2K context €0.03 per 1M in 🇪🇺 EU
Mistral 7B Instruct v0.2
Mistral · 7.2B

py from mistralcommon.tokens.tokenizers.mistral import MistralTokenizer from mistralcommon.protocol.instruct.messages import UserMessage from mistralcommon.protocol.instruct.request import ChatCompletionRequest

33K context €0.10 per 1M in 🇪🇺 EU
LlamaGuard 7b
Meta · 6.7B

LlamaGuard 7b is een open-source taalmodel van Meta met 6.7B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU
deepseek llm 67b base
DeepSeek · 67B

Introducing DeepSeek LLM, an advanced language model comprising 67 billion parameters. It has been trained from scratch on a vast dataset of 2 trillion tokens in both English and Chinese. In order to foster research, we have made DeepSeek LLM 7B/67B Base and DeepSeek LLM 7B/67B Chat open source for the research community.

4K context €0.40 per 1M in 🇪🇺 EU
deepseek llm 7b chat
DeepSeek · 7B

Introducing DeepSeek LLM, an advanced language model comprising 7 billion parameters. It has been trained from scratch on a vast dataset of 2 trillion tokens in both English and Chinese. In order to foster research, we have made DeepSeek LLM 7B/67B Base and DeepSeek LLM 7B/67B Chat open source for the research community.

4K context €0.10 per 1M in 🇪🇺 EU
deepseek llm 7b base
DeepSeek · 7B

Introducing DeepSeek LLM, an advanced language model comprising 7 billion parameters. It has been trained from scratch on a vast dataset of 2 trillion tokens in both English and Chinese. In order to foster research, we have made DeepSeek LLM 7B/67B Base and DeepSeek LLM 7B/67B Chat open source for the research community.

4K context €0.10 per 1M in 🇪🇺 EU
BPO
Z.AI

- Repository: https://github.com/thu-coai/BPO - Paper: https://arxiv.org/abs/2311.04155 - Data: https://huggingface.co/datasets/THUDM/BPO

4K context €0.10 per 1M in 🇪🇺 EU
deepseek coder 33b instruct
DeepSeek · 33B

Deepseek Coder is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language in both English and Chinese. We provide various sizes of the code model, ranging from 1B to 33B versions. Each model is pre-trained on project-level code corpus by employing a window size of 16K and a extra fill-in-the-blank task, to support project-level code completion and infilling. For coding capabilities, Deepseek Coder achieves state-of-the-art performance among open-source code models on multiple programming languages and various

16K context €0.25 per 1M in 🇪🇺 EU
deepseek coder 1.3b instruct
DeepSeek · 1.3B

Deepseek Coder is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language in both English and Chinese. We provide various sizes of the code model, ranging from 1B to 33B versions. Each model is pre-trained on project-level code corpus by employing a window size of 16K and a extra fill-in-the-blank task, to support project-level code completion and infilling. For coding capabilities, Deepseek Coder achieves state-of-the-art performance among open-source code models on multiple programming languages and various

16K context €0.03 per 1M in 🇪🇺 EU
deepseek coder 6.7b instruct
DeepSeek · 6.7B

Deepseek Coder is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language in both English and Chinese. We provide various sizes of the code model, ranging from 1B to 33B versions. Each model is pre-trained on project-level code corpus by employing a window size of 16K and a extra fill-in-the-blank task, to support project-level code completion and infilling. For coding capabilities, Deepseek Coder achieves state-of-the-art performance among open-source code models on multiple programming languages and various

16K context €0.05 per 1M in 🇪🇺 EU
deepseek coder 1.3b base
DeepSeek · 1.3B

Deepseek Coder is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language in both English and Chinese. We provide various sizes of the code model, ranging from 1B to 33B versions. Each model is pre-trained on project-level code corpus by employing a window size of 16K and a extra fill-in-the-blank task, to support project-level code completion and infilling. For coding capabilities, Deepseek Coder achieves state-of-the-art performance among open-source code models on multiple programming languages and various

16K context €0.03 per 1M in 🇪🇺 EU
deepseek coder 6.7b base
DeepSeek · 6.7B

Deepseek Coder is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language in both English and Chinese. We provide various sizes of the code model, ranging from 1B to 33B versions. Each model is pre-trained on project-level code corpus by employing a window size of 16K and a extra fill-in-the-blank task, to support project-level code completion and infilling. For coding capabilities, Deepseek Coder achieves state-of-the-art performance among open-source code models on multiple programming languages and various

16K context €0.05 per 1M in 🇪🇺 EU
Mistral 7B Instruct v0.1
Mistral · 7.2B

py from mistralcommon.tokens.tokenizers.mistral import MistralTokenizer from mistralcommon.protocol.instruct.messages import UserMessage from mistralcommon.protocol.instruct.request import ChatCompletionRequest

33K context €0.10 per 1M in 🇪🇺 EU
Mistral 7B v0.1
Mistral · 7.2B

The Mistral-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters. Mistral-7B-v0.1 outperforms Llama 2 13B on all benchmarks we tested.

33K context €0.10 per 1M in 🇪🇺 EU
phi 1
Microsoft · 1.4B

The language model Phi-1 is a Transformer with 1.3 billion parameters, specialized for basic Python coding. Its training involved a variety of data sources, including subsets of Python codes from The Stack v1.2, Q&A content from StackOverflow, competition code from codecontests, and synthetic Python textbooks and exercises generated by gpt-3.5-turbo-0301. Even though the model and the datasets are relatively small compared to contemporary Large Language Models (LLMs), Phi-1 has demonstrated an impressive accuracy rate exceeding 50% on the simple Python coding benchmark, HumanEval.

2K context €0.03 per 1M in 🇪🇺 EU
phi 1 5
Microsoft · 1.4B

The language model Phi-1.5 is a Transformer with 1.3 billion parameters. It was trained using the same data sources as phi-1, augmented with a new data source that consists of various NLP synthetic texts. When assessed against benchmarks testing common sense, language understanding, and logical reasoning, Phi-1.5 demonstrates a nearly state-of-the-art performance among models with less than 10 billion parameters.

2K context €0.03 per 1M in 🇪🇺 EU
Llama 2 70b chat hf
Meta · 69B

Llama 2 70b chat hf is een open-source taalmodel van Meta met 69B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU
Llama 2 7b chat hf
Meta · 6.7B

Llama 2 7b chat hf is een open-source taalmodel van Meta met 6.7B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU
Llama 2 7b hf
Meta · 6.7B

Llama 2 7b hf is een open-source taalmodel van Meta met 6.7B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU
Llama 2 13b hf
Meta · 13B

Llama 2 13b hf is een open-source taalmodel van Meta met 13B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
Llama 2 13b chat hf
Meta · 13B

Llama 2 13b chat hf is een open-source taalmodel van Meta met 13B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
Llama 2 70b hf
Meta · 69B

Llama 2 70b hf is een open-source taalmodel van Meta met 69B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU
CodeGPT small java
Microsoft

CodeGPT small java is een open-source taalmodel van Microsoft, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
CodeGPT small java adaptedGPT2
Microsoft

CodeGPT small java adaptedGPT2 is een open-source taalmodel van Microsoft, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
CodeGPT small py
Microsoft

CodeGPT small py is een open-source taalmodel van Microsoft, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
DialoGPT large
Microsoft

DialoGPT is a SOTA large-scale pretrained dialogue response generation model for multiturn conversations. The human evaluation results indicate that the response generated from DialoGPT is comparable to human response quality under a single-turn conversation Turing test. The model is trained on 147M multi-turn dialogue from Reddit discussion thread.

€0.10 per 1M in 🇪🇺 EU
DialoGPT medium
Microsoft

DialoGPT is a SOTA large-scale pretrained dialogue response generation model for multiturn conversations. The human evaluation results indicate that the response generated from DialoGPT is comparable to human response quality under a single-turn conversation Turing test. The model is trained on 147M multi-turn dialogue from Reddit discussion thread.

€0.10 per 1M in 🇪🇺 EU
DialoGPT small
Microsoft · 0.2B

DialoGPT is a SOTA large-scale pretrained dialogue response generation model for multiturn conversations. The human evaluation results indicate that the response generated from DialoGPT is comparable to human response quality under a single-turn conversation Turing test. The model is trained on 147M multi-turn dialogue from Reddit discussion thread.

€0.03 per 1M in 🇪🇺 EU
deepseek coder 6.7b
DeepSeek · 6.7B

deepseek coder 6.7b is een open-source taalmodel van DeepSeek met 6.7B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU
deepseek coder 7b instruct v1.5
DeepSeek · 7B

deepseek coder 7b instruct v1.5 is een open-source taalmodel van DeepSeek met 7B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU
DeepSeek Coder V2 Lite (16B)
DeepSeek · 16B

DeepSeek Coder V2 Lite (16B) is een open-source taalmodel van DeepSeek met 16B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.15 per 1M in 🇪🇺 EU
DeepSeek R1 0528 Qwen3 8B
DeepSeek · 8B

DeepSeek R1 0528 Qwen3 8B is een open-source taalmodel van DeepSeek met 8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
DeepSeek R1 Distill 1.5B
DeepSeek · 1.5B

DeepSeek R1 Distill 1.5B is een open-source taalmodel van DeepSeek met 1.5B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
DeepSeek R1 Distill 14B
DeepSeek · 14B

DeepSeek R1 Distill 14B is een open-source taalmodel van DeepSeek met 14B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
DeepSeek R1 Distill 32B
DeepSeek · 32B

DeepSeek R1 Distill 32B is een open-source taalmodel van DeepSeek met 32B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU
DeepSeek R1 Distill Llama 70B
DeepSeek · 70B

DeepSeek R1 Distill Llama 70B is een open-source taalmodel van DeepSeek met 70B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU
DeepSeek R1 Distill Llama 8B
DeepSeek · 8B

DeepSeek R1 Distill Llama 8B is een open-source taalmodel van DeepSeek met 8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU
DeepSeek V3 (685B MoE)
DeepSeek · 685B

DeepSeek V3 (685B MoE) is een open-source taalmodel van DeepSeek met 685B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU
functiongemma 270m
Google

functiongemma 270m is een open-source taalmodel van Google, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU
gemma 3 12b
Google · 12B

gemma 3 12b is een open-source taalmodel van Google met 12B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
Gemma 3 1B
Google · 1B

Gemma 3 1B is een open-source taalmodel van Google met 1B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
Gemma 3 27B
Google · 27B

Gemma 3 27B is een open-source taalmodel van Google met 27B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU
Gemma 3 4B
Google · 4B

Gemma 3 4B is een open-source taalmodel van Google met 4B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
gemma 3n E2B
Google · 2B

gemma 3n E2B is een open-source taalmodel van Google met 2B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU
Llama 3.1 70B
Meta · 70B

Llama 3.1 70B is een open-source taalmodel van Meta met 70B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU
Llama 3.1 8B
Meta · 8B

Llama 3.1 8B is een open-source taalmodel van Meta met 8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
Llama 3.2 11B Vision
Meta · 11B

Llama 3.2 11B Vision is een open-source taalmodel van Meta met 11B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
Llama 3.2 1B
Meta · 1B

Llama 3.2 1B is een open-source taalmodel van Meta met 1B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
Llama 3.2 3B
Meta · 3B

Llama 3.2 3B is een open-source taalmodel van Meta met 3B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
Llama 3.3 70B
Meta · 70B

Llama 3.3 70B is een open-source taalmodel van Meta met 70B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU
Llama 4 Maverick (17Bx128E)
Meta · 17B

Llama 4 Maverick (17Bx128E) is een open-source taalmodel van Meta met 17B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU
Llama 4 Scout (17Bx16E)
Meta · 17B

Llama 4 Scout (17Bx16E) is een open-source taalmodel van Meta met 17B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU
Loes
HostYourAI · 7B

Sovereign EU model fine-tuned by HostYourAI on dutch-clean.

🇪🇺 EU
Loes (EuroLLM-22B)
HostYourAI · 22B

Sovereign EU model fine-tuned by HostYourAI on loes-xl-pre.

🇪🇺 EU
Loes (Qwen3-14B)
HostYourAI · 14B

Sovereign EU model fine-tuned by HostYourAI on loes-xl-pre.

🇪🇺 EU
medgemma 27b
Google · 27B

medgemma 27b is een open-source taalmodel van Google met 27B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU
medgemma 4b
Google · 4B

medgemma 4b is een open-source taalmodel van Google met 4B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU
Mistral Nemo 12B
Mistral · 12B

Mistral Nemo 12B is een open-source taalmodel van Mistral met 12B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
Phi 4 (14B)
Microsoft · 14B

Phi 4 (14B) is een open-source taalmodel van Microsoft met 14B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
Phi 4 Mini (3.8B)
Microsoft · 3.8B

Phi 4 Mini (3.8B) is een open-source taalmodel van Microsoft met 3.8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
Qwen 2.5 14B
Qwen · 14B

Qwen 2.5 14B is een open-source taalmodel van Qwen met 14B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
Qwen 2.5 72B
Qwen · 72B

Qwen 2.5 72B is een open-source taalmodel van Qwen met 72B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU
Qwen 2.5 Coder 1.5B
Qwen · 1.5B

Qwen 2.5 Coder 1.5B is een open-source taalmodel van Qwen met 1.5B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
Qwen 2.5 Coder 32B
Qwen · 32B

Qwen 2.5 Coder 32B is een open-source taalmodel van Qwen met 32B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU
Qwen 2.5 Coder 7B
Qwen · 7B

Qwen 2.5 Coder 7B is een open-source taalmodel van Qwen met 7B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU
Qwen 3 0.6B
Qwen · 0.6B

Qwen 3 0.6B is een open-source taalmodel van Qwen met 0.6B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
Qwen 3 4B
Qwen · 4B

Qwen 3 4B is een open-source taalmodel van Qwen met 4B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
Qwen 3 Coder 30B-A3B (MoE)
Qwen · 30B

Qwen 3 Coder 30B-A3B (MoE) is een open-source taalmodel van Qwen met 30B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU
Qwen2.5 32B
Qwen · 32B

Qwen2.5 32B is een open-source taalmodel van Qwen met 32B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU
Qwen2.5 7B
Qwen · 7B

Qwen2.5 7B is een open-source taalmodel van Qwen met 7B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
Qwen2.5 Coder 14B
Qwen · 14B

Qwen2.5 Coder 14B is een open-source taalmodel van Qwen met 14B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.15 per 1M in 🇪🇺 EU
Qwen2.5 VL 32B
Qwen · 32B

Qwen2.5 VL 32B is een open-source taalmodel van Qwen met 32B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU
Qwen2.5 VL 3B
Qwen · 3B

Qwen2.5 VL 3B is een open-source taalmodel van Qwen met 3B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU
Qwen2.5 VL 72B
Qwen · 72B

Qwen2.5 VL 72B is een open-source taalmodel van Qwen met 72B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU
Qwen2.5 VL 7B
Qwen · 7B

Qwen2.5 VL 7B is een open-source taalmodel van Qwen met 7B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
Qwen3 1.7B
Qwen · 1.7B

Qwen3 1.7B is een open-source taalmodel van Qwen met 1.7B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
Qwen3 14B
Qwen · 14B

Qwen3 14B is een open-source taalmodel van Qwen met 14B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
Qwen3 30B A3B
Qwen · 30B

Qwen3 30B A3B is een open-source taalmodel van Qwen met 30B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU
Qwen3 30B A3B Instruct 2507
Qwen · 30B

Qwen3 30B A3B Instruct 2507 is een open-source taalmodel van Qwen met 30B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU
Qwen3 32B
Qwen · 32B

Qwen3 32B is een open-source taalmodel van Qwen met 32B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU
Qwen3 4B Instruct 2507
Qwen · 4B

Qwen3 4B Instruct 2507 is een open-source taalmodel van Qwen met 4B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU
Qwen3 4B Thinking 2507
Qwen · 4B

Qwen3 4B Thinking 2507 is een open-source taalmodel van Qwen met 4B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU
Qwen3 8B
Qwen · 8B

Qwen3 8B is een open-source taalmodel van Qwen met 8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
Qwen3 Coder Next
Qwen

Qwen3 Coder Next is een open-source taalmodel van Qwen, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU
Qwen3 VL 2B
Qwen · 2B

Qwen3 VL 2B is een open-source taalmodel van Qwen met 2B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
Qwen3 VL 30B A3B
Qwen · 30B

Qwen3 VL 30B A3B is een open-source taalmodel van Qwen met 30B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU
Qwen3 VL 32B
Qwen · 32B

Qwen3 VL 32B is een open-source taalmodel van Qwen met 32B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU
Qwen3 VL 4B
Qwen · 4B

Qwen3 VL 4B is een open-source taalmodel van Qwen met 4B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU
Qwen3 VL 8B
Qwen · 8B

Qwen3 VL 8B is een open-source taalmodel van Qwen met 8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
Qwen3 VL 8B Thinking
Qwen · 8B

Qwen3 VL 8B Thinking is een open-source taalmodel van Qwen met 8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU
Qwen3Guard Gen 0.6B
Qwen · 0.6B

Qwen3Guard Gen 0.6B is een open-source taalmodel van Qwen met 0.6B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU
translategemma 4b
Google · 4B

translategemma 4b is een open-source taalmodel van Google met 4B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU
Geen modellen gevonden. Pas je zoekopdracht of filters aan.

Host. Route. Ship.

Geen creditcard nodig. Betaal naar gebruik, stop wanneer je wilt.

Begin vandaag gratis met hosten