Open-source LLM hosting in de EU — modelcatalogus

GLM 5.2

Z.AI · 753B

We're introducing GLM-5.2, our latest flagship model for long-horizon tasks. It marks a substantial leap in long-horizon task capability over its predecessor GLM-5.1 and, for the first time, delivers that capability on a solid 1M-token context. GLM-5.2's new capabilities include: - Solid 1M Context: A solid 1M-token context that stably sustains long-horizon work - Advanced Coding with Flexible Effort: Stronger coding capabilities with multiple thinking effort levels to balance performance and latency - Improved Architecture: We propose IndexShare, which reuses the same indexer across every fou

1M context €0.40 per 1M in 🇪🇺 EU

GLM 5.2 FP8

Z.AI · 753B

We're introducing GLM-5.2, our latest flagship model for long-horizon tasks. It marks a substantial leap in long-horizon task capability over its predecessor GLM-5.1 and, for the first time, delivers that capability on a solid 1M-token context. GLM-5.2's new capabilities include: - Solid 1M Context: A solid 1M-token context that stably sustains long-horizon work - Advanced Coding with Flexible Effort: Stronger coding capabilities with multiple thinking effort levels to balance performance and latency - Improved Architecture: We propose IndexShare, which reuses the same indexer across every fou

1M context €0.40 per 1M in 🇪🇺 EU

FastContext 1.0 4B RL

Microsoft · 4B

FastContext-1.0 is a lightweight repository-exploration subagent for LLM coding agents. Instead of letting a single model both explore the repository and solve the task, FastContext separates these two roles: it is invoked on demand by a main coding agent, issues parallel read-only tool calls (READ, GLOB, GREP), and returns compact file paths and line ranges as focused context.

262K context €0.05 per 1M in 🇪🇺 EU

FastContext 1.0 4B SFT

Microsoft · 4B

FastContext-1.0 is a lightweight repository-exploration subagent for LLM coding agents. Instead of letting a single model both explore the repository and solve the task, FastContext separates these two roles: it is invoked on demand by a main coding agent, issues parallel read-only tool calls (READ, GLOB, GREP), and returns compact file paths and line ranges as focused context.

262K context €0.05 per 1M in 🇪🇺 EU

gemma 4 31B it qat w4a16 ct

Google · 34B

[!Note] This model card is for the new versions of the Gemma 4 family optimized with Quantization-Aware Training (QAT), which allows preserving similar quality to bfloat16 while dramatically reducing the memory requirements to load the model. Four versions of the QAT checkpoints are available: Unquantized QAT checkpoints (Q40): Half-precision weights extracted from the QAT pipeline, ideal for custom downstream compilation and research. Available for Gemma 4 E2B, E4B, 12B, 26B A4B, and 31B, and their drafter models. GGUF (Q40): Ready-to-deploy formats for broad ecosystem compatibility. Availabl

€0.25 per 1M in multimodal 🇪🇺 EU

gemma 4 26B A4B it qat q4 0 unquantized

Google · 27B

[!Note] This model card is for the new versions of the Gemma 4 family optimized with Quantization-Aware Training (QAT), which allows preserving similar quality to bfloat16 while dramatically reducing the memory requirements to load the model. Four versions of the QAT checkpoints are available: Unquantized QAT checkpoints (Q40): Half-precision weights extracted from the QAT pipeline, ideal for custom downstream compilation and research. Available for Gemma 4 E2B, E4B, 12B, 26B A4B, and 31B, and their drafter models. GGUF (Q40): Ready-to-deploy formats for broad ecosystem compatibility. Availabl

€0.25 per 1M in multimodal 🇪🇺 EU

gemma 4 31B it qat q4 0 unquantized

Google · 33B

[!Note] This model card is for the new versions of the Gemma 4 family optimized with Quantization-Aware Training (QAT), which allows preserving similar quality to bfloat16 while dramatically reducing the memory requirements to load the model. Four versions of the QAT checkpoints are available: Unquantized QAT checkpoints (Q40): Half-precision weights extracted from the QAT pipeline, ideal for custom downstream compilation and research. Available for Gemma 4 E2B, E4B, 12B, 26B A4B, and 31B, and their drafter models. GGUF (Q40): Ready-to-deploy formats for broad ecosystem compatibility. Availabl

€0.25 per 1M in multimodal 🇪🇺 EU

DeepSeek V4 Pro

DeepSeek · 862B

We present a preview version of DeepSeek-V4 series, including two strong Mixture-of-Experts (MoE) language models — DeepSeek-V4-Pro with 1.6T parameters (49B activated) and DeepSeek-V4-Flash with 284B parameters (13B activated) — both supporting a context length of one million tokens.

1M context €0.40 per 1M in 🇪🇺 EU

DeepSeek V4 Flash

DeepSeek · 158B

We present a preview version of DeepSeek-V4 series, including two strong Mixture-of-Experts (MoE) language models — DeepSeek-V4-Pro with 1.6T parameters (49B activated) and DeepSeek-V4-Flash with 284B parameters (13B activated) — both supporting a context length of one million tokens.

1M context €0.40 per 1M in 🇪🇺 EU

Qwen3.6 27B FP8

Qwen · 28B

[!Note] This repository contains FP8-quantized model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc. The quantization method is fine-grained fp8 quantization with block size of 128, and its performance metrics are nearly identical to those of the original model.

€0.25 per 1M in multimodal 🇪🇺 EU

Qwen3.6 27B

Qwen · 28B

[!Note] This repository contains model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc.

€0.25 per 1M in multimodal 🇪🇺 EU

Qwen3.6 35B A3B FP8

Qwen · 36B

[!Note] This repository contains FP8-quantized model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc. The quantization method is fine-grained fp8 quantization with block size of 128, and its performance metrics are nearly identical to those of the original model.

€0.40 per 1M in multimodal 🇪🇺 EU

Qwen3.6 35B A3B

Qwen · 36B

[!Note] This repository contains model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc.

€0.40 per 1M in multimodal 🇪🇺 EU

GLM 5.1 FP8

Z.AI · 754B

GLM-5.1 is our next-generation flagship model for agentic engineering, with significantly stronger coding capabilities than its predecessor. It achieves state-of-the-art performance on SWE-Bench Pro and leads GLM-5 by a wide margin on NL2Repo (repo generation) and Terminal-Bench 2.0 (real-world terminal tasks).

203K context €0.40 per 1M in 🇪🇺 EU

GLM 5.1

Z.AI · 754B

GLM-5.1 is our next-generation flagship model for agentic engineering, with significantly stronger coding capabilities than its predecessor. It achieves state-of-the-art performance on SWE-Bench Pro and leads GLM-5 by a wide margin on NL2Repo (repo generation) and Terminal-Bench 2.0 (real-world terminal tasks).

203K context €0.40 per 1M in 🇪🇺 EU

gemma 4 31B

Google · 33B

Gemma is a family of open models built by Google DeepMind. Gemma 4 models are multimodal, handling text and image input (with audio supported on small models) and generating text output. This release includes open-weights models in both pre-trained and instruction-tuned variants. Gemma 4 features a context window of up to 256K tokens and maintains multilingual support in over 140 languages.

€0.25 per 1M in multimodal 🇪🇺 EU

gemma 4 26B A4B

Google · 27B

Gemma is a family of open models built by Google DeepMind. Gemma 4 models are multimodal, handling text and image input (with audio supported on small models) and generating text output. This release includes open-weights models in both pre-trained and instruction-tuned variants. Gemma 4 features a context window of up to 256K tokens and maintains multilingual support in over 140 languages.

€0.25 per 1M in multimodal 🇪🇺 EU

gemma 4 26B A4B it

Google · 27B

Gemma is a family of open models built by Google DeepMind. Gemma 4 models are multimodal, handling text and image input (with audio supported on small models) and generating text output. This release includes open-weights models in both pre-trained and instruction-tuned variants. Gemma 4 features a context window of up to 256K tokens and maintains multilingual support in over 140 languages.

€0.25 per 1M in multimodal 🇪🇺 EU

gemma 4 31B it

Google · 33B

Gemma is a family of open models built by Google DeepMind. Gemma 4 models are multimodal, handling text and image input (with audio supported on small models) and generating text output. This release includes open-weights models in both pre-trained and instruction-tuned variants. Gemma 4 features a context window of up to 256K tokens and maintains multilingual support in over 140 languages.

€0.25 per 1M in multimodal 🇪🇺 EU

Qwen3.5 35B A3B GPTQ Int4

Qwen · 36B

[!Note] This repository contains int4-quantized model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc.

€0.40 per 1M in multimodal 🇪🇺 EU

Qwen3.5 27B GPTQ Int4

Qwen · 28B

[!Note] This repository contains int4-quantized model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc.

€0.25 per 1M in multimodal 🇪🇺 EU

Qwen3.5 122B A10B GPTQ Int4

Qwen · 125B

[!Note] This repository contains int4-quantized model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc.

€0.40 per 1M in multimodal 🇪🇺 EU

Qwen3.5 0.8B Base

Qwen · 0.9B

[!Note] This repository contains model weights and configuration files for the pre-trained only model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, etc. The intended use cases are fine-tuning, in-context learning experiments, and other research or development purposes, not direct interaction. However, the control tokens, e.g., <|imstart| and <|imend| were trained to allow efficient LoRA-style PEFT with the official chat template, mitigating the need to finetune embeddings, a significant optimization given Qwen3.5's larger

€0.03 per 1M in multimodal 🇪🇺 EU

Qwen3.5 0.8B

Qwen · 0.9B

[!Note] This repository contains model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc. In light of its parameter scale, the intended use cases are prototyping, task-specific fine-tuning, and other research or development purposes.

€0.03 per 1M in multimodal 🇪🇺 EU

Qwen3.5 2B

Qwen · 2.3B

[!Note] This repository contains model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc. In light of its parameter scale, the intended use cases are prototyping, task-specific fine-tuning, and other research or development purposes.

€0.03 per 1M in multimodal 🇪🇺 EU

Qwen3.5 4B

Qwen · 4.7B

[!Note] This repository contains model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc.

€0.05 per 1M in multimodal 🇪🇺 EU

Qwen3.5 9B

Qwen · 9.7B

[!Note] This repository contains model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc.

€0.10 per 1M in multimodal 🇪🇺 EU

Qwen3.5 35B A3B FP8

Qwen · 36B

[!Note] This repository contains FP8-quantized model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc. The quantization method is fine-grained fp8 quantization with block size of 128, and its performance metrics are nearly identical to those of the original model.

€0.40 per 1M in multimodal 🇪🇺 EU

Qwen3.5 27B FP8

Qwen · 28B

[!Note] This repository contains FP8-quantized model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc. The quantization method is fine-grained fp8 quantization with block size of 128, and its performance metrics are nearly identical to those of the original model.

€0.25 per 1M in multimodal 🇪🇺 EU

Qwen3.5 122B A10B FP8

Qwen · 125B

[!Note] This repository contains FP8-quantized model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc. The quantization method is fine-grained fp8 quantization with block size of 128, and its performance metrics are nearly identical to those of the original model.

€0.40 per 1M in multimodal 🇪🇺 EU

Qwen3.5 122B A10B

Qwen · 125B

[!Note] This repository contains model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc.

€0.40 per 1M in multimodal 🇪🇺 EU

Qwen3.5 27B

Qwen · 28B

[!Note] This repository contains model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc.

€0.25 per 1M in multimodal 🇪🇺 EU

Qwen3.5 35B A3B

Qwen · 36B

[!Note] This repository contains model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc.

€0.40 per 1M in multimodal 🇪🇺 EU

Qwen3.5 397B A17B FP8

Qwen · 403B

[!Note] This repository contains FP8-quantized model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc. The quantization method is fine-grained fp8 quantization with block size of 128, and its performance metrics are nearly identical to those of the original model.

€0.40 per 1M in multimodal 🇪🇺 EU

Qwen3.5 397B A17B

Qwen · 403B

[!Note] This repository contains model weights and configuration files for the post-trained model in the Hugging Face Transformers format. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc.

€0.40 per 1M in multimodal 🇪🇺 EU

GLM 5

Z.AI · 754B

We are launching GLM-5, targeting complex systems engineering and long-horizon agentic tasks. Scaling is still one of the most important ways to improve the intelligence efficiency of Artificial General Intelligence (AGI). Compared to GLM-4.5, GLM-5 scales from 355B parameters (32B active) to 744B parameters (40B active), and increases pre-training data from 23T to 28.5T tokens. GLM-5 also integrates DeepSeek Sparse Attention (DSA), largely reducing deployment cost while preserving long-context capacity.

203K context €0.40 per 1M in 🇪🇺 EU

GLM 5 FP8

Z.AI · 754B

We are launching GLM-5, targeting complex systems engineering and long-horizon agentic tasks. Scaling is still one of the most important ways to improve the intelligence efficiency of Artificial General Intelligence (AGI). Compared to GLM-4.5, GLM-5 scales from 355B parameters (32B active) to 744B parameters (40B active), and increases pre-training data from 23T to 28.5T tokens. GLM-5 also integrates DeepSeek Sparse Attention (DSA), largely reducing deployment cost while preserving long-context capacity.

203K context €0.40 per 1M in 🇪🇺 EU

X Reasoner 7B

Microsoft · 8.3B

We introduce X-Reasoner, a vision-language model posttrained solely on general-domain text for generalizable reasoning, using a twostage approach: an initial supervised fine-tuning phase with distilled long chainof-thoughts, followed by reinforcement learning with verifiable rewards. Experiments show that X-Reasoner successfully transfers reasoning capabilities to both multimodal and out-of-domain settings, outperforming existing state-of-theart models trained with in-domain and multimodal data across various general and medical benchmarks. More details can be found in the paper: X-Reasoner: T

128K context €0.10 per 1M in multimodal 🇪🇺 EU

Qwen3 Coder Next FP8

Qwen · 80B

Today, we're announcing Qwen3-Coder-Next-FP8, an open-weight language model designed specifically for coding agents and local development. It features the following key enhancements:

262K context €0.40 per 1M in 🇪🇺 EU

Qwen3 Coder Next

Qwen · 80B

Today, we're announcing Qwen3-Coder-Next, an open-weight language model designed specifically for coding agents and local development. It features the following key enhancements:

262K context €0.40 per 1M in 🇪🇺 EU

GLM OCR

Z.AI · 1.3B

GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture. It introduces Multi-Token Prediction (MTP) loss and stable full-task reinforcement learning to improve training efficiency, recognition accuracy, and generalization. The model integrates the CogViT visual encoder pre-trained on large-scale image–text data, a lightweight cross-modal connector with efficient token downsampling, and a GLM-0.5B language decoder. Combined with a two-stage pipeline of layout analysis and parallel recognition based on PP-DocLayout-V3, GLM-OCR deliver

€0.03 per 1M in multimodal 🇪🇺 EU

DeepSeek OCR 2

DeepSeek · 3.4B

Inference using Huggingface transformers on NVIDIA GPUs. Requirements tested on python 3.12.9 + CUDA11.8：

8K context €0.03 per 1M in multimodal 🇪🇺 EU

GLM 4.7 Flash

Z.AI · 31B

GLM-4.7-Flash is a 30B-A3B MoE model. As the strongest model in the 30B class, GLM-4.7-Flash offers a new option for lightweight deployment that balances performance and efficiency.

203K context €0.25 per 1M in 🇪🇺 EU

translategemma 27b it

Google · 29B

translategemma 27b it is een multimodaal taalmodel van Google met 29B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in multimodal 🇪🇺 EU

translategemma 12b it

Google · 13B

translategemma 12b it is een multimodaal taalmodel van Google met 13B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in multimodal 🇪🇺 EU

medgemma 1.5 4b it

Google · 4.3B

medgemma 1.5 4b it is een multimodaal taalmodel van Google met 4.3B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in multimodal 🇪🇺 EU

FrogBoss 32B 2510

Microsoft · 32B

FrogBoss is built on the Qwen3-32B transformer architecture with a maximum context length of 64k tokens. The model uses multi-turn debugging workflows and complex code reasoning. Unlike general-purpose LLMs, FrogBoss is specialized for software engineering tasks.

41K context €0.25 per 1M in 🇪🇺 EU

GLM 4.7 FP8

Z.AI · 358B

GLM-4.7, your new coding partner, is coming with the following features:

203K context €0.40 per 1M in 🇪🇺 EU

GLM 4.7

Z.AI · 358B

GLM-4.7, your new coding partner, is coming with the following features:

203K context €0.40 per 1M in 🇪🇺 EU

OptiMind SFT

Microsoft · 21B

OptiMind-SFT is a specialized 20B parameter model designed to bridge the gap between natural language and executable optimization solvers. It automates the translation of complex decision-making problems—such as supply chain planning, scheduling, and resource allocation—into correct MILP formulations.

131K context €0.25 per 1M in 🇪🇺 EU

AutoGLM Phone 9B Multilingual

Z.AI · 0B

⚠️ This project is intended for research and educational purposes only. Any use for illegal data access, system interference, or unlawful activities is strictly prohibited. Please review our Terms of Use carefully.

€0.03 per 1M in multimodal 🇪🇺 EU

AutoGLM Phone 9B

Z.AI · 0B

⚠️ This project is intended for research and educational purposes only. Any use for illegal data access, system interference, or unlawful activities is strictly prohibited. Please review our Terms of Use carefully.

€0.03 per 1M in multimodal 🇪🇺 EU

GLM 4.6V FP8

Z.AI · 108B

This model is part of the GLM-V family of models, introduced in the paper GLM-4.1V-Thinking and GLM-4.5V: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning.

€0.40 per 1M in multimodal 🇪🇺 EU

GLM 4.6V

Z.AI · 108B

This model is part of the GLM-V family of models, introduced in the paper GLM-4.1V-Thinking and GLM-4.5V: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning.

€0.40 per 1M in multimodal 🇪🇺 EU

GLM 4.6V Flash

Z.AI · 10B

This model is part of the GLM-V family of models, introduced in the paper GLM-4.1V-Thinking and GLM-4.5V: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning.

€0.10 per 1M in multimodal 🇪🇺 EU

DeepSeek V3.2

DeepSeek · 685B

We introduce DeepSeek-V3.2, a model that harmonizes high computational efficiency with superior reasoning and agent performance. Our approach is built upon three key technical breakthroughs:

164K context €0.40 per 1M in 🇪🇺 EU

Ministral 3 3B Instruct 2512 ONNX

Mistral · 3B

[!Tip] This model was contributed by Xenova from Hugging Face. We sincerely appreciate the integration and community collaboration. While preliminary functionality checks have been performed, comprehensive testing has not yet been completed. We recommend you to proceed with caution and conducting your own evaluations for specific use cases. If any issues arise, open a PR/Issue here and we will try to address them promptly.

€0.03 per 1M in multimodal 🇪🇺 EU

WebVIA Agent

Z.AI · 10B

- Repository: https://github.com/zheny2751-dotcom/WebVIA - Paper: https://arxiv.org/pdf/2511.06251

€0.10 per 1M in multimodal 🇪🇺 EU

UI2Code N

Z.AI

- Repository: https://github.com/zai-org/UI2CodeN - Paper: https://arxiv.org/abs/2511.08195

€0.10 per 1M in multimodal 🇪🇺 EU

Fara 7B

Microsoft · 8.3B

Description: Fara-7B is Microsoft's first agentic small language model (SLM) designed specifically for computer use. With only 7 billion parameters, Fara-7B is an ultra-compact Computer Use Agent (CUA) that achieves state-of-the-art performance within its size class and is competitive with larger, more resource-intensive agentic systems.

128K context €0.10 per 1M in multimodal 🇪🇺 EU

Glyph

Z.AI · 10B

- Repository: https://github.com/thu-coai/Glyph - Paper: https://arxiv.org/abs/2510.17800

€0.10 per 1M in multimodal 🇪🇺 EU

Qwen3 VL 2B Instruct

Qwen · 2.1B

Meet Qwen3-VL — the most powerful vision-language model in the Qwen series to date.

€0.03 per 1M in multimodal 🇪🇺 EU

Qwen3 VL 32B Instruct FP8

Qwen · 33B

This repository contains an FP8 quantized version of the Qwen3-VL-32B-Instruct model. The quantization method is fine-grained fp8 quantization with block size of 128, and its performance metrics are nearly identical to those of the original BF16 model. Enjoy!

€0.25 per 1M in multimodal 🇪🇺 EU

Qwen3 VL 32B Thinking FP8

Qwen · 33B

This repository contains an FP8 quantized version of the Qwen3-VL-32B-Thinking model. The quantization method is fine-grained fp8 quantization with block size of 128, and its performance metrics are nearly identical to those of the original BF16 model. Enjoy!

€0.25 per 1M in multimodal 🇪🇺 EU

Qwen3 VL 32B Instruct

Qwen · 33B

Meet Qwen3-VL — the most powerful vision-language model in the Qwen series to date.

€0.25 per 1M in multimodal 🇪🇺 EU

DeepSeek OCR

DeepSeek · 3.3B

torch==2.6.0 transformers==4.46.3 tokenizers==0.20.3 einops addict easydict pip install flash-attn==2.7.3 --no-build-isolation

8K context €0.03 per 1M in multimodal 🇪🇺 EU

Qwen3 VL 8B Instruct FP8

Qwen · 8.8B

This repository contains an FP8 quantized version of the Qwen3-VL-8B-Instruct model. The quantization method is fine-grained fp8 quantization with block size of 128, and its performance metrics are nearly identical to those of the original BF16 model. Enjoy!

€0.10 per 1M in multimodal 🇪🇺 EU

Qwen3 VL 4B Instruct FP8

Qwen · 4.8B

This repository contains an FP8 quantized version of the Qwen3-VL-4B-Instruct model. The quantization method is fine-grained fp8 quantization with block size of 128, and its performance metrics are nearly identical to those of the original BF16 model. Enjoy!

€0.05 per 1M in multimodal 🇪🇺 EU

Qwen3 VL 8B Instruct

Qwen · 8.8B

Meet Qwen3-VL — the most powerful vision-language model in the Qwen series to date.

€0.10 per 1M in multimodal 🇪🇺 EU

Qwen3 VL 4B Instruct

Qwen · 4.4B

Meet Qwen3-VL — the most powerful vision-language model in the Qwen series to date.

€0.05 per 1M in multimodal 🇪🇺 EU

Qwen3 VL 30B A3B Instruct FP8

Qwen · 31B

This repository contains an FP8 quantized version of the Qwen3-VL-30B-A3B-Instruct model. The quantization method is fine-grained fp8 quantization with block size of 128, and its performance metrics are nearly identical to those of the original BF16 model. Enjoy!

€0.25 per 1M in multimodal 🇪🇺 EU

UserLM 8b

Microsoft · 8B

Unlike typical LLMs that are trained to play the role of the "assistant" in conversation, we trained UserLM-8b to simulate the “user” role in conversation (by training it to predict user turns in a large corpus of conversations called WildChat). This model is useful in simulating more realistic conversations, which is in turn useful in the development of more robust assistants.

8K context €0.10 per 1M in 🇪🇺 EU

Qwen3 VL 30B A3B Instruct

Qwen · 31B

Meet Qwen3-VL — the most powerful vision-language model in the Qwen series to date.

€0.25 per 1M in multimodal 🇪🇺 EU

GLM 4.6

Z.AI · 357B

Compared with GLM-4.5, GLM-4.6 brings several key improvements:

203K context €0.40 per 1M in 🇪🇺 EU

GLM 4.6 FP8

Z.AI · 358B

Compared with GLM-4.5, GLM-4.6 brings several key improvements:

203K context €0.40 per 1M in 🇪🇺 EU

DeepSeek V3.2 Exp

DeepSeek · 685B

We are excited to announce the official release of DeepSeek-V3.2-Exp, an experimental version of our model. As an intermediate step toward our next-generation architecture, V3.2-Exp builds upon V3.1-Terminus by introducing DeepSeek Sparse Attention—a sparse attention mechanism designed to explore and validate optimizations for training and inference efficiency in long-context scenarios.

164K context €0.40 per 1M in 🇪🇺 EU

DeepSeek V3.1 Terminus

DeepSeek · 685B

This update maintains the model's original capabilities while addressing issues reported by users, including:

164K context €0.40 per 1M in 🇪🇺 EU

Qwen3 VL 235B A22B Instruct

Qwen · 236B

Meet Qwen3-VL — the most powerful vision-language model in the Qwen series to date.

€0.40 per 1M in multimodal 🇪🇺 EU

gpt oss safeguard 20b

OpenAI · 22B

gpt-oss-safeguard-120b and gpt-oss-safeguard-20b are safety reasoning models built-upon gpt-oss. With these models, you can classify text content based on safety policies that you provide and perform a suite of foundational safety tasks. These models are intended for safety use cases. For other applications, we recommend using gpt-oss models.

131K context €0.25 per 1M in 🇪🇺 EU

gpt oss safeguard 120b

OpenAI · 120B

gpt-oss-safeguard-120b and gpt-oss-safeguard-20b are safety reasoning models built-upon gpt-oss. With these models, you can classify text content based on safety policies that you provide and perform a suite of foundational safety tasks. These models are intended for safety use cases. For other applications, we recommend using gpt-oss models.

131K context €0.40 per 1M in 🇪🇺 EU

DeepSeek V3.1

DeepSeek · 685B

DeepSeek-V3.1 is a hybrid model that supports both thinking mode and non-thinking mode. Compared to the previous version, this upgrade brings improvements in multiple aspects:

164K context €0.40 per 1M in 🇪🇺 EU

DeepSeek V3.1 Base

DeepSeek · 685B

DeepSeek-V3.1 is a hybrid model that supports both thinking mode and non-thinking mode. Compared to the previous version, this upgrade brings improvements in multiple aspects:

164K context €0.40 per 1M in 🇪🇺 EU

GLM 4.5V

Z.AI · 108B

This model is part of the GLM-V family of models, introduced in the paper GLM-4.1V-Thinking and GLM-4.5V: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning.

€0.40 per 1M in multimodal 🇪🇺 EU

GLM 4.5V FP8

Z.AI · 108B

Vision-language models (VLMs) have become a key cornerstone of intelligent systems. As real-world AI tasks grow increasingly complex, VLMs urgently need to enhance reasoning capabilities beyond basic multimodal perception — improving accuracy, comprehensiveness, and intelligence — to enable complex problem solving, long-context understanding, and multimodal agents.

€0.40 per 1M in multimodal 🇪🇺 EU

Qwen3 4B Instruct 2507 FP8

Qwen · 4.4B

We introduce the updated version of the Qwen3-4B-FP8 non-thinking mode, named Qwen3-4B-Instruct-2507-FP8, featuring the following key enhancements:

262K context €0.05 per 1M in 🇪🇺 EU

Qwen3 4B Instruct 2507

Qwen · 4B

We introduce the updated version of the Qwen3-4B non-thinking mode, named Qwen3-4B-Instruct-2507, featuring the following key enhancements:

262K context €0.05 per 1M in 🇪🇺 EU

gpt oss 20b

OpenAI · 22B

Welcome to the gpt-oss series, OpenAI’s open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases.

131K context €0.25 per 1M in 🇪🇺 EU

gpt oss 120b

OpenAI · 120B

Welcome to the gpt-oss series, OpenAI’s open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases.

131K context €0.40 per 1M in 🇪🇺 EU

Qwen3 Coder 30B A3B Instruct FP8

Qwen · 31B

Qwen3-Coder is available in multiple sizes. Today, we're excited to introduce Qwen3-Coder-30B-A3B-Instruct-FP8. This streamlined model maintains impressive performance and efficiency, featuring the following key enhancements:

262K context €0.25 per 1M in 🇪🇺 EU

Qwen3 Coder 30B A3B Instruct

Qwen · 31B

Qwen3-Coder is available in multiple sizes. Today, we're excited to introduce Qwen3-Coder-30B-A3B-Instruct. This streamlined model maintains impressive performance and efficiency, featuring the following key enhancements:

262K context €0.25 per 1M in 🇪🇺 EU

GLM 4.5 Base

Z.AI · 358B

The GLM-4.5 series models are foundation models designed for intelligent agents. GLM-4.5 has 355 billion total parameters with 32 billion active parameters, while GLM-4.5-Air adopts a more compact design with 106 billion total parameters and 12 billion active parameters. GLM-4.5 models unify reasoning, coding, and intelligent agent capabilities to meet the complex demands of intelligent agent applications.

131K context €0.40 per 1M in 🇪🇺 EU

GLM 4.5 Air FP8

Z.AI · 111B

The GLM-4.5 series models are foundation models designed for intelligent agents. GLM-4.5 has 355 billion total parameters with 32 billion active parameters, while GLM-4.5-Air adopts a more compact design with 106 billion total parameters and 12 billion active parameters. GLM-4.5 models unify reasoning, coding, and intelligent agent capabilities to meet the complex demands of intelligent agent applications.

131K context €0.40 per 1M in 🇪🇺 EU

GLM 4.5 FP8

Z.AI · 358B

We present GLM-4.5, an open-source Mixture-of-Experts (MoE) large language model with 355B total parameters and 32B activated parameters, featuring a hybrid reasoning method that supports both thinking and direct response modes. Through multi-stage training on 23T tokens and comprehensive post-training with expert model iteration and reinforcement learning, GLM-4.5 achieves strong performance across agentic, reasoning, and coding (ARC) tasks, scoring 70.1% on TAU-Bench, 91.0% on AIME 24, and 64.2% on SWE-bench Verified. With much fewer parameters than several competitors, GLM-4.5 ranks 3rd ove

131K context €0.40 per 1M in 🇪🇺 EU

GLM 4.5 Air

Z.AI · 110B

The GLM-4.5 series models are foundation models designed for intelligent agents. GLM-4.5 has 355 billion total parameters with 32 billion active parameters, while GLM-4.5-Air adopts a more compact design with 106 billion total parameters and 12 billion active parameters. GLM-4.5 models unify reasoning, coding, and intelligent agent capabilities to meet the complex demands of intelligent agent applications.

131K context €0.40 per 1M in 🇪🇺 EU

GLM 4.5

Z.AI · 358B

The GLM-4.5 series models are foundation models designed for intelligent agents. GLM-4.5 has 355 billion total parameters with 32 billion active parameters, while GLM-4.5-Air adopts a more compact design with 106 billion total parameters and 12 billion active parameters. GLM-4.5 models unify reasoning, coding, and intelligent agent capabilities to meet the complex demands of intelligent agent applications.

131K context €0.40 per 1M in 🇪🇺 EU

GLM 4.5 Air Base

Z.AI · 110B

The GLM-4.5 series models are foundation models designed for intelligent agents. GLM-4.5 has 355 billion total parameters with 32 billion active parameters, while GLM-4.5-Air adopts a more compact design with 106 billion total parameters and 12 billion active parameters. GLM-4.5 models unify reasoning, coding, and intelligent agent capabilities to meet the complex demands of intelligent agent applications.

131K context €0.40 per 1M in 🇪🇺 EU

MediPhi Instruct

Microsoft · 3.8B

The MediPhi Model Collection comprises 7 small language models of 3.8B parameters from the base model Phi-3.5-mini-instruct specialized in the medical and clinical domains. The collection is designed in a modular fashion. Five MediPhi experts are fine-tuned on various medical corpora (i.e. PubMed commercial, Medical Wikipedia, Medical Guidelines, Medical Coding, and open-source clinical documents) and merged back with the SLERP method in their base model to conserve general abilities. One model combined all five experts into one general expert with the multi-model merging method BreadCrumbs. F

131K context €0.05 per 1M in 🇪🇺 EU

Dayhoff 3b GR HM c

Microsoft · 3B

Dayhoff is an Atlas of both protein sequence data and generative language models — a centralized resource that brings together 3.34 billion protein sequences across 1.7 billion clusters of metagenomic and natural protein sequences (GigaRef), 46 million structure-derived synthetic sequences (BackboneRef), and 16 million multiple sequence alignments (OpenProteinSet). These models can natively predict zero-shot mutation effects on fitness, scaffold structural motifs by conditioning on evolutionary or structural context, and perform guided generation of novel proteins within specified families. Le

262K context €0.03 per 1M in 🇪🇺 EU

GLM 4.1V 9B Thinking

Z.AI · 10B

Vision-Language Models (VLMs) have become foundational components of intelligent systems. As real-world AI tasks grow increasingly complex, VLMs must evolve beyond basic multimodal perception to enhance their reasoning capabilities in complex tasks. This involves improving accuracy, comprehensiveness, and intelligence, enabling applications such as complex problem solving, long-context understanding, and multimodal agents.

€0.10 per 1M in multimodal 🇪🇺 EU

GLM 4.1V 9B Base

Z.AI · 10B

Vision-Language Models (VLMs) have become foundational components of intelligent systems. As real-world AI tasks grow increasingly complex, VLMs must evolve beyond basic multimodal perception to enhance their reasoning capabilities in complex tasks. This involves improving accuracy, comprehensiveness, and intelligence, enabling applications such as complex problem solving, long-context understanding, and multimodal agents.

€0.10 per 1M in multimodal 🇪🇺 EU

Phi tiny MoE instruct

Microsoft · 3.8B

Phi-tiny-MoE is a lightweight Mixture of Experts (MoE) model with 3.8B total parameters and 1.1B activated parameters. It is compressed and distilled from the base model shared by Phi-3.5-MoE and GRIN-MoE using the SlimMoE approach, then post-trained via supervised fine-tuning and direct preference optimization for instruction following and safety. The model is trained on Phi-3 synthetic data and filtered public documents, with a focus on high-quality, reasoning-dense content. It is part of the SlimMoE series, which includes a larger variant, Phi-mini-MoE, with 7.6B total and 2.4B activated pa

4K context €0.05 per 1M in 🇪🇺 EU

Phi mini MoE instruct

Microsoft · 7.6B

Phi-mini-MoE is a lightweight Mixture of Experts (MoE) model with 7.6B total parameters and 2.4B activated parameters. It is compressed and distilled from the base model shared by Phi-3.5-MoE and GRIN-MoE using the SlimMoE approach, then post-trained via supervised fine-tuning and direct preference optimization for instruction following and safety. The model is trained on Phi-3 synthetic data and filtered public documents, with a focus on high-quality, reasoning-dense content. It is part of the SlimMoE series, which includes a smaller variant, Phi-tiny-MoE, with 3.8B total and 1.1B activated p

4K context €0.10 per 1M in 🇪🇺 EU

gemma 3n E2B it

Google · 5.4B

gemma 3n E2B it is een multimodaal taalmodel van Google met 5.4B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in multimodal 🇪🇺 EU

gemma 3n E4B it

Google · 7.8B

gemma 3n E4B it is een multimodaal taalmodel van Google met 7.8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in multimodal 🇪🇺 EU

GUI Actor Verifier 2B

Microsoft · 2.2B

This model was introduced in the paper GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents. It is developed based on UI-TARS-2B-SFT and is designed to predict the correctness of an action position given a language instruction. This model is well-suited for GUI-Actor, as its attention map effectively provides diverse candidates for verification with only a single inference.

33K context €0.03 per 1M in multimodal 🇪🇺 EU

DeepSeek R1 0528 Qwen3 8B

DeepSeek · 8.2B

The DeepSeek R1 model has undergone a minor version upgrade, with the current version being DeepSeek-R1-0528. In the latest update, DeepSeek R1 has significantly improved its depth of reasoning and inference capabilities by leveraging increased computational resources and introducing algorithmic optimization mechanisms during post-training. The model has demonstrated outstanding performance across various benchmark evaluations, including mathematics, programming, and general logic. Its overall performance is now approaching that of leading models, such as O3 and Gemini 2.5 Pro.

131K context €0.10 per 1M in 🇪🇺 EU

DeepSeek R1 0528

DeepSeek · 685B

The DeepSeek R1 model has undergone a minor version upgrade, with the current version being DeepSeek-R1-0528. In the latest update, DeepSeek R1 has significantly improved its depth of reasoning and inference capabilities by leveraging increased computational resources and introducing algorithmic optimization mechanisms during post-training. The model has demonstrated outstanding performance across various benchmark evaluations, including mathematics, programming, and general logic. Its overall performance is now approaching that of leading models, such as O3 and Gemini 2.5 Pro.

164K context €0.40 per 1M in 🇪🇺 EU

medgemma 27b text it

Google · 27B

medgemma 27b text it is een open-source taalmodel van Google met 27B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU

medgemma 4b it

Google · 4.3B

medgemma 4b it is een multimodaal taalmodel van Google met 4.3B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in multimodal 🇪🇺 EU

Qwen3 14B AWQ

Qwen · 15B

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

41K context €0.15 per 1M in 🇪🇺 EU

Phi 4 mini reasoning

Microsoft · 3.8B

Phi-4-mini-reasoning is a lightweight open model built upon synthetic data with a focus on high-quality, reasoning dense data further finetuned for more advanced math reasoning capabilities. The model belongs to the Phi-4 model family and supports 128K token context length.

131K context €0.05 per 1M in 🇪🇺 EU

Qwen3 0.6B FP8

Qwen · 0.8B

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

41K context €0.03 per 1M in 🇪🇺 EU

Qwen3 1.7B Base

Qwen · 1.7B

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Building upon extensive advancements in training data, model architecture, and optimization techniques, Qwen3 delivers the following key improvements over the previously released Qwen2.5:

33K context €0.03 per 1M in 🇪🇺 EU

Qwen3 4B Base

Qwen · 4B

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Building upon extensive advancements in training data, model architecture, and optimization techniques, Qwen3 delivers the following key improvements over the previously released Qwen2.5:

33K context €0.05 per 1M in 🇪🇺 EU

Qwen3 235B A22B

Qwen · 235B

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

41K context €0.40 per 1M in 🇪🇺 EU

Qwen3 32B

Qwen · 33B

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

41K context €0.25 per 1M in 🇪🇺 EU

Qwen3 30B A3B

Qwen · 31B

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

41K context €0.25 per 1M in 🇪🇺 EU

Qwen3 14B

Qwen · 15B

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

41K context €0.15 per 1M in 🇪🇺 EU

Qwen3 8B

Qwen · 8.2B

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

41K context €0.10 per 1M in 🇪🇺 EU

Qwen3 4B

Qwen · 4B

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

41K context €0.05 per 1M in 🇪🇺 EU

Qwen3 1.7B

Qwen · 2B

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

41K context €0.03 per 1M in 🇪🇺 EU

Qwen3 0.6B

Qwen · 0.8B

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

41K context €0.03 per 1M in 🇪🇺 EU

Llama Guard 4 12B

Meta · 12B

Llama Guard 4 12B is een multimodaal taalmodel van Meta met 12B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in multimodal 🇪🇺 EU

Phi 4 reasoning plus

Microsoft · 15B

[!IMPORTANT] To fully take advantage of the model's capabilities, inference must use temperature=0.8, topk=50, topp=0.95, and dosample=True. For more complex queries, set maxnewtokens=32768 to allow for longer chain-of-thought (CoT).

33K context €0.15 per 1M in 🇪🇺 EU

Phi 4 reasoning

Microsoft · 15B

[!IMPORTANT] To fully take advantage of the model's capabilities, inference must use temperature=0.8, topk=50, topp=0.95, and dosample=True. For more complex queries, set maxnewtokens=32768 to allow for longer chain-of-thought (CoT).

33K context €0.15 per 1M in 🇪🇺 EU

gemma 3 12b it qat q4 0 unquantized

Google · 12B

gemma 3 12b it qat q4 0 unquantized is een multimodaal taalmodel van Google met 12B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in multimodal 🇪🇺 EU

GLM Z1 32B 0414

Z.AI · 33B

The GLM family welcomes a new generation of open-source models, the GLM-4-32B-0414 series, featuring 32 billion parameters. Its performance is comparable to OpenAI's GPT series and DeepSeek's V3/R1 series, and it supports very user-friendly local deployment features. GLM-4-32B-Base-0414 was pre-trained on 15T of high-quality data, including a large amount of reasoning-type synthetic data, laying the foundation for subsequent reinforcement learning extensions. In the post-training stage, in addition to human preference alignment for dialogue scenarios, we also enhanced the model's performance i

33K context €0.25 per 1M in 🇪🇺 EU

GLM Z1 9B 0414

Z.AI · 9.4B

The GLM family welcomes a new generation of open-source models, the GLM-4-32B-0414 series, featuring 32 billion parameters. Its performance is comparable to OpenAI's GPT series and DeepSeek's V3/R1 series, and it supports very user-friendly local deployment features. GLM-4-32B-Base-0414 was pre-trained on 15T of high-quality data, including a large amount of reasoning-type synthetic data, laying the foundation for subsequent reinforcement learning extensions. In the post-training stage, in addition to human preference alignment for dialogue scenarios, we also enhanced the model's performance i

33K context €0.10 per 1M in 🇪🇺 EU

GLM 4 32B 0414

Z.AI · 33B

The GLM family welcomes new members, the GLM-4-32B-0414 series models, featuring 32 billion parameters. Its performance is comparable to OpenAI’s GPT series and DeepSeek’s V3/R1 series. It also supports very user-friendly local deployment features. GLM-4-32B-Base-0414 was pre-trained on 15T of high-quality data, including substantial reasoning-type synthetic data. This lays the foundation for subsequent reinforcement learning extensions. In the post-training stage, we employed human preference alignment for dialogue scenarios. Additionally, using techniques like rejection sampling and reinforc

33K context €0.25 per 1M in 🇪🇺 EU

GLM 4 9B 0414

Z.AI · 9.4B

The GLM family welcomes new members, the GLM-4-32B-0414 series models, featuring 32 billion parameters. Its performance is comparable to OpenAI’s GPT series and DeepSeek’s V3/R1 series. It also supports very user-friendly local deployment features. GLM-4-32B-Base-0414 was pre-trained on 15T of high-quality data, including substantial reasoning-type synthetic data. This lays the foundation for subsequent reinforcement learning extensions. In the post-training stage, we employed human preference alignment for dialogue scenarios. Additionally, using techniques like rejection sampling and reinforc

33K context €0.10 per 1M in 🇪🇺 EU

Llama 4 Maverick 17B 128E

Meta · 402B

Llama 4 Maverick 17B 128E is een multimodaal taalmodel van Meta met 402B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in multimodal 🇪🇺 EU

Llama 4 Scout 17B 16E

Meta · 109B

Llama 4 Scout 17B 16E is een multimodaal taalmodel van Meta met 109B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in multimodal 🇪🇺 EU

Llama 4 Scout 17B 16E Instruct

Meta · 109B

Llama 4 Scout 17B 16E Instruct is een multimodaal taalmodel van Meta met 109B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in multimodal 🇪🇺 EU

Llama 4 Maverick 17B 128E Instruct

Meta · 402B

Llama 4 Maverick 17B 128E Instruct is een multimodaal taalmodel van Meta met 402B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in multimodal 🇪🇺 EU

Llama 4 Maverick 17B 128E Instruct FP8

Meta · 402B

Llama 4 Maverick 17B 128E Instruct FP8 is een multimodaal taalmodel van Meta met 402B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in multimodal 🇪🇺 EU

DeepSeek V3 0324

DeepSeek · 685B

DeepSeek-V3-0324 demonstrates notable improvements over its predecessor, DeepSeek-V3, in several key aspects.

164K context €0.40 per 1M in 🇪🇺 EU

txgemma 9b chat

Google · 9.2B

txgemma 9b chat is een open-source taalmodel van Google met 9.2B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

txgemma 2b predict

Google · 2.6B

txgemma 2b predict is een open-source taalmodel van Google met 2.6B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

Qwen2.5 VL 32B Instruct

Qwen · 33B

In the past five months since Qwen2-VL’s release, numerous developers have built new models on the Qwen2-VL vision-language models, providing us with valuable feedback. During this period, we focused on building more useful vision-language models. Today, we are excited to introduce the latest addition to the Qwen family: Qwen2.5-VL.

128K context €0.25 per 1M in multimodal 🇪🇺 EU

gemma 3 1b it

Google · 1B

gemma 3 1b it is een open-source taalmodel van Google met 1B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

gemma 3 12b it

Google · 12B

gemma 3 12b it is een multimodaal taalmodel van Google met 12B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in multimodal 🇪🇺 EU

gemma 3 12b pt

Google · 12B

gemma 3 12b pt is een multimodaal taalmodel van Google met 12B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in multimodal 🇪🇺 EU

gemma 3 27b it

Google · 27B

gemma 3 27b it is een multimodaal taalmodel van Google met 27B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in multimodal 🇪🇺 EU

gemma 3 27b pt

Google · 27B

gemma 3 27b pt is een multimodaal taalmodel van Google met 27B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in multimodal 🇪🇺 EU

gemma 3 1b pt

Google · 1B

gemma 3 1b pt is een open-source taalmodel van Google met 1B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

gemma 3 4b it

Google · 4.3B

gemma 3 4b it is een multimodaal taalmodel van Google met 4.3B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in multimodal 🇪🇺 EU

gemma 3 4b pt

Google · 4.3B

gemma 3 4b pt is een multimodaal taalmodel van Google met 4.3B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in multimodal 🇪🇺 EU

Phi 4 mini instruct

Microsoft · 3.8B

🎉Phi-4: [mini-reasoning | reasoning] | [multimodal-instruct | onnx]; [mini-instruct | onnx]

131K context €0.05 per 1M in 🇪🇺 EU

Qwen2.5 VL 7B Instruct AWQ

Qwen · 8.3B

In the past five months since Qwen2-VL’s release, numerous developers have built new models on the Qwen2-VL vision-language models, providing us with valuable feedback. During this period, we focused on building more useful vision-language models. Today, we are excited to introduce the latest addition to the Qwen family: Qwen2.5-VL.

128K context €0.10 per 1M in multimodal 🇪🇺 EU

Qwen2.5 VL 3B Instruct AWQ

Qwen · 3.8B

--- licensename: qwen-research licenselink: https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct-AWQ/blob/main/LICENSE language: - en pipelinetag: image-text-to-text tags: - multimodal libraryname: transformers basemodel: - Qwen/Qwen2.5-VL-3B-Instruct ---

128K context €0.05 per 1M in multimodal 🇪🇺 EU

Qwen2.5 VL 72B Instruct

Qwen · 73B

--- license: other licensename: qwen licenselink: https://huggingface.co/Qwen/Qwen2.5-VL-72B-Instruct/blob/main/LICENSE language: - en pipelinetag: image-text-to-text tags: - multimodal libraryname: transformers ---

128K context €0.40 per 1M in multimodal 🇪🇺 EU

Qwen2.5 VL 7B Instruct

Qwen · 8.3B

--- license: apache-2.0 language: - en pipelinetag: image-text-to-text tags: - multimodal libraryname: transformers ---

128K context €0.10 per 1M in multimodal 🇪🇺 EU

Qwen2.5 VL 3B Instruct

Qwen · 3.8B

--- licensename: qwen-research licenselink: https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct/blob/main/LICENSE language: - en pipelinetag: image-text-to-text tags: - multimodal libraryname: transformers ---

128K context €0.05 per 1M in multimodal 🇪🇺 EU

DeepSeek R1 Distill Qwen 32B

DeepSeek · 33B

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorpor

131K context €0.25 per 1M in 🇪🇺 EU

DeepSeek R1 Distill Qwen 7B

DeepSeek · 7.6B

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorpor

131K context €0.10 per 1M in 🇪🇺 EU

DeepSeek R1 Distill Llama 70B

DeepSeek · 71B

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorpor

131K context €0.40 per 1M in 🇪🇺 EU

DeepSeek R1 Distill Llama 8B

DeepSeek · 8B

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorpor

131K context €0.10 per 1M in 🇪🇺 EU

DeepSeek R1 Distill Qwen 1.5B

DeepSeek · 1.8B

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorpor

131K context €0.03 per 1M in 🇪🇺 EU

DeepSeek R1

DeepSeek · 685B

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorpor

164K context €0.40 per 1M in 🇪🇺 EU

DeepSeek R1 Zero

DeepSeek · 685B

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorpor

164K context €0.40 per 1M in 🇪🇺 EU

glm 4 9b hf

Z.AI · 9.4B

If you are using the weights from this repository, please update to

8K context €0.10 per 1M in 🇪🇺 EU

DeepSeek V3

DeepSeek · 685B

We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which were thoroughly validated in DeepSeek-V2. Furthermore, DeepSeek-V3 pioneers an auxiliary-loss-free strategy for load balancing and sets a multi-token prediction training objective for stronger performance. We pre-train DeepSeek-V3 on 14.8 trillion diverse and high-quality tokens, followed by Supervised Fine-Tuning

164K context €0.40 per 1M in 🇪🇺 EU

cogagent 9b 20241220

Z.AI · 14B

The CogAgent-9B-20241220 model is based on GLM-4V-9B, a bilingual open-source VLM base model. Through data collection and optimization, multi-stage training, and strategy improvements, CogAgent-9B-20241220 achieves significant advancements in GUI perception, inference prediction accuracy, action space completeness, and task generalizability. The model supports bilingual (Chinese and English) interaction with both screenshots and language input.

€0.10 per 1M in multimodal 🇪🇺 EU

phi 4

Microsoft · 15B

Our training data is an extension of the data used for Phi-3 and includes a wide variety of sources from:

16K context €0.15 per 1M in 🇪🇺 EU

Llama 3.3 70B Instruct

Meta · 71B

Llama 3.3 70B Instruct is een open-source taalmodel van Meta met 71B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU

glm edge v 5b

Z.AI · 4.9B

Install the transformers library from the source code:

4K context €0.05 per 1M in multimodal 🇪🇺 EU

glm edge v 2b

Z.AI · 2.1B

Install the transformers library from the source code:

4K context €0.03 per 1M in multimodal 🇪🇺 EU

paligemma2 28b mix 448

Google · 28B

paligemma2 28b mix 448 is een multimodaal taalmodel van Google met 28B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in multimodal 🇪🇺 EU

paligemma2 3b pt 448

Google · 3B

paligemma2 3b pt 448 is een multimodaal taalmodel van Google met 3B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in multimodal 🇪🇺 EU

paligemma2 3b pt 224

Google · 3B

paligemma2 3b pt 224 is een multimodaal taalmodel van Google met 3B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in multimodal 🇪🇺 EU

paligemma2 3b ft docci 448

Google · 3B

paligemma2 3b ft docci 448 is een multimodaal taalmodel van Google met 3B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in multimodal 🇪🇺 EU

paligemma2 3b mix 224

Google · 3B

paligemma2 3b mix 224 is een multimodaal taalmodel van Google met 3B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in multimodal 🇪🇺 EU

glm edge 4b chat

Z.AI · 4.3B

Install the transformers library from the source code:

8K context €0.05 per 1M in 🇪🇺 EU

glm edge 1.5b chat

Z.AI · 1.6B

Install the transformers library from the source code:

8K context €0.03 per 1M in 🇪🇺 EU

Qwen2.5 Coder 14B Instruct AWQ

Qwen · 15B

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). As of now, Qwen2.5-Coder has covered six mainstream model sizes, 0.5, 1.5, 3, 7, 14, 32 billion parameters, to meet the needs of different developers. Qwen2.5-Coder brings the following improvements upon CodeQwen1.5:

33K context €0.15 per 1M in 🇪🇺 EU

Qwen2.5 Coder 32B Instruct AWQ

Qwen · 33B

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). As of now, Qwen2.5-Coder has covered six mainstream model sizes, 0.5, 1.5, 3, 7, 14, 32 billion parameters, to meet the needs of different developers. Qwen2.5-Coder brings the following improvements upon CodeQwen1.5:

33K context €0.25 per 1M in 🇪🇺 EU

Qwen2.5 Coder 32B Instruct

Qwen · 33B

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). As of now, Qwen2.5-Coder has covered six mainstream model sizes, 0.5, 1.5, 3, 7, 14, 32 billion parameters, to meet the needs of different developers. Qwen2.5-Coder brings the following improvements upon CodeQwen1.5:

33K context €0.25 per 1M in 🇪🇺 EU

Qwen2.5 Coder 14B Instruct

Qwen · 15B

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). As of now, Qwen2.5-Coder has covered six mainstream model sizes, 0.5, 1.5, 3, 7, 14, 32 billion parameters, to meet the needs of different developers. Qwen2.5-Coder brings the following improvements upon CodeQwen1.5:

33K context €0.15 per 1M in 🇪🇺 EU

glm 4 9b chat 1m hf

Z.AI · 9.5B

If you are using the weights from this repository, please update to

1M context €0.10 per 1M in 🇪🇺 EU

glm 4 9b chat hf

Z.AI · 9.4B

If you are using the weights from this repository, please update to

131K context €0.10 per 1M in 🇪🇺 EU

OmniParser

Microsoft

This model hub includes a finetuned version of YOLOv8 and a finetuned BLIP-2 model on the above dataset respectively. For more details of the models used and finetuning, please refer to the paper.

€0.10 per 1M in multimodal 🇪🇺 EU

gemma 2 2b jpn it

Google · 2.6B

gemma 2 2b jpn it is een open-source taalmodel van Google met 2.6B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

Llama Guard 3 1B

Meta · 1.5B

Llama Guard 3 1B is een open-source taalmodel van Meta met 1.5B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

Llama Guard 3 11B Vision

Meta · 11B

Llama Guard 3 11B Vision is een multimodaal taalmodel van Meta met 11B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in multimodal 🇪🇺 EU

Llama 3.2 90B Vision Instruct

Meta · 89B

Llama 3.2 90B Vision Instruct is een multimodaal taalmodel van Meta met 89B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in multimodal 🇪🇺 EU

Llama 3.2 90B Vision

Meta · 89B

Llama 3.2 90B Vision is een multimodaal taalmodel van Meta met 89B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in multimodal 🇪🇺 EU

Llama 3.2 11B Vision Instruct

Meta · 11B

Llama 3.2 11B Vision Instruct is een multimodaal taalmodel van Meta met 11B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in multimodal 🇪🇺 EU

Llama 3.2 11B Vision

Meta · 11B

Llama 3.2 11B Vision is een multimodaal taalmodel van Meta met 11B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in multimodal 🇪🇺 EU

Llama 3.2 3B

Meta · 3.2B

Llama 3.2 3B is een open-source taalmodel van Meta met 3.2B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

Llama 3.2 3B Instruct

Meta · 3.2B

Llama 3.2 3B Instruct is een open-source taalmodel van Meta met 3.2B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

Llama 3.2 1B Instruct

Meta · 1.2B

Llama 3.2 1B Instruct is een open-source taalmodel van Meta met 1.2B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

Llama 3.2 1B

Meta · 1.2B

Llama 3.2 1B is een open-source taalmodel van Meta met 1.2B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

Qwen2.5 Coder 1.5B Instruct

Qwen · 1.5B

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). As of now, Qwen2.5-Coder has covered six mainstream model sizes, 0.5, 1.5, 3, 7, 14, 32 billion parameters, to meet the needs of different developers. Qwen2.5-Coder brings the following improvements upon CodeQwen1.5:

33K context €0.03 per 1M in 🇪🇺 EU

Qwen2.5 1.5B Instruct

Qwen · 1.5B

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

33K context €0.03 per 1M in 🇪🇺 EU

Qwen2.5 3B Instruct

Qwen · 3.1B

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

33K context €0.03 per 1M in 🇪🇺 EU

Qwen2.5 72B Instruct AWQ

Qwen · 73B

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

33K context €0.40 per 1M in 🇪🇺 EU

Qwen2.5 32B Instruct AWQ

Qwen · 33B

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

33K context €0.25 per 1M in 🇪🇺 EU

Qwen2.5 14B Instruct AWQ

Qwen · 15B

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

33K context €0.15 per 1M in 🇪🇺 EU

Qwen2.5 7B Instruct AWQ

Qwen · 7.6B

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

33K context €0.10 per 1M in 🇪🇺 EU

Qwen2.5 1.5B Instruct AWQ

Qwen · 1.8B

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

33K context €0.03 per 1M in 🇪🇺 EU

Qwen2.5 Coder 7B Instruct

Qwen · 7.6B

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). As of now, Qwen2.5-Coder has covered six mainstream model sizes, 0.5, 1.5, 3, 7, 14, 32 billion parameters, to meet the needs of different developers. Qwen2.5-Coder brings the following improvements upon CodeQwen1.5:

33K context €0.10 per 1M in 🇪🇺 EU

Qwen2.5 32B Instruct

Qwen · 33B

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

33K context €0.25 per 1M in 🇪🇺 EU

Qwen2.5 14B Instruct

Qwen · 15B

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

33K context €0.15 per 1M in 🇪🇺 EU

Qwen2.5 7B Instruct

Qwen · 7.6B

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

33K context €0.10 per 1M in 🇪🇺 EU

Qwen2.5 0.5B Instruct

Qwen · 0.5B

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

33K context €0.03 per 1M in 🇪🇺 EU

Qwen2.5 1.5B

Qwen · 1.5B

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

131K context €0.03 per 1M in 🇪🇺 EU

Qwen2.5 0.5B

Qwen · 0.5B

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

33K context €0.03 per 1M in 🇪🇺 EU

gemma 7b aps it

Google · 8.5B

gemma 7b aps it is een open-source taalmodel van Google met 8.5B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

DeepSeek Coder V2 Instruct 0724

DeepSeek · 236B

In standard benchmark evaluations, DeepSeek-Coder-V2 achieves superior performance compared to closed-source models such as GPT4-Turbo, Claude 3 Opus, and Gemini 1.5 Pro in coding and math benchmarks. The list of supported programming languages can be found here.

164K context €0.40 per 1M in 🇪🇺 EU

DeepSeek V2.5

DeepSeek · 236B

DeepSeek-V2.5 is an upgraded version that combines DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct. The new model integrates the general and coding abilities of the two previous versions. For model details, please visit DeepSeek-V2 page for more information.

164K context €0.40 per 1M in 🇪🇺 EU

Qwen2 VL 7B Instruct AWQ

Qwen · 8.3B

We're excited to unveil Qwen2-VL, the latest iteration of our Qwen-VL model, representing nearly a year of innovation.

33K context €0.10 per 1M in multimodal 🇪🇺 EU

Qwen2 VL 7B Instruct

Qwen · 8.3B

We're excited to unveil Qwen2-VL, the latest iteration of our Qwen-VL model, representing nearly a year of innovation.

33K context €0.10 per 1M in multimodal 🇪🇺 EU

Qwen2 VL 2B Instruct

Qwen · 2.2B

We're excited to unveil Qwen2-VL, the latest iteration of our Qwen-VL model, representing nearly a year of innovation.

33K context €0.03 per 1M in multimodal 🇪🇺 EU

Phi 3.5 MoE instruct

Microsoft · 42B

Phi-3.5-MoE is a lightweight, state-of-the-art open model built upon datasets used for Phi-3 - synthetic data and filtered publicly available documents - with a focus on very high-quality, reasoning dense data. The model supports multilingual and comes with 128K context length (in tokens). The model underwent a rigorous enhancement process, incorporating supervised fine-tuning, proximal policy optimization, and direct preference optimization to ensure precise instruction adherence and robust safety measures.

131K context €0.40 per 1M in 🇪🇺 EU

Phi 3.5 vision instruct

Microsoft · 4.1B

Phi-3.5-vision is a lightweight, state-of-the-art open multimodal model built upon datasets which include - synthetic data and filtered publicly available websites - with a focus on very high-quality, reasoning dense data both on text and vision. The model belongs to the Phi-3 model family, and the multimodal version comes with 128K context length (in tokens) it can support. The model underwent a rigorous enhancement process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures.

131K context €0.05 per 1M in multimodal 🇪🇺 EU

Phi 3.5 mini instruct

Microsoft · 3.8B

🎉Phi-4: [multimodal-instruct | onnx]; [mini-instruct | onnx]

131K context €0.05 per 1M in 🇪🇺 EU

LongWriter glm4 9b

Z.AI · 9.4B

LongWriter-glm4-9b is trained based on glm-4-9b, and is capable of generating 10,000+ words at once.

€0.10 per 1M in 🇪🇺 EU

LongWriter llama3.1 8b

Z.AI · 8B

LongWriter-llama3.1-8b is trained based on Meta-Llama-3.1-8B, and is capable of generating 10,000+ words at once.

131K context €0.10 per 1M in 🇪🇺 EU

gemma 2b AWQ

Google · 3B

AWQ quantized version of google/gemma-2b.

8K context €0.03 per 1M in 🇪🇺 EU

Llama Guard 3 8B

Meta · 8B

Llama Guard 3 8B is een open-source taalmodel van Meta met 8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

Llama Guard 3 8B INT8

Meta · 8B

Llama Guard 3 8B INT8 is een open-source taalmodel van Meta met 8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

Llama 3.1 405B FP8

Meta · 406B

Llama 3.1 405B FP8 is een open-source taalmodel van Meta met 406B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU

Llama 3.1 405B Instruct FP8

Meta · 406B

Llama 3.1 405B Instruct FP8 is een open-source taalmodel van Meta met 406B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU

Llama 3.1 8B Instruct

Meta · 8B

Llama 3.1 8B Instruct is een open-source taalmodel van Meta met 8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

DeepSeek V2 Chat 0628

DeepSeek · 236B

DeepSeek-V2-Chat-0628 is an improved version of DeepSeek-V2-Chat. For model details, please visit DeepSeek-V2 page for more information.

164K context €0.40 per 1M in 🇪🇺 EU

shieldgemma 9b

Google · 9.2B

shieldgemma 9b is een open-source taalmodel van Google met 9.2B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

shieldgemma 2b

Google · 2.6B

shieldgemma 2b is een open-source taalmodel van Google met 2.6B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

Llama 3.1 405B Instruct

Meta · 406B

Llama 3.1 405B Instruct is een open-source taalmodel van Meta met 406B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU

Llama 3.1 70B Instruct

Meta · 71B

Llama 3.1 70B Instruct is een open-source taalmodel van Meta met 71B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU

gemma 2 2b it

Google · 2.6B

gemma 2 2b it is een open-source taalmodel van Google met 2.6B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

gemma 2 2b

Google · 2.6B

gemma 2 2b is een open-source taalmodel van Google met 2.6B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

Llama 3.1 405B

Meta · 406B

Llama 3.1 405B is een open-source taalmodel van Meta met 406B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU

Llama 3.1 70B

Meta · 71B

Llama 3.1 70B is een open-source taalmodel van Meta met 71B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU

Llama 3.1 8B

Meta · 8B

Llama 3.1 8B is een open-source taalmodel van Meta met 8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

codegeex4 all 9b

Z.AI · 9.4B

We introduce CodeGeeX4-ALL-9B, the open-source version of the latest CodeGeeX4 model series. It is a multilingual code generation model continually trained on the GLM-4-9B, significantly enhancing its code generation capabilities. Using a single CodeGeeX4-ALL-9B model, it can support comprehensive functions such as code completion and generation, code interpreter, web search, function call, repository-level code Q&A, covering various scenarios of software development. CodeGeeX4-ALL-9B has achieved highly competitive performance on public benchmarks, such as BigCodeBench and NaturalCodeBench. I

€0.10 per 1M in 🇪🇺 EU

gemma 2 9b

Google · 9.2B

gemma 2 9b is een open-source taalmodel van Google met 9.2B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

gemma 2 9b it

Google · 9.2B

gemma 2 9b it is een open-source taalmodel van Google met 9.2B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

gemma 2 27b

Google · 27B

gemma 2 27b is een open-source taalmodel van Google met 27B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU

gemma 2 27b it

Google · 27B

gemma 2 27b it is een open-source taalmodel van Google met 27B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU

Florence 2 base ft

Microsoft · 0.2B

This Hub repository contains a HuggingFace's transformers implementation of Florence-2 model from Microsoft.

€0.03 per 1M in multimodal 🇪🇺 EU

Florence 2 large ft

Microsoft · 0.8B

This Hub repository contains a HuggingFace's transformers implementation of Florence-2 model from Microsoft.

€0.03 per 1M in multimodal 🇪🇺 EU

Florence 2 base

Microsoft · 0.2B

This Hub repository contains a HuggingFace's transformers implementation of Florence-2 model from Microsoft.

€0.03 per 1M in multimodal 🇪🇺 EU

Florence 2 large

Microsoft · 0.8B

This is a continued pretrained version of Florence-2-large model with 4k context length, only 0.1B samples are used for continue pretraining, thus it might not be trained well. In addition, OCR task has been updated with line separator ('\n'). COCO OD AP 39.8

€0.03 per 1M in multimodal 🇪🇺 EU

DeepSeek Coder V2 Lite Instruct

DeepSeek · 16B

In standard benchmark evaluations, DeepSeek-Coder-V2 achieves superior performance compared to closed-source models such as GPT4-Turbo, Claude 3 Opus, and Gemini 1.5 Pro in coding and math benchmarks. The list of supported programming languages can be found here.

164K context €0.15 per 1M in 🇪🇺 EU

DeepSeek Coder V2 Lite Base

DeepSeek · 16B

In standard benchmark evaluations, DeepSeek-Coder-V2 achieves superior performance compared to closed-source models such as GPT4-Turbo, Claude 3 Opus, and Gemini 1.5 Pro in coding and math benchmarks. The list of supported programming languages can be found here.

164K context €0.15 per 1M in 🇪🇺 EU

DeepSeek Coder V2 Instruct

DeepSeek · 236B

In standard benchmark evaluations, DeepSeek-Coder-V2 achieves superior performance compared to closed-source models such as GPT4-Turbo, Claude 3 Opus, and Gemini 1.5 Pro in coding and math benchmarks. The list of supported programming languages can be found here.

164K context €0.40 per 1M in 🇪🇺 EU

Qwen2 7B Instruct

Qwen · 7.6B

Qwen2 is the new series of Qwen large language models. For Qwen2, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters, including a Mixture-of-Experts model. This repo contains the instruction-tuned 7B Qwen2 model.

33K context €0.10 per 1M in 🇪🇺 EU

glm 4 9b

Z.AI · 9.4B

2024/08/12, 本仓库代码已更新并使用 transformers=4.44.0, 请及时更新依赖。

€0.10 per 1M in 🇪🇺 EU

Qwen2 1.5B Instruct

Qwen · 1.5B

Qwen2 is the new series of Qwen large language models. For Qwen2, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters, including a Mixture-of-Experts model. This repo contains the instruction-tuned 1.5B Qwen2 model.

33K context €0.03 per 1M in 🇪🇺 EU

Qwen2 0.5B

Qwen · 0.5B

Qwen2 is the new series of Qwen large language models. For Qwen2, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters, including a Mixture-of-Experts model. This repo contains the 0.5B Qwen2 base language model.

131K context €0.03 per 1M in 🇪🇺 EU

Phi 3 vision 128k instruct

Microsoft · 4.1B

🎉 Phi-3.5: [[mini-instruct]](https://huggingface.co/microsoft/Phi-3.5-mini-instruct); [[MoE-instruct]](https://huggingface.co/microsoft/Phi-3.5-MoE-instruct) ; [[vision-instruct]](https://huggingface.co/microsoft/Phi-3.5-vision-instruct)

131K context €0.05 per 1M in 🇪🇺 EU

DeepSeek V2 Lite Chat

DeepSeek · 16B

Last week, the release and buzz around DeepSeek-V2 have ignited widespread interest in MLA (Multi-head Latent Attention)! Many in the community suggested open-sourcing a smaller MoE model for in-depth research. And now DeepSeek-V2-Lite comes out:

164K context €0.15 per 1M in 🇪🇺 EU

DeepSeek V2 Lite

DeepSeek · 16B

Last week, the release and buzz around DeepSeek-V2 have ignited widespread interest in MLA (Multi-head Latent Attention)! Many in the community suggested open-sourcing a smaller MoE model for in-depth research. And now DeepSeek-V2-Lite comes out:

164K context €0.15 per 1M in 🇪🇺 EU

paligemma 3b ft cococap 448

Google · 2.9B

paligemma 3b ft cococap 448 is een multimodaal taalmodel van Google met 2.9B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in multimodal 🇪🇺 EU

paligemma 3b pt 448

Google · 2.9B

paligemma 3b pt 448 is een multimodaal taalmodel van Google met 2.9B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in multimodal 🇪🇺 EU

paligemma 3b mix 224

Google · 2.9B

paligemma 3b mix 224 is een multimodaal taalmodel van Google met 2.9B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in multimodal 🇪🇺 EU

paligemma 3b pt 224

Google · 2.9B

paligemma 3b pt 224 is een multimodaal taalmodel van Google met 2.9B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in multimodal 🇪🇺 EU

Phi 3 small 128k instruct

Microsoft · 7.4B

🎉 Phi-3.5: [[mini-instruct]](https://huggingface.co/microsoft/Phi-3.5-mini-instruct); [[MoE-instruct]](https://huggingface.co/microsoft/Phi-3.5-MoE-instruct) ; [[vision-instruct]](https://huggingface.co/microsoft/Phi-3.5-vision-instruct)

131K context €0.10 per 1M in 🇪🇺 EU

Phi 3 small 8k instruct

Microsoft · 7.4B

🎉 Phi-3.5: [[mini-instruct]](https://huggingface.co/microsoft/Phi-3.5-mini-instruct); [[MoE-instruct]](https://huggingface.co/microsoft/Phi-3.5-MoE-instruct) ; [[vision-instruct]](https://huggingface.co/microsoft/Phi-3.5-vision-instruct)

8K context €0.10 per 1M in 🇪🇺 EU

Phi 3 medium 128k instruct

Microsoft · 14B

🎉 Phi-3.5: [[mini-instruct]](https://huggingface.co/microsoft/Phi-3.5-mini-instruct); [[MoE-instruct]](https://huggingface.co/microsoft/Phi-3.5-MoE-instruct) ; [[vision-instruct]](https://huggingface.co/microsoft/Phi-3.5-vision-instruct)

131K context €0.15 per 1M in 🇪🇺 EU

Phi 3 medium 4k instruct

Microsoft · 14B

🎉 Phi-3.5: [[mini-instruct]](https://huggingface.co/microsoft/Phi-3.5-mini-instruct); [[MoE-instruct]](https://huggingface.co/microsoft/Phi-3.5-MoE-instruct) ; [[vision-instruct]](https://huggingface.co/microsoft/Phi-3.5-vision-instruct)

4K context €0.15 per 1M in 🇪🇺 EU

DeepSeek V2 Chat

DeepSeek · 236B

Due to the constraints of HuggingFace, the open-source code currently experiences slower performance than our internal codebase when running on GPUs with Huggingface. To facilitate the efficient execution of our model, we offer a dedicated vllm solution that optimizes performance for running our model effectively.

164K context €0.40 per 1M in 🇪🇺 EU

Phi 3 mini 128k instruct

Microsoft · 3.8B

🎉Phi-4: [multimodal-instruct | onnx]; [mini-instruct | onnx]

131K context €0.05 per 1M in 🇪🇺 EU

Phi 3 mini 4k instruct

Microsoft · 3.8B

🎉 Phi-3.5: [[mini-instruct]](https://huggingface.co/microsoft/Phi-3.5-mini-instruct); [[MoE-instruct]](https://huggingface.co/microsoft/Phi-3.5-MoE-instruct) ; [[vision-instruct]](https://huggingface.co/microsoft/Phi-3.5-vision-instruct)

4K context €0.05 per 1M in 🇪🇺 EU

DeepSeek V2

DeepSeek · 236B

Due to the constraints of HuggingFace, the open-source code currently experiences slower performance than our internal codebase when running on GPUs with Huggingface. To facilitate the efficient execution of our model, we offer a dedicated vllm solution that optimizes performance for running our model effectively.

164K context €0.40 per 1M in 🇪🇺 EU

Meta Llama Guard 2 8B

Meta · 8B

Meta Llama Guard 2 8B is een open-source taalmodel van Meta met 8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

Meta Llama 3 8B

Meta · 8B

Meta Llama 3 8B is een open-source taalmodel van Meta met 8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

Meta Llama 3 8B Instruct

Meta · 8B

Meta Llama 3 8B Instruct is een open-source taalmodel van Meta met 8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

Meta Llama 3 70B Instruct

Meta · 71B

Meta Llama 3 70B Instruct is een open-source taalmodel van Meta met 71B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU

Meta Llama 3 70B

Meta · 71B

Meta Llama 3 70B is een open-source taalmodel van Meta met 71B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU

gemma 1.1 2b it

Google · 2.5B

gemma 1.1 2b it is een open-source taalmodel van Google met 2.5B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

gemma 1.1 7b it

Google · 8.5B

gemma 1.1 7b it is een open-source taalmodel van Google met 8.5B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

codegemma 7b

Google · 8.5B

codegemma 7b is een open-source taalmodel van Google met 8.5B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

codegemma 2b

Google · 2.5B

codegemma 2b is een open-source taalmodel van Google met 2.5B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

CodeLlama 34b Instruct hf

Meta · 34B

CodeLlama 34b Instruct hf is een open-source taalmodel van Meta met 34B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU

CodeLlama 70b Instruct hf

Meta · 69B

CodeLlama 70b Instruct hf is een open-source taalmodel van Meta met 69B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU

CodeLlama 70b hf

Meta · 69B

CodeLlama 70b hf is een open-source taalmodel van Meta met 69B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU

CodeLlama 13b Instruct hf

Meta · 13B

CodeLlama 13b Instruct hf is een open-source taalmodel van Meta met 13B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

CodeLlama 13b hf

Meta · 13B

CodeLlama 13b hf is een open-source taalmodel van Meta met 13B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

CodeLlama 7b Instruct hf

Meta · 6.7B

CodeLlama 7b Instruct hf is een open-source taalmodel van Meta met 6.7B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU

CodeLlama 7b Python hf

Meta · 6.7B

CodeLlama 7b Python hf is een open-source taalmodel van Meta met 6.7B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU

CodeLlama 7b hf

Meta · 7B

CodeLlama 7b hf is een open-source taalmodel van Meta met 7B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

gemma 7b it

Google · 8.5B

gemma 7b it is een open-source taalmodel van Google met 8.5B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

gemma 7b

Google · 8.5B

gemma 7b is een open-source taalmodel van Google met 8.5B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

gemma 2b it

Google · 2.5B

gemma 2b it is een open-source taalmodel van Google met 2.5B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

gemma 2b

Google · 2.5B

gemma 2b is een open-source taalmodel van Google met 2.5B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

deepseek math 7b instruct

DeepSeek · 7B

❗❗❗ Please use chain-of-thought prompt to test DeepSeekMath-Instruct and DeepSeekMath-RL:

4K context €0.10 per 1M in 🇪🇺 EU

deepseek coder 7b instruct v1.5

DeepSeek · 6.9B

Deepseek-Coder-7B-Instruct-v1.5 is continue pre-trained from Deepseek-LLM 7B on 2T tokens by employing a window size of 4K and next token prediction objective, and then fine-tuned on 2B tokens of instruction data.

4K context €0.05 per 1M in 🇪🇺 EU

deepseek moe 16b chat

DeepSeek · 16B

python import torch from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig

4K context €0.15 per 1M in 🇪🇺 EU

deepseek moe 16b base

DeepSeek · 16B

modelname = "deepseek-ai/deepseek-moe-16b-base" tokenizer = AutoTokenizer.frompretrained(modelname) model = AutoModelForCausalLM.frompretrained(modelname, torchdtype=torch.bfloat16, devicemap="auto") model.generationconfig = GenerationConfig.frompretrained(modelname) model.generationconfig.padtokenid = model.generationconfig.eostokenid

4K context €0.15 per 1M in 🇪🇺 EU

phi 2

Microsoft · 2.8B

Phi-2 is a Transformer with 2.7 billion parameters. It was trained using the same data sources as Phi-1.5, augmented with a new data source that consists of various NLP synthetic texts and filtered websites (for safety and educational value). When assessed against benchmarks testing common sense, language understanding, and logical reasoning, Phi-2 showcased a nearly state-of-the-art performance among models with less than 13 billion parameters.

2K context €0.03 per 1M in 🇪🇺 EU

Mistral 7B Instruct v0.2

Mistral · 7.2B

py from mistralcommon.tokens.tokenizers.mistral import MistralTokenizer from mistralcommon.protocol.instruct.messages import UserMessage from mistralcommon.protocol.instruct.request import ChatCompletionRequest

33K context €0.10 per 1M in 🇪🇺 EU

LlamaGuard 7b

Meta · 6.7B

LlamaGuard 7b is een open-source taalmodel van Meta met 6.7B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU

deepseek llm 67b base

DeepSeek · 67B

Introducing DeepSeek LLM, an advanced language model comprising 67 billion parameters. It has been trained from scratch on a vast dataset of 2 trillion tokens in both English and Chinese. In order to foster research, we have made DeepSeek LLM 7B/67B Base and DeepSeek LLM 7B/67B Chat open source for the research community.

4K context €0.40 per 1M in 🇪🇺 EU

deepseek llm 7b chat

DeepSeek · 7B

Introducing DeepSeek LLM, an advanced language model comprising 7 billion parameters. It has been trained from scratch on a vast dataset of 2 trillion tokens in both English and Chinese. In order to foster research, we have made DeepSeek LLM 7B/67B Base and DeepSeek LLM 7B/67B Chat open source for the research community.

4K context €0.10 per 1M in 🇪🇺 EU

deepseek llm 7b base

DeepSeek · 7B

Introducing DeepSeek LLM, an advanced language model comprising 7 billion parameters. It has been trained from scratch on a vast dataset of 2 trillion tokens in both English and Chinese. In order to foster research, we have made DeepSeek LLM 7B/67B Base and DeepSeek LLM 7B/67B Chat open source for the research community.

4K context €0.10 per 1M in 🇪🇺 EU

BPO

Z.AI

- Repository: https://github.com/thu-coai/BPO - Paper: https://arxiv.org/abs/2311.04155 - Data: https://huggingface.co/datasets/THUDM/BPO

4K context €0.10 per 1M in 🇪🇺 EU

deepseek coder 33b instruct

DeepSeek · 33B

Deepseek Coder is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language in both English and Chinese. We provide various sizes of the code model, ranging from 1B to 33B versions. Each model is pre-trained on project-level code corpus by employing a window size of 16K and a extra fill-in-the-blank task, to support project-level code completion and infilling. For coding capabilities, Deepseek Coder achieves state-of-the-art performance among open-source code models on multiple programming languages and various

16K context €0.25 per 1M in 🇪🇺 EU

deepseek coder 1.3b instruct

DeepSeek · 1.3B

Deepseek Coder is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language in both English and Chinese. We provide various sizes of the code model, ranging from 1B to 33B versions. Each model is pre-trained on project-level code corpus by employing a window size of 16K and a extra fill-in-the-blank task, to support project-level code completion and infilling. For coding capabilities, Deepseek Coder achieves state-of-the-art performance among open-source code models on multiple programming languages and various

16K context €0.03 per 1M in 🇪🇺 EU

deepseek coder 6.7b instruct

DeepSeek · 6.7B

Deepseek Coder is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language in both English and Chinese. We provide various sizes of the code model, ranging from 1B to 33B versions. Each model is pre-trained on project-level code corpus by employing a window size of 16K and a extra fill-in-the-blank task, to support project-level code completion and infilling. For coding capabilities, Deepseek Coder achieves state-of-the-art performance among open-source code models on multiple programming languages and various

16K context €0.05 per 1M in 🇪🇺 EU

deepseek coder 1.3b base

DeepSeek · 1.3B

Deepseek Coder is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language in both English and Chinese. We provide various sizes of the code model, ranging from 1B to 33B versions. Each model is pre-trained on project-level code corpus by employing a window size of 16K and a extra fill-in-the-blank task, to support project-level code completion and infilling. For coding capabilities, Deepseek Coder achieves state-of-the-art performance among open-source code models on multiple programming languages and various

16K context €0.03 per 1M in 🇪🇺 EU

deepseek coder 6.7b base

DeepSeek · 6.7B

Deepseek Coder is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language in both English and Chinese. We provide various sizes of the code model, ranging from 1B to 33B versions. Each model is pre-trained on project-level code corpus by employing a window size of 16K and a extra fill-in-the-blank task, to support project-level code completion and infilling. For coding capabilities, Deepseek Coder achieves state-of-the-art performance among open-source code models on multiple programming languages and various

16K context €0.05 per 1M in 🇪🇺 EU

Mistral 7B Instruct v0.1

Mistral · 7.2B

py from mistralcommon.tokens.tokenizers.mistral import MistralTokenizer from mistralcommon.protocol.instruct.messages import UserMessage from mistralcommon.protocol.instruct.request import ChatCompletionRequest

33K context €0.10 per 1M in 🇪🇺 EU

Mistral 7B v0.1

Mistral · 7.2B

The Mistral-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters. Mistral-7B-v0.1 outperforms Llama 2 13B on all benchmarks we tested.

33K context €0.10 per 1M in 🇪🇺 EU

phi 1

Microsoft · 1.4B

The language model Phi-1 is a Transformer with 1.3 billion parameters, specialized for basic Python coding. Its training involved a variety of data sources, including subsets of Python codes from The Stack v1.2, Q&A content from StackOverflow, competition code from codecontests, and synthetic Python textbooks and exercises generated by gpt-3.5-turbo-0301. Even though the model and the datasets are relatively small compared to contemporary Large Language Models (LLMs), Phi-1 has demonstrated an impressive accuracy rate exceeding 50% on the simple Python coding benchmark, HumanEval.

2K context €0.03 per 1M in 🇪🇺 EU

phi 1 5

Microsoft · 1.4B

The language model Phi-1.5 is a Transformer with 1.3 billion parameters. It was trained using the same data sources as phi-1, augmented with a new data source that consists of various NLP synthetic texts. When assessed against benchmarks testing common sense, language understanding, and logical reasoning, Phi-1.5 demonstrates a nearly state-of-the-art performance among models with less than 10 billion parameters.

2K context €0.03 per 1M in 🇪🇺 EU

Llama 2 70b chat hf

Meta · 69B

Llama 2 70b chat hf is een open-source taalmodel van Meta met 69B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU

Llama 2 7b chat hf

Meta · 6.7B

Llama 2 7b chat hf is een open-source taalmodel van Meta met 6.7B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU

Llama 2 7b hf

Meta · 6.7B

Llama 2 7b hf is een open-source taalmodel van Meta met 6.7B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU

Llama 2 13b hf

Meta · 13B

Llama 2 13b hf is een open-source taalmodel van Meta met 13B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

Llama 2 13b chat hf

Meta · 13B

Llama 2 13b chat hf is een open-source taalmodel van Meta met 13B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

Llama 2 70b hf

Meta · 69B

Llama 2 70b hf is een open-source taalmodel van Meta met 69B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU

CodeGPT small java

Microsoft

CodeGPT small java is een open-source taalmodel van Microsoft, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

CodeGPT small java adaptedGPT2

Microsoft

CodeGPT small java adaptedGPT2 is een open-source taalmodel van Microsoft, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

CodeGPT small py

Microsoft

CodeGPT small py is een open-source taalmodel van Microsoft, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

DialoGPT large

Microsoft

DialoGPT is a SOTA large-scale pretrained dialogue response generation model for multiturn conversations. The human evaluation results indicate that the response generated from DialoGPT is comparable to human response quality under a single-turn conversation Turing test. The model is trained on 147M multi-turn dialogue from Reddit discussion thread.

€0.10 per 1M in 🇪🇺 EU

DialoGPT medium

Microsoft

DialoGPT is a SOTA large-scale pretrained dialogue response generation model for multiturn conversations. The human evaluation results indicate that the response generated from DialoGPT is comparable to human response quality under a single-turn conversation Turing test. The model is trained on 147M multi-turn dialogue from Reddit discussion thread.

€0.10 per 1M in 🇪🇺 EU

DialoGPT small

Microsoft · 0.2B

DialoGPT is a SOTA large-scale pretrained dialogue response generation model for multiturn conversations. The human evaluation results indicate that the response generated from DialoGPT is comparable to human response quality under a single-turn conversation Turing test. The model is trained on 147M multi-turn dialogue from Reddit discussion thread.

€0.03 per 1M in 🇪🇺 EU

deepseek coder 6.7b

DeepSeek · 6.7B

deepseek coder 6.7b is een open-source taalmodel van DeepSeek met 6.7B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU

deepseek coder 7b instruct v1.5

DeepSeek · 7B

deepseek coder 7b instruct v1.5 is een open-source taalmodel van DeepSeek met 7B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU

DeepSeek Coder V2 Lite (16B)

DeepSeek · 16B

DeepSeek Coder V2 Lite (16B) is een open-source taalmodel van DeepSeek met 16B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.15 per 1M in 🇪🇺 EU

DeepSeek R1 0528 Qwen3 8B

DeepSeek · 8B

DeepSeek R1 0528 Qwen3 8B is een open-source taalmodel van DeepSeek met 8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

DeepSeek R1 Distill 1.5B

DeepSeek · 1.5B

DeepSeek R1 Distill 1.5B is een open-source taalmodel van DeepSeek met 1.5B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

DeepSeek R1 Distill 14B

DeepSeek · 14B

DeepSeek R1 Distill 14B is een open-source taalmodel van DeepSeek met 14B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

DeepSeek R1 Distill 32B

DeepSeek · 32B

DeepSeek R1 Distill 32B is een open-source taalmodel van DeepSeek met 32B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU

DeepSeek R1 Distill Llama 70B

DeepSeek · 70B

DeepSeek R1 Distill Llama 70B is een open-source taalmodel van DeepSeek met 70B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU

DeepSeek R1 Distill Llama 8B

DeepSeek · 8B

DeepSeek R1 Distill Llama 8B is een open-source taalmodel van DeepSeek met 8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU

DeepSeek V3 (685B MoE)

DeepSeek · 685B

DeepSeek V3 (685B MoE) is een open-source taalmodel van DeepSeek met 685B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU

functiongemma 270m

Google

functiongemma 270m is een open-source taalmodel van Google, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU

gemma 3 12b

Google · 12B

gemma 3 12b is een open-source taalmodel van Google met 12B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

Gemma 3 1B

Google · 1B

Gemma 3 1B is een open-source taalmodel van Google met 1B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

Gemma 3 27B

Google · 27B

Gemma 3 27B is een open-source taalmodel van Google met 27B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU

Gemma 3 4B

Google · 4B

Gemma 3 4B is een open-source taalmodel van Google met 4B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

gemma 3n E2B

Google · 2B

gemma 3n E2B is een open-source taalmodel van Google met 2B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU

Llama 3.1 70B

Meta · 70B

Llama 3.1 70B is een open-source taalmodel van Meta met 70B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU

Llama 3.1 8B

Meta · 8B

Llama 3.1 8B is een open-source taalmodel van Meta met 8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

Llama 3.2 11B Vision

Meta · 11B

Llama 3.2 11B Vision is een open-source taalmodel van Meta met 11B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

Llama 3.2 1B

Meta · 1B

Llama 3.2 1B is een open-source taalmodel van Meta met 1B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

Llama 3.2 3B

Meta · 3B

Llama 3.2 3B is een open-source taalmodel van Meta met 3B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

Llama 3.3 70B

Meta · 70B

Llama 3.3 70B is een open-source taalmodel van Meta met 70B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU

Llama 4 Maverick (17Bx128E)

Meta · 17B

Llama 4 Maverick (17Bx128E) is een open-source taalmodel van Meta met 17B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU

Llama 4 Scout (17Bx16E)

Meta · 17B

Llama 4 Scout (17Bx16E) is een open-source taalmodel van Meta met 17B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU

Loes

HostYourAI · 7B

Sovereign EU model fine-tuned by HostYourAI on dutch-clean.

🇪🇺 EU

Loes (EuroLLM-22B)

HostYourAI · 22B

Sovereign EU model fine-tuned by HostYourAI on loes-xl-pre.

🇪🇺 EU

Loes (Qwen3-14B)

HostYourAI · 14B

Sovereign EU model fine-tuned by HostYourAI on loes-xl-pre.

🇪🇺 EU

medgemma 27b

Google · 27B

medgemma 27b is een open-source taalmodel van Google met 27B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU

medgemma 4b

Google · 4B

medgemma 4b is een open-source taalmodel van Google met 4B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU

Mistral Nemo 12B

Mistral · 12B

Mistral Nemo 12B is een open-source taalmodel van Mistral met 12B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

Phi 4 (14B)

Microsoft · 14B

Phi 4 (14B) is een open-source taalmodel van Microsoft met 14B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

Phi 4 Mini (3.8B)

Microsoft · 3.8B

Phi 4 Mini (3.8B) is een open-source taalmodel van Microsoft met 3.8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

Qwen 2.5 14B

Qwen · 14B

Qwen 2.5 14B is een open-source taalmodel van Qwen met 14B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

Qwen 2.5 72B

Qwen · 72B

Qwen 2.5 72B is een open-source taalmodel van Qwen met 72B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU

Qwen 2.5 Coder 1.5B

Qwen · 1.5B

Qwen 2.5 Coder 1.5B is een open-source taalmodel van Qwen met 1.5B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

Qwen 2.5 Coder 32B

Qwen · 32B

Qwen 2.5 Coder 32B is een open-source taalmodel van Qwen met 32B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU

Qwen 2.5 Coder 7B

Qwen · 7B

Qwen 2.5 Coder 7B is een open-source taalmodel van Qwen met 7B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU

Qwen 3 0.6B

Qwen · 0.6B

Qwen 3 0.6B is een open-source taalmodel van Qwen met 0.6B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

Qwen 3 4B

Qwen · 4B

Qwen 3 4B is een open-source taalmodel van Qwen met 4B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

Qwen 3 Coder 30B-A3B (MoE)

Qwen · 30B

Qwen 3 Coder 30B-A3B (MoE) is een open-source taalmodel van Qwen met 30B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU

Qwen2.5 32B

Qwen · 32B

Qwen2.5 32B is een open-source taalmodel van Qwen met 32B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU

Qwen2.5 7B

Qwen · 7B

Qwen2.5 7B is een open-source taalmodel van Qwen met 7B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

Qwen2.5 Coder 14B

Qwen · 14B

Qwen2.5 Coder 14B is een open-source taalmodel van Qwen met 14B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.15 per 1M in 🇪🇺 EU

Qwen2.5 VL 32B

Qwen · 32B

Qwen2.5 VL 32B is een open-source taalmodel van Qwen met 32B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU

Qwen2.5 VL 3B

Qwen · 3B

Qwen2.5 VL 3B is een open-source taalmodel van Qwen met 3B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU

Qwen2.5 VL 72B

Qwen · 72B

Qwen2.5 VL 72B is een open-source taalmodel van Qwen met 72B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU

Qwen2.5 VL 7B

Qwen · 7B

Qwen2.5 VL 7B is een open-source taalmodel van Qwen met 7B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

Qwen3 1.7B

Qwen · 1.7B

Qwen3 1.7B is een open-source taalmodel van Qwen met 1.7B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

Qwen3 14B

Qwen · 14B

Qwen3 14B is een open-source taalmodel van Qwen met 14B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

Qwen3 30B A3B

Qwen · 30B

Qwen3 30B A3B is een open-source taalmodel van Qwen met 30B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU

Qwen3 30B A3B Instruct 2507

Qwen · 30B

Qwen3 30B A3B Instruct 2507 is een open-source taalmodel van Qwen met 30B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU

Qwen3 32B

Qwen · 32B

Qwen3 32B is een open-source taalmodel van Qwen met 32B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU

Qwen3 4B Instruct 2507

Qwen · 4B

Qwen3 4B Instruct 2507 is een open-source taalmodel van Qwen met 4B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU

Qwen3 4B Thinking 2507

Qwen · 4B

Qwen3 4B Thinking 2507 is een open-source taalmodel van Qwen met 4B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU

Qwen3 8B

Qwen · 8B

Qwen3 8B is een open-source taalmodel van Qwen met 8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

Qwen3 Coder Next

Qwen

Qwen3 Coder Next is een open-source taalmodel van Qwen, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.40 per 1M in 🇪🇺 EU

Qwen3 VL 2B

Qwen · 2B

Qwen3 VL 2B is een open-source taalmodel van Qwen met 2B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

Qwen3 VL 30B A3B

Qwen · 30B

Qwen3 VL 30B A3B is een open-source taalmodel van Qwen met 30B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU

Qwen3 VL 32B

Qwen · 32B

Qwen3 VL 32B is een open-source taalmodel van Qwen met 32B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.25 per 1M in 🇪🇺 EU

Qwen3 VL 4B

Qwen · 4B

Qwen3 VL 4B is een open-source taalmodel van Qwen met 4B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU

Qwen3 VL 8B

Qwen · 8B

Qwen3 VL 8B is een open-source taalmodel van Qwen met 8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

Qwen3 VL 8B Thinking

Qwen · 8B

Qwen3 VL 8B Thinking is een open-source taalmodel van Qwen met 8B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.10 per 1M in 🇪🇺 EU

Qwen3Guard Gen 0.6B

Qwen · 0.6B

Qwen3Guard Gen 0.6B is een open-source taalmodel van Qwen met 0.6B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.03 per 1M in 🇪🇺 EU

translategemma 4b

Google · 4B

translategemma 4b is een open-source taalmodel van Google met 4B parameters, gehost op Europese GPU's via een OpenAI-compatibele API.

€0.05 per 1M in 🇪🇺 EU

Modelcatalogus

Host. Route. Ship.