Model garden

DeepSeek OCR 2

Instantly via the EU router or as a dedicated GPU deployment. Data stays in Europe.

Inference using Huggingface transformers on NVIDIA GPUs. Requirements tested on python 3.12.9 + CUDA11.8:

deepseek-ai/DeepSeek-OCR-2 vLLM ready
text+image->text · deepseek-ai · EU-hosted
3.4B
Parameters
8K
Context window
8GB
Minimum VRAM
POST /api/v1/chat/completions200 OK

Specifications

Parameters 3.4B
Context window 8,192 tokens
Minimum VRAM 8 GB
Architecture DeepseekOCR2ForCausalLM (vLLM)
License apache-2.0
Modality text+image->text
Released January 2026
Publisher deepseek-ai ↗

Pricing

€0.03
Input (per 1M tokens)
€0.06
Output (per 1M tokens)

Shared EU router, pay-per-token, scale-to-zero. Dedicated GPU deployments are billed hourly — see pricing.

✓ Verified working on 24-06-2026 — responded in 761 ms on our EU infrastructure.

Call it now

Drop-in replacement for OpenAI: change only the base URL and API key. The Anthropic format (/v1/messages) is supported too.

curl https://hostyourai.com/api/v1/chat/completions \
  -H "Authorization: Bearer hyai-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-ai/DeepSeek-OCR-2",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Frequently asked questions

Can I run DeepSeek OCR 2 in the EU?

Yes. HostYourAI runs DeepSeek OCR 2 on GPUs in European datacenters via vLLM. Prompts and outputs never leave the EU and there is no US cloud provider in the chain.

Is hosting DeepSeek OCR 2 GDPR-compliant?

Yes. All processing happens inside the EU, a Data Processing Agreement (DPA) is available and the subprocessor list is public. Open-source weights also mean: no training on your data.

How much does DeepSeek OCR 2 cost?

Via the shared EU router you pay €0.03 per million input tokens and €0.06 per million output tokens, with no fixed costs. For high volume or isolation you can also run DeepSeek OCR 2 as a dedicated hourly GPU instance.

Is the API OpenAI-compatible?

Yes. You use the standard OpenAI SDKs with a custom base URL (https://hostyourai.com/api/v1). The Anthropic Messages API is supported as a drop-in as well.

More models from DeepSeek

DeepSeek V4 Pro

We present a preview version of DeepSeek-V4 series, including two strong Mixture-of-Experts (MoE) language models — DeepSeek-V4-Pro with 1.6T parameters (49B activated) and DeepSeek-V4-Flash with 284B parameters (13B activated) — both supporting a context length of one million tokens.

862B 1M context View model →
DeepSeek V4 Flash

We present a preview version of DeepSeek-V4 series, including two strong Mixture-of-Experts (MoE) language models — DeepSeek-V4-Pro with 1.6T parameters (49B activated) and DeepSeek-V4-Flash with 284B parameters (13B activated) — both supporting a context length of one million tokens.

158B 1M context View model →
DeepSeek V3.2

We introduce DeepSeek-V3.2, a model that harmonizes high computational efficiency with superior reasoning and agent performance. Our approach is built upon three key technical breakthroughs:

685B 164K context View model →
DeepSeek OCR

torch==2.6.0 transformers==4.46.3 tokenizers==0.20.3 einops addict easydict pip install flash-attn==2.7.3 --no-build-isolation

3.3B 8K context View model →
DeepSeek V3.2 Exp

We are excited to announce the official release of DeepSeek-V3.2-Exp, an experimental version of our model. As an intermediate step toward our next-generation architecture, V3.2-Exp builds upon V3.1-Terminus by introducing DeepSeek Sparse Attention—a sparse attention mechanism designed to explore and validate optimizations for training and inference efficiency in long-context scenarios.

685B 164K context View model →
DeepSeek V3.1 Terminus

This update maintains the model's original capabilities while addressing issues reported by users, including:

685B 164K context View model →

Try DeepSeek OCR 2 for free

Creating an account takes a minute. Test DeepSeek OCR 2 straight away in the playground.

Start for free