Model garden

Phi 3 medium 4k instruct

Direct via de EU-router of als dedicated GPU-deployment. Data blijft in Europa.

🎉 Phi-3.5: [[mini-instruct]](https://huggingface.co/microsoft/Phi-3.5-mini-instruct); [[MoE-instruct]](https://huggingface.co/microsoft/Phi-3.5-MoE-instruct) ; [[vision-instruct]](https://huggingface.co/microsoft/Phi-3.5-vision-instruct)

microsoft/Phi-3-medium-4k-instruct
text->text · microsoft · EU-hosted
14B
Parameters
4K
Contextvenster
33GB
Minimale VRAM
POST /api/v1/chat/completions200 OK

Specificaties

Parameters 14B
Contextvenster 4,096 tokens
Minimale VRAM 33 GB
Architectuur Phi3ForCausalLM (vLLM)
Licentie mit
Modaliteit text->text
Uitgebracht May 2024
Uitgever microsoft ↗

Prijzen

€0.15
Input (per 1M tokens)
€0.25
Output (per 1M tokens)

Gedeelde EU-router, pay-per-token, scale-to-zero. Dedicated GPU-deployments worden per uur afgerekend — zie prijzen.

Direct aanroepen

Drop-in vervanger voor OpenAI: wijzig alleen de base-URL en de API-key. Ook het Anthropic-formaat (/v1/messages) wordt ondersteund.

curl https://hostyourai.com/api/v1/chat/completions \
  -H "Authorization: Bearer hyai-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "microsoft/Phi-3-medium-4k-instruct",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Veelgestelde vragen

Kan ik Phi 3 medium 4k instruct in de EU draaien?

Ja. HostYourAI draait Phi 3 medium 4k instruct op GPU's in Europese datacenters via vLLM. Prompts en outputs verlaten de EU niet en er is geen Amerikaanse cloudprovider in de keten.

Is Phi 3 medium 4k instruct hosten AVG/GDPR-compliant?

Ja. Alle verwerking vindt plaats binnen de EU, er is een verwerkersovereenkomst (DPA) beschikbaar en de subprocessor-lijst is openbaar. Open-source gewichten betekenen ook: geen training op jouw data.

Wat kost Phi 3 medium 4k instruct?

Via de gedeelde EU-router betaal je €0.15 per miljoen input-tokens en €0.25 per miljoen output-tokens, zonder vaste kosten. Voor hoge volumes of isolatie kun je Phi 3 medium 4k instruct ook als dedicated GPU-instance per uur draaien.

Is de API compatibel met OpenAI?

Ja. Je gebruikt de standaard OpenAI-SDK's met een aangepaste base-URL (https://hostyourai.com/api/v1). Ook de Anthropic Messages API wordt ondersteund als drop-in.

Andere modellen van Microsoft

FastContext 1.0 4B RL

FastContext-1.0 is a lightweight repository-exploration subagent for LLM coding agents. Instead of letting a single model both explore the repository and solve the task, FastContext separates these two roles: it is invoked on demand by a main coding agent, issues parallel read-only tool calls (READ, GLOB, GREP), and returns compact file paths and line ranges as focused context.

4B 262K context Bekijk model →
FastContext 1.0 4B SFT

FastContext-1.0 is a lightweight repository-exploration subagent for LLM coding agents. Instead of letting a single model both explore the repository and solve the task, FastContext separates these two roles: it is invoked on demand by a main coding agent, issues parallel read-only tool calls (READ, GLOB, GREP), and returns compact file paths and line ranges as focused context.

4B 262K context Bekijk model →
X Reasoner 7B

We introduce X-Reasoner, a vision-language model posttrained solely on general-domain text for generalizable reasoning, using a twostage approach: an initial supervised fine-tuning phase with distilled long chainof-thoughts, followed by reinforcement learning with verifiable rewards. Experiments show that X-Reasoner successfully transfers reasoning capabilities to both multimodal and out-of-domain settings, outperforming existing state-of-theart models trained with in-domain and multimodal data across various general and medical benchmarks. More details can be found in the paper: X-Reasoner: T

8.3B 128K context Bekijk model →
FrogBoss 32B 2510

FrogBoss is built on the Qwen3-32B transformer architecture with a maximum context length of 64k tokens. The model uses multi-turn debugging workflows and complex code reasoning. Unlike general-purpose LLMs, FrogBoss is specialized for software engineering tasks.

32B 41K context Bekijk model →
OptiMind SFT

OptiMind-SFT is a specialized 20B parameter model designed to bridge the gap between natural language and executable optimization solvers. It automates the translation of complex decision-making problems—such as supply chain planning, scheduling, and resource allocation—into correct MILP formulations.

21B 131K context Bekijk model →
Fara 7B

Description: Fara-7B is Microsoft's first agentic small language model (SLM) designed specifically for computer use. With only 7 billion parameters, Fara-7B is an ultra-compact Computer Use Agent (CUA) that achieves state-of-the-art performance within its size class and is competitive with larger, more resource-intensive agentic systems.

8.3B 128K context Bekijk model →

Probeer Phi 3 medium 4k instruct gratis

Account aanmaken duurt een minuut. Test Phi 3 medium 4k instruct direct in de playground.

Start gratis