Model garden

codegeex4 all 9b

Direct via de EU-router of als dedicated GPU-deployment. Data blijft in Europa.

We introduce CodeGeeX4-ALL-9B, the open-source version of the latest CodeGeeX4 model series. It is a multilingual code generation model continually trained on the GLM-4-9B, significantly enhancing its code generation capabilities. Using a single CodeGeeX4-ALL-9B model, it can sup...

zai-org/codegeex4-all-9b
text->text · zai-org · EU-hosted
9.4B
Parameters
Contextvenster
22GB
Minimale VRAM
POST /api/v1/chat/completions200 OK

Specificaties

Parameters 9.4B
Minimale VRAM 22 GB
Architectuur ChatGLMModel (vLLM)
Licentie other
Modaliteit text->text
Uitgebracht July 2024
Uitgever zai-org ↗

Prijzen

€0.10
Input (per 1M tokens)
€0.18
Output (per 1M tokens)

Gedeelde EU-router, pay-per-token, scale-to-zero. Dedicated GPU-deployments worden per uur afgerekend — zie prijzen.

Direct aanroepen

Drop-in vervanger voor OpenAI: wijzig alleen de base-URL en de API-key. Ook het Anthropic-formaat (/v1/messages) wordt ondersteund.

curl https://hostyourai.com/api/v1/chat/completions \
  -H "Authorization: Bearer hyai-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "zai-org/codegeex4-all-9b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Veelgestelde vragen

Kan ik codegeex4 all 9b in de EU draaien?

Ja. HostYourAI draait codegeex4 all 9b op GPU's in Europese datacenters via vLLM. Prompts en outputs verlaten de EU niet en er is geen Amerikaanse cloudprovider in de keten.

Is codegeex4 all 9b hosten AVG/GDPR-compliant?

Ja. Alle verwerking vindt plaats binnen de EU, er is een verwerkersovereenkomst (DPA) beschikbaar en de subprocessor-lijst is openbaar. Open-source gewichten betekenen ook: geen training op jouw data.

Wat kost codegeex4 all 9b?

Via de gedeelde EU-router betaal je €0.10 per miljoen input-tokens en €0.18 per miljoen output-tokens, zonder vaste kosten. Voor hoge volumes of isolatie kun je codegeex4 all 9b ook als dedicated GPU-instance per uur draaien.

Is de API compatibel met OpenAI?

Ja. Je gebruikt de standaard OpenAI-SDK's met een aangepaste base-URL (https://hostyourai.com/api/v1). Ook de Anthropic Messages API wordt ondersteund als drop-in.

Andere modellen van Z.AI

GLM 5.2

We're introducing GLM-5.2, our latest flagship model for long-horizon tasks. It marks a substantial leap in long-horizon task capability over its predecessor GLM-5.1 and, for the first time, delivers that capability on a solid 1M-token context. GLM-5.2's new capabilities include: - Solid 1M Context: A solid 1M-token context that stably sustains long-horizon work - Advanced Coding with Flexible Effort: Stronger coding capabilities with multiple thinking effort levels to balance performance and latency - Improved Architecture: We propose IndexShare, which reuses the same indexer across every fou

753B 1M context Bekijk model →
GLM 5.2 FP8

We're introducing GLM-5.2, our latest flagship model for long-horizon tasks. It marks a substantial leap in long-horizon task capability over its predecessor GLM-5.1 and, for the first time, delivers that capability on a solid 1M-token context. GLM-5.2's new capabilities include: - Solid 1M Context: A solid 1M-token context that stably sustains long-horizon work - Advanced Coding with Flexible Effort: Stronger coding capabilities with multiple thinking effort levels to balance performance and latency - Improved Architecture: We propose IndexShare, which reuses the same indexer across every fou

753B 1M context Bekijk model →
GLM 5.1 FP8

GLM-5.1 is our next-generation flagship model for agentic engineering, with significantly stronger coding capabilities than its predecessor. It achieves state-of-the-art performance on SWE-Bench Pro and leads GLM-5 by a wide margin on NL2Repo (repo generation) and Terminal-Bench 2.0 (real-world terminal tasks).

754B 203K context Bekijk model →
GLM 5.1

GLM-5.1 is our next-generation flagship model for agentic engineering, with significantly stronger coding capabilities than its predecessor. It achieves state-of-the-art performance on SWE-Bench Pro and leads GLM-5 by a wide margin on NL2Repo (repo generation) and Terminal-Bench 2.0 (real-world terminal tasks).

754B 203K context Bekijk model →
GLM 5

We are launching GLM-5, targeting complex systems engineering and long-horizon agentic tasks. Scaling is still one of the most important ways to improve the intelligence efficiency of Artificial General Intelligence (AGI). Compared to GLM-4.5, GLM-5 scales from 355B parameters (32B active) to 744B parameters (40B active), and increases pre-training data from 23T to 28.5T tokens. GLM-5 also integrates DeepSeek Sparse Attention (DSA), largely reducing deployment cost while preserving long-context capacity.

754B 203K context Bekijk model →
GLM 5 FP8

We are launching GLM-5, targeting complex systems engineering and long-horizon agentic tasks. Scaling is still one of the most important ways to improve the intelligence efficiency of Artificial General Intelligence (AGI). Compared to GLM-4.5, GLM-5 scales from 355B parameters (32B active) to 744B parameters (40B active), and increases pre-training data from 23T to 28.5T tokens. GLM-5 also integrates DeepSeek Sparse Attention (DSA), largely reducing deployment cost while preserving long-context capacity.

754B 203K context Bekijk model →

Probeer codegeex4 all 9b gratis

Account aanmaken duurt een minuut. Test codegeex4 all 9b direct in de playground.

Start gratis