Examples and Solutions
Explore practical examples and step-by-step solutions for common AI integration challenges. These guides demonstrate how to resolve technical issues and optimize your AI infrastructure using Ozeki AI Gateway.
How to fix missing think tag for Kimi K2.5
This guide explains how to resolve the missing opening <think> tag issue in Kimi K2.5 model responses. When using sglang or AI gateways, the model returns reasoning content with only a closing </think> tag, breaking compatibility with clients like Open WebUI, Cline, and Claude Code. The solution involves routing requests through Ozeki AI Gateway and adding a request modifier that injects the chat_template_kwargs parameter with thinking set to true, enabling proper formatting of reasoning blocks.
How to fix missing think tag for Kimi K2.5Qwen3 Coder Next NVFP4 setup
Setup guide for the NVFP4 quantized version of Qwen3-Coder-Next, a state-of-the-art code generation model. This 80B-A3B model with Hybrid DeltaNet + Attention + MoE architecture is quantized from 149GB to 45GB while maintaining strong performance. It supports context lengths up to 262K tokens and runs efficiently on NVIDIA Blackwell GPUs.
How to setup Qwen3-Coder-Next with VllmHardware considerations for training LLMs
This page covers essential hardware requirements and considerations for training large language models, including GPU specifications, memory needs, and optimization strategies.
Hardware considerations for training LLMs