Tag: Serving

Apr 11, 2026 · AI Engineering
RadixAttention in SGLang: Prefix Caching Documentation
SGLang's RadixAttention uses radix trees for KV cache optimization. How it outperforms vLLM PagedAttention for multi-turn conversations and agent workflows.