kvcache.ai

Mooncake Public
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

kvcache-ai/Mooncake’s past year of commit activity

C++ 4,419 Apache-2.0 471 217 (12 issues need help) 69 Updated Dec 15, 2025
sglang Public Forked from sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.

kvcache-ai/sglang’s past year of commit activity

Python 4 Apache-2.0 3,758 0 1 Updated Dec 15, 2025
ktransformers Public
A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations

kvcache-ai/ktransformers’s past year of commit activity

Python 16,201 Apache-2.0 1,183 382 (1 issue needs help) 3 Updated Dec 12, 2025
sglang_awq Public Forked from sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.

kvcache-ai/sglang_awq’s past year of commit activity

Python 1 Apache-2.0 3,761 0 0 Updated Dec 10, 2025
gpustack Public Forked from gpustack/gpustack
GPU cluster manager for optimized AI model deployment

kvcache-ai/gpustack’s past year of commit activity

Python 0 Apache-2.0 429 0 0 Updated Dec 7, 2025
TrEnv-X Public

kvcache-ai/TrEnv-X’s past year of commit activity

Go 70 Apache-2.0 2 0 0 Updated Sep 15, 2025
sglang-npu Public Forked from sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.

kvcache-ai/sglang-npu’s past year of commit activity

Python 0 Apache-2.0 3,761 0 0 Updated Aug 12, 2025
DeepEP_fault_tolerance Public Forked from deepseek-ai/DeepEP
DeepEP: an efficient expert-parallel communication library that supports fault tolerance

kvcache-ai/DeepEP_fault_tolerance’s past year of commit activity

Cuda 3 MIT 1,031 0 0 Updated Jul 31, 2025
custom_flashinfer Public Forked from flashinfer-ai/flashinfer
FlashInfer: Kernel Library for LLM Serving

kvcache-ai/custom_flashinfer’s past year of commit activity

Cuda 5 Apache-2.0 596 0 0 Updated Jul 24, 2025
vllm Public Forked from vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs

kvcache-ai/vllm’s past year of commit activity

Python 14 Apache-2.0 12,081 0 0 Updated Mar 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kvcache.ai

Pinned Loading

Repositories

People

Top languages

Most used topics

Uh oh!