Examples
Training¶
TRL
Fine-tune Llama 3.1 8B with SFT and QLoRA, single-node or distributed across multiple nodes.
Axolotl
Fine-tune Llama models with FSDP and QLoRA, single-node or distributed across multiple nodes.
Ray+RAGEN
Fine-tune an agent on multiple nodes with RAGEN, verl, and Ray.
Clusters¶
GCP
Set up GCP A4 and A3 clusters with optimized networking
AWS
Set up AWS EFA clusters with optimized networking
Lambda
Set up Lambda clusters with optimized networking
Crusoe
Set up Crusoe clusters with optimized networking
Nebius
Set up Nebius clusters with optimized networking
NCCL/RCCL tests
Run multi-node NCCL tests with MPI
Inference¶
SGLang
Deploy Qwen3.6-27B with SGLang
vLLM
Deploy Qwen3.6-27B with vLLM
NIM
Deploy a DeepSeek distilled model with NIM
TensorRT-LLM
Deploy Qwen3 with TensorRT-LLM
Models¶
DeepSeek V4
Deploy DeepSeek V4 with SGLang on B200:8
Qwen 3.6
Deploy Qwen3.6-27B with SGLang on NVIDIA or AMD