Examples

Single-node training¶

Fine-tune Llama 3.1 8B on a custom dataset using TRL.

Fine-tune Llama 4 on a custom dataset using Axolotl.

Fine-tune LLM on multiple nodes with TRL, Accelerate, and Deepspeed.

Fine-tune LLM on multiple nodes with Axolotl.

Fine-tune an agent on multiple nodes with RAGEN, verl, and Ray.

Run multi-node NCCL tests with MPI

Run multi-node RCCL tests with MPI

Set up GCP A3 Mega clusters with optimized networking

Set up GCP A3 High clusters with optimized networking

Set up AWS EFA clusters with optimized networking

Deploy DeepSeek distilled models with SGLang

Deploy Llama 3.1 with vLLM

Deploy Llama 4 with TGI

Deploy a DeepSeek distilled model with NIM

Deploy DeepSeek models with TensorRT-LLM

Deploy and fine-tune LLMs on AMD

Deploy and fine-tune LLMs on TPU

Deploy and fine-tune LLMs on Intel Gaudi

Deploy and fine-tune LLMs on Tenstorrent