Using TPUs for fine-tuning and deploying LLMs
If you’re using or planning to use TPUs with Google Cloud, you can now do so via dstack
. Just specify the TPU version and the number of cores
(separated by a dash), in the gpu
property under resources
.
Read below to find out how to use TPUs with dstack
for fine-tuning and deploying
LLMs, leveraging open-source tools like Hugging Face’s
Optimum TPU
and vLLM .