GreenNLP • Efficient computation

Efficient training

For getting started with efficient training of neural networks for NLP, check CSC’s machine learning guide, which has been expanded and improved as part of the GreenNLP project.

Particularly relevant sections:

Getting started with machine learning at CSC, for getting started with doing machine learning on supercomputers from scratch,
GPU-accelerated machine learning, which discusses GPU utlization and common bottle-necks than can cause low utilization,
Data storage for machine learning which discusses data storage in the context of supercomputers,
Multi-GPU and multi-node machine learning with Slurm examples and short tutorials for PyTorch DDP, PyTorch Lightning, Accelerate and DeepSpeed.
Managing machine learning workflows on CSC’s supercomputers which covers using MLflow for tracking multiple runs and metrics,
Working with large language models on supercomputers which discussess how to do efficient fine-tuning of LLMs on supercomputers.

Also relevant is the PyTorch module documentation which includes a short tutorial on how to use the PyTorch profiler.

In GreenNLP we have also collected other links with recipies and tips for LLM training, in particular for LUMI:

Efficient inference

For doing LLM inference on supercomputers, see the section on inference in CSC’s guide on Working with large language models on supercomputers.

For continuous inference, or running inference as a service, cloud platforms are more suited, such as CSC’s Pouta or Rahti services.