Reducing the carbon footprint of natural language processing

The GreenNLP project is building resources for sustainable NLP

The recent dramatic advances in natural language processing (NLP) technology, such as neural machine translation (NMT) and large language models (LLM), are changing the way people work and interact with technology. These new NLP technologies have the potential to increase productivity and levels of automation in a wide variety of fields.

The downside of the new NLP technology is its enormous energy consumption. At a time when energy efficiency has become essential due to the climate crisis, the advances in NLP are vastly increasing the energy usage of the IT sector. The GreenNLP project addresses this issue by developing more environmentally sustainable ways of building and using NLP applications.

News from GreenNLP

            

Benchmarking large-scale LLM training on LUMI

CSC, a partner in the GreenNLP project, has evaluated the scalability of large language model (LLM) training on the LUMI supercomputer. The results indicate that there are no fundamental scaling bottlenecks even when training with thousands of GPUs.

                  

11 January 2024

First Call for papers: MOOMIN

The GreenNLP project is one of the organizers of the first edition of the MOOMIN workshop on Modular and Open Multilingual NLP, to be held at EACL 2024 on March 21 or 22, 2024.

                  

1 November 2023

        

Areas of research

Data curation

Reducing training costs through data curation and selection

Compact language models

Decreasing runtime costs with compact language models

Compact translation models

Decreasing runtime costs with compact translation models

Efficient computation

Reducing computation with efficient training and inference procedures

Modular NLP

Cost-efficient components with modular multilingual NLP

Reuse and sustainability

Documentation, packaging and distribution

Consortium partners