Skip to article frontmatterSkip to article content

ESG Voc

Introduction

ESGVOC is a Python library designed to streamline the management of controlled vocabularies (CV) used by the climate modelling community publishing datasets related to WCRP-ESMO (https://www.wcrp-esmo.org) activities on the ESGF (https://esgf.llnl.gov). By harmonizing vocabulary terms and providing both a Python API and a CLI for easy access, ESGVOC resolves common issues like inconsistencies, errors, and inefficiencies associated with managing controlled vocabularies.

Why ESGVOC?

Previously, controlled vocabularies were stored in multiple locations and formats, requiring various software implementations to query and interpret data. This approach introduced challenges, including:

ESGVOC improves controlled vocabulary management through two main ideas:

  1. Harmonization terms through a unified CV source
    A single, centralized repository — referred to as the “Universe CV” — hosts all controlled vocabularies. Specialized vocabularies for specific projects reference the Universe CV via streamlined lists of IDs. This ensures consistency and eliminates duplication.

  2. Providing both pythin api and a CLI
    ESGVOC provides a dedicated service for interacting with controlled vocabularies. It enables developers, administrators, and software systems to access vocabularies seamlessly via:

    • A Python API for programmatic interaction.
    • A CLI powered by Typer for command-line use.

Installation

You can install ESGVOC using recent Python packaging tools. It is only available in pypi.org (not in anaconda.org). We recommend the following methods:

Using UV (preferred)

UV is recommended for managing dependencies and isolating the library:

uv add esgvoc

This ensures all dependencies are installed, and cached repositories and databases will be stored in the .cache directory alongside the .venv folder. This approach simplifies updates and uninstallation.

Using pip in a virtual environment

Alternatively, you can use a virtual environment:

python -m venv myenv
source myenv/bin/activate
pip install esgvoc

Fetching vocabulary data

Once installed esgvoc need to clone the following WCRP CV repositories and cache them into an SQLite database:

ESGVOC primarily uses the following repositories for controlled vocabulary data:

eval-rst - Unknown Directive
.. warning::
   To be accurate, ESGVOC uses the specific branch "esgvoc" in those repositories.

those are configured by default !

esgvoc install

This command performs the following actions:

Offline use

If there is no internet access, it is still possible to use the library: copy the repositories into .cache/repos and then issue esgvoc install. The library will check the .cache/repos directory for existing repositories.

Official controlled vocabulary repositories

Flexibility for other repositories

While designed for these repositories, ESGVOC can use other repositories if they are structured correctly.

Requirements


This introduction covers the general purpose and installation of ESGVOC. In the next sections, we will dive deeper into its functionality, including the Python API and CLI usage.

For more information check these links: