Inputs & Outputs

Before using the tool, it would be helpful to define all the input arguments available to the user to enrich their estimations and the final outputs (columns of the returned datasets).

Input Arguments

Argument

Description

StartDate

The first date of the range to process jobs for, in YYYY-MM-DD.
Default: January 1st of the current year e.g. 2025-01-01

EndDate

The final date of the range to process jobs for, in YYYY-MM-DD.
Default: The current date.

JobIDs

Comma-separated list of all the HPC job IDs to filter on (e.g "id1245,id6789")
Default is “all_jobs”, which processes all the jobs ran in the specified date range.

Region

UK region of the HPC cluster you are using, needed for carbon intensity data.
This is used to retrieve realtime carbon intensity data from the NESO Carbon Intensity API
corresponding to job start times.

Options: 'North Scotland', 'South Scotland', 'North West England',
'North East England', 'Yorkshire', 'North Wales', 'South Wales',
'West Midlands', 'East Midlands', 'East England',
'South West England', 'South England', 'London', 'South East England'.

Default: 'UK_average' which was 124 gCO2e/kWh in 2024.

Scope3

Option to include scope 3 (embodied) emissions estimates as well as scope 2 in the output.
This feature is only available to a few HPC systems which have undergone lifecycle
assessments to obtain a per node-hour scope 3 emissions factor.

Options: Isambard3, IsambardAI, and Archer2 (see here).
You may also specify a custom numeric value in gCO2e/node-hour for other HPC systems
if these values are available (e.g. 51).

Default: no_scope3 which means only scope 2 (operational) emissions will be calculated
and included in the output.

CSV

Save the final datasets to CSV file for further analysis elsewhere.

Options:
full: Entire dataset (all jobs) with all columns (see below.)
full_summary: entire dataset with summary columns only.
daily: dataset aggregated by day with all columns.
daily_summary: dataset aggregated by day with summary columns only.
total: dataset aggregated over all total jobs with all columns.
total_summary : dataset aggregated over all total jobs with summary columns only.
all: all of the above datasets saved to CSV files.

Output Data

Below are the columns returned in the outputted datasets once the tool has been run. These are not stored if you are using the command-line interface, unless the user has specified the data to be saved to CSV. The following pages describe how to use the tool and store outputs for each user mode:

SLURM-Extracted Job Data columns

Column Name

Description

Job_ID

Unique identifier for your job ran on the HPC system.

UserID

Unique numerical ID of the job submitter.

UserName

Username of the person who submitted the job.

PartitionName

Name of the partition the job was submitted to.

PartitionCategory

Processor type of the partition (CPU or GPU).
This is provided by the user in hpc_config.yaml.

NameofJob

Job name given by the user when submitted.

SubmissionTime

Date and time the job was submitted in %Y-%m-%d %H:%M:%S format (Datetime).

StateCode

Numeric code representing the job status (1 = completed/successful, 0 = failed).

TotalNodes

Total number of nodes allocated to the job(s).

CPUsAllocated

Total number of CPU cores allocated to the job(s).

GPUsAllocated

Total number of GPUs allocated to the job(s).

ElapsedRuntime

Total wallclock runtime of the job(s).

CPUusagetime

The actual CPU time consumed by the job(s), summed across all CPUs
(measured time CPUs were actively processing).
As a timedelta object D days HH:MM:SS.

CPUwalltime

Estimated CPU time (NCPUs * ElapsedRuntime).
The max CPU time if all cores were 100% utilised.
As a timedelta object D days HH:MM:SS.

GPUusagetime

Estimated GPU usage time (GPUsAllocated * ElapsedRuntime).
Assumes 100% GPU utilisation due to a lack of measured GPU usage data available from SLURM.
As a timedelta object D days HH:MM:SS.

RequestedMemoryGB

Amount of memory requested by the user at job submissions (in GB).

UsedMemoryGB

Amount of memory actually used by the job(s) (in GB).

RequiredMemoryGB

The estimated minimum amount of memory required for the job(s) to run (in GB).

NodeHours

Calculated total node-hour usage (in hours).

WorkingDirectory

File system path where the job was run from.
e.g. /lus/lfs1aip1/home/d5c/eayliffe.d5c/job

Energy Data

Column Name

Description

EnergyIPMI_kwh

Total energy consumed (kWh) by the job(s) measured by hardware energy/power counters
(e.g. IPMI or RAPL). This is only logged if energy counters are available on the HPC system

energy_estimated_kwh

Total energy consumed (kWh) by the job(s) including the datacenter overhead (PUE factor).
This is estimated from usage data and TDP values supplied by the user in hpc_config.yaml

energy_estimated_noPUE_kwh

Total energy consumed by the job(s) without the datacenter overhead (PUE) applied (usage-based estimate)
This is for valid comparison with energy counters.

CPU_energy_estimated_kwh

Energy consumed by CPUs (usage-based estimate).

GPU_energy_estimated_kwh

Energy consumed by GPUs (usage-based estimate).

memory_energy_estimated_kwh

Energy consumed by memory (usage-based estimate).

required_memory_energy_estimated_kwh

Energy consumed by memory if only the required memory was allocated (usage-based estimate).

energy_requiredMem_estimated_kwh

Total energy consumed (kWh) by the job(s) if only the required memory was allocated.

failed_energy_kwh

Energy consumed by failed jobs only.

Carbon Emissions Data

Column Name

Description

CarbonIntensity_gCO2e_kwh

Carbon Intensity at the time of job submission for the selected Region (in gCO2e/kWh).
Retrieved from the carbon intensity API at the time of job submission.
This is averaged over all jobs.

Scope2Emissions_gCO2e

Scope 2 (operational) emissions calculated using estimated energy (in gCO2e).

Scope2Emissions_IPMI_gCO2e

Scope 2 (operational) emissions calculated using measured energy (energy counters).

Scope3Emissions_gCO2e

Scope 3 (embodied) emissions estimated for the job(s).
Only shows if the Scope3 argument is set.

Scope2Emissions_requiredMem_gCO2e

Scope 2 emissions produced if only the required memory had been allocated.

Scope2Emissions_failed_gCO2e

Scope 2 emissions associated with the failed jobs only.

TotalEmissions_gCO2e

Total carbon emissions in gCO2e (scope 2 + scope 3).
This includes counter-based scope 2 emissions if energy counters are available,
and usage-based estimates if they aren’t.

Equivalents for User Interest

These data are provided as approximate values intended to help contextualise the impact of the user’s computational carbon footprint. See the Methodology for sources and assumptions for these calculations.

Column Name

Description

Cost_GBP

The approximate electricity cost (in British pounds) of running the job(s)
based on the value supplied in hpc_config.yaml.

driving_miles

The equivalent number of miles driven by an average UK car (miles).

tree_absorption_months

The months for one tree to absorb the total amount of CO2e produced (months).

uk_houses_daily_emissions

Equivalent number of UK household’s daily emissions from electricity use.

bris_paris_flights

Equivalent number of flights from Bristol to Paris.