# Inputs & Outputs
Before using the tool, it would be helpful to define all the **input arguments** available to the user to enrich their estimations and the **final outputs** (columns of the returned datasets).
## Input Arguments
| Argument | Description |
|---------------------|---------------------------------------------------------------------------------|
| `StartDate` | The first date of the range to process jobs for, in `YYYY-MM-DD`.
Default: January 1st of the current year e.g. `2025-01-01` |
| `EndDate` | The final date of the range to process jobs for, in `YYYY-MM-DD`.
Default: The current date. |
| `JobIDs` | Comma-separated list of all the HPC **job IDs** to filter on (e.g `"id1245,id6789"`)
Default is "all_jobs", which processes all the jobs ran in the specified date range. |
| `Region` | UK region of the HPC cluster you are using, needed for carbon intensity data.
This is used to retrieve realtime carbon intensity data from the [NESO Carbon Intensity API](https://carbonintensity.org.uk)
corresponding to job start times.
**Options:** `'North Scotland'`, `'South Scotland'`, `'North West England'`,
`'North East England'`, `'Yorkshire'`, `'North Wales'`, `'South Wales'`,
`'West Midlands'`, `'East Midlands'`, `'East England'`,
`'South West England'`, `'South England'`, `'London'`, `'South East England'`.
Default: `'UK_average'` which was [124 gCO2e/kWh in 2024.](https://www.carbonbrief.org/analysis-uks-electricity-was-cleanest-ever-in-2024/) |
| `Scope3` | Option to include scope 3 (embodied) emissions estimates as well as scope 2 in the output.
This feature is only available to a few HPC systems which have undergone lifecycle
assessments to obtain a **per node-hour scope 3 emissions factor**.
**Options:** `Isambard3`, `IsambardAI`, and `Archer2` [(see here)](https://docs.archer2.ac.uk/user-guide/energy/).
You may also specify a custom numeric value in gCO2e/node-hour for other HPC systems
if these values are available (e.g. `51`).
Default: `no_scope3` which means only scope 2 (operational) emissions will be calculated
and included in the output.|
| `CSV` | Save the final datasets to CSV file for further analysis elsewhere.
**Options:**
`full`: Entire dataset (all jobs) with all columns [(see below.)](#output-data)
`full_summary`: entire dataset with summary columns only.
`daily`: dataset aggregated by day with all columns.
`daily_summary`: dataset aggregated by day with summary columns only.
`total`: dataset aggregated over all total jobs with all columns.
`total_summary` : dataset aggregated over all total jobs with summary columns only.
`all`: all of the above datasets saved to CSV files.|
## Output Data
Below are the columns returned in the outputted datasets once the tool has been run. These are not stored if you are using the command-line interface, unless the user has specified the data to be saved to CSV. The following pages describe how to use the tool and store outputs for each user mode:
- [Command-line Interface](cli.md)
- [Python Usage](function.md)
- [Interactive Jupyter Interface](jupyter.md)
#### **SLURM-Extracted Job Data columns**
| Column Name | Description |
|---------------------|---------------------------------------------------------------------------------|
| `Job_ID` | Unique identifier for your job ran on the HPC system. |
| `UserID` | Unique numerical ID of the job submitter. |
| `UserName` | Username of the person who submitted the job. |
| `PartitionName` | Name of the partition the job was submitted to. |
| `PartitionCategory` | Processor type of the partition (`CPU` or `GPU`).
This is provided by the user in `hpc_config.yaml`.|
| `NameofJob` | Job name given by the user when submitted. |
| `SubmissionTime` | Date and time the job was submitted in `%Y-%m-%d %H:%M:%S` format (Datetime).
| `StateCode` | Numeric code representing the job status (1 = completed/successful, 0 = failed).|
| `TotalNodes` | Total number of nodes allocated to the job(s).|
| `CPUsAllocated` | Total number of CPU cores allocated to the job(s). |
| `GPUsAllocated` | Total number of GPUs allocated to the job(s). |
| `ElapsedRuntime` | Total wallclock runtime of the job(s). |
| `CPUusagetime` | The actual CPU time consumed by the job(s), summed across all CPUs
(measured time CPUs were actively processing).
As a timedelta object `D days HH:MM:SS`. |
| `CPUwalltime` | Estimated CPU time (NCPUs * `ElapsedRuntime`).
The max CPU time if all cores were 100% utilised.
As a timedelta object `D days HH:MM:SS`.|
| `GPUusagetime` | Estimated GPU usage time (`GPUsAllocated` * `ElapsedRuntime`).
Assumes 100% GPU utilisation due to a lack of measured GPU usage data available from SLURM.
As a timedelta object `D days HH:MM:SS`.|
| `RequestedMemoryGB` | Amount of memory requested by the user at job submissions (in GB). |
| `UsedMemoryGB` | Amount of memory actually used by the job(s) (in GB). |
| `RequiredMemoryGB` | The estimated minimum amount of memory required for the job(s) to run (in GB).|
| `NodeHours` | Calculated total node-hour usage (in hours). |
| `WorkingDirectory` | File system path where the job was run from.
e.g. `/lus/lfs1aip1/home/d5c/eayliffe.d5c/job` |
#### **Energy Data**
| Column Name | Description |
|---------------------|---------------------------------------------------------------------------------|
| `EnergyIPMI_kwh` | Total energy consumed (kWh) by the job(s) measured by hardware energy/power counters
(e.g. **IPMI or RAPL**). This is only logged if energy counters are available on the HPC system |
| `energy_estimated_kwh` | Total energy consumed (kWh) by the job(s) including the datacenter overhead (PUE factor).
This is estimated from usage data and TDP values supplied by the user in `hpc_config.yaml` |
| `energy_estimated_noPUE_kwh` | Total energy consumed by the job(s) without the datacenter overhead (PUE) applied (usage-based estimate)
This is for valid comparison with energy counters. |
| `CPU_energy_estimated_kwh` | Energy consumed by CPUs (usage-based estimate). |
| `GPU_energy_estimated_kwh` | Energy consumed by GPUs (usage-based estimate). |
| `memory_energy_estimated_kwh` | Energy consumed by memory (usage-based estimate). |
| `required_memory_energy_estimated_kwh` | Energy consumed by memory if only the required memory was allocated (usage-based estimate). |
| `energy_requiredMem_estimated_kwh` | Total energy consumed (kWh) by the job(s) if only the required memory was allocated. |
| `failed_energy_kwh` | Energy consumed by failed jobs only. |
#### **Carbon Emissions Data**
| Column Name | Description |
|---------------------|---------------------------------------------------------------------------------|
| `CarbonIntensity_gCO2e_kwh` | Carbon Intensity at the time of job submission for the selected `Region` (in gCO2e/kWh).
Retrieved from the carbon intensity API at the time of job submission.
This is averaged over all jobs.|
| `Scope2Emissions_gCO2e` | Scope 2 (operational) emissions calculated using estimated energy (in gCO2e). |
| `Scope2Emissions_IPMI_gCO2e` | Scope 2 (operational) emissions calculated using measured energy (energy counters). |
| `Scope3Emissions_gCO2e` | Scope 3 (embodied) emissions estimated for the job(s).
Only shows if the `Scope3` argument is set. |
| `Scope2Emissions_requiredMem_gCO2e` | Scope 2 emissions produced if only the required memory had been allocated. |
| `Scope2Emissions_failed_gCO2e` | Scope 2 emissions associated with the failed jobs only. |
| `TotalEmissions_gCO2e` | Total carbon emissions in gCO2e (scope 2 + scope 3).
This includes counter-based scope 2 emissions if energy counters are available,
and usage-based estimates if they aren't. |
#### **Equivalents for User Interest**
These data are provided as approximate values intended to help contextualise the impact of the user's computational carbon footprint.
See the [Methodology](methodology.md) for sources and assumptions for these calculations.
| Column Name | Description |
|---------------------|---------------------------------------------------------------------------------|
| `Cost_GBP` | The approximate electricity cost (in British pounds) of running the job(s)
based on the value supplied in `hpc_config.yaml`.
| `driving_miles` | The equivalent number of miles driven by an average UK car (miles). |
| `tree_absorption_months` | The months for one tree to absorb the total amount of CO2e produced (months). |
| `uk_houses_daily_emissions` | Equivalent number of UK household's daily emissions from electricity use. |
| `bris_paris_flights` | Equivalent number of flights from Bristol to Paris. |