Python Usage

For greater flexibility and integration, you can use GRACE-HPC programmatically, directly in a Python script (.py) or Jupyter Notebook (.ipynb) by importing and calling a function. This enables you to embed the tool into your own workflows, automate analyses, or even develop your own custom carbon footprinting solutions.

💡 You must have generated, edited, and saved the hpc_config.yaml configuration file before completing the steps below.

Import the Function

In your Python code, enter:

from gracehpc import gracehpc_run 

Function Arguments

The arguments are identical to those used in the CLI. For detailed descriptions and available options, refer to the Inputs Arguments section.

Argument

Type

Format Example

StartDate

str

YYYY-MM-DD "2025-01-01"

EndDate

str

YYYY-MM-DD "2025-06-15"

JobIDs

str

Comma separated list (no spaces) "id1234,id5678"

Region

str

UK Region Name "South West England"

Scope3

str

HPC system name or custom value "Isambard3" or "51" or "no_scope3"

CSV

str

CSV output type "full", "total", etc. or "no_save"

Run the Engine

Call the following function to run the full engine (and return 3 dataframes) in a Python script or notebook, instead of the command-line interface.

full_df, daily_df, total_df = gracehpc_run(
    StartDate="2025-01-01", 
    EndDate="2026-01-01",
    Region="South West England",
    JobIDs="id1234,id5678",
    Scope3="Isambard3",
    CSV="no_save"
)

Example function calls:

>>> gracehpc_run(StartDate="2025-06-01", EndDate="2053-07-25", JobIDs="12345,67890", Region="South West England", Scope3="IsambardAI", CSV="all")
>>> gracehpc_run(StartDate="2025-01-01", EndDate="2025-08-01", Region="South West England", Scope3="Isambard3", CSV="full_summary")
>>> gracehpc_run(StartDate="2025-07-16", EndDate="2025-07-07", JobIDs="id1245", Region="London", Scope3="51", CSV="no_save")

For more information on the required arguments run:

help(gracehpc_run)

Output

The output generated is the same as that produced in the terminal for the CLI, see here for an example.

Function Returns

Output results can be captured in three pandas DataFrames for the further exploration after using the tool. Refer to the Output Data section for details on the data included and their corresponding column names.

DataFrame

Description

full_df

Full job-level dataset - one row per job.
Includes all fields for each job processed in the date range.

daily_df

Daily aggregated dataset - one row per day.
Aggregates all fields across all jobs per day.
E.g. sums the energy and carbon emissions of jobs submitted in each day,
and takes the average carbon intensity value.

total_df

Total summary dataset - one row aggregating all jobs.
Includes overall totals or averages for each field in the full_df.

You can now filter, explore, or process these DataFrames further. See some examples below:

# Access the energy consumed by CPUs for each job 
cpu_energy = full_df["CPU_energy_estimated_kwh"]

# Filter the daily_df to extract data for a specific submission date
# Convert the SubmissionDate column to datetime format
daily_df["SubmissionDate"] = pd.to_datetime(daily_df["SubmissionDate"])
filtered_daily = daily_df[daily_df["SubmissionDate"] == pd.to_datetime("2025-06-16")]

# Find jobs in full_df with a specific job name
ai_job = full_df[full_df["NameofJob"] == "AI_benchmark"]

# Select only completed jobs (StateCode == 1)
completed_jobs = full_df[full_df["StateCode"] == 1]