Python Usage
For greater flexibility and integration, you can use GRACE-HPC programmatically, directly in a Python script (.py) or Jupyter Notebook (.ipynb) by importing and calling a function. This enables you to embed the tool into your own workflows, automate analyses, or even develop your own custom carbon footprinting solutions.
💡 You must have generated, edited, and saved the
hpc_config.yamlconfiguration file before completing the steps below.
Import the Function
In your Python code, enter:
from gracehpc import gracehpc_run
Function Arguments
The arguments are identical to those used in the CLI. For detailed descriptions and available options, refer to the Inputs Arguments section.
Argument |
Type |
Format |
|---|---|---|
StartDate |
str |
YYYY-MM-DD |
EndDate |
str |
YYYY-MM-DD |
JobIDs |
str |
Comma separated list (no spaces) |
Region |
str |
UK Region Name |
Scope3 |
str |
HPC system name or custom value |
CSV |
str |
CSV output type |
Run the Engine
Call the following function to run the full engine (and return 3 dataframes) in a Python script or notebook, instead of the command-line interface.
full_df, daily_df, total_df = gracehpc_run(
StartDate="2025-01-01",
EndDate="2026-01-01",
Region="South West England",
JobIDs="id1234,id5678",
Scope3="Isambard3",
CSV="no_save"
)
Example function calls:
>>> gracehpc_run(StartDate="2025-06-01", EndDate="2053-07-25", JobIDs="12345,67890", Region="South West England", Scope3="IsambardAI", CSV="all")
>>> gracehpc_run(StartDate="2025-01-01", EndDate="2025-08-01", Region="South West England", Scope3="Isambard3", CSV="full_summary")
>>> gracehpc_run(StartDate="2025-07-16", EndDate="2025-07-07", JobIDs="id1245", Region="London", Scope3="51", CSV="no_save")
For more information on the required arguments run:
help(gracehpc_run)
Output
The output generated is the same as that produced in the terminal for the CLI, see here for an example.
Function Returns
Output results can be captured in three pandas DataFrames for the further exploration after using the tool. Refer to the Output Data section for details on the data included and their corresponding column names.
DataFrame |
Description |
|---|---|
full_df |
Full job-level dataset - one row per job. |
daily_df |
Daily aggregated dataset - one row per day. |
total_df |
Total summary dataset - one row aggregating all jobs. |
You can now filter, explore, or process these DataFrames further. See some examples below:
# Access the energy consumed by CPUs for each job
cpu_energy = full_df["CPU_energy_estimated_kwh"]
# Filter the daily_df to extract data for a specific submission date
# Convert the SubmissionDate column to datetime format
daily_df["SubmissionDate"] = pd.to_datetime(daily_df["SubmissionDate"])
filtered_daily = daily_df[daily_df["SubmissionDate"] == pd.to_datetime("2025-06-16")]
# Find jobs in full_df with a specific job name
ai_job = full_df[full_df["NameofJob"] == "AI_benchmark"]
# Select only completed jobs (StateCode == 1)
completed_jobs = full_df[full_df["StateCode"] == 1]