Grid Creation
How to run this?
Go here to instantly run all code below inside of your browser.
Use Case
This is a more advanced use case. If you have little coding experience we recommend following the "Basics" and "Speed-Fuel curve" use cases first.
When employing a Toqua Ship Kernel in a high-intensity routing application, millions of sequential predictions may need to happen within a time frame of a few minutes. In these cases, calling a Web API like ours will introduce too much latency. To address this issue, a simpler version of our model can be created that can be stored and queried locally. This simplified model is a multi-dimensional grid between which can be interpolated.
This notebook will explain what such a grid looks like and demonstrate how to create one using Toqua's API.
Feel free to run and experiment in this notebook. Any changes done will not be permanent. Ensure that you have a Toqua API Key that is authorized to use the grid creation endpoints.
Setup
Fill in the IMO number of your ship below.
IMO_NUMBER = "9999999"
Helper functions
Some helper functions to not clutter our code too much later on.
import io
import json
import time
import zipfile
from time import sleep
from typing import Dict
import pandas as pd
import requests
def make_api_call(method, url, payload=None, return_json=True):
headers = {
"accept": "application/json",
"content-type": "application/json",
"X-API-Key": API_KEY,
}
if method == 'GET':
response = requests.get(url, headers=headers)
elif method == 'POST':
response = requests.post(url, json=payload, headers=headers)
else:
print("Error: Invalid method")
return None
response.raise_for_status()
if return_json:
return response.json()
return response
What is a grid?
First of all, let's make it clear what we mean with "multi-dimensional grid" and how this can represent a Toqua Ship Kernel. It sounds more complicated than it is.
Toy example
Let's say we have a very simple Ship Kernel model. This model has only 2 input parameters: STW and Mean Draft, and can predict a single output parameter: Main Engine Power.
We can represent this model as a 2-dimensional grid where each dimension corresponds to an input parameter and the cells of the grid contain the output parameter. To make this concrete, let's look at the following grid, shown as a simple table:
STW | |||||
---|---|---|---|---|---|
8 | 10 | 12 | 14 | ||
Mean Draft |
10 | 2000 | 4000 | 7000 | 12000 |
16 | 3000 | 5000 | 8000 | 13000 |
Here we have the two input parameters, STW and Mean Draft, corresponding to the horizontal and vertical dimension respectively. As we have 4 different STW values and 2 different Mean Draft values, we have a total of 4 x 2 = 8 input combinations. Feeding these into our simple model results in 8 output values, shown as the value in each cell.
This 2-dimensional grid can approximate an actual model's predictions by interpolating between the values. For example, if we wish to know the predicted Main Engine Power at Mean Draft=10 and a STW=9 we can linearly interpolate between the Main Engine Power at a STW=8 and STW=10 for Mean Draft=10. This results in a predicted Main Engine Power output of 3000 kW.
Grid Accuracy
A grid will of course be less accurate than the actual model. Fortunately, we can control the accuracy penalty by choosing more granular dimensions, e.g. choosing STW values of [8, 9, 10, 11, 12, 13, 14]
would already make this grid more accurate.
We can (and should!) also add more dimensions. Other than STW and Draft, any model input parameter can be used as additional dimension. However, for each new dimension that is added the grid grows exponentially in size.
Creating a grid
A grid can be created using the Toqua API by providing a "grid specification".
Grid Specification
A grid specification looks a lot like a normal model prediction: it is a dictionary of input parameters mapped to a list of numbers. This list contains the values that are used along the dimension of the grid corresponding to the input parameter.
Our earlier example's grid specification would look as follows:
{
"STW": [8, 10, 12, 14],
"Mean Draft": [10, 16]
}
The actual grid is then constructed by creating a combination between each of these values, resulting in 8 total combinations, as shown earlier in the 2-d table.
Below we've created a grid specification that extends our toy example by including weather parameters and fuel_specific_energy
as additional dimensions. We will continue with this grid specification throughout this guide.
grid_specification = {
"sog": [8, 10, 12, 14], # [kn]
"draft_avg": [10, 16], # [m]
"wave_direction": [0, 90], # [deg]
"wave_height": [0.0, 2.0, 4.0], # [m]
"current_speed": [0.0], # [m/s]
"current_direction": [0.0], # [deg]
"wind_speed": [0.0, 10.0, 20.0], # [m/s]
"wind_direction": [0.0, 90.0], # [deg]
"fuel_specific_energy": [41.5], # [MJ/kg]
}
The number of total combinations here is 288, resulting in a grid size of 288. That can easily be calculated by multiplying the length of each list.
Note that the maximum allowed grid size may be capped.
Grid creation request
A grid creation request can be provided to our API as follows. It contains our grid specification, together with an optional text description of the grid and a key multi_grid
which can be set to True
or False
.
grid_request = {
"description": "My first grid",
"multi_grid": False,
"grid": grid_specification
}
The multi_grid
key
multi_grid
keySo far, we've made the assumption that our simple model only predicts a single parameter, Main Engine Power. However, a Toqua Ship Kernel is able to predict additional parameters: STW, Main Engine RPM, Main Engine Fuel Consumption, ... These additional outputs will simply be additional dimensions in our grid.
As mentioned earlier, a grid can quickly get pretty large. Because not all outputs are affected by all input parameters, we implemented an optimization that can be toggled on by setting the key multi_grid
to true
. When toggled, the grid will be split up into three smaller grids, each one containing a subset of the output parameters. Together they cover the whole space that would be covered by the original grid, even though the total size of all three grids combined can more than 500x smaller.
Specifically, this is how the three tables are created:
- The provided
sog
values will control thesog
inputs for the first table. - The second table's
stw
input will have the same range assog
, but its lower and upper bound extended to +/- the maximum givencurrent_speed
. - The third table's
me_power
will range from 0 to the ship's MCR.
For simplicity, we've disabled this option.
Creating a grid
Depending on the grid size, a grid takes our models a while to create. That's why we've impemented an asynchronous flow for creating grids.
What this means is that if a grid creation request is sent to our API it will check whether the request is valid, start a grid creation process (called a "job") and immediately return some information about the job. It doesn't immediately return the grid itself. Rather, in the background the job we started will ensure the grid is created at some point in the future. Once this job is completed the grid may be retrieved.
A grid creation job can be started by a POST request on the https://api.toqua.ai/ships/{imo_number}/models/latest/grid
endpoint.
def create_grid(imo_number, payload):
url = f"https://api.toqua.ai/ships/{imo_number}/models/latest/grid"
return make_api_call('POST', url, payload)
# start a new grid creation job
response = create_grid(IMO_NUMBER, grid_request)
print(json.dumps(response, indent=4))
{
"job_id": "ccb6c324-120e-4952-8510-8f02ab470242",
"creation_date": "2023-05-09T17:51:18.580732+00:00",
"grid_size": 288,
"specification": {
"description": "My first grid",
"date": null,
"multi_grid": false,
"grid": {
"sog": [
8.0,
10.0,
12.0,
14.0
],
"draft_avg": [
10.0,
16.0
],
"trim": null,
"wave_direction": [
0.0,
90.0
],
"wave_height": [
0.0,
2.0,
4.0
],
"wave_period": null,
"current_speed": [
0.0
],
"current_direction": [
0.0
],
"wind_direction": [
0.0,
90.0
],
"wind_speed": [
0.0,
10.0,
20.0
],
"sea_surface_temperature": null,
"sea_surface_salinity": null,
"fuel_specific_energy": [
41.5
]
}
},
"status": "queued",
"completion_date": null,
"estimated_completion_date": null
}
Waiting for a grid creation job to finish
To check how our job is progressing, we can retrieve its status. A grid creation job has four possible statuses: queued
, in_progress
, completed
and error
.
queued
: the job is waiting to be startedin_progress
: the job has been started and the grid is being createdcompleted
: the job is finished and the grid may be retrievederror
: an error occurred during the creation of the grid
The status can be retrieved by a GET request on the https://api.toqua.ai/ships/{imo_number}/models/latest/grid/{job_id}/status
endpoint. The job_id
can be retrieved from the response to our grid creation request.
For larger grids a job can take a while. Below, we've implemented a busy-waiting loop that checks the grid status every 5 seconds. In case the job takes longer than 10 minutes, we time out.
def get_grid_status(imo_number, job_id):
url = f"https://api.toqua.ai/ships/{imo_number}/models/latest/grid/{job_id}/status"
return make_api_call('GET', url)
# get job id
job_id = response['job_id']
# set a time out of 10 min to avoid waiting forever
max_timeout_in_seconds = 60 * 10
timed_out = False
job_in_progress = True
start_time = time.time()
# stop once timed out or job is finished
while job_in_progress and not timed_out:
response = get_grid_status(IMO_NUMBER, job_id)
status = response["status"]
job_in_progress = status == "in_progress" or status == "queued"
timed_out = (time.time() - start_time) > max_timeout_in_seconds
print(f"Job status: {status}...")
# poll job status every 5 seconds
sleep(5)
if timed_out:
raise TimeoutError(f"We've timed out after {max_timeout_in_seconds} seconds!")
if status == "error":
raise RuntimeError("Something went wrong during creation of the grid!")
# job finished successfully!
print(f"Job {status}. Creation date: {response['creation_date']}, completion date: {response['completion_date']}")
Job status: completed...
Job completed. Creation date: 2023-05-09T17:51:18.580732+00:00, completion date: 2023-05-09T17:51:19.763026
Retrieving the grid
As the job status is set to completed
, we can retrieve our grid.
The grid can be retrieved by a GET request to https://api.toqua.ai/ships/{imo_number}/models/latest/grid/{job_id}
.
A ZIP file will be returned. We will write this file to disk so it's easy to manually inspect its contents. The file will contain one .csv file for each grid and a metadata.json
file describing the file's contents.
def get_grid(imo_number, job_id):
url = f"https://api.toqua.ai/ships/{imo_number}/models/latest/grid/{job_id}"
return make_api_call('GET', url, return_json=False)
# get the zip file
response = get_grid(IMO_NUMBER, job_id)
# write the zip file to disk
print(f"Creating .zip file...")
file_path = job_id + ".zip"
with open(file_path, "wb") as f:
f.write(response.content)
print(f"Zip file written to {file_path}")
Creating .zip file...
Zip file written to ccb6c324-120e-4952-8510-8f02ab470242.zip
Using the metadata.json
file
metadata.json
fileManually downloading, unzipping and extracting .zip files is not very practical. To automate this process, a standardized metadata.json
file is present which describes the contents of the zip file. It is a simple file whose schema is specified as an OpenAPI component in https://api.toqua.ai/openapi.json#/components/schemas/ModelGridFilesDescription.
Let's see which files are present in the .zip file:
# read the zip file
zip = zipfile.ZipFile(io.BytesIO(response.content))
# list the files in the zip
file_names = [file_info.filename for file_info in zip.filelist]
print(f"Files in zip: {file_names}")
Files in zip: ['metadata.json', 'end-to-end.csv']
Each of the .csv files correspond to a grid. Let's inspect the metadata.json
file.
metadata = json.loads(zip.read("metadata.json"))
print(f"Keys present in metadata: {list(metadata.keys())}")
Keys present in metadata: ['grids', 'job']
The job
key is a copy of the job specification that was used to create this grid. We won't print it here as it is quite long.
The grids
key is interesting, as that contains the description of each grid that is present.
grid_descriptions = metadata['grids']
print(f"{len(grid_descriptions)} grid files present")
print(json.dumps(grid_descriptions, indent=4))
1 grid files present
[
{
"entrypoint": "sog",
"targets": [
"me_fo_consumption",
"stw",
"me_rpm",
"me_power"
],
"inputs": [
"draft_avg",
"wave_direction",
"wave_height",
"current_speed",
"current_direction",
"wind_speed",
"wind_direction",
"ship_heading",
"fuel_specific_energy",
"sog"
],
"filename": "end-to-end.csv",
"grid_type": "end-to-end"
}
]
For each grid present, the grids
array has one element in the following form:
{
"entrypoint": string,
"targets": [string, ...],
"inputs": [string, ...],
"filename": string,
"grid_type": ["sog-stw" or "stw-me_rpm-me_power" or "me_power-me_fo_consumption" or "end-to-end"]
}
-
entrypoint
,targets
andinputs
describes the input and output data of the grid. Each string will correspond to a column found in the grid's .csv file. -
filename
is the filename of the grid in the .zip file. -
grid_type
describes which kind of grid it is, and can only be one of 4 types shown below. This element is crucial, as it can be used to automatically find the correct filename, entrypoint, targets and inputs.sog-stw
: a grid that predicts thestw
starting fromsog
stw-me_rpm-me_power
: a grid that predicts theme_power
andme_rpm
starting fromstw
me_power-me_fo_consumption
: a grid that predicts theme_fo_consumption
starting from theme_power
end-to-end
: a grid that predicts thestw
,me_rpm
,me_power
andme_fo_consumption
starting fromsog
. This grid is retrieved whenmulti_grid=false
.
As a small example, let's use the grid_type
to fetch the correct .csv
file, load it into a pandas DataFrame and print it out.
def find_filename_of_grid_type(metadata: Dict, grid_type: str):
for g in metadata["grids"]:
if g["grid_type"] == grid_type:
return g["filename"]
raise Exception(f"Grid type `{grid_type}` not found!")
# try to find the filename of the end-to-end grid
grid_type = "end-to-end"
grid_filename = find_filename_of_grid_type(metadata, grid_type)
print(f"Grid type `{grid_type}` has filename `{grid_filename}`")
# extract the grid file from the zip
grid_file = zip.read(grid_filename)
# read the grid file into a pandas dataframe
df = pd.read_csv(io.BytesIO(grid_file))
print(df.head())
print()
Grid type `end-to-end` has filename `end-to-end.csv`
draft_avg wave_direction wave_height current_speed current_direction \
0 10.0 0.0 0.0 0.0 0.0
1 10.0 0.0 0.0 0.0 0.0
2 10.0 0.0 0.0 0.0 0.0
3 10.0 0.0 0.0 0.0 0.0
4 10.0 0.0 0.0 0.0 0.0
wind_speed wind_direction ship_heading fuel_specific_energy sog \
0 0.0 0.0 0 41.5 8.0
1 0.0 90.0 0 41.5 8.0
2 10.0 0.0 0 41.5 8.0
3 10.0 90.0 0 41.5 8.0
4 20.0 0.0 0 41.5 8.0
me_fo_consumption stw me_rpm me_power limit_exceeded
0 10.960092 8.0 32.253263 2328.710170 NaN
1 10.960092 8.0 32.253263 2328.710170 NaN
2 13.191427 8.0 31.787412 2833.959285 NaN
3 10.960092 8.0 31.440519 2328.710170 NaN
4 21.716698 8.0 37.100321 4852.612086 NaN
Model limits
As can be seen in the previous output, an additional column is present: limit_exceeded
. When predicting multiple grids, this column will be present in each grid.
When a ship's limits have been exceeded, this column will contain the name of the primary operational parameter whose limit was exceeded.
Currently, the following will trigger a limit_exceeded
flag:
sog
,stw
,me_rpm
,me_power
orme_fo_consumption
below 0me_power
> 90% MCRme_power
< 10% MCRme_rpm
> Max RPM
For example, if 90% MCR is 25000 kW and the predicted me_power
is 25001, then the column will contain me_power
.
Updated about 1 year ago