Skip to content

Define a "production" run class in the API to allow import of new pipeline data #69

@angelicalmcgowan

Description

@angelicalmcgowan

Issue description

Before we can pull runs from the new pipeline to the VIEWS API, we must define and prepare for a production class of runs in the API. Keeping the fatalities002 class name for the new runs is unfortunately not possible due to the discontinuation of surrogate models, which are part of the definition of the fatalities002 class.

Unfortunately, this is not as simple as modifying the data transfer scripts (available here).

How to pull new data into the API

To pull to the API, one:

  1. runs the get_views3 script that fetches the runs from the storage system and packs them into a pgm and a cm data frame stored as a local file. This script must be executed on the vpn, so it has to be done on one's laptop.

  2. scp's the two files from your laptop to the API box using a certificate to gain access

  3. ssh's into the API box, and runs the register_views3 script which opens the cm and pgm files and ingests them into the API box's private database

To create a new class of runs requires editing both of those scripts, and – first – manually editing a set of three of four linked SQL tables on the API box which define currently allowed run names, the constituent models corresponding to each run, etc. This is, sadly, very fiddly.

Run class naming convention

The new API run naming convention agreed upon by Jim, Simon, Dylan, and Angelica late 2024 is production_YYYY_MM, where production becomes the name of the new model, and YYYY_MM is appended to the dataset name for each run from that model. The latter refers to the year and month of production (note that this differs from the current convention, where the date refers to the EndOfHistory.

Implications

Pulling data to the VIEWS API through a new run class comes with implications for the dashboard. The dashboard uses a string-based filter to select which datasets it should import from the VIEWS API. It currently searches for fatalities001 and fatalities002. In order for the new runs to be imported into and displayed on the dashboard, we must add the name of the new run class to the dashboard code. This will likely require some hours from the dashboard consultant Henrik. This step is addressed in a separate issue (#70).

Tasks

  • Define a production run class in the VIEWS API, incl. updates to the linked SQL tables
  • Update the data transfer scripts to facilitate import of datasets from the production run class into the API

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions