Steps
Steps are the smallest units of work that can be executed on their own in polaris. All tasks are made up of 1 or more steps, and all steps are set up into subdirectories inside of the work directory for the component. Shared steps should reside somewhere in the work directory above (or possibly inside of) all tasks that share the steps. Steps that belong to only one task should be inside of that task’s work directory. Typically, a user will run all steps in a task but certain tasks may prefer to have steps that are not run by default (e.g. a long forward simulation or optional visualization) but which are available for a user to manually alter and then run on their own.
A step is defined by a class that descends from polaris.Step
.
The child class must override the constructor and must also either override the
polaris.Step.run()
method or define the args
attribute. It will
sometimes also wish to override the polaris.Step.setup()
method,
described below.
Step attributes
As was the case for tasks, the base class polaris.Step
has a
large number of attributes that are useful at different stages (init, setup and
run) of the step.
Some attributes are available after calling the base class’ constructor
super().__init__()
. These include:
self.name
the name of the step
self.component
The component the step belongs to
self.subdir
the subdirectory for the step within the component
self.path
the path within the base work directory of the step, made up of the name of the component and the step’s
subdir
self.ntasks
the number of parallel (MPI) tasks the step would ideally use. Too few cores are available on the system to run
ntasks * cpus_per_task
, the step will run on all available cores as long as this is not belowmin_tasks * min_cpus_per_task
self.min_tasks
the number of MPI tasks the step requires. If the system fewer than
min_tasks * min_cpus_per_task
cores, the step will failself.cpus_per_task
The number of CPUs that each task runs with, or the total number of CPUs the step would ideally run with if python threading or multiprocessing is being used, in which case
ntasks = 1
self.min_cpus_per_task
The minimum number of CPUs that each task runs with, or the minimum total number of CPUs required for the step if python threading or multiprocessing is being used, in which case
ntasks = 1
. Ifntasks > 1
,min_cpus_per_task
much be the same ascpus_per_task
.self.openmp_threads
the number of OpenMP threads the step will use
self.max_memory
An aspirational attribute that will be used in the future to indicate the amount of memory that the step is allowed to use in MB
self.cached
Whether to get all of the outputs for the step from the database of cached outputs for the component that this step belongs to
self.run_as_subprocess
Whether to run this step as a subprocess, rather than just running it directly from the task. It is useful to run a step as a subprocess if there is not a good way to redirect output to a log file (e.g. if the step calls external code that, in turn, calls additional subprocesses).
The default behavior when python code calls one of the
subprocess
functions is that the output goes tostdout
/stderr
(i.e. the terminal). When python code outside of polaris (e.g.jigsawpy
) calls asubprocess
function (e.g. to call JIGSAW), that output goes to the terminal rather than a log file. For most output tostdout
/stderr
likeprint()
statements,check_call()
in MPAS-Tools employs a “trick” to redirect that output to a log file instead. But that doesn’t work withsubprocess
calls. They continue to go to the terminal. However, if we call a given polaris step as a subprocess while redirecting its output to a log file, we can prevent unwanted output from ending up in the terminal (the “outer” subprocess call gets redirected to a log file even when the inner one does not).self.dependencies
A dictionary of steps that this step depends on (i.e. it can’t run until they have finished). Dependencies are used when the names of the files produced by the dependency aren’t known at setup (e.g. because they depend on config options or data read in from files). If the names of this step’s input files are known at setup, it is sufficient (and preferable) to indicate that an output file from another step is an input of this step to establish a dependency.
self.is_dependency
Whether this step is the dependency of one or more other steps
self.args
A list of command-line arguments to call in parallel. This attribute should be defined as an alternative to overriding the
run()
.args
should not include calls to a parallel executable likesrun
or the associated flags for number of MPI tasks, nodes, etc., since these will be added internally by Polaris.
Another set of attributes is not useful until setup()
is called by the
polaris framework:
self.config
Configuration options for this task, a combination of the defaults for the machine, core and configuration
self.config_filename
The local name of the config file that
config
has been written to during setup and read from during runself.work_dir
The step’s work directory, defined during setup as the combination of
base_work_dir
andpath
self.base_work_dir
The base work directory
These can be used to add additional input, output, namelist or streams files based on config options that were not available during init, or which rely on knowing the work directory.
Finally, a few attributes are available only when run()
gets called by the
framework:
self.inputs
a list of absolute paths of input files produced as part of setting up the step. These input files must all exist at run time or the step will raise an exception
self.outputs
a list of absolute paths of output files produced by this step and available as inputs to other tasks and steps. These files must exist after the test has run or an exception will be raised
self.logger
A logger for output from the step. This gets passed on to other methods and functions that use the logger to write their output to the log file.
self.log_filename
The name of a log file where output/errors from the step are being logged, or
None
if output is to stdout/stderrself.machine_info
Information about E3SM supported machines
The inputs and outputs should not be altered but they may be used to get file names to read or write.
Some attributes are also used by the framework to validate variables in output files against a baseline in one is provided:
self.baseline_dir
Location of the same step within the baseline work directory, for use in comparing variables and timers
self.validate_vars
A list of variables for each output file for which a baseline comparison should be performed if a baseline run has been provided. The baseline validation is performed after the step has run.
You can add other attributes to the child class that keeps track of information that the step will need.
As an example,
polaris.landice.tasks.dome.setup_mesh.SetupMesh
keeps track of the
mesh type as an attribute:
from polaris import Step
class SetupMesh(Step):
"""
A step for creating a mesh and initial condition for dome tasks
Attributes
----------
mesh_type : str
The resolution or mesh type of the task
"""
def __init__(self, component, mesh_type):
"""
Update the dictionary of step properties
Parameters
----------
component : polaris.Component
The component this step belongs to
mesh_type : str
The resolution or mesh type of the task
"""
super().__init__(component=component, name='setup_mesh')
self.mesh_type = mesh_type
if mesh_type == 'variable_resolution':
# download and link the mesh
# the empty database is a trick for downloading to the root of
# the local MALI file cache
self.add_input_file(filename='mpas_grid.nc',
target='dome_varres_grid.nc', database='')
self.add_output_file(filename='landice_grid.nc')
constructor
The step’s constructor (__init__()
method) should call the base class’
constructor with super().__init__()
, passing the name of the step, the
task it belongs to, and possibly several optional arguments: the
subdirectory for the step (if not the same as the name), number of MPI tasks,
the minimum number of MPI tasks, the number of CPUs per task, the minimum
number of CPUs per task, the number of OpenMP threads, and (currently as
placeholders) the amount of memory the step is allowed to use.
Then, the step can add inputs and outputs as well as Adding yaml, namelist and streams files, as described below.
As with the task’s constructor, it is important that the step’s constructor doesn’t perform any time-consuming calculations, download files, or otherwise use significant resources because this function is called quite often for every single task and step: when tasks are listed, set up, or cleaned up, and also when suites are set up or cleaned up. However, it is okay to add input, output, streams and namelist files to the step by calling any of the following methods:
Each of these functions just caches information about the the inputs, outputs, namelists, streams or YAML files to be read later if the task in question gets set up, so each takes a negligible amount of time.
If this is a shared step with its own config options, it is also okay to call
mpas_tools.config.MpasConfigParser.add_from_package()
from the
constructor.
The following is from
polaris.ocean.tasks.baroclinic_channel.forward.Forward()
:
from polaris.ocean.model import OceanModelStep
class Forward(OceanModelStep):
"""
A step for performing forward ocean component runs as part of baroclinic
channel tasks.
Attributes
----------
resolution : float
The resolution of the task in km
dt : float
The model time step in seconds
btr_dt : float
The model barotropic time step in seconds
run_time_steps : int or None
Number of time steps to run for
"""
def __init__(self, component, resolution, name='forward', subdir=None,
indir=None, ntasks=None, min_tasks=None, openmp_threads=1,
nu=None, run_time_steps=None):
"""
Create a new task
Parameters
----------
component : polaris.Component
The component the step belongs to
resolution : km
The resolution of the task in km
name : str
the name of the task
subdir : str, optional
the subdirectory for the step. If neither this nor ``indir``
are provided, the directory is the ``name``
indir : str, optional
the directory the step is in, to which ``name`` will be appended
ntasks : int, optional
the number of tasks the step would ideally use. If fewer tasks
are available on the system, the step will run on all available
tasks as long as this is not below ``min_tasks``
min_tasks : int, optional
the number of tasks the step requires. If the system has fewer
than this number of tasks, the step will fail
openmp_threads : int, optional
the number of OpenMP threads the step will use
nu : float, optional
the viscosity (if different from the default for baroclinic channel
tests)
run_time_steps : int, optional
Number of time steps to run for
"""
self.resolution = resolution
self.run_time_steps = run_time_steps
super().__init__(component=component, name=name, subdir=subdir,
indir=indir, ntasks=ntasks, min_tasks=min_tasks,
openmp_threads=openmp_threads)
# make sure output is double precision
self.add_yaml_file('polaris.ocean.config', 'output.yaml')
self.add_input_file(filename='initial_state.nc',
target='../../init/initial_state.nc')
self.add_yaml_file('polaris.ocean.tasks.baroclinic_channel',
'forward.yaml')
if nu is not None:
# update the viscosity to the requested value *after* loading
# forward.yaml
self.add_model_config_options(options=dict(config_mom_del2=nu))
self.add_output_file(
filename='output.nc',
validate_vars=['temperature', 'salinity', 'layerThickness',
'normalVelocity'])
self.dt = None
self.btr_dt = None
Several parameters are passed into the constructor (with defaults if they
are not included) and then passed on to the base class’ constructor: name
,
subdir
, indir
, ntasks
, min_tasks
, cpus_per_task
,
min_cpus_per_task
, and openmp_threads
. Additional parameters nu
and
run_time_steps
are used to determine settings for running the model.
Then, two yaml files with modifications to the model config options are added
(for later processing). An additional model config option, config_mom_del2
is set manually via a python dictionary of namelist options.
Additionally, two input and one output file are added. By providing
validate_vars
to polaris.Step.add_output_file()
, validation of
these 4 variables in the output.nc
output file will automatically be
performed after the step has run if a baseline was provided as part of the call
to polaris setup.
setup()
The setup()
method is called when a user is setting up the step either
as part of a call to polaris setup or polaris suite.
As in constructor, you can add input, output, streams and namelist
files to the step by calling any of the following methods:
Set up should not do any major computations or any time-consuming operations
other than downloading files. Time-consuming work should be saved for
run()
whenever possible.
As an example, here is
polaris.ocean.tasks.global_ocean.mesh.mesh.MeshStep.setup()
:
def setup(self):
"""
Set up the task in the work directory, including downloading any
dependencies.
"""
# get the these properties from the config options
config = self.config
self.cpus_per_task = config.getint('global_ocean',
'mesh_cpus_per_task')
self.min_cpus_per_task = config.getint('global_ocean',
'mesh_min_cpus_per_task')
Some parts of the mesh computation (creating masks for culling) are done using
python multiprocessing, so the cpus_per_task
and min_cpus_per_task
attributes are set to appropriate values based on config options.
constrain_resources()
The constrain_resources()
method is used to update the ntasks
,
min_tasks
, cpus_per_task
, and min_cpus_per_task
attributes prior to
running the step, in case the user has modified these in the config options.
These performance-related attributes affect how the step runs and must be set
prior to runtime, whereas other options can be set within runtime_setup()
.
The framework calls constrain_resources()
within
polaris.run.serial.run_tasks()
, and a step can override this method
if desired, typically to get ntasks
, min_tasks
, cpus_per_task
, etc. from
config options or compute them using an algorithm and set the corresponding
attributes. When overriding constrain_resources
, it is important to also call
the base class’ version of the method with super().constrain_resources()
.
The names of the resources are related to the Slurm naming conventions:
ntasks
The target number of MPI tasks that a step will use if the resources are available.
min_tasks
The minimum number of MPI tasks for a step. If too few resources are available, the step will not run.
cpus_per_task
If
ntasks > 1
, this is typically a number of threads used by each MPI task (e.g. with OpenMP threading). Ifntasks == 1
, this may be the number of target cores used in on-node parallelism like python or c++ threading, or python multiprocessing.cpus_per_task
will automatically be constrained to be less than or equal to the number of cores on a node (and the total available cores). So it may be appropriate to set it to a high value appropriate for machines with large nodes, knowing that it will be constrained to fit on one node.
For MPI applications without threading, cpus_per_task
will always be
1
, the default.
min_cpus_per_task
The minimum number of cores for on-node parallelism (threading, multiprocessing, etc.). If too few resources are available, the step will fail with an error message.
runtime_setup()
The runtime_setup()
method is used to modify any behaviors of the step at
runtime. This includes things like partitioning an MPAS mesh across processors
and computing a times step based on config options that might have been
modified by the user. It must not include modifying the ntasks
, min_tasks
,
cpus_per_task
, min_cpus_per_task
or openmp_threads
attributes.
These attributes must be altered by overriding
constrain_resources().
Typically, runtime_setup()
will only be needed when the args
attribute is
being defined instead of a run()
method. This lets you run a small amount
of python code before launching an command, often using MPI parallelism.
run()
This method defines how the step will run. The contents of run()
can vary
quite a lot between steps.
In the baroclinic channel’s Init
step, the run()
method,
polaris.ocean.tasks.baroclinic_channel.init.Init.run()
,
is quite involved:
import cmocean # noqa: F401
import numpy as np
import xarray as xr
from mpas_tools.io import write_netcdf
from mpas_tools.mesh.conversion import convert, cull
from mpas_tools.planar_hex import make_planar_hex_mesh
from polaris import Step
from polaris.mesh.planar import compute_planar_hex_nx_ny
from polaris.ocean.vertical import init_vertical_coord
from polaris.viz import plot_horiz_field
class Init(Step):
...
def run(self):
"""
Run this step of the task
"""
config = self.config
logger = self.logger
section = config['baroclinic_channel']
resolution = self.resolution
lx = section.getfloat('lx')
ly = section.getfloat('ly')
# these could be hard-coded as functions of specific supported
# resolutions but it is preferable to make them algorithmic like here
# for greater flexibility
nx, ny = compute_planar_hex_nx_ny(lx, ly, resolution)
dc = 1e3 * resolution
ds_mesh = make_planar_hex_mesh(nx=nx, ny=ny, dc=dc,
nonperiodic_x=False,
nonperiodic_y=True)
write_netcdf(ds_mesh, 'base_mesh.nc')
ds_mesh = cull(ds_mesh, logger=logger)
ds_mesh = convert(ds_mesh, graphInfoFileName='culled_graph.info',
logger=logger)
write_netcdf(ds_mesh, 'culled_mesh.nc')
section = config['baroclinic_channel']
use_distances = section.getboolean('use_distances')
gradient_width_dist = section.getfloat('gradient_width_dist')
gradient_width_frac = section.getfloat('gradient_width_frac')
bottom_temperature = section.getfloat('bottom_temperature')
surface_temperature = section.getfloat('surface_temperature')
temperature_difference = section.getfloat('temperature_difference')
salinity = section.getfloat('salinity')
coriolis_parameter = section.getfloat('coriolis_parameter')
ds = ds_mesh.copy()
x_cell = ds.xCell
y_cell = ds.yCell
bottom_depth = config.getfloat('vertical_grid', 'bottom_depth')
ds['bottomDepth'] = bottom_depth * xr.ones_like(x_cell)
ds['ssh'] = xr.zeros_like(x_cell)
init_vertical_coord(config, ds)
x_min = x_cell.min().values
x_max = x_cell.max().values
y_min = y_cell.min().values
y_max = y_cell.max().values
y_mid = 0.5 * (y_min + y_max)
x_perturb_min = x_min + 4.0 * (x_max - x_min) / 6.0
x_perturb_max = x_min + 5.0 * (x_max - x_min) / 6.0
if use_distances:
perturb_width = gradient_width_dist
else:
perturb_width = (y_max - y_min) * gradient_width_frac
y_offset = perturb_width * np.sin(
6.0 * np.pi * (x_cell - x_min) / (x_max - x_min))
temp_vert = (bottom_temperature +
(surface_temperature - bottom_temperature) *
((ds.refZMid + bottom_depth) / bottom_depth))
frac = xr.where(y_cell < y_mid - y_offset, 1., 0.)
mask = np.logical_and(y_cell >= y_mid - y_offset,
y_cell < y_mid - y_offset + perturb_width)
frac = xr.where(mask,
1. - (y_cell - (y_mid - y_offset)) / perturb_width,
frac)
temperature = temp_vert - temperature_difference * frac
temperature = temperature.transpose('nCells', 'nVertLevels')
# Determine y_offset for 3rd crest in sin wave
y_offset = 0.5 * perturb_width * np.sin(
np.pi * (x_cell - x_perturb_min) / (x_perturb_max - x_perturb_min))
mask = np.logical_and(
np.logical_and(y_cell >= y_mid - y_offset - 0.5 * perturb_width,
y_cell <= y_mid - y_offset + 0.5 * perturb_width),
np.logical_and(x_cell >= x_perturb_min,
x_cell <= x_perturb_max))
temperature = (temperature +
mask * 0.3 * (1. - ((y_cell - (y_mid - y_offset)) /
(0.5 * perturb_width))))
temperature = temperature.expand_dims(dim='Time', axis=0)
normal_velocity = xr.zeros_like(ds_mesh.xEdge)
normal_velocity, _ = xr.broadcast(normal_velocity, ds.refBottomDepth)
normal_velocity = normal_velocity.transpose('nEdges', 'nVertLevels')
normal_velocity = normal_velocity.expand_dims(dim='Time', axis=0)
ds['temperature'] = temperature
ds['salinity'] = salinity * xr.ones_like(temperature)
ds['normalVelocity'] = normal_velocity
ds['fCell'] = coriolis_parameter * xr.ones_like(x_cell)
ds['fEdge'] = coriolis_parameter * xr.ones_like(ds_mesh.xEdge)
ds['fVertex'] = coriolis_parameter * xr.ones_like(ds_mesh.xVertex)
ds.attrs['nx'] = nx
ds.attrs['ny'] = ny
ds.attrs['dc'] = dc
write_netcdf(ds, 'initial_state.nc')
plot_horiz_field(ds, ds_mesh, 'temperature',
'initial_temperature.png')
plot_horiz_field(ds, ds_mesh, 'normalVelocity',
'initial_normal_velocity.png', cmap='cmo.balance',
show_patch_edges=True)
Without going into all the details of this method, it creates a mesh that
is periodic in x (but not y), then adds a vertical grid and an initial
condition to an xarray.Dataset
, which is then written out to
the file ocean.nc
.
In the example Forward
step we showed above, there is no run method at all
because we let its superclass ModelStep
define an args
attribute instead.
Rather than call the run()
method, the command given by these arguments
will be run on the commandline. This is capability important for supporting
task parallelism, since each such command may need to run with its own set of
MPI, threading and memory resources.
To get a feel for different types of run()
methods, it may be best to
explore different steps that are already implemented in Polaris.
inputs and outputs
Currently, steps run in sequence in the order they are added to the task
(or in the order they appear in the task’s steps_to_run
attribute).
There are plans to allow tasks and their steps to run in parallel in the
future. For this reason, we require that each step defines a list of the
absolute paths to all input files that could come from other steps (possibly in
other tasks) and all outputs from the step that might be used by other
steps (again, possibly in other tasks). There is no harm in including
inputs to the step that do not come from other steps (e.g. files that will be
downloaded when the task gets set up) as long as they are sure to exist
before the step runs. Likewise, there is no harm in including outputs from the
step that aren’t used by any other steps as long as the step will be sure to
generate them.
The inputs and outputs need to be defined during init of either the step or
the task, or in the step’s setup()
method because they are needed
before run() is called (to determine which steps depend on which
other steps). Inputs are added with polaris.Step.add_input_file()
and outputs with polaris.Step.add_output_file()
. Inputs may be
symbolic links to files in polaris, from the various databases on the
LCRC server,
downloaded from another source, or from another step.
Because the inputs and outputs need to be defined before the step runs, there can be some cases to avoid. The name of an output file should not depend on a config option. Otherwise, if the user changes the config option, the file actually created may have a different name than expected, in which case the step will fail. This would be true even if a subsequent step would have been able to read in the same config option and modify the name of the expected input file.
Along the same lines, an input or output file name should not depend on data from an input file that does not exist during setup(). Since the file does not exist, there is no way to read the file with the dependency within setup() and determine the resulting input or output file name.
Both of these issues have arisen for the
dev-ocean-global-ocean-files-for-e3sm from
dev-ocean-global-ocean tasks. Output files are named using the
“short name” of the mesh in E3SM, which depends both on config options and on
the number of vertical levels, which is read in from a mesh file created in a
previous step. For now, the outputs of this step are not used by any other
steps so it is safe to simply omit them, but this could become problematic in
the future if new steps are added that depend on
dev-ocean-global-ocean-files-for-e3sm. These steps would need to add
the appropriate shared step from files_for_e3sm
as a dependency using
polaris.Step.add_dependency()
.
polaris.Step
includes several methods for adding input, output,
namelist and streams files:
Input files
Typically, a step will add input files with
polaris.Step.add_input_file()
during init or in its setup()
method. It is also possible to add inputs in the task’s
constructor.
It is possible to simply supply the path to an input file as filename
without any other arguments to polaris.Step.add_input_file()
. In
this case, the file name is either an absolute path or a relative path with
respect to the step’s work directory:
def __init__(self, task):
...
self.add_input_file(filename='../setup_mesh/landice_grid.nc')
This is not typically how add_input_file()
is used because input files are
usually not directly in the step’s work directory.
Symlinks to input files
The most common type of input file is the output from another step. Rather than just giving the file name directly, as in the example above, the preference is to place a symbolic link in the work directory. This makes it much easier to see if the file is missing (because symlink will show up as broken) and allows you to refer to a short, local name for the file rather than its full path:
import xarray
def __init__(self, task):
...
self.add_input_file(filename='landice_grid.nc',
target='../setup_mesh/landice_grid.nc')
...
def run(step, test_suite, config, logger):
...
with xarray.open_dataset('landice_grid.nc') as ds:
...
A symlink is not actually created when polaris.Step.add_input_file()
is called. This will not happen until the step gets set up, after calling its
setup() method.
Sometimes you want to create a symlink to an input file in the work directory,
but the relative path between the target and the step’s work directory
isn’t very convenient to determine. This may be because the name of the
subdirectory for this step or the target’s step (or both) depends on
parameters. For such cases, there is a work_dir_target
argument that
allows you to give the path with respect to the base work directory (which is
not yet known at init). Here is an example taken from
polaris.ocean.tasks.global_ocean.forward.ForwardStep
:
def __init__(self, component, mesh, init):
mesh_path = mesh.mesh_step.path
if mesh.with_ice_shelf_cavities:
initial_state_target = f'{init.path}/ssh_adjustment/adjusted_init.nc'
else:
initial_state_target = f'{init.path}/init/initial_state.nc'
self.add_input_file(filename='init.nc',
work_dir_target=initial_state_target)
self.add_input_file(
filename='forcing_data.nc',
work_dir_target=f'{init.path}/init/init_mode_forcing_data.nc')
Symlink to input files from polaris
Another common need is to symlink a data file from within the task or its shared framework:
def __init__(self, component):
...
self.add_input_file(
filename='enthA_analy_result.mat',
package='polaris.landice.tasks.enthalpy_benchmark.A')
Here, we supply the name of the package that the file is in. The polaris framework will take care of figuring out where the package is located.
Downloading and symlinking input files
Another type of input file is one that is downloaded and stored locally. Typically, to save ourselves the time of downloading large files and to reduce potential problems on systems with firewalls, we cache the downloaded files in a location where they can be shared between users and reused over time. These “databases” are subdirectories of the core’s database root on the LCRC server.
To add an input file from a database, call
polaris.Step.add_input_file()
with the database
argument:
self.add_input_file(
filename='topography.nc',
target='BedMachineAntarctica_v2_and_GEBCO_2022_0.05_degree_20220729.nc',
database='bathymetry_database')
In this example from
polaris.ocean.tasks.global_ocean.init.init.Init()
,
the file BedMachineAntarctica_v2_and_GEBCO_2022_0.05_degree_20220729.nc
is
slated for later downloaded from the
Ocean bathymetry database.
The file will be stored in the subdirectory ocean/bathymetry_database
of the path in the database_root
config option in the paths
section of
the config file. The database_root
option is set either by selecting one
of the Supported Machines or in the user’s config file.
You can also specify the database_component
parameter to choose to get
files from a database belonging to another component, e.g.:
self.add_input_file(filename='icePresent_QU60km_polar.nc',
target='icePresent_QU60km_polar.nc',
database='partition',
database_component='seaice')
It is also possible to download files directly from a URL and store them in the step’s working directory:
step.add_input_file(
filename='dome_varres_grid.nc',
url='https://web.lcrc.anl.gov/public/e3sm/polaris/landice/dome_varres_grid.nc')
We recommend against this practice except for very small files.
Copying input files
In nearly all the cases discussed above, a symlink is created to the input
file, usually either from the polaris
package or from one of the databases.
If you wish to copy the file instead of symlinking it (e.g. so a user can make
local modifications), simply add the keyword argument copy=True
to any call
to self.add_input_file()
:
def __init__(self, task):
...
self.add_input_file(filename='landice_grid.nc',
target='../setup_mesh/landice_grid.nc', copy=True)
In this case, a copy of landice_grid.nc
will be made in the step’s work
directory.
Output files
We require that all steps provide a list of any output files that other steps are allowed to use as inputs. This helps us keep track of dependencies and will be used in the future to enable steps to run in parallel as long as they don’t depend on each other. Adding an output file is pretty straightforward:
self.add_output_file(filename='output_file.nc')
polaris.Step.add_output_file()
can be called in a step’s
constructor or setup() method or (less commonly)
in the task’s constructor.
The relative path in filename
is with respect to the step’s work directory,
and is converted to an absolute path internally before the step is run.
You can specify a list of variables to validate against a baseline (if one
is provided) using the validate_vars
attribute:
self.add_output_file(
filename='output.nc',
validate_vars=['temperature', 'salinity', 'layerThickness',
'normalVelocity'])
If a baseline is provided during polaris setup, then after the step
runs, the variables temperature
, salinity
, layerThickness
, and
normalVelocity
in the file output.nc
will be checked against the same
variables in the same file in the baseline run to make sure they are identical.
Cached output files
Many polaris tasks and steps are expensive enough that it can become time consuming to run full workflows to produce meshes and initial conditions in order to test simulations. Therefore, polaris provides a mechanism for caching the outputs of each step in a database so that they can be downloaded and symlinked rather than being computed each time.
Cached output files are be stored in the polaris_cache
database within each
component’s space on that LCRC server (see Downloading and symlinking input files).
If the “cached” version of a step is selected, as we will describe below, each
of the task’s outputs will have a corresponding “input” file added with
the target
being a cache file on the LCRC server and the filename
being
the output file. Polaris uses the cached_files.json
database to know
which cache files correspond to which step outputs.
A developer can indicate that polaris suite includes steps with cached outputs in two ways. First, if all steps in a task should have cached output, the following notation should be used:
ocean/global_ocean/QU240/mesh
cached
ocean/global_ocean/QU240/PHC/init
cached
That is, the word cached
should appear after the task on its own line.
The indentation is for visual clarity and is not required.
Second, ff only some steps in a task should have cached output, they need to be listed explicitly, as follows:
ocean/global_ocean/QUwISC240/mesh
cached: mesh
ocean/global_ocean/QUwISC240/PHC/init
cached: init ssh_adjustment
The line can be indented for visual clarity, but must begin with cached:
,
followed by a list of steps separated by a single space.
Similarly, a user setting up tasks has two mechanisms for specifying which
tasks and steps should have cached outputs. If all steps in a task
should have cached outputs, the suffix c
can be added to the test number:
polaris setup -n 90c 91c 92 ...
In this example, tasks 90 and 91 (mesh
and init
tasks from
the SOwISC12to60
global ocean mesh, in this case) are set up with cached
outputs in all steps and 92 (performance_test
) is not. This approach is
efficient but does not provide any control of which steps use cached outputs
and which do not.
A much more verbose approach is required if some steps use cached outputs and
others do not within a given task. Each task must be set up on its
own with the -t
and --cached
flags as follows:
polaris setup -t ocean/global_ocean/QU240/mesh --cached mesh ...
polaris setup -t ocean/global_ocean/QU240/PHC/init --cached init ...
...
Cache files should be generated by first running the task as normal, then
running the polaris cache command-line tool at the base of the work
directory, providing the names of the steps whose outputs should be added to
the cache. The resulting <component>_cached_files.json
should be copied
to polaris/<component>/cached_files.json
in a polaris branch.
Calls to polaris cache
must be made on Chrysalis or Anvil. If outputs were
produced on another machine, they must be transferred to one of these two
machines before calling polaris cache
. File can be added manually to the
LCRC server and the cached_files.json
databases but this is not
recommended.
More details on cached outputs are available in the compass design document Caching outputs from compass steps.
Adding other steps as dependencies
In some circumstances, it is not feasible to know the output filenames at
when a step gets set up. For example, the filename may depend on config
options that a user could change before running the step. Or the filename
could depend on data read in from files or computed at runtime. In such
circumstances, it is not feasible to specify the output filename with
polaris.Step.add_output_file()
. Nor can other steps that depend
on that output file as an input use polaris.Step.add_input_file()
.
Under these circumstances, it is useful to be able to specify that a step
is a dependency of another (dependent) step. This is accomplished by passing
the dependency to the step’s polaris.Step.add_dependency()
method
either during the creation of the dependent step, within the configure()
method of the parent task, or in the setup()
method of the dependent
step. The dependency does not need to belong to the task, it can be a shared
step, but it should be a step in any tasks that also use the dependent step.
This is because the dependent step will fail to run if the dependency has not
run.
When a step is added as a dependency, after it runs, its state will be stored in a pickle file (see setup module) that contains any modifications to its state during the run. When the dependent step is run, the the state of each dependency is “unpickled” along with the state of the step itself so that the dependent step can make use of runtime updates to its dependencies.
Note
There is some non-negligible overhead involved in pickling and unpickling dependencies so it is preferable to use file-based dependencies where possible.