Config
1 Overview
Model configuration refers to all of the necessary variables to configure and run the model. It contains all the input parameters, choices of methods, physical coefficients for parameterizations and various options for I/O. Typically, this is read as a single input file that can also serve as a description of the configuration for provenance as well.
2 Requirements
2.1 Requirement: Human readability
File or files containing the configuration must be easily understood and modified by non-expert users. It is also desirable to have minimum markup that would interfere with readability.
2.2 Requirement: Standard format
Configuration files must conform to a standard format for ease in parsing and for potential interoperability. Examples of standards typically used include YAML, JSON, XML.
2.3 Requirement: Archiving and provenance
The entire model configuration must be able to be saved and archived to act as model provenance.
2.4 Requirement: Internal accessibility
Although most input parameters will only be relevant to a particular module or class, it is inevitable that some parameterizations or process models will have dependencies on other configuration choices. All configuration variables must therefore be accessible to all other modules/classes.
2.5 Requirement: Support of data types
Model configuration must support parameters in the standard data types, including logical (boolean), integer, float/double, and string data types. Limited support for vectors or arrays of the above types may be desirable.
2.6 Requirement: Efficiency and parallelism
Configuration should never consume significant resources, in either run time or storage. For parallel execution, the configuration will need to be replicated across MPI ranks, though in many cases, the variables will be manipulated into a different form (eg string variables describing choices will be converted to enums). This may happen before or after replication.
2.8 Desired: Single configuration input
For simplicity, we desire a single file for model configuration where users can define all model configuration details and provenance can be maintained more easily. If multiple files are required, there should be obvious links or includes in the main configuration file to make the dependency clear.
2.9 Required: Hierarchy or grouping
For both readability and for easier encapsulation by modules/classes, the configuration should be organized in a logical way to make it clear where specific configuration variables are to be used. Parameters primarily or specifically associated with a module or class should be in a grouping of those parameters with the configuration.
2.10 Requirement: Language support
Because the configuration will also be used for provenance, it is likely that the configuration input file will need to be read by other languages outside of the OMEGA C++ model. This requires an ability to parse a configuration input file from other common languages (eg python).
2.11 Requirement: Optional or missing values
If a configuration variable is missing from the configuration input file, there must be an option to either supply a default value or throw an error and exit, depending on the user choice. If a default value is supplied, then the default value must be added to the configuration so that it is included in the output and become part of the provenance.
2.12 Requirement: Extra values
If extra or unexpected values are encountered, they will be ignored.
2.13 Desired: Automated generation of default input and error checking
While the source code defines the configuration variables, it would be desirable to have a means to extract from the source code what the code is expecting into a default input config file. This would also enable some external error checking for missing or extra entries.
2.14 Desired: Acceptable values
For users modifying an input configuration, it would be desirable to document the acceptable values or range of values that each variable can be assigned.
3 Algorithmic Formulation
There are no specific algorithms needed for this other than those used in the anticipated parsing/storage packages associated with the standard format chosen.
4 Design
We select the YAML format that meets the above requirements with improved readability over other standard forms. Within the OMEGA model, we use the yaml-cpp library, a third-party implementation for parsing YAML input and efficiently storing/retrieving information as YAML nodes. Many of the configuration variables will be stored as maps in YAML, the package features a map syntax similar to the C++ standard template library map type.
Because extracting variables from a large and complex config structure is less efficient, we plan for a single reading of the full configuration. Each initialization routine in OMEGA will then extract needed variables and manipulate them as needed for the later forward integration. In this implementation, we will read/write the configuration from a master rank and each initialization routine within OMEGA will manipulate and broadcast necessary variables across ranks for parallel execution. Because not all variables will need to be broadcast and many others will be converted to more efficient types (eg string options to logical or enums), we believe this model will be more efficient than broadcasting the full config structure and manipulating afterward.
4.1 Data types and parameters
4.1.1 Parameters
There are no global parameters or shared constants.
4.1.2 Class/structs/data types
We define a Config type, which is actually an alias of YAML::node:
using Config = YAML::node;
from the yaml-cpp library. A YAML node is more fully and accurately
defined in the YAML specification, but for the purposes of this
document, our configuration is represented in YAML as a set of nested
map nodes, where a map is simply a keyword-value pair. At the lowest
level, these nodes are the simple variable-name: value
maps. The next
level up is a map of the module name to the collection of maps associated
with the module. The root node corresponds to the full model configuration
and is simply a collection of all those module maps. I/O stream/file
configuration will be part of this configuration in a design TBD later
but will be a similar hierarchy under the full omega config node.
An example YAML input file might then look like:
omega:
timeManagement:
doRestart: false
restartTimestampName: restartTimestamp
startTime: 0001-01-01_00:00:00
stopTime: none
runDuration: 0010_00:00:00
calendarType: noleap
[Other config options in a similar way]
hmix:
hmixScaleWithMesh: false
maxMeshDensity: -1.0
hmixUseRefWidth: false
hmixRefWidth: 30.0e3
[more config options]
streams:
mesh:
type: input
filenameTemplate: mesh.nc
inputInterval: initial_only
output:
type: output
filenameTemplate: output/output.$Y-$M-$D_$h.$m.$s.nc
filenameInterval: 01-00-00_00:00:00
referenceTime: 0001-01-01_00:00:00
clobberMode: truncate
precision: single
outputInterval: 0001_00:00:00
contents:
- tracers
- layerThickness
- ssh
- kineticEnergyCell
- relativeVorticityCell
- [other fields]
[other streams in similar form]
4.2 Methods
All of the methods in the YAML::Node class are obviously supported, but we will alias or wrap some of the most common in the OMEGA context to be associated with Config.
4.2.1 File read and master config
The most common use case should be creating a Config by reading a YAML configuration file using:
Config omegaConfig = ConfigRead("omega.yml");
where the argument is the name for the YAML input file. In OMEGA, we will retain this master configuration throughout the initialization as omegaConfig.
4.2.2 Get/Retrieval
Once the configuration has been read, we will need to retrieve variables from the Config. Because our config is a hierarchy, there are really no variables at the top level and we need to first retrieve the sub-config associated with the local module/group. In the sample above, if we need to retrieve a variable from the hmix group, we first retrieve the hmix config and then the variable using:
Config hmixConfig = ConfigGet(omegaConfig,"hmix",iErr);
Real refWidth{0.0};
bool useRefWidth{false};
refWidth = ConfigGet(hmixConfig, "hmixRefWidth", iErr);
useRefWidth = ConfigGet(hmixConfig, "hmixUseRefWidth", iErr);
where there is a retrieval function for all supported Omega data types:
bool, I4, I8, R4, R8, Real, std::string. These retrievals are just
overloaded wrappers around the YAML form: configName["varName"].as<type>
with some error checking and reporting. Rather than a templated form,
we use simple overloading to keep a cleaner interface. If the variable
or config is missing, these functions will print an error message and
return with a non-zero error argument.
Another interface will allow the setting of a default value if the variable is missing from the input config. This interface simply adds the default value as an additional argument, for example:
refWidth = ConfigGet(hmixConfig, "hmixRefWidth", defaultVal, iErr);
In this case, if the variable does not exist, it will not only use the default value but print a warning that the default is being used because the entry is missing.
4.2.3 Change an existing value
While the intent is for all config variables to be set using the config file read interface above, the capability modify a value is also required. The syntax is essentially the inverse of the get/retrieval above. Similar to that case, the sub-group will need to be retrieved first.
Config hmixConfig = ConfigGet(omegaConfig, "hmix", iErr);
ConfigSet(hmixConfig, "hmixRefWidth", 10.0e3, iErr);
ConfigSet(hmixConfig, "hmixUseRefWidth", true, iErr);
There will be overloaded interfaces for each supported type. For literals (as in the example above), they will be cast to an appropriate type according to C++ default type conversion and will be converted to the desired type on retrieval (YAML internal storage is ignorant of the type and only performs the type cast on retrieval).
4.2.3 Adding new entries
It may be necessary to build up a configuration that does not yet exist or add new entries to an existing group. We provide an Add interface to distinguish this case from the Set case above.
// For an existing subgroup:
Config hmixConfig = ConfigGet(omegaConfig, "hmix", iErr);
ConfigAdd(hmixConfig, "hmixRefWidth", 10.0e3, iErr);
ConfigAdd(hmixConfig, "hmixUseRefWidth", true, iErr);
// To add a new subgroup:
Config hmixConfig; // empty Config constructor
ConfigAdd(hmixConfig, "hmixRefWidth", 10.0e3, iErr); // build subgroup
ConfigAdd(hmixConfig, "hmixUseRefWidth", true, iErr);
ConfigAdd(omegaConfig, hmixConfig); // add new subgroup to parent
There will be overloaded interfaces for each supported type. For literals (as in the example above), they will be cast to an appropriate type according to C++ default type conversion and will be converted to the desired type on retrieval (YAML internal storage is ignorant of the type and only performs the type cast on retrieval).
4.2.4 Existence
It is not expected that a user would test the existence since the Get/Set functions will perform the test internally. However, to satisfy requirement 2.11, we will add a function to test the existence of an entry, given a config or sub-config. Using the hmix example again:
if (ConfigExists(hmixConfig,"hmixRefWidth") {
// variable exists, do stuff
}
Note that this can also be used to test the existence of a complete sub-group as well:
bool hmixExists = ConfigExists(omegaConfig, "hmix");
4.2.5 File write
While we may decide to save provenance a different way, a write interface is supplied to write a configuration to an output YAML file:
err = ConfigWrite(myConfig, "outputFileName");
4.2.6 Constructor/destructor
A default constructor for an empty Config and destructor will be provided:
Config myConfig;
delete myConfig;
The destructor may be important to free up space since the Config is likely to only be used during the init phase in the current plan.
4.3 Documentation and Auto-generation of Default Inputs
In order to keep the source code, input files and documentation consistent and avoid missing or extra entries, we propose inserting a block within each source code header (where other interfaces will be documented). This block would follow Doxygen-like format and look something like:
/// \ConfigInput
/// #
/// # Group description (eg. hmix: the horizontal mix configuration)
/// #
/// groupName:
/// #
/// # Parameter description (eg horizontal mixing coeff)
/// # more description (units, acceptable values or range)
/// #
/// varName1: defaultValue1
/// #
/// # Parameter description
/// #
/// varName2: defaultValue2
/// [ continue for remaining vars in this block]
///
/// \EndConfigInput (might not be necessary?)
The block between the ConfigInput lines could be extracted verbatim
and written (with proper indenting) into a fully documented yaml
default input file or could be extracted and the #
-delimited
comments stripped for a more concise yaml input file.
In addition, we may be able to similarly extract the same info for
the User and Developer Guides.
5 Verification and Testing
The selection of YAML automatically satisfies Requirements 2.1, 2.2 and 2.10. Requirement 2.8 will be enforced by Omega development. The other requirements will be tested with a parallel unit test driver that performs the following tests in order and will output the test name and a PASS/FAIL for use with CTest or other testing frameworks.
5.1 Test default constructor
Create an empty config on master rank using default constructor.
tests constructor needed to satisfy other requirements
5.2 Test Set function
Add configuration variables to the empty config with at least one of each supported type with a few extras to test behavior for other requirements. Also add multiple levels of hierarchy and a large enough number of subgroups and parameters to simulate a full omega config.
tests function needed for requirements 2.3, 2.5, 2.9, 2.11, 2.12
5.3 Test Get function
Retrieve all the variables set in the test above and verify they return identical values.
completes test of 2.4, 2.5, 2.9
Broadcast variables from master after retrieval to provide test of performance for this choice of parallel implementation.
tests 2.6
5.4 Test for missing variables
Inquire for a missing variable and output a PASS if the variable was not found. Also add the missing variable and assign a default value, the retrieve to test this behavior.
tests 2.11
5.5 Test write and re-read
Write a YAML output file using the above constructed Config. Then read in the new file. Add an extra variable to the new file. Verify that the newly read Config matches the original, ignoring the extra variable.
tests 2.3, 2.12