Unified Mesh: River Network Preparation
date: 2026/04/19
Contributors:
Xylar Asay-Davis
Codex
Claude
Summary
This design describes the shared prepare_river_network step and associated
tasks that can run the shared river steps on their own for the unified global
base-mesh workflow. The purpose of the step is to simplify a global river
dataset into products that can be consumed directly by build_sizing_field
without re-reading or reinterpreting the raw source data.
The shared river-network workflow is implemented in Polaris pull request https://github.com/E3SM-Project/polaris/pull/556.
The preferred first source is HydroRIVERS or an equivalent global flowline
dataset. Unlike the standalone
mpas_land_mesh
workflow, the Polaris design makes the downstream interface explicit. In
particular, the workflow distinguishes between the authoritative simplified
river network, the target-grid products needed by build_sizing_field, and
the mesh-conditioned products needed by create_base_mesh, rather than
overloading a single raster with mixed semantics.
Because river-network simplification and river-driven meshing are the parts of
the workflow where Xylar’s design intuition is currently weakest, the first
Polaris design should preserve the
mpas_land_mesh
river algorithms as closely as is practical.
The implementation aligns prepare_river_network with the shared target-grid
tier and coastline interpretation chosen for the workflow, while deferring
river-outlet reconciliation until after an MPAS base mesh exists.
Success means that Polaris gains a documented, reusable river-network preprocessing workflow that preserves the major hydrographic controls relevant for mesh generation and makes its outputs easy to inspect and easy for downstream steps to consume.
Workflow Context
The overall unified-mesh workflow is described in Unified Mesh: Global Base Mesh Workflow.
The upstream unified-mesh workflow design is:
The downstream unified-mesh workflow designs are:
Requirements
Requirement: Downstream-Ready River Network Products
Date last modified: 2026/05/16
Contributors:
Xylar Asay-Davis
Codex
prepare_river_network shall provide source-level, target-grid, and
mesh-conditioned river products that can be consumed directly by
build_sizing_field and create_base_mesh.
The shared products shall retain the major river-network information needed for mesh refinement and direct cell-center placement, including channel locations and basin-root provenance.
The downstream sizing-field and base-mesh steps shall not need to rerun HydroRIVERS filtering, network reconstruction, or coastline-aware river clipping and simplification.
Coastline-aware river clipping shall be local to river-line geometry. It shall remove only the portions of each retained river line that fall inside the coastal exclusion band, preserving valid inland pieces rather than pruning whole trees or short inland fragments.
Requirement: Hydrologically Meaningful Simplification
Date last modified: 2026/05/15
Contributors:
Xylar Asay-Davis
Codex
The first implementation shall preserve the dominant global river main stems and major tributaries needed to inform mesh resolution. Terminal river segments shall be retained as basin roots for traversal and grouping, not as coastline-reconciled outlet products.
The design shall support filtering by drainage area and by proximity so the retained network reflects the target mesh scale rather than the full source dataset density.
The simplification shall preserve connectivity and confluence structure rather than reducing the product to disconnected local segments.
Where practical, the first Polaris design shall preserve the existing
mpas_land_mesh
river-network algorithms rather than redesigning them.
Requirement: Deferred Outlet Reconciliation
Date last modified: 2026/05/15
Contributors:
Xylar Asay-Davis
Codex
The pre-base-mesh river workflow shall not snap river outlets to the coastline, write separate outlet products, or refine the sizing field based on outlet mask cells.
The workflow shall preserve enough basin-root provenance, through
outlet_hyriv_id, outlet_drainage_area, and river_network_rank, for
downstream workflows to identify, select, and optionally write per-catchment
products without rerunning HydroRIVERS simplification. Outlet/coastline
reconciliation shall still occur after the MPAS base mesh exists.
Requirement: Standalone River-Network Task
Date last modified: 2026/05/15
Contributors:
Xylar Asay-Davis
Codex
Claude
Polaris shall provide a standalone task per named unified mesh that runs the
full shared river-network workflow for that mesh, including HydroRIVERS
simplification, channel rasterization, and coastline-aware
clipping, together with the shared upstream steps it depends on (for example
e3sm/init/topo/combine and prepare_coastline).
The standalone task shall make it practical to inspect retained basins, target-grid river-channel masks, and clipped river geometry without running the full unified mesh workflow.
The same shared steps and configuration shall be reusable from the full unified workflow when settings match.
Requirement: Reproducible Source Data Access
Date last modified: 2026/04/19
Contributors:
Xylar Asay-Davis
Codex
All source datasets needed by prepare_river_network shall be obtained either
from documented public sources or, if that is not feasible, from the Polaris
database.
The preferred implementation shall download raw source data from public sources and perform any needed preprocessing within Polaris rather than requiring users to provide local input-file paths.
Adding preprocessed artifacts to the Polaris database should be treated as a fallback for cases where the source data are not publicly distributable or the required preprocessing cannot be reproduced robustly within Polaris.
Algorithm Design
Algorithm Design: Downstream-Ready River Network Products
Date last modified: 2026/05/16
Contributors:
Xylar Asay-Davis
Codex
Claude
The current implementation separates source-level hydrographic products from target-grid products rather than trying to make one step serve both roles. This aligns with the design intent that downstream consumers should not need to reinterpret HydroRIVERS or infer outlet semantics from one overloaded raster.
At the source level, the workflow writes:
simplified_river_network.geojson, containing retained segments withhyriv_id,main_riv,ord_stra,drainage_area,next_down,endorheic,outlet_hyriv_id,outlet_drainage_area, andriver_network_rank; networks are ordered largest-first by terminal-root drainage area, and the rank field makes the N largest networks directly selectable without relying on feature order alone. Theoutlet_hyriv_idfield is retained as basin-root provenance for future catchment grouping, not as a coastline-reconciled outlet product.
At the target-grid level, the workflow writes:
river_network.nc, withriver_channel_mask.
This is intentionally clearer than the standalone workflow’s mixed raster
semantics. The present implementation does not yet add stream-order rasters or
basin IDs, but it does establish a clean product split that the
build_sizing_field implementation now consumes directly.
For base-mesh consumers, the workflow also writes a mesh-conditioned product set:
clipped_river_network.geojson, containing river segments clipped inland of the coastline and simplified for direct JIGSAW geometry use, with valid inland pieces preserved even when one source feature is split by the coastal exclusion band, and with networks ordered largest-first by terminal-root drainage area; andclipped_river_network.nc, containing masks regenerated from the clipped network for diagnostics.
These products are where the river workflow becomes aware of the selected
unified mesh and its direct cell-placement needs. build_sizing_field uses the
target-grid masks, while create_base_mesh consumes the conditioned vector
geometry.
Generating the clipped products requires evaluating the coastline’s
signed_distance field along each retained river line. The implementation
first densifies each line at the coastline-grid scale, then batches all sampled
coordinates from all segments into a single array, performs one vectorised
bilinear-interpolation call over the entire network, and splits the resulting
distance values back to the corresponding per-segment slices. The clipped
geometry is then built by retaining sampled intervals farther inland than the
configured clip distance and linearly interpolating exact threshold crossings.
This makes clipping local to the geometry near the coastline while avoiding
artificial inland gaps. Short retained inland pieces are preserved; only
degenerate pieces with fewer than two distinct points are removed.
Algorithm Design: Hydrologically Meaningful Simplification
Date last modified: 2026/05/16
Contributors:
Xylar Asay-Davis
Codex
The current Polaris implementation is a focused reimplementation built around
HydroRIVERS attributes such as HYRIV_ID, MAIN_RIV, ORD_STRA,
UPLAND_SKM, NEXT_DOWN, and ENDORHEIC. Its staged logic is:
Filter source flowlines by a minimum drainage-area threshold tied to the intended river-refinement scale.
Merge multiple source features with the same
hyriv_idinto one canonical segment when needed.Validate that the retained
NEXT_DOWNgraph is acyclic before attempting basin traversal.Identify terminal basin roots from segments with
next_down == 0.Traverse upstream iteratively from each terminal root, keeping the largest upstream segment at each confluence as the main stem.
Retain additional tributaries when either their drainage area exceeds a configurable fraction of the largest upstream branch at the current confluence or their minimum distance from the already retained basin skeleton exceeds the branch-distance tolerance.
The key point is that simplification should be basin-aware and topology-aware. The Polaris design should preserve connectivity and confluences, not just apply independent Douglas-Peucker style simplification to each source feature.
The Polaris implementation intentionally differs from the standalone
mpas_land_mesh simplification algorithm in the mechanics of basin
construction. The standalone workflow performs a greedy reverse search for each
individual basin: it rebuilds a pyrivergraph, updates headwater stream order,
merges and defines stream segments, and recursively grows an R-tree of retained
flowlines from the outlet upstream. At each step, nearby branches can be kept,
rejected, or replaced by a larger branch depending on the order in which the
greedy search encounters them.
Polaris keeps the same design intent but uses a smaller algorithm tied directly
to HydroRIVERS. It uses the NEXT_DOWN attributes as the authoritative
downstream graph, validates that the retained graph is acyclic, constructs an
upstream adjacency map, and processes each terminal root independently. Branch
selection is deterministic and local to each confluence: keep the largest
upstream branch, then keep other upstream branches that pass the area-ratio
test or the distance-tolerance fallback. The retained set is not later mutated
by replacing smaller branches with larger nearby ones. This makes the step
easier to test, allows basin traversal to run in parallel, and avoids importing
the broader mpas_land_mesh/pyflowline helper stack into Polaris.
Algorithm Design: Deferred Outlet Reconciliation
Date last modified: 2026/05/15
Contributors:
Xylar Asay-Davis
Codex
Claude
Outlet and coastline reconciliation is intentionally deferred until after an MPAS base mesh exists. Before that point, snapping HydroRIVERS terminal points to coastline cells and refining outlet mask cells adds complexity without a clear benefit because the base-mesh workflow clips near-coast river geometry and the sizing-field workflow blends land resolution toward ocean resolution near the coastline.
The pre-base-mesh river workflow therefore keeps terminal-root provenance on
retained river segments through outlet_hyriv_id, outlet_drainage_area, and
river_network_rank. Rasterization produces the channel mask needed by the
sizing field, and clipped vector products provide the river geometry needed by
JIGSAW. Downstream workflows that need outlet locations or catchment-specific
files can group segments by outlet_hyriv_id, select the largest basins by
river_network_rank, and perform outlet/coastline reconciliation later.
Algorithm Design: Standalone River-Network Task
Date last modified: 2026/05/15
Contributors:
Xylar Asay-Davis
Codex
Claude
The current standalone task design uses one thin wrapper per named unified
mesh, UnifiedRiverNetworkTask, rather than separate source-level and
lat-lon tasks. Each task wraps the full shared river-network step chain for
its mesh — coastline steps, simplification, rasterization, clipping, and
visualization — so all products can be inspected together without running the
full unified mesh workflow.
Organizing by mesh name rather than by resolution keeps the task structure consistent with the sizing-field and base-mesh task families and avoids creating standalone tasks for resolutions that are not tied to a specific mesh configuration.
Implementation
Implementation: Downstream-Ready River Network Products
Date last modified: 2026/05/16
Contributors:
Xylar Asay-Davis
Codex
Claude
The file naming and class layout are now concrete. The river implementation is
organized under polaris/tasks/mesh/spherical/unified/river/ as:
simplify.py(SimplifyRiverNetworkStep) for HydroRIVERS download, unpacking and source-level simplification;rasterize.py(RasterizeRiverLatLonStep) for target-grid rasterization of retained river channels;clip.py(ClipRiverNetworkStep) for coastline-aware clipping and conditioning of retained river geometry for final mesh generation;viz.py(VizRiverStep) for diagnostic plotting and text summaries;steps.pyfor shared-step setup helpers (get_unified_mesh_river_steps());task.pyandtasks.pyfor standalone task wrappers; andthe configuration sections are loaded from the unified mesh config.
This implementation prioritizes a clean output contract over carrying forward
the standalone workflow’s mixed raster conventions or writing default
per-catchment GeoJSON files. A single ranked GeoJSON keeps the authoritative
simplified network in one file while still allowing scripts to reproduce the
standalone workflow’s “largest N basins” exports by filtering on
river_network_rank.
The simplification step obtains HydroRIVERS through add_input_file() using
the public archive URL in the river network config section, with the Polaris
database still available as a fallback cache location. The rasterization step
then consumes the shared coastline grid for the selected convention and writes a
channel-only mask. The ClipRiverNetworkStep consumes the simplified network
together with the selected coastline product and writes the clipped river
geometry consumed by the unified base-mesh step.
The coastline-aware clipping in condition_base_mesh_river_segments() uses
coastline-grid-scale line densification before signed-distance sampling, so a
river line with endpoints inside the coastal exclusion band can still retain an
inland middle portion. All sampled coordinates are stacked into one array,
_interpolate_signed_distance() is called once, and the resulting
signed-distance values are split back to per-segment slices with np.split().
The helper then retains only intervals outside the coastal exclusion band,
interpolates boundary crossings, preserves all valid inland fragments, and
falls back to unsimplified clipped geometry if Douglas-Peucker simplification
would make a piece degenerate. The historical minimum-length option is retained
for configuration compatibility but no longer removes valid inland pieces.
Implementation: Hydrologically Meaningful Simplification
Date last modified: 2026/05/16
Contributors:
Xylar Asay-Davis
Codex
Claude
The current simplification logic lives in
simplify_river_network_feature_collection() in
polaris/tasks/mesh/spherical/unified/river/simplify.py. It uses small focused
helpers for canonicalizing segments, validating downstream topology, filtering
by drainage area, and traversing retained basin structure from all terminal
roots.
The traversal is iterative rather than recursive, so very deep main stems do not depend on Python recursion limits. When multiple CPUs are available, terminal basins are distributed across forked worker processes that share the read-only HydroRIVERS segment map, upstream adjacency map, and spatial index. Each worker returns the retained segments for one basin root, and the parent process merges those basin-local results before annotating network rank.
After basin traversal, the implementation annotates each retained segment with
outlet_drainage_area and river_network_rank. The rank is 1-based, with
rank 1 assigned to the retained terminal basin with the largest outlet drainage
area. These properties are preserved by the canonical RiverSegment read/write
helpers and are carried through coastline conditioning so downstream products do
not silently drop the network-selection metadata.
The implementation favors a compact Polaris-native reimplementation over a
direct migration of
mpas_land_mesh
helper layers. No clear defect emerged from the current unit tests, but this
remains an area where additional comparison against real HydroRIVERS output
would strengthen confidence.
Implementation: Deferred Outlet Reconciliation
Date last modified: 2026/05/15
Contributors:
Xylar Asay-Davis
Codex
Claude
The current implementation removes coastline matching and inland-sink treatment
from the pre-base-mesh river products. river_network.nc contains
river_channel_mask only, and the simplified/clipped GeoJSON products keep
basin-root provenance and network-selection metadata but no coastline-snapped
outlet products. Outlet snapping and catchment-specific outlet products are
deferred to downstream workflows that operate after the MPAS base mesh exists.
Implementation: Standalone River-Network Task
Date last modified: 2026/05/11
Contributors:
Xylar Asay-Davis
Codex
Claude
The current implementation adds one lightweight task wrapper per named unified
mesh in polaris/tasks/mesh/spherical/unified/river/task.py and avoids any
separate task-specific river-processing code path. UnifiedRiverNetworkTask
wraps the full shared step chain for its mesh — coastline steps, simplification
(SimplifyRiverNetworkStep), rasterization (RasterizeRiverLatLonStep),
clipping (ClipRiverNetworkStep), and visualization — so all products can be
inspected together. Task registration is handled by add_river_tasks() in
tasks.py, which iterates over UNIFIED_MESH_NAMES and registers one task per
mesh.
Testing
Testing and Validation: Downstream-Ready River Network Products
Date last modified: 2026/05/23
Contributors:
Xylar Asay-Davis
Codex
Claude
Unit tests in tests/mesh/spherical/unified/test_river.py verify the
target-grid product contract. Specifically:
test_build_river_network_dataset_contract_and_channel_maskverifies thatbuild_river_network_dataset()writes the expected channel-only mask variable (river_channel_mask) without outlet-matching attributes.test_mesh_river_step_factories_use_mesh_subdirsverifies thatget_unified_mesh_river_steps()createsSimplifyRiverNetworkStep,RasterizeRiverLatLonStep, andClipRiverNetworkStepwith the expected mesh-specific subdirectories.test_mesh_river_step_factories_reuse_shared_configsverifies step and config identity across multiple calls toget_unified_mesh_river_steps().
The coastline-aware conditioning tests in the same file verify
condition_base_mesh_river_segments(), including local clipping through
multiple entries and exits from the coastal exclusion band, densification before
signed-distance sampling, preservation of short inland pieces, and safe
simplification fallback. The test_base_mesh.py tests then verify that
UnifiedBaseMeshStep converts the prepared clipped_river_network.geojson
product into JIGSAW line constraints rather than raw river geometry.
build_sizing_field unit tests consume the target-grid river masks. The full
river workflow feeding the sizing-field task and the final base-mesh task has
been run on real data for all four named unified meshes.
Testing and Validation: Hydrologically Meaningful Simplification
Date last modified: 2026/05/23
Contributors:
Xylar Asay-Davis
Codex
Claude
Unit tests in tests/mesh/spherical/unified/test_river.py validate
simplification behavior on synthetic networks:
test_simplify_river_network_traverses_all_terminal_segmentsverifies that all retained terminal segments are traversed and thatoutlet_hyriv_id,outlet_drainage_area, andriver_network_rankare preserved as basin-root provenance and network-selection metadata.test_simplify_river_network_handles_deep_main_stemconfirms correctness for a 1500-segment chain without Python recursion limits.test_simplify_river_network_rejects_next_down_cyclesverifies that cyclicNEXT_DOWNgraphs are rejected with a clear error.test_simplify_river_network_preserves_branch_traversal_orderverifies that multi-branch confluence structure is retained correctly.test_convert_hydrorivers_shapefile_to_geojsonverifies shapefile conversion.test_unpack_hydrorivers_archiveverifies archive unpacking.test_drainage_area_threshold_auto_derived_from_configandtest_branch_distance_tolerance_auto_derived_from_configverify that simplification thresholds are derived correctly from mesh configs.
The simplification has been exercised on the full global HydroRIVERS dataset for all four named unified meshes, and the resulting river networks were inspected visually and found to reflect the major hydrographic controls at each resolution.
Testing and Validation: Deferred Outlet Reconciliation
Date last modified: 2026/05/16
Contributors:
Xylar Asay-Davis
Codex
Claude
Unit tests in tests/mesh/spherical/unified/test_river.py cover the
channel-only pre-base-mesh products:
test_build_river_network_dataset_contract_and_channel_maskverifies the channel-only raster contract.test_build_river_network_dataset_applies_physical_channel_bufferverifies the physical buffer applied to rasterized channel cells.test_condition_base_mesh_river_segments_clips_then_simplifies,test_condition_base_mesh_river_segments_keeps_short_fragments,test_condition_base_mesh_river_segments_keeps_reentry_pieces,test_condition_base_mesh_river_segments_densifies_before_clipping, andtest_condition_base_mesh_river_segments_simplify_fallback_keeps_geometryverify the coastline clipping applied before base-mesh conditioning.
The visualization step writes river_network_overlay.png,
rasterized_river_network.png, and debug_summary.txt, making the simplified,
clipped, and rasterized channel products straightforward to inspect in task
runs.
Testing and Validation: Standalone River-Network Task
Date last modified: 2026/05/23
Contributors:
Xylar Asay-Davis
Codex
Claude
Unit tests in tests/mesh/spherical/unified/test_river.py verify the
standalone task structure:
test_add_river_tasks_registers_mesh_tasksverifies thatadd_river_tasks()registers oneUnifiedRiverNetworkTaskper name inUNIFIED_MESH_NAMES, that each task subdirectory isspherical/unified/<mesh_name>/river/task, and that each task name isriver_network_<mesh_name>_task.test_mesh_river_step_factories_use_mesh_subdirsverifies mesh-specific subdirectories for the simplify, rasterize, and clip steps.test_mesh_river_step_factories_reuse_shared_configsverifies that step and config instances are shared across multipleget_unified_mesh_river_steps()calls for the same mesh.
Standalone river tasks have been run for all four named unified meshes, showing the expected rasterized river networks and visualization overlays at each resolution. The full end-to-end workflow through sizing-field construction, base-mesh generation, topography remap, and mesh culling has been completed for all four meshes. The resulting culled ocean and land meshes were visually verified to be consistent with expectations.