AI Training Datasets
The E3SM project and Allen Institute for AI (Ai2) have developed several datasets for AI and machine learning applications. These datasets have been postprocessed for ingestion by the ACE/FourCastNet emulator.
Dataset Details
EAMv2: 73-year EAMv2 simulation (F2010, perpetual 2010 forcing, repeating annual SST cycle from 2005-2014 average). 6-hourly outputs. More details see: Duncan et al. 2024
EAMv3: 51-year EAMv3 AMIP-style simulation (1970-2020, F2010 with AMIP SSTs, constant 2010 CO2). Includes multiple ENSO cycles and global warming trend. More details see: Wu et al. 2025
E3SMv3: Coupled pre-industrial and historical training data (coming soon)
SCREAMv1: Simple Cloud-Resolving E3SM Atmosphere Model version 1 training data (coming soon)
Tip
Check the archive_contents
text file to see files included in each tar archive. You can selectively download the files you need.