Zstash documentation¶
What is zstash?¶
Zstash is an HPSS long-term archiving solution for E3SM.
Zstash is written entirely in Python using standard libraries. Its design is intentionally minimalistic to provide an effective long-term HPSS archiving solution without creating an overly complicated (and hard to maintain) tool.
Key features:
Files are archived into standard tar files with a user specified maximum size.
Tar files are first created locally, then transferred to HPSS.
Checksums (md5) of input files are computed on-the-fly during archiving. For large files, this saves a considerable amount of time compared to separate checksumming and archiving steps.
Checksums and additional metadata (size, modification time, tar file and offset) are stored in a sqlite3 index database.
Database enables faster retrieval of individual files by locating in which tar file a specific file is stored, as well as its location (offset) within the tar file.
File integrity is verified by computing checksums on-the-fly while extracting files.
Source code is available on Github: https://github.com/E3SM-Project/zstash.
This documentation reflects the v1.0.0
release of zstash
.