SPRAS Designs
SPRAS makes a few high-level design decisions. We motivate them here.
Immutable Outputs
During benchmarking runs, SPRAS data is uploaded to the Open Science
Data Federation. OSDF enforces an
immutable file structure, where files can never be deleted or rewritten.
By default, SPRAS does not have immutable files. However, in SPRAS
configurations, the immutable_files parameter can be enabled to make
files fully immutable where no file with the same file name will be
written with different data.
To do this, SPRAS tags all datasets, gold standards, and algorithms with a version hash, which is effectively the current version of how SPRAS processes that data in-code.
In implementation, this version hash is the hash of the RECORD
file, which contains hashes of all ‘installed’ files. When SPRAS is not
installed in development mode (i.e. without the --editable flag),
the RECORD file hashes all Python source files, leading to the
desired effect that the version hash changes when the source code
changes. In development mode, the RECORD file does not change when
source code is changed.