It is crucial for a well-maintained data management practice, that there is a predefined data structure in the file system and all users in each group follow the guidelines. We differentiate three main data sets in Computerome 2.0:
Recommendation for data structure:
./apps: for project specific applications - used when the project cannot use the standard applications provided for instance in /services/tools. Common candidates include: anaconda, perl, qiime, R, ncbi-blast, samtools, bamtools, bedtools, java.
./apps/modulefiles: for project specific module files
./backup: this directory is reserved for a copy of critical data - meaning raw data and scripts. This is the only directory that is backed up for disaster recovery purposes. It is the group owner's responsibility, that a copy of the project related critical files are physically stored in this directory. It is not possible to execute jobs using files in this directory. Note, that backup is not to be mistaken by roll-back. All data stored on Computerome 2.0 can be reverted up to 4 weeks back in the time.
./data/: reserved for analyses and generated data, this is a general working directory.
./people/<user>: each project members project related content.
./scratch: for temporary data, relevant for some projects only. Changes in this directory are not logged and therefore cannot be reverted either.
In case you used to use Computerome 1.0, you might notice, that the ./archive directory has been removed from the directory tree. As Computerome 2.0 does not provide cold storage (data retention) services, the project owners are expected to remove retired data from Computerome 2.0 after computation completion or project termination.
Files under /home/people/
Each user has their own directory under /home/people/<user>/, which contains the environment setup and everything else which is considered strictly user specific. For security reasons, only the user has access to his/her directory.
Because users might switch projects, graduate, change work places, etc., no project data or anything else project related must be kept in the directory, but must be stored under the project directories instead, under /home/projects/<project_name>/. To enforce this rule in Computerome 2.0, there is a 10 GB quota on this directory.
Project files under /home/projects/
All project related data must be stored under /home/projects/<project_name>/. In order to keep a coherent data structure across all projects, each project is created with the same, default directory tree. As each project has different workflows and processes, the structure can be further tailored by adding sub directories.
Reference databases under /home/databases/
Computerome provides access to a list of read-only reference databases under /home/databases/. Feel free to check the available list of data sets and use it in your projects. Users may request the download of additional reference databases by sending the request to firstname.lastname@example.org.