4.9 Accessing data
Please refer to this section of the emLab SOP for a description of the data directory structure for our emLab GRIT data storage space.
All data in the emLab GRIT data storage space can be directly accessed
on each of the servers (sequoia and quebracho) without any changes to
the directory paths. All data in the emlab/data and
emlab/projects/current-projects directory physically lives on
high-speed hard drives attached to sequoia, so if you need to work on
data in these directories, you will have the best computing performance
when using sequoia. Please refer to this
section
of the emLab SOP for a code snippet that can be used to directly access
data on the server in R.
In addition to having access to our emLab GRIT data storage space, which is shared across all members of our team, all individual users also have a private user-specific storage space. All GRIT users get a free 50GB personal storage space by default. As a general best practice, we recommend storing all data on the emLab data storage space, and only storing cloned GitHub repos and user-specific R packages and settings in your personal user space. For example, you should store all project-specific data in the appropriate directory under the emLab data storage space, but you should store all of your cloned GitHub repos s in your personal storage space. By default, when you clone repos from GitHub they are stored in your personal storage space, along with any of your user-specific R packages and configurations. If for whatever reason your personal storage space exceeds 50GB, it will stop working, so you should ensure you always have a safe buffer. However, we envision that if users only keep cloned GitHub repos and R packages in their personal user space, they should not need to worry about hitting the 50GB limit. You can check your current personal storage by typing df -h in the terminal and then looking for your username.