Lab Share
The main working directory for your research data. This location is for researchers to set up and organize data within the lab. We strongly advise to create top level project folders that are representative of your DUAs.
File Tiers
Historically, in high-performance computing, only filesystems such as "scratch" or "temp" were performant, and your lab share was on a seperate NFS volume. Within the ReD Environment, we have simplified storage using performant storage for your Lab Share, so there is no need for moving data back and forth. We leverage FSx for Lustre as an file cache in front of S3. Files are first written to a SSD layer, before migrating to S3. To keep data costs manageable, our S3 layer has a policy engine to move between tiers. Files older than 30 days are migrated to S3 Infrequent. Retrieval time will not be affected. Files older than 90 days are migrated to Glacier. Retrieval times will be slowed. If this becomes problematic please send in a ticket.
Syntax
Lab Share paths look like /data/labs/XXXX_lab where XXXX is your typically your PIs last name, Center, or project group. For example, Univ RCD has a test project "urcdtest":
[syockel_test@ip-10-3-2-176 urcdtest_lab]$ pwd
/data/labs/urcdtest_lab
Permisions
Permissions are set to read/write for lab groups by default. If more granular permissions are needed, please submit a request ticket.
[syockel_test@ip-10-3-2-176 labs]$ ls -ld urcdtest_lab/
drwxrws---+ 5 root urc-urcdtest-lab 33280 Jan 8 22:13 urcdtest_lab/
Where the "d" represents a directory. The next 3 triads represent the user, group, other permission sets. So in this case rwx represent read-write-execute for the user, while rws is set for group, and the last "---" is unset for others. The (s)pecial bit on the group is commonly noted as SGID, and provides the utility of:
- If set on a file, it allows the file to be executed as the group that owns the file (similar to SUID)
- If set on a directory, any files created in the directory will have their group ownership set to that of the directory owner
Proper Usage
Your Lab Share is capable of handling I/O intensive or large numbers of jobs. Since Weka is a parallel filesystem, you can leverage MPI I/O and other RDMA functions for increased performance.