- Main page: User Guide
File storage at ACENET is implemented with one of two file system technologies:
- Lustre, a parallel file system widely used in high-performance computing, at Mahone, Fundy, and Placentia.
- Oracle's SAM-QFS at Glooscap.
In early 2016 ACENET replaced a variety of aging storage hardware and software at Mahone, Fundy, Placentia in order to assure data continuity. Lustre was also introduced at that time, and the following changes affecting users were made:
/globalscratchfile system was merged with
/home. Files formerly in
/globalscratch/$USERshould now be found in
/home/$USER/scratchis no longer a symbolic link but a real subdirectory. Scripts or programs which explicitly refer to
/globalscratchwill have to be edited.
quotacommand is now a wrapper around the
lfs quotacommand. The appearance of its output has changed somewhat, but your quota standing are now available immediately.
- Quotas have been adjusted to reflect the merged filesystems, and a new file count quota has been imposed with a default quota of 180,000 files per user.
- The tape layer of the old storage systems has not been replaced, for reasons of cost. While file restoration after deletion or other forms of accidental loss was never officially supported, such recovery is now practically impossible in every case.
- Red Hat Enterprise Linux 6 (RHEL6) is the default operating system on all four clusters now. This simplifies job submission for those users who have been obliged to use
os=RHEL6for the last year or so, and certain applications that were previously only available on RHEL6 can now run on any node, e.g. MATLAB.
- Policy document: ACENET Data Policies
ACENET does not provide backup services. Our filesystems are built with RAID redundancy to protect your data from the loss due to hardware failures, but we do not protect you from accidentally deleting or changing your own files. Users are therefore strongly encouraged to make off-site (or multi-site) copies of their critical data. Source code and other such key files should be managed with a version control tool such as Git, Subversion, Mercurial, or CVS.
You should also be aware of your home institution's data storage policies and follow them. Some institutions offer network backup facilities which you might be able to take advantage of. MUN users can take advantage of MUN's RDB system for backing up data from Placentia.
ACENET does not provide permanent data archiving.
Data retention policy
Data stored in expired accounts is subject to deletion after a grace period of 4 months.
There are three types of disk space available to the user on most ACENET clusters: One is Permanent Storage, and two others are Temporary Storage. The general outline of the ACENET storage system is given below.
- Permanent Storage system on every cluster
||critical data and code||network|
- Temporary Storage system on every cluster
||temporary data, large data||network|
||temporary data, fast read/write access data||node-local|
Main storage (home directory) is your personal and permanent space for research-critical data and code. This is where you should put your data prior to and after computations, and where you should keep source code and executables. It is located in
/home/<username>, where your username will replace
<username>. When you log in, this is your current working directory. You may create whatever subdirectories you like here. The Main storage is a networked storage shared among all compute nodes via Lustre or NFS (Network File System) at Glooscap.
Storage quotas are implemented at all clusters. The default quota values (soft limits) are given in the table below. The hard limit quotas are 5-10% higher (except Glooscap). The grace period of exceeding the soft limit is one week.
||bytes per user||150 GB||155 GB||75 GB||61 GB|
||files per user||180,000||180,000||180,000||no limit|
Your usage and limit information can be found with the command
You can also use
du to determine how much space your files occupy:
$ du -h --max-depth=1 /home/$USER/
In the table below, the Disk Allocation Unit (DAU) sizes on ACENET clusters are provided. Where there are two numbers specified for the DAU in the table below, like so X (Y) KB, then the first 8 blocks of a file will be X KB each, and the rest of the blocks will be Y KB each.
||4 KB||4 KB||4 KB||4 (64) KB|
No-quota Scratch (NQS) is temporary network storage that has no per-user quota limit, but gets cleaned periodically to get rid of old files. It's available at
/nqs/<username>/ on every cluster to users who have requested access to it.
NQS is designed to allow you to store large amounts of data on a temporary basis, for example, files generated and consumed during a single job that cannot be stored on Main Storage or Global Scratch due to the per-user quotas. Because no quotas are enforced on NQS, there is an irreducible risk that the filesystem will fill up. Should that occur existing data on
/nqs may be unrecoverable. This means it is unsuitable for storage of critical data. Long-term storage of data --- critical or not --- is also not appropriate since this increases the risk of the filesystem filling up during its intended use.
You are expected to delete your files from
/nqs once the associated job or jobs are complete. Technical staff also reserve the right to delete files manually in the event of a manifest risk of a fill-up emergency.
To ensure that these guidelines are followed and
/nqs stays usable for its intended purpose, files which have not been accessed for 31 days are automatically deleted. The deletion routine will notify you seven days in advance of removing any of your files if you keep a file named
/home/username/.nqs in your home directory with these contents:
|size||12 T||13 T||12 T||19 T|
|DAU||4 KB||4 KB||4 KB||4 KB|
If you want to check how much space is used or available in NQS then use the following command:
$ df -h /nqs/$USER
To examine the last access time of your files:
$ ls -lu /nqs/$USER # in the given directory $ ls -luR /nqs/$USER # in subdirectories too, recursively
To find files recursively which have not been accessed for the last e.g. 24 days:
$ find /nqs/$USER -type f -atime +24
- Main page: Local Scratch
Each compute node has its own disk (or in some cases, solid state memory) which is not shared with other compute nodes. We refer to this as local disk. If it is used to store temporary files for an individual job, then we refer to that as "local scratch storage".
Local scratch has the advantage over network storage that local storage is not prone to slow down when cluster load is high. If your application does a high volume of input/output then using local scratch might result in more predictable run times. However, local scratch is more complicated to use than network storage. If you are willing to invest some effort into learning how to use node-local disk in general and the specifics of ACENET's node-local scratch in particular, then please read Local Scratch.