From 5154ac1b7939fc2f13896f45c7a6d2e4cbd7b29f Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Tue, 9 Dec 2025 14:22:40 -0500 Subject: [PATCH 01/26] goodbye gpfs and minor reformat modified: docs/hpc/03_storage/01_intro_and_data_management.mdx modified: docs/hpc/03_storage/02_available_storage_systems.md modified: docs/hpc/03_storage/05_research_project_space.mdx modified: docs/hpc/03_storage/06_best_practices.md modified: docs/hpc/12_tutorial_intro_shell_hpc/03_moving_looking.mdx --- .../01_intro_and_data_management.mdx | 127 ++++++------------ .../02_available_storage_systems.md | 25 +--- .../03_storage/05_research_project_space.mdx | 2 +- docs/hpc/03_storage/06_best_practices.md | 1 - .../03_moving_looking.mdx | 3 +- 5 files changed, 46 insertions(+), 112 deletions(-) diff --git a/docs/hpc/03_storage/01_intro_and_data_management.mdx b/docs/hpc/03_storage/01_intro_and_data_management.mdx index bc718ecc7f..3c9e21ff86 100644 --- a/docs/hpc/03_storage/01_intro_and_data_management.mdx +++ b/docs/hpc/03_storage/01_intro_and_data_management.mdx @@ -1,131 +1,88 @@ # HPC Storage -The NYU HPC clusters are served by a General Parallel File System (GPFS) cluster and an all Flash VAST storage cluster. - -The NYU HPC team supports data storage, transfer, and archival needs on the HPC clusters, as well as collaborative research services like the [Research Project Space (RPS)](./05_research_project_space.mdx). - -## Highlights -- 9.5 PB Total GPFS Storage - - Up to 78 GB per second read speeds - - Up to 650k input/output operations per second (IOPS) -- Research Project Space (RPS): RPS volumes provide working spaces for sharing data and code amongst project or lab members +The Torch HPC cluster is served by an all Flash VAST storage cluster. The HPC team supports data storage, transfer, and archival needs on the HPC clusters, as well as collaborative research services like the [Research Project Space (RPS)](./05_research_project_space.mdx). ## Introduction to HPC Data Management -The NYU HPC Environment provides access to a number of ***file systems*** to better serve the needs of researchers managing data during the various stages of the research data lifecycle (data capture, analysis, archiving, etc.). Each HPC file system comes with different features, policies, and availability. - -In addition, a number of ***data management tools*** are available that enable data transfers and data sharing, recommended best practices, and various scenarios and use cases of managing data in the HPC Environment. +The NYU HPC Environment provides access to a number of file-systems to better serve the needs of researchers managing data during the various stages of the research data life cycle (data capture, analysis, archiving, etc.). Each HPC file-system comes with different features, policies, and availability. -Multiple ***public data sets*** are available to all users of the HPC environment, such as a subset of The Cancer Genome Atlas (TCGA), the Million Song Database, ImageNet, and Reference Genomes. +In addition, a number of data management tools are available that enable data transfers and data sharing, recommended best practices, and various scenarios and use cases of managing data in the HPC Environment. We strongly recommend [`globus`](./04_globus.md) as the primary data transfer tool. -Below is a list of file systems with their characteristics and a summary table. Reviewing the list of available file systems and the various Scenarios/Use cases that are presented below, can help select the right file systems for a research project. As always, if you have any questions about data storage in the HPC environment, you can request a consultation with the HPC team by sending email to [hpc@nyu.edu](mailto:hpc@nyu.edu). +Multiple ***public data sets*** are available to all users of the HPC environment, such as a subset of The Cancer Genome Atlas (`TCGA`), the Million Song Database, `ImageNet`, and Reference Genomes. More information can be found in the [datasets section](../04_datasets/01_intro.md). -### Data Security Warning -::::warning -#### Moderate Risk Data - HPC Approved +:::warning[HPC is only approved for Moderate Risk Data] - The HPC Environment has been approved for storing and analyzing **Moderate Risk research data**, as defined in the [NYU Electronic Data and System Risk Classification Policy](https://www.nyu.edu/about/policies-guidelines-compliance/policies-and-guidelines/electronic-data-and-system-risk-classification.html). -- **High Risk** research data, such as those that include Personal Identifiable Information (**PII**) or electronic Protected Health Information (**ePHI**) or Controlled Unclassified Information (**CUI**) **should NOT be stored in the HPC Environment**. -:::note -only the Office of Sponsored Projects (OSP) and Global Office of Information Security (GOIS) are empowered to classify the risk categories of data. +- **High Risk** research data, such as those that include Personal Identifiable Information (**PII**) or electronic Protected Health Information (**ePHI**) or Controlled Unclassified Information (**CUI**) **should NOT be stored in the HPC Environment**. +- Only the Office of Sponsored Projects (OSP) and Global Office of Information Security (GOIS) are empowered to classify the risk categories of data. ::: -:::tip -#### High Risk Data - Secure Research Data Environments (SRDE) Approved -Because the HPC system is not approved for High Risk data, we recommend using an approved system like the [Secure Research Data Environments (SRDE)](../../srde/01_getting_started/01_intro.md). +:::info[SRDE for High Risk Data] +Because the HPC system is not approved for High Risk data, we recommend using an approved system like the [Secure Research Data Environments (SRDE)](../../srde/01_getting_started/01_intro.md). ::: -:::: -### Data Storage options in the HPC Environment -#### User Home Directories -Every individual user has a home directory (under **`/home/$USER`**, environment variable **`$HOME`**) for permanently storing code and important configuration files. Home Directories provide limited storage space (**50 GB**) and inodes (files) **30,000** per user. Users can check their quota utilization using the [myquota](http://www.info-ren.org/projects/ckp/tech/software/version/myquota.html) command. +## Data Storage options in the HPC Environment -User home directories are backed up daily and old files under **`$HOME`** are not purged. +Below is a list of file-systems with their characteristics and a summary table. Reviewing the list of available file-systems and the various Scenarios/Use cases that are presented below, can help select the right file-systems for a research project. -The User home directories are available on all HPC clusters (Torch) and on every cluster node (login nodes, compute nodes) as well as and Data Transfer Node (gDTN). +### User Home Directories +Every individual user has a home directory (under **`/home/$USER`**, environment variable **`$HOME`**) for permanently storing code and important configuration files. Home Directories provide limited storage space (**50 GB**) and inodes (files) **30,000** per user. Users can check their quota utilization using the [myquota](http://www.info-ren.org/projects/ckp/tech/software/version/myquota.html) command. User home directories are backed up daily and old files under **`$HOME`** are not purged. The user home directories are available on every cluster node (login nodes, compute nodes) as well as and the Data Transfer Node (gDTN). :::warning Avoid changing file and directory permissions in your home directory to allow other users to access files. ::: -User Home Directories are not ideal for sharing files and folders with other users. HPC Scratch or [Research Project Space (RPS)](./05_research_project_space.mdx) are better file systems for sharing data. -:::warning -**One of the common issues that users report regarding their home directories is running out of inodes,** i.e. the number of files stored under their home exceeds the inode limit, which by default is set to 30,000 files. This typically occurs when users install software under their home directories, for example, when working with Conda and Julia environments, that involve many small files. -::: +User Home Directories are not ideal for sharing files and folders with other users. HPC Scratch or [Research Project Space (RPS)](./05_research_project_space.mdx) are better file-systems for sharing data. -:::tip +:::tip[`inode` limits] +- One of the common issues that users report regarding their home directories is running out of inodes (i.e. the number of files stored under their home exceeds the inode limit), which by default is set to 30,000 files - To find out the current space and inode quota utilization and the distribution of files under your home directory, please see: [Understanding user quota limits and the myquota command.](./06_best_practices.md#user-quota-limits-and-the-myquota-command) -- **Working with Conda environments:** To avoid running out of inode limits in home directories, the HPC team recommends **setting up conda environments with Singularity overlay images** +- Working with `conda` environments: To avoid running out of inode limits in home directories, the HPC team recommends **setting up `conda` environments with Singularity overlay images** as [described here](../07_containers/03_singularity_with_conda.md). Avoid creating `conda` environments in your `$HOME` directory. ::: -#### HPC Scratch -The HPC scratch file system is the HPC file system where most of the users store research data needed during the analysis phase of their research projects. The scratch file system provides ***temporary*** storage for datasets needed for running jobs. - -Files stored in the HPC scratch file system are subject to the **HPC Scratch old file purging policy:** Files on the /scratch file system that have not been accessed for 60 or more days will be purged. - -Every user has a dedicated scratch directory (**/scratch/$USER**) with **5 TB** disk quota and **1,000,000 inodes** (files) limit per user. +### HPC Scratch +The HPC scratch an all flash (VAST) file-system where most of the users store research data needed during the analysis phase of their research projects. The scratch file-system provides ***temporary*** storage for datasets needed for running jobs. Every user has a dedicated scratch directory (**/scratch/$USER**) with **5 TB** disk quota and **5,000,000 inodes** (files) limit per user. The scratch file-system is available on all nodes (compute, login, etc.) on Torch as well as Data Transfer Node (gDTN). -The scratch file system is available on all nodes (compute, login, etc.) on Torch as well as Data Transfer Node (gDTN). - -:::warning -There are **No Back ups of the scratch file system.** ***Files that were deleted accidentally or removed due to storage system failures CAN NOT be recovered.*** +:::warning[Scratch Purging Policy] +- Files on the /scratch file-system that have not been accessed for 60 or more days will be purged. +0 There are no backups of the scratch file-system. Files that were deleted accidentally or removed due to storage system failures CANNOT be recovered. ::: :::tip -- Since there are ***no back ups of HPC Scratch file system***, users should not put important source code, scripts, libraries, executables in `/scratch`. These important files should be stored in file systems that are backed up, such as `/home` or [Research Project Space (RPS)](./05_research_project_space.mdx). Code can also be stored in a ***git*** repository. -- ***Old file purging policy on HPC Scratch:*** All files on the HPC Scratch file system that have not been accessed ***for more than 60 days*** will be removed. It is a policy violation to use scripts to change the file access time. Any user found to be violating this policy will have their HPC account locked. A second violation may result in your HPC account being turned off. +- Since there are ***no backups of HPC Scratch file-system***, users should not put important source code, scripts, libraries, executables in `/scratch`. These important files should be stored in file-systems that are backed up, such as `/home` or [Research Project Space (RPS)](./05_research_project_space.mdx). Code can also be stored in a `git` repository. +- ***Old file purging policy on HPC Scratch:*** All files on the HPC Scratch file-system that have not been accessed ***for more than 60 days*** will be removed. It is a policy violation to use scripts to change the file access time. Any user found to be violating this policy will have their HPC account locked. A second violation may result in your HPC account being turned off. - To find out the user's current disk space and inode quota utilization and the distribution of files under your scratch directory, please see: [Understanding user quota Limits and the myquota command.](./06_best_practices.md#user-quota-limits-and-the-myquota-command) -- Once a research project completes, users should archive their important files in the [HPC Archive file system](./01_intro_and_data_management.mdx#hpc-archive). +- Once a research project completes, users should archive their important files in the [HPC Archive file-system](./01_intro_and_data_management.mdx#hpc-archive). ::: -#### HPC Vast -The HPC Vast all-flash file system is the HPC file system where users store research data needed during the analysis phase of their research projects, particularly for high I/O data that can bottleneck on the scratch file system. The Vast file system provides ***temporary*** storage for datasets needed for running jobs. - -Files stored in the HPC vast file system are subject to the ***HPC Vast old file purging policy:*** Files on the `/vast` file system that have not been accessed for **60 or more days** will be purged. - -Every user has a dedicated vast directory (**`/vast/$USER`**) with **2 TB** disk quota and **5,000,000 inodes** (files) limit per user. - -The vast file system is available on all nodes (compute, login, etc.) on Torch as well as Data Transfer Node (gDTN). - -:::warning -There are **No Back ups** of the vastsc file system. ***Files that were deleted accidentally or removed due to storage system failures CAN NOT be recovered.*** -::: - -:::tip -- Since there are ***no back ups of HPC Vast file system***, users should not put important source code, scripts, libraries, executables in `/vast`. These important files should be stored in file systems that are backed up, such as `/home` or [Research Project Space (RPS)](./05_research_project_space.mdx). Code can also be stored in a ***git*** repository. -- ***Old file purging policy on HPC Vast:*** All files on the HPC Vast file system that have not been accessed ***for more than 60 days will be removed.*** It is a policy violation to use scripts to change the file access time. Any user found to be violating this policy will have their HPC account locked. A second violation may result in your HPC account being turned off. -- To find out the user's current disk space and inode quota utilization and the distribution of files under your vast directory, please see: [Understanding user quota Limits and the myquota command.](./06_best_practices.md#user-quota-limits-and-the-myquota-command) -- Once a research project completes, users should archive their important files in the [HPC Archive file system](./01_intro_and_data_management.mdx#hpc-archive). -::: - -#### HPC Research Project Space -The HPC Research Project Space (RPS) provides data storage space for research projects that is easily shared amongst collaborators, ***backed up***, and ***not subject to the old file purging policy***. HPC RPS was introduced to ease data management in the HPC environment and eliminate the need of having to frequently copying files between Scratch and Archive file systems by having all projects files under one area. ***These benefits of the HPC RPS come at a cost***. The cost is determined by the allocated disk space and the number of files (inodes). +### HPC Research Project Space +The HPC Research Project Space (RPS) provides data storage space for research projects that is easily shared amongst collaborators, ***backed up***, and ***not subject to the old file purging policy***. HPC RPS was introduced to ease data management in the HPC environment and eliminate the need of having to frequently copying files between Scratch and Archive file-systems by having all projects files under one area. ***These benefits of the HPC RPS come at a cost***. The cost is determined by the allocated disk space and the number of files (inodes). - For detailed information about RPS see: [HPC Research Project Space](./05_research_project_space.mdx) -#### HPC Work -The HPC team makes available a number of public datasets that are commonly used in analysis jobs. The data sets are available Read-Only under **`/scratch/work/public`**. - -For some of the datasets users must provide a signed usage agreement before accessing. - -Public datasets available on the HPC clusters can be viewed on the [Datasets page](../04_datasets/01_intro.md). +### HPC Work +The HPC team makes available a number of public datasets that are commonly used in analysis jobs. The data-sets are available Read-Only under `/scratch/work/public`. For some of the datasets users must provide a signed usage agreement before accessing. Public datasets available on the HPC clusters can be viewed on the [Datasets page](../04_datasets/01_intro.md). -#### HPC Archive -Once the Analysis stage of the research data lifecycle has completed, _HPC users should **tar** their data and code into a single tar.gz file and then copy the file to their archive directory (**`/archive/$USER`**_). The HPC Archive file system is not accessible by running jobs; it is suitable for long-term data storage. Each user has access to a default disk quota of **2TB** and ***20,000 inode (files) limit***. The rather low limit on the number of inodes per user is intentional. The archive file system is available only ***on login nodes*** of Torch. The archive file system is backed up daily. +### HPC Archive +Once the Analysis stage of the research data lifecycle has completed, HPC users should **tar** their data and code into a single tar.gz file and then copy the file to their archive directory (`/archive/$USER`). The HPC Archive file-system is not accessible by running jobs; it is suitable for long-term data storage. Each user has access to a default disk quota of **2TB** and ***20,000 inode (files) limit***. The rather low limit on the number of inodes per user is intentional. The archive file-system is available only ***on login nodes*** of Torch. The archive file-system is backed up daily. -- Here is an example ***tar*** command that combines the data in a directory named ***my_run_dir*** under ***`$SCRATCH`*** and outputs the tar file in the user's ***`$ARCHIVE`***: +- Here is an example ***tar*** command that combines the data in a directory named `my_run_dir` under `$SCRATCH` and outputs the tar file in the user's `$ARCHIVE`: ```sh # to archive `$SCRATCH/my_run_dir` tar cvf $ARCHIVE/simulation_01.tar -C $SCRATCH my_run_dir ``` -#### NYU (Google) Drive +### NYU (Google) Drive Google Drive ([NYU Drive](https://www.nyu.edu/life/information-technology/communication-and-collaboration/document-collaboration-and-sharing/nyu-drive.html)) is accessible from the NYU HPC environment and provides an option to users who wish to archive data or share data with external collaborators who do not have access to the NYU HPC environment. -As of December 2023, storage limits were applied to all faculty, staff, and sutdent NYU Google accounts. Please see [Google Workspace Storage](https://www.nyu.edu/life/information-technology/about-nyu-it/key-projects-and-initiatives/google-workspace-storage.html) for details +As of December 2023, storage limits were applied to all faculty, staff, and student NYU Google accounts. Please see [Google Workspace Storage](https://www.nyu.edu/life/information-technology/about-nyu-it/key-projects-and-initiatives/google-workspace-storage.html) for details. -There are also limits to the data transfer rate in moving to/from Google Drive. Thus, moving many small files to Google Drive is not going to be efficient. +There are also limits to the data transfer rate in moving to/from Google Drive. Thus, moving many small files to Google Drive is not going to be efficient. Please read the [Instructions on how to use cloud storage within the NYU HPC Environment](./08_transferring_cloud_storage_data_with_rclone.md). -Please read the [Instructions on how to use cloud storage within the NYU HPC Environment](./08_transferring_cloud_storage_data_with_rclone.md). +### HPC Storage Comparison Table -#### HPC Storage Mounts Comparison Table - +| Space | Environment Variable | Purpose | Backed Up / Flushed | Quota Disk Space / # of Files | +|-----------------------------|----------------------|-------------------------------------------------------|-------------------------------------|------------------------------------| +| /home | $HOME | Personal user home space that is best for small files | YES / NO | 50 GB / 30 K | +| /scratch | $SCRATCH | Best for large files | NO / Files not accessed for 60 days | 5 TB / 5 M | +| /archive | $ARCHIVE | Long-term storage | YES / NO | 2 TB / 20 K | +| HPC Research Project Space | NA | Shared disk space for research projects | YES / NO | Payment based TB-year/inodes-year | Please see the next page for best practices for data management on NYU HPC systems. diff --git a/docs/hpc/03_storage/02_available_storage_systems.md b/docs/hpc/03_storage/02_available_storage_systems.md index afe6ef4667..701fb5b805 100644 --- a/docs/hpc/03_storage/02_available_storage_systems.md +++ b/docs/hpc/03_storage/02_available_storage_systems.md @@ -2,30 +2,9 @@ The NYU HPC clusters are served by the following storage systems: -## GPFS -General Parallel File System (GPFS) storage cluster is a high-performance clustered file system developed by IBM that provides concurrent high-speed file access to applications executing on multiple nodes of clusters. - -### Configuration -The NYU HPC cluster storage runs on Lenovo Distributed Storage Solution DSS-G hardware: -- 2x DSS-G 202 - - 116 Solid State Drives (SSDs) - - 464TB raw storage -- 2x DSS-G 240 - - 668 Hard Disk Drives (HDDs) - - 9.1PB raw storage - -### Performance -- Read Speed: 78 GB per second read speeds -- Write Speed: 42 GB per second write speeds -- I/O Performance: up to 650k input/output operations per second (IOPS) - ## Flash Tier Storage (VAST) -An all flash file system, using [VAST Flash storage](https://www.vastdata.com/), is now available on Torch. Flash storage is optimal for computational workloads with high I/O rates. For example, If you have jobs to run with huge amount of tiny files, VAST may be a good candidate. If you and your lab members are interested, please reach out to [hpc@nyu.edu](mailto:hpc@nyu.edu) for more information. -- NVMe interface -- Total size: 778 TB -:::note -/vast is available for all users to read and available to approved users to write data. -::: +An all flash file-system, using [VAST Flash storage](https://www.vastdata.com/), is available as the primary file-system for Torch. Flash storage is optimal for computational workloads with high I/O rates. + ## Research Project Space (RPS) [Research Project Space (RPS)](./05_research_project_space.mdx) volumes provide working spaces for sharing data and code amongst project or lab members. diff --git a/docs/hpc/03_storage/05_research_project_space.mdx b/docs/hpc/03_storage/05_research_project_space.mdx index f5e3e1c362..9f7b958955 100644 --- a/docs/hpc/03_storage/05_research_project_space.mdx +++ b/docs/hpc/03_storage/05_research_project_space.mdx @@ -1,7 +1,7 @@ # Research Project Space (RPS) ## Description -Research Project Space (RPS) volumes provide working space for sharing data and code amongst project or lab members. RPS directories are built on the same parallel file system (GPFS) like HPC Scratch. They are mounted on the cluster Compute Nodes, and thus they can be accessed by running jobs. RPS directories are backed up and there is no old file purging policy. These features of RPS simplify the management of data in the HPC environment as users of the HPC Cluster can store their data and code on RPS directories and they do not need to move data between the HPC Scratch and the HPC Archive file systems. +Research Project Space (RPS) volumes provide working space for sharing data and code amongst project or lab members. RPS directories are built on the same parallel file system (VAST) like HPC Scratch. They are mounted on the cluster Compute Nodes, and thus they can be accessed by running jobs. RPS directories are backed up and there is no old file purging policy. These features of RPS simplify the management of data in the HPC environment as users of the HPC Cluster can store their data and code on RPS directories and they do not need to move data between the HPC Scratch and the HPC Archive file systems. :::note - Due to limitations of the underlying parallel file system, ***the total number of RPS volumes that can be created is limited***. diff --git a/docs/hpc/03_storage/06_best_practices.md b/docs/hpc/03_storage/06_best_practices.md index 65705284b0..fa90799177 100644 --- a/docs/hpc/03_storage/06_best_practices.md +++ b/docs/hpc/03_storage/06_best_practices.md @@ -17,7 +17,6 @@ Space Variable /Flushed? Space / Files Space(%) / Files(%) /home $HOME Yes/No 50.0GB/30.0K 8.96GB(17.91%)/33000(110.00%) /scratch $SCRATCH No/Yes 5.0TB/1.0M 811.09GB(15.84%)/2437(0.24%) /archive $ARCHIVE Yes/No 2.0TB/20.0K 0.00GB(0.00%)/1(0.00%) -/vast $VAST No/Yes 2.0TB/5.0M 0.00GB(0.00%)/1(0.00%) ``` Users can find out the number of inodes (files) used per subdirectory under their home directory (`$HOME`), by running the following commands: ```sh diff --git a/docs/hpc/12_tutorial_intro_shell_hpc/03_moving_looking.mdx b/docs/hpc/12_tutorial_intro_shell_hpc/03_moving_looking.mdx index a8ab10fb37..c35ac0046c 100644 --- a/docs/hpc/12_tutorial_intro_shell_hpc/03_moving_looking.mdx +++ b/docs/hpc/12_tutorial_intro_shell_hpc/03_moving_looking.mdx @@ -24,7 +24,6 @@ The NYU HPC clusters have multiple file systems for user’s files. Each file sy | /home | $HOME | Program development space; storing small files you want to keep long term, e.g. source code, scripts. | NO | 20 GB | | /scratch | $SCRATCH | Computational workspace. Best suited to large, infrequent reads and writes. | YES. Files not accessed for 60 days are deleted. | 5 TB | | /archive | $ARCHIVE | Long-term storage | NO | 2 TB | -| /vast | $VAST | Flash memory for high I/O workflows | YES. Files not accessed for 60 days are deleted. | 2 TB | Please see [HPC Storage](../03_storage/01_intro_and_data_management.mdx) for more details. @@ -374,4 +373,4 @@ The directories are listed alphabetical at each level, the files/directories in - To view files, use `ls`. - You can view help for a command with `man command` or `command --help`. - Hit `tab` to autocomplete whatever you’re currently typing. -::: \ No newline at end of file +::: From 65930968235133143b3a035c2d4c52338d253069 Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Tue, 9 Dec 2025 15:46:12 -0500 Subject: [PATCH 02/26] remove spurious page --- docs/cloud/04_dataproc/02_data_management.md | 2 +- .../01_intro_and_data_management.mdx | 19 ++++++++----------- .../02_available_storage_systems.md | 18 ------------------ ...data_transfers.md => 02_data_transfers.md} | 8 ++++---- .../03_storage/{04_globus.md => 03_globus.md} | 0 ...pace.mdx => 04_research_project_space.mdx} | 0 ...best_practices.md => 05_best_practices.md} | 2 +- ...s.md => 06_large_number_of_small_files.md} | 0 ...ferring_cloud_storage_data_with_rclone.md} | 4 ++-- ...ta_on_hpc.md => 08_sharing_data_on_hpc.md} | 0 ...hon_packages_with_virtual_environments.mdx | 2 +- .../05_r_packages_with_renv.mdx | 2 +- .../06_conda_environments.mdx | 2 +- .../10_using_resources_responsibly.mdx | 4 ++-- 14 files changed, 21 insertions(+), 42 deletions(-) delete mode 100644 docs/hpc/03_storage/02_available_storage_systems.md rename docs/hpc/03_storage/{03_data_transfers.md => 02_data_transfers.md} (95%) rename docs/hpc/03_storage/{04_globus.md => 03_globus.md} (100%) rename docs/hpc/03_storage/{05_research_project_space.mdx => 04_research_project_space.mdx} (100%) rename docs/hpc/03_storage/{06_best_practices.md => 05_best_practices.md} (97%) rename docs/hpc/03_storage/{07_large_number_of_small_files.md => 06_large_number_of_small_files.md} (100%) rename docs/hpc/03_storage/{08_transferring_cloud_storage_data_with_rclone.md => 07_transferring_cloud_storage_data_with_rclone.md} (99%) rename docs/hpc/03_storage/{09_sharing_data_on_hpc.md => 08_sharing_data_on_hpc.md} (100%) diff --git a/docs/cloud/04_dataproc/02_data_management.md b/docs/cloud/04_dataproc/02_data_management.md index 4997858f77..94f7a23929 100644 --- a/docs/cloud/04_dataproc/02_data_management.md +++ b/docs/cloud/04_dataproc/02_data_management.md @@ -4,7 +4,7 @@ HDFS stands for Hadoop Distributed File System. HDFS is a highly fault-tolerant ### File Permissions and Access Control Lists -You can share files with others using [access control lists (ACLs)](../../hpc/03_storage/09_sharing_data_on_hpc.md). An ACL gives you per-file, per-directory and per-user control over who has permission to access files. You can see the ACL for a file or directory with the getfacl command: +You can share files with others using [access control lists (ACLs)](../../hpc/03_storage/08_sharing_data_on_hpc.md). An ACL gives you per-file, per-directory and per-user control over who has permission to access files. You can see the ACL for a file or directory with the getfacl command: ```sh hdfs dfs -getfacl /user/_nyu_edu/testdir ``` diff --git a/docs/hpc/03_storage/01_intro_and_data_management.mdx b/docs/hpc/03_storage/01_intro_and_data_management.mdx index 3c9e21ff86..67c140da54 100644 --- a/docs/hpc/03_storage/01_intro_and_data_management.mdx +++ b/docs/hpc/03_storage/01_intro_and_data_management.mdx @@ -1,13 +1,10 @@ # HPC Storage -The Torch HPC cluster is served by an all Flash VAST storage cluster. The HPC team supports data storage, transfer, and archival needs on the HPC clusters, as well as collaborative research services like the [Research Project Space (RPS)](./05_research_project_space.mdx). - -## Introduction to HPC Data Management The NYU HPC Environment provides access to a number of file-systems to better serve the needs of researchers managing data during the various stages of the research data life cycle (data capture, analysis, archiving, etc.). Each HPC file-system comes with different features, policies, and availability. -In addition, a number of data management tools are available that enable data transfers and data sharing, recommended best practices, and various scenarios and use cases of managing data in the HPC Environment. We strongly recommend [`globus`](./04_globus.md) as the primary data transfer tool. +Numerous data management tools are available that enable data transfers and data sharing, recommended best practices, and various scenarios and use cases of managing data in the HPC Environment. We strongly recommend [`globus`](./03_globus.md) as the primary data transfer tool. -Multiple ***public data sets*** are available to all users of the HPC environment, such as a subset of The Cancer Genome Atlas (`TCGA`), the Million Song Database, `ImageNet`, and Reference Genomes. More information can be found in the [datasets section](../04_datasets/01_intro.md). +Selected public datasets are available to all HPC users, such as a subset of The Cancer Genome Atlas (`TCGA`), the Million Song Database, `ImageNet`, and Reference Genomes. More information can be found in the [datasets section](../04_datasets/01_intro.md). :::warning[HPC is only approved for Moderate Risk Data] - The HPC Environment has been approved for storing and analyzing **Moderate Risk research data**, as defined in the [NYU Electronic Data and System Risk Classification Policy](https://www.nyu.edu/about/policies-guidelines-compliance/policies-and-guidelines/electronic-data-and-system-risk-classification.html). @@ -29,11 +26,11 @@ Every individual user has a home directory (under **`/home/$USER`**, environment Avoid changing file and directory permissions in your home directory to allow other users to access files. ::: -User Home Directories are not ideal for sharing files and folders with other users. HPC Scratch or [Research Project Space (RPS)](./05_research_project_space.mdx) are better file-systems for sharing data. +User Home Directories are not ideal for sharing files and folders with other users. HPC Scratch or [Research Project Space (RPS)](./04_research_project_space.mdx) are better file-systems for sharing data. :::tip[`inode` limits] - One of the common issues that users report regarding their home directories is running out of inodes (i.e. the number of files stored under their home exceeds the inode limit), which by default is set to 30,000 files -- To find out the current space and inode quota utilization and the distribution of files under your home directory, please see: [Understanding user quota limits and the myquota command.](./06_best_practices.md#user-quota-limits-and-the-myquota-command) +- To find out the current space and inode quota utilization and the distribution of files under your home directory, please see: [Understanding user quota limits and the myquota command.](./05_best_practices.md#user-quota-limits-and-the-myquota-command) - Working with `conda` environments: To avoid running out of inode limits in home directories, the HPC team recommends **setting up `conda` environments with Singularity overlay images** as [described here](../07_containers/03_singularity_with_conda.md). Avoid creating `conda` environments in your `$HOME` directory. ::: @@ -47,15 +44,15 @@ The HPC scratch an all flash (VAST) file-system where most of the users store re :::tip -- Since there are ***no backups of HPC Scratch file-system***, users should not put important source code, scripts, libraries, executables in `/scratch`. These important files should be stored in file-systems that are backed up, such as `/home` or [Research Project Space (RPS)](./05_research_project_space.mdx). Code can also be stored in a `git` repository. +- Since there are ***no backups of HPC Scratch file-system***, users should not put important source code, scripts, libraries, executables in `/scratch`. These important files should be stored in file-systems that are backed up, such as `/home` or [Research Project Space (RPS)](./04_research_project_space.mdx). Code can also be stored in a `git` repository. - ***Old file purging policy on HPC Scratch:*** All files on the HPC Scratch file-system that have not been accessed ***for more than 60 days*** will be removed. It is a policy violation to use scripts to change the file access time. Any user found to be violating this policy will have their HPC account locked. A second violation may result in your HPC account being turned off. -- To find out the user's current disk space and inode quota utilization and the distribution of files under your scratch directory, please see: [Understanding user quota Limits and the myquota command.](./06_best_practices.md#user-quota-limits-and-the-myquota-command) +- To find out the user's current disk space and inode quota utilization and the distribution of files under your scratch directory, please see: [Understanding user quota Limits and the myquota command.](./05_best_practices.md#user-quota-limits-and-the-myquota-command) - Once a research project completes, users should archive their important files in the [HPC Archive file-system](./01_intro_and_data_management.mdx#hpc-archive). ::: ### HPC Research Project Space The HPC Research Project Space (RPS) provides data storage space for research projects that is easily shared amongst collaborators, ***backed up***, and ***not subject to the old file purging policy***. HPC RPS was introduced to ease data management in the HPC environment and eliminate the need of having to frequently copying files between Scratch and Archive file-systems by having all projects files under one area. ***These benefits of the HPC RPS come at a cost***. The cost is determined by the allocated disk space and the number of files (inodes). -- For detailed information about RPS see: [HPC Research Project Space](./05_research_project_space.mdx) +- For detailed information about RPS see: [HPC Research Project Space](./04_research_project_space.mdx) ### HPC Work The HPC team makes available a number of public datasets that are commonly used in analysis jobs. The data-sets are available Read-Only under `/scratch/work/public`. For some of the datasets users must provide a signed usage agreement before accessing. Public datasets available on the HPC clusters can be viewed on the [Datasets page](../04_datasets/01_intro.md). @@ -74,7 +71,7 @@ Google Drive ([NYU Drive](https://www.nyu.edu/life/information-technology/commun As of December 2023, storage limits were applied to all faculty, staff, and student NYU Google accounts. Please see [Google Workspace Storage](https://www.nyu.edu/life/information-technology/about-nyu-it/key-projects-and-initiatives/google-workspace-storage.html) for details. -There are also limits to the data transfer rate in moving to/from Google Drive. Thus, moving many small files to Google Drive is not going to be efficient. Please read the [Instructions on how to use cloud storage within the NYU HPC Environment](./08_transferring_cloud_storage_data_with_rclone.md). +There are also limits to the data transfer rate in moving to/from Google Drive. Thus, moving many small files to Google Drive is not going to be efficient. Please read the [Instructions on how to use cloud storage within the NYU HPC Environment](./07_transferring_cloud_storage_data_with_rclone.md). ### HPC Storage Comparison Table diff --git a/docs/hpc/03_storage/02_available_storage_systems.md b/docs/hpc/03_storage/02_available_storage_systems.md deleted file mode 100644 index 701fb5b805..0000000000 --- a/docs/hpc/03_storage/02_available_storage_systems.md +++ /dev/null @@ -1,18 +0,0 @@ -# Available storage systems - -The NYU HPC clusters are served by the following storage systems: - -## Flash Tier Storage (VAST) -An all flash file-system, using [VAST Flash storage](https://www.vastdata.com/), is available as the primary file-system for Torch. Flash storage is optimal for computational workloads with high I/O rates. - - -## Research Project Space (RPS) -[Research Project Space (RPS)](./05_research_project_space.mdx) volumes provide working spaces for sharing data and code amongst project or lab members. -- RPS directories are available on the Torch HPC cluster. -- There is no old-file purging policy on RPS. -- RPS is backed up. -- There is a cost per TB per year and inodes per year for RPS volumes. - -Please see [Research Project Space](./05_research_project_space.mdx) for more information. - - diff --git a/docs/hpc/03_storage/03_data_transfers.md b/docs/hpc/03_storage/02_data_transfers.md similarity index 95% rename from docs/hpc/03_storage/03_data_transfers.md rename to docs/hpc/03_storage/02_data_transfers.md index 9989801659..af71d5cc8b 100644 --- a/docs/hpc/03_storage/03_data_transfers.md +++ b/docs/hpc/03_storage/02_data_transfers.md @@ -1,14 +1,14 @@ # Data Transfers :::tip Globus -Globus is the recommended tool to use for large-volume data transfers due to the efficiency, reliability, security and ease of use. Use other tools only if you really need to. Detailed instructions available at [Globus](./04_globus.md) +Globus is the recommended tool to use for large-volume data transfers due to the efficiency, reliability, security and ease of use. Use other tools only if you really need to. Detailed instructions available at [Globus](./03_globus.md) ::: ## Data-Transfer nodes Attached to the NYU HPC cluster Torch, the Torch Data Transfer Node (gDTN) are nodes optimized for transferring data between cluster file systems (e.g. scratch) and other endpoints outside the NYU HPC clusters, including user laptops and desktops. The gDTNs have 100-Gb/s Ethernet connections to the High Speed Research Network (HSRN) and are connected to the HDR Infiniband fabric of the HPC clusters. More information on the hardware characteristics is available at [Torch spec sheet](../10_spec_sheet.md). ### Data Transfer Node Access -The HPC cluster filesystems include `/home`, `/scratch`, `/archive` and the [HPC Research Project Space](./05_research_project_space.mdx) are available on the gDTN. The Data-Transfer Node (DTN) can be accessed in a variety of ways +The HPC cluster filesystems include `/home`, `/scratch`, `/archive` and the [HPC Research Project Space](./04_research_project_space.mdx) are available on the gDTN. The Data-Transfer Node (DTN) can be accessed in a variety of ways - From NYU-net and the High Speed Research Network: use SSH to the DTN hostname `dtn011.hpc.nyu.edu` or `dtn012.hpc.nyu.edu` :::info @@ -42,12 +42,12 @@ where username would be your user name, project1 a directory to be copied to the ### Windows Tools #### File Transfer Clients -Windows 10 machines may have the Linux Subsystem installed, which will allow for the use of Linux tools, as listed above, but generally it is recommended to use a client such as [WinSCP](https://winscp.net/eng/docs/tunneling) or [FileZilla](https://filezilla-project.org/) to transfer data. Additionally, Windows users may also take advantage of [Globus](./04_globus.md) to transfer files. +Windows 10 machines may have the Linux Subsystem installed, which will allow for the use of Linux tools, as listed above, but generally it is recommended to use a client such as [WinSCP](https://winscp.net/eng/docs/tunneling) or [FileZilla](https://filezilla-project.org/) to transfer data. Additionally, Windows users may also take advantage of [Globus](./03_globus.md) to transfer files. ### Globus Globus is the recommended tool to use for large-volume data transfers. It features automatic performance tuning and automatic retries in cases of file-transfer failures. Data-transfer tasks can be submitted via a web portal. The Globus service will take care of the rest, to make sure files are copied efficiently, reliably, and securely. Globus is also a tool for you to share data with collaborators, for whom you only need to provide the email addresses. -The Globus endpoint for Torch is available at `nyu#torch`. Detailed instructions available at [Globus](./04_globus.md) +The Globus endpoint for Torch is available at `nyu#torch`. Detailed instructions available at [Globus](./03_globus.md) ### rclone rclone - rsync for cloud storage, is a command line program to sync files and directories to and from cloud storage systems such as Google Drive, Amazon Drive, S3, B2 etc. rclone is available on DTNs. [Please see the documentation for how to use it.](https://rclone.org/) diff --git a/docs/hpc/03_storage/04_globus.md b/docs/hpc/03_storage/03_globus.md similarity index 100% rename from docs/hpc/03_storage/04_globus.md rename to docs/hpc/03_storage/03_globus.md diff --git a/docs/hpc/03_storage/05_research_project_space.mdx b/docs/hpc/03_storage/04_research_project_space.mdx similarity index 100% rename from docs/hpc/03_storage/05_research_project_space.mdx rename to docs/hpc/03_storage/04_research_project_space.mdx diff --git a/docs/hpc/03_storage/06_best_practices.md b/docs/hpc/03_storage/05_best_practices.md similarity index 97% rename from docs/hpc/03_storage/06_best_practices.md rename to docs/hpc/03_storage/05_best_practices.md index fa90799177..2f95adc8da 100644 --- a/docs/hpc/03_storage/06_best_practices.md +++ b/docs/hpc/03_storage/05_best_practices.md @@ -39,7 +39,7 @@ $ for d in $(find $(pwd) -maxdepth 1 -mindepth 1 -type d | sort -u); do n_files= ## Large number of small files In case your dataset or workflow requires to use large number of small files, this can create a bottleneck due to read/write rates. -Please refer to [our page on working with a large number of files](./07_large_number_of_small_files.md) to learn about some of the options we recommend to consider. +Please refer to [our page on working with a large number of files](./06_large_number_of_small_files.md) to learn about some of the options we recommend to consider. ## Installing Python packages :::warning diff --git a/docs/hpc/03_storage/07_large_number_of_small_files.md b/docs/hpc/03_storage/06_large_number_of_small_files.md similarity index 100% rename from docs/hpc/03_storage/07_large_number_of_small_files.md rename to docs/hpc/03_storage/06_large_number_of_small_files.md diff --git a/docs/hpc/03_storage/08_transferring_cloud_storage_data_with_rclone.md b/docs/hpc/03_storage/07_transferring_cloud_storage_data_with_rclone.md similarity index 99% rename from docs/hpc/03_storage/08_transferring_cloud_storage_data_with_rclone.md rename to docs/hpc/03_storage/07_transferring_cloud_storage_data_with_rclone.md index 10316b7868..e306b8177b 100644 --- a/docs/hpc/03_storage/08_transferring_cloud_storage_data_with_rclone.md +++ b/docs/hpc/03_storage/07_transferring_cloud_storage_data_with_rclone.md @@ -1,7 +1,7 @@ # Transferring Cloud Storage Data with rclone ## Transferring files to and from Google Drive with RCLONE -Having access to Google Drive from the HPC environment provides an option to archive data and even share data with collaborators who have no access to the NYU HPC environment. Other options to archiving data include the HPC Archive file system and using [Globus](./04_globus.md) to share data with collaborators. +Having access to Google Drive from the HPC environment provides an option to archive data and even share data with collaborators who have no access to the NYU HPC environment. Other options to archiving data include the HPC Archive file system and using [Globus](./03_globus.md) to share data with collaborators. Access to Google Drive is provided by [rclone](https://rclone.org/drive/) - rsync for cloud storage - a command line program to sync files and directories to and from cloud storage systems such as Google Drive, Amazon Drive, S3, B2 etc. [rclone](https://rclone.org/drive/) is available on Torch cluster as a module, the module versions currently available (March 2025) are: - **rclone/1.68.2** @@ -344,7 +344,7 @@ Please enter 'q' and we're done with configuration. ### Step 4: Transfer :::warning -Please be sure to perform data transfers on a data transfer node (DTN). It can degrade performance for other users to perform transfers on other types of nodes. For more information please see [Data Transfers](./03_data_transfers.md) +Please be sure to perform data transfers on a data transfer node (DTN). It can degrade performance for other users to perform transfers on other types of nodes. For more information please see [Data Transfers](./02_data_transfers.md) ::: Sample commands: diff --git a/docs/hpc/03_storage/09_sharing_data_on_hpc.md b/docs/hpc/03_storage/08_sharing_data_on_hpc.md similarity index 100% rename from docs/hpc/03_storage/09_sharing_data_on_hpc.md rename to docs/hpc/03_storage/08_sharing_data_on_hpc.md diff --git a/docs/hpc/06_tools_and_software/04_python_packages_with_virtual_environments.mdx b/docs/hpc/06_tools_and_software/04_python_packages_with_virtual_environments.mdx index aa29be619c..b8959a0857 100644 --- a/docs/hpc/06_tools_and_software/04_python_packages_with_virtual_environments.mdx +++ b/docs/hpc/06_tools_and_software/04_python_packages_with_virtual_environments.mdx @@ -33,7 +33,7 @@ Thus you can consider the following options: - Reinstall your packages if some of the files get deleted - You can do this manually - You can do this automatically. For example, within a workflow of a pipeline software like [Nextflow](https://www.nextflow.io/) -- Pay for "Research Project Space" - for details see [Research Project Space](../03_storage/05_research_project_space.mdx) +- Pay for "Research Project Space" - for details see [Research Project Space](../03_storage/04_research_project_space.mdx) ::: diff --git a/docs/hpc/06_tools_and_software/05_r_packages_with_renv.mdx b/docs/hpc/06_tools_and_software/05_r_packages_with_renv.mdx index 97d2582be4..87e5917519 100644 --- a/docs/hpc/06_tools_and_software/05_r_packages_with_renv.mdx +++ b/docs/hpc/06_tools_and_software/05_r_packages_with_renv.mdx @@ -36,7 +36,7 @@ Thus you can consider the following options: - Reinstall your packages if some of the files get deleted - You can do this manually - You can do this automatically. For example, within a workflow of a pipeline software like [Nextflow](https://www.nextflow.io/) -- Pay for "Research Project Space" - for details see [Research Project Space](../03_storage/05_research_project_space.mdx) +- Pay for "Research Project Space" - for details see [Research Project Space](../03_storage/04_research_project_space.mdx) - Use Singularity and install packages within a corresponding overlay file - Details available at [Squash File System and Singularity](../07_containers/04_squash_file_system_and_singularity.md) ::: diff --git a/docs/hpc/06_tools_and_software/06_conda_environments.mdx b/docs/hpc/06_tools_and_software/06_conda_environments.mdx index 589e56d60f..00290589f6 100644 --- a/docs/hpc/06_tools_and_software/06_conda_environments.mdx +++ b/docs/hpc/06_tools_and_software/06_conda_environments.mdx @@ -41,7 +41,7 @@ Thus you can consider the following options: - Reinstall your packages if some of the files get deleted - You can do this manually - You can do this automatically. For example, within a workflow of a pipeline software like [Nextflow](https://www.nextflow.io/) -- Pay for "Research Project Space" - for details see [Research Project Space](../03_storage/05_research_project_space.mdx) +- Pay for "Research Project Space" - for details see [Research Project Space](../03_storage/04_research_project_space.mdx) - Use Singularity and install packages within a corresponding overlay file - Details available at [Squash File System and Singularity](../07_containers/04_squash_file_system_and_singularity.md) ::: diff --git a/docs/hpc/13_tutorial_intro_hpc/10_using_resources_responsibly.mdx b/docs/hpc/13_tutorial_intro_hpc/10_using_resources_responsibly.mdx index 278f077e41..4355fa78a1 100644 --- a/docs/hpc/13_tutorial_intro_hpc/10_using_resources_responsibly.mdx +++ b/docs/hpc/13_tutorial_intro_hpc/10_using_resources_responsibly.mdx @@ -20,7 +20,7 @@ The widespread usage of scheduling systems where users submit jobs on HPC resour ## Be Kind to the Login Nodes The login node is often busy managing all of the logged in users, creating and editing files and compiling software. If the machine runs out of memory or processing capacity, it will become very slow and unusable for everyone. While the machine is meant to be used, be sure to do so responsibly – in ways that will not adversely impact other users’ experience. -Login nodes are always the right place to launch jobs, but data transfers should be done on the Torch Data Transfer Nodes (gDTNs). Please see more about gDTNs at [Data Transfers](../03_storage/03_data_transfers.md). Similarly, computationally intensive tasks should all be done on compute nodes. This refers to not just computational analysis/research tasks, but also to processor intensive software installations and similar tasks. +Login nodes are always the right place to launch jobs, but data transfers should be done on the Torch Data Transfer Nodes (gDTNs). Please see more about gDTNs at [Data Transfers](../03_storage/02_data_transfers.md). Similarly, computationally intensive tasks should all be done on compute nodes. This refers to not just computational analysis/research tasks, but also to processor intensive software installations and similar tasks. :::warning[Login Nodes Are a Shared Resource] Remember, the login node is shared with all other users and your actions could cause issues for other people. Think carefully about the potential implications of issuing commands that may use large amounts of resource. @@ -77,7 +77,7 @@ Make sure you understand what the backup policy is on the system you are using a ::: ## Transferring Data -The most important point about transferring data responsibly on Green is to be sure to use Torch Data Transfer Nodes (gDTNs) or other options like [Globus](../03_storage/04_globus.md). Please see [Data Transfers](../03_storage/03_data_transfers.md) for details. By doing this you'll help to keep the login nodes responsive for all users. +The most important point about transferring data responsibly on Green is to be sure to use Torch Data Transfer Nodes (gDTNs) or other options like [Globus](../03_storage/03_globus.md). Please see [Data Transfers](../03_storage/02_data_transfers.md) for details. By doing this you'll help to keep the login nodes responsive for all users. Being efficient in *how* you transfer data on the gDTNs is also important. It will not only reduce the load on the gDTNs, but also save your time. Be sure to archive and compress you files if possible with `tar` and `gzip`. This will remove the overhead of trying to transfer many files and shrink the size of transfer. Please see [Transferring Files with Remote Computers](./07_transferring_files_remote.mdx) for details. From e432d4dc088cbd4c96e4fe1c3c3bf826e89db9ab Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Tue, 9 Dec 2025 15:55:00 -0500 Subject: [PATCH 03/26] minor cleanup --- docs/hpc/03_storage/01_intro_and_data_management.mdx | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/docs/hpc/03_storage/01_intro_and_data_management.mdx b/docs/hpc/03_storage/01_intro_and_data_management.mdx index 67c140da54..39593ddae0 100644 --- a/docs/hpc/03_storage/01_intro_and_data_management.mdx +++ b/docs/hpc/03_storage/01_intro_and_data_management.mdx @@ -17,7 +17,7 @@ Because the HPC system is not approved for High Risk data, we recommend using an ## Data Storage options in the HPC Environment -Below is a list of file-systems with their characteristics and a summary table. Reviewing the list of available file-systems and the various Scenarios/Use cases that are presented below, can help select the right file-systems for a research project. +Below is a list of file-systems with their characteristics and a summary table. Reviewing the list of available file-systems and the various scenarios/use cases that are presented below, can help select the right file-systems for a research project. Please note that there are strict limits on the size and number of files you are allowed to have on each filesystem. To find out your current disk space, inode quota utilization and the distribution of files under your scratch directory, please see: [Understanding user quota Limits and the myquota command.](./05_best_practices.md#user-quota-limits-and-the-myquota-command) ### User Home Directories Every individual user has a home directory (under **`/home/$USER`**, environment variable **`$HOME`**) for permanently storing code and important configuration files. Home Directories provide limited storage space (**50 GB**) and inodes (files) **30,000** per user. Users can check their quota utilization using the [myquota](http://www.info-ren.org/projects/ckp/tech/software/version/myquota.html) command. User home directories are backed up daily and old files under **`$HOME`** are not purged. The user home directories are available on every cluster node (login nodes, compute nodes) as well as and the Data Transfer Node (gDTN). @@ -38,15 +38,13 @@ User Home Directories are not ideal for sharing files and folders with other use The HPC scratch an all flash (VAST) file-system where most of the users store research data needed during the analysis phase of their research projects. The scratch file-system provides ***temporary*** storage for datasets needed for running jobs. Every user has a dedicated scratch directory (**/scratch/$USER**) with **5 TB** disk quota and **5,000,000 inodes** (files) limit per user. The scratch file-system is available on all nodes (compute, login, etc.) on Torch as well as Data Transfer Node (gDTN). :::warning[Scratch Purging Policy] -- Files on the /scratch file-system that have not been accessed for 60 or more days will be purged. +- Files on the /scratch file-system that have not been accessed for 60 or more days will be purged. 0 There are no backups of the scratch file-system. Files that were deleted accidentally or removed due to storage system failures CANNOT be recovered. +- It is a policy violation to use scripts to change the file access time. Any user found to be violating this policy will have their HPC account locked. A second violation may result in your HPC account being turned off. ::: :::tip - -- Since there are ***no backups of HPC Scratch file-system***, users should not put important source code, scripts, libraries, executables in `/scratch`. These important files should be stored in file-systems that are backed up, such as `/home` or [Research Project Space (RPS)](./04_research_project_space.mdx). Code can also be stored in a `git` repository. -- ***Old file purging policy on HPC Scratch:*** All files on the HPC Scratch file-system that have not been accessed ***for more than 60 days*** will be removed. It is a policy violation to use scripts to change the file access time. Any user found to be violating this policy will have their HPC account locked. A second violation may result in your HPC account being turned off. -- To find out the user's current disk space and inode quota utilization and the distribution of files under your scratch directory, please see: [Understanding user quota Limits and the myquota command.](./05_best_practices.md#user-quota-limits-and-the-myquota-command) +- There are no backups of HPC Scratch file-system and you should not put important source code, scripts, libraries, executables in `/scratch`. These files should instead be stored in file-systems that are backed up, such as `/home` or [Research Project Space (RPS)](./04_research_project_space.mdx). Code can also be stored in a `git` repository. - Once a research project completes, users should archive their important files in the [HPC Archive file-system](./01_intro_and_data_management.mdx#hpc-archive). ::: From e1b860c85a07963a4646f36d62cbef09137bcc69 Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Tue, 9 Dec 2025 15:57:25 -0500 Subject: [PATCH 04/26] minor cleanup --- docs/hpc/03_storage/01_intro_and_data_management.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/hpc/03_storage/01_intro_and_data_management.mdx b/docs/hpc/03_storage/01_intro_and_data_management.mdx index 39593ddae0..6c0434c611 100644 --- a/docs/hpc/03_storage/01_intro_and_data_management.mdx +++ b/docs/hpc/03_storage/01_intro_and_data_management.mdx @@ -40,7 +40,7 @@ The HPC scratch an all flash (VAST) file-system where most of the users store re :::warning[Scratch Purging Policy] - Files on the /scratch file-system that have not been accessed for 60 or more days will be purged. 0 There are no backups of the scratch file-system. Files that were deleted accidentally or removed due to storage system failures CANNOT be recovered. -- It is a policy violation to use scripts to change the file access time. Any user found to be violating this policy will have their HPC account locked. A second violation may result in your HPC account being turned off. +- It is a policy violation to use scripts to change the file access time. Any user found to be violating this policy will have their HPC account locked. A second violation may result in your HPC account being turned off. ::: :::tip From 4d80eaba8ba2ea792d098f027002240ec0cf2c05 Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Tue, 9 Dec 2025 16:00:57 -0500 Subject: [PATCH 05/26] link to library guide --- docs/hpc/03_storage/01_intro_and_data_management.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/hpc/03_storage/01_intro_and_data_management.mdx b/docs/hpc/03_storage/01_intro_and_data_management.mdx index 6c0434c611..2cedc326db 100644 --- a/docs/hpc/03_storage/01_intro_and_data_management.mdx +++ b/docs/hpc/03_storage/01_intro_and_data_management.mdx @@ -1,6 +1,6 @@ # HPC Storage -The NYU HPC Environment provides access to a number of file-systems to better serve the needs of researchers managing data during the various stages of the research data life cycle (data capture, analysis, archiving, etc.). Each HPC file-system comes with different features, policies, and availability. +The NYU HPC Environment provides access to a number of file-systems to better serve the needs of researchers managing data during the various stages of the [research data life cycle](https://guides.nyu.edu/dataservices#s-lg-box-33756318). Each HPC file-system comes with different features, policies, and availability. Numerous data management tools are available that enable data transfers and data sharing, recommended best practices, and various scenarios and use cases of managing data in the HPC Environment. We strongly recommend [`globus`](./03_globus.md) as the primary data transfer tool. From 9b63ba5670d8225d27db9d62606b18ddc4474a83 Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Tue, 9 Dec 2025 16:04:54 -0500 Subject: [PATCH 06/26] minor cleanup --- docs/hpc/03_storage/01_intro_and_data_management.mdx | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/docs/hpc/03_storage/01_intro_and_data_management.mdx b/docs/hpc/03_storage/01_intro_and_data_management.mdx index 2cedc326db..6aaabcd097 100644 --- a/docs/hpc/03_storage/01_intro_and_data_management.mdx +++ b/docs/hpc/03_storage/01_intro_and_data_management.mdx @@ -49,16 +49,15 @@ The HPC scratch an all flash (VAST) file-system where most of the users store re ::: ### HPC Research Project Space -The HPC Research Project Space (RPS) provides data storage space for research projects that is easily shared amongst collaborators, ***backed up***, and ***not subject to the old file purging policy***. HPC RPS was introduced to ease data management in the HPC environment and eliminate the need of having to frequently copying files between Scratch and Archive file-systems by having all projects files under one area. ***These benefits of the HPC RPS come at a cost***. The cost is determined by the allocated disk space and the number of files (inodes). -- For detailed information about RPS see: [HPC Research Project Space](./04_research_project_space.mdx) +The HPC Research Project Space (RPS) provides data storage space for research projects that is easily shared amongst collaborators, ***backed up***, and ***not subject to the old file purging policy***. HPC RPS was introduced to ease data management in the HPC environment and eliminate the need of having to frequently copying files between Scratch and Archive file-systems by having all projects files under one area. ***These benefits of the HPC RPS come at a cost***. The cost is determined by the allocated disk space and the number of files (inodes). For detailed information about RPS see: [HPC Research Project Space](./04_research_project_space.mdx) ### HPC Work The HPC team makes available a number of public datasets that are commonly used in analysis jobs. The data-sets are available Read-Only under `/scratch/work/public`. For some of the datasets users must provide a signed usage agreement before accessing. Public datasets available on the HPC clusters can be viewed on the [Datasets page](../04_datasets/01_intro.md). ### HPC Archive -Once the Analysis stage of the research data lifecycle has completed, HPC users should **tar** their data and code into a single tar.gz file and then copy the file to their archive directory (`/archive/$USER`). The HPC Archive file-system is not accessible by running jobs; it is suitable for long-term data storage. Each user has access to a default disk quota of **2TB** and ***20,000 inode (files) limit***. The rather low limit on the number of inodes per user is intentional. The archive file-system is available only ***on login nodes*** of Torch. The archive file-system is backed up daily. +Once the Analysis stage of the [research data life cycle](https://guides.nyu.edu/dataservices#s-lg-box-33756318) has completed, you should compress your data before moving it onto the archive (`/archive/$USER`). For instance you can use the `tar` command to compress all your data into a single `tar.gz` file. The HPC Archive file-system is not accessible by running jobs; it is suitable for long-term data storage. Each user has access to a default disk quota of **2TB** and ***20,000 inode (files) limit***. The rather low limit on the number of inodes per user is intentional. The archive file-system is available only ***on login nodes*** of Torch. The archive file-system is backed up daily. -- Here is an example ***tar*** command that combines the data in a directory named `my_run_dir` under `$SCRATCH` and outputs the tar file in the user's `$ARCHIVE`: +- Here is an example `tar` command that combines the data in a directory named `my_run_dir` under `$SCRATCH` and outputs the tar file in the user's `$ARCHIVE`: ```sh # to archive `$SCRATCH/my_run_dir` tar cvf $ARCHIVE/simulation_01.tar -C $SCRATCH my_run_dir From 03ac7263c309ce31f12df51db0ba7f010b633a1d Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Tue, 9 Dec 2025 16:07:32 -0500 Subject: [PATCH 07/26] minor cleanup --- docs/hpc/03_storage/04_research_project_space.mdx | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/docs/hpc/03_storage/04_research_project_space.mdx b/docs/hpc/03_storage/04_research_project_space.mdx index 9f7b958955..2925715d36 100644 --- a/docs/hpc/03_storage/04_research_project_space.mdx +++ b/docs/hpc/03_storage/04_research_project_space.mdx @@ -3,8 +3,7 @@ ## Description Research Project Space (RPS) volumes provide working space for sharing data and code amongst project or lab members. RPS directories are built on the same parallel file system (VAST) like HPC Scratch. They are mounted on the cluster Compute Nodes, and thus they can be accessed by running jobs. RPS directories are backed up and there is no old file purging policy. These features of RPS simplify the management of data in the HPC environment as users of the HPC Cluster can store their data and code on RPS directories and they do not need to move data between the HPC Scratch and the HPC Archive file systems. -:::note -- Due to limitations of the underlying parallel file system, ***the total number of RPS volumes that can be created is limited***. +:::info - There is an annual cost associated with RPS. - The disk space and inode usage in RPS directories do not count towards quota limits in other HPC file systems (Home, Scratch, and Archive). ::: From 681127aa08fe460d5d72da58b0f4ed46dd843d25 Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Tue, 9 Dec 2025 16:08:35 -0500 Subject: [PATCH 08/26] minor cleanup --- .../07_transferring_cloud_storage_data_with_rclone.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/docs/hpc/03_storage/07_transferring_cloud_storage_data_with_rclone.md b/docs/hpc/03_storage/07_transferring_cloud_storage_data_with_rclone.md index e306b8177b..1040b38b59 100644 --- a/docs/hpc/03_storage/07_transferring_cloud_storage_data_with_rclone.md +++ b/docs/hpc/03_storage/07_transferring_cloud_storage_data_with_rclone.md @@ -1,5 +1,9 @@ # Transferring Cloud Storage Data with rclone +:::tip Globus +Globus is the recommended tool to use for large-volume data transfers due to the efficiency, reliability, security and ease of use. Use other tools only if you really need to. Detailed instructions available at [Globus](./03_globus.md) +::: + ## Transferring files to and from Google Drive with RCLONE Having access to Google Drive from the HPC environment provides an option to archive data and even share data with collaborators who have no access to the NYU HPC environment. Other options to archiving data include the HPC Archive file system and using [Globus](./03_globus.md) to share data with collaborators. From d4c5e7b47378a2651feabdd7d52125af279e6321 Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Tue, 9 Dec 2025 16:13:16 -0500 Subject: [PATCH 09/26] minor cleanup --- .../01_intro_and_data_management.mdx | 22 +++++++++---------- 1 file changed, 10 insertions(+), 12 deletions(-) diff --git a/docs/hpc/03_storage/01_intro_and_data_management.mdx b/docs/hpc/03_storage/01_intro_and_data_management.mdx index 6aaabcd097..709c564db4 100644 --- a/docs/hpc/03_storage/01_intro_and_data_management.mdx +++ b/docs/hpc/03_storage/01_intro_and_data_management.mdx @@ -1,10 +1,10 @@ # HPC Storage -The NYU HPC Environment provides access to a number of file-systems to better serve the needs of researchers managing data during the various stages of the [research data life cycle](https://guides.nyu.edu/dataservices#s-lg-box-33756318). Each HPC file-system comes with different features, policies, and availability. +The HPC Environment provides access to a number of file-systems to better serve the needs of researchers managing data during the various stages of the [research data life cycle](https://guides.nyu.edu/dataservices#s-lg-box-33756318). Each HPC file-system comes with different features, policies, and availability. Numerous data management tools are available that enable data transfers and data sharing, recommended best practices, and various scenarios and use cases of managing data in the HPC Environment. We strongly recommend [`globus`](./03_globus.md) as the primary data transfer tool. -Selected public datasets are available to all HPC users, such as a subset of The Cancer Genome Atlas (`TCGA`), the Million Song Database, `ImageNet`, and Reference Genomes. More information can be found in the [datasets section](../04_datasets/01_intro.md). +Selected public datasets are available to all HPC users, such as a subset of The Cancer Genome Atlas (`TCGA`), the Million Song Database, `ImageNet`, and Reference Genomes. More information can be found in the [datasets section](../04_datasets/01_intro.md). :::warning[HPC is only approved for Moderate Risk Data] - The HPC Environment has been approved for storing and analyzing **Moderate Risk research data**, as defined in the [NYU Electronic Data and System Risk Classification Policy](https://www.nyu.edu/about/policies-guidelines-compliance/policies-and-guidelines/electronic-data-and-system-risk-classification.html). @@ -15,11 +15,9 @@ Selected public datasets are available to all HPC users, such as a subset of The Because the HPC system is not approved for High Risk data, we recommend using an approved system like the [Secure Research Data Environments (SRDE)](../../srde/01_getting_started/01_intro.md). ::: -## Data Storage options in the HPC Environment +Below is a list of file-systems with their characteristics and a summary table. Reviewing the list of available file-systems and the various scenarios/use cases that are presented below, can help you in selecting the right file-systems for your research project. Please note that there are strict limits on the size and number of files you are allowed to have on each filesystem. To find out your current disk space, inode quota utilization and the distribution of files under your scratch directory, refer to the section on [understanding user quota Limits and the myquota command.](./05_best_practices.md#user-quota-limits-and-the-myquota-command) -Below is a list of file-systems with their characteristics and a summary table. Reviewing the list of available file-systems and the various scenarios/use cases that are presented below, can help select the right file-systems for a research project. Please note that there are strict limits on the size and number of files you are allowed to have on each filesystem. To find out your current disk space, inode quota utilization and the distribution of files under your scratch directory, please see: [Understanding user quota Limits and the myquota command.](./05_best_practices.md#user-quota-limits-and-the-myquota-command) - -### User Home Directories +## User Home Directories Every individual user has a home directory (under **`/home/$USER`**, environment variable **`$HOME`**) for permanently storing code and important configuration files. Home Directories provide limited storage space (**50 GB**) and inodes (files) **30,000** per user. Users can check their quota utilization using the [myquota](http://www.info-ren.org/projects/ckp/tech/software/version/myquota.html) command. User home directories are backed up daily and old files under **`$HOME`** are not purged. The user home directories are available on every cluster node (login nodes, compute nodes) as well as and the Data Transfer Node (gDTN). :::warning @@ -34,7 +32,7 @@ User Home Directories are not ideal for sharing files and folders with other use - Working with `conda` environments: To avoid running out of inode limits in home directories, the HPC team recommends **setting up `conda` environments with Singularity overlay images** as [described here](../07_containers/03_singularity_with_conda.md). Avoid creating `conda` environments in your `$HOME` directory. ::: -### HPC Scratch +## HPC Scratch The HPC scratch an all flash (VAST) file-system where most of the users store research data needed during the analysis phase of their research projects. The scratch file-system provides ***temporary*** storage for datasets needed for running jobs. Every user has a dedicated scratch directory (**/scratch/$USER**) with **5 TB** disk quota and **5,000,000 inodes** (files) limit per user. The scratch file-system is available on all nodes (compute, login, etc.) on Torch as well as Data Transfer Node (gDTN). :::warning[Scratch Purging Policy] @@ -48,13 +46,13 @@ The HPC scratch an all flash (VAST) file-system where most of the users store re - Once a research project completes, users should archive their important files in the [HPC Archive file-system](./01_intro_and_data_management.mdx#hpc-archive). ::: -### HPC Research Project Space +## HPC Research Project Space The HPC Research Project Space (RPS) provides data storage space for research projects that is easily shared amongst collaborators, ***backed up***, and ***not subject to the old file purging policy***. HPC RPS was introduced to ease data management in the HPC environment and eliminate the need of having to frequently copying files between Scratch and Archive file-systems by having all projects files under one area. ***These benefits of the HPC RPS come at a cost***. The cost is determined by the allocated disk space and the number of files (inodes). For detailed information about RPS see: [HPC Research Project Space](./04_research_project_space.mdx) -### HPC Work +## HPC Work The HPC team makes available a number of public datasets that are commonly used in analysis jobs. The data-sets are available Read-Only under `/scratch/work/public`. For some of the datasets users must provide a signed usage agreement before accessing. Public datasets available on the HPC clusters can be viewed on the [Datasets page](../04_datasets/01_intro.md). -### HPC Archive +## HPC Archive Once the Analysis stage of the [research data life cycle](https://guides.nyu.edu/dataservices#s-lg-box-33756318) has completed, you should compress your data before moving it onto the archive (`/archive/$USER`). For instance you can use the `tar` command to compress all your data into a single `tar.gz` file. The HPC Archive file-system is not accessible by running jobs; it is suitable for long-term data storage. Each user has access to a default disk quota of **2TB** and ***20,000 inode (files) limit***. The rather low limit on the number of inodes per user is intentional. The archive file-system is available only ***on login nodes*** of Torch. The archive file-system is backed up daily. - Here is an example `tar` command that combines the data in a directory named `my_run_dir` under `$SCRATCH` and outputs the tar file in the user's `$ARCHIVE`: @@ -63,14 +61,14 @@ Once the Analysis stage of the [research data life cycle](https://guides.nyu.edu tar cvf $ARCHIVE/simulation_01.tar -C $SCRATCH my_run_dir ``` -### NYU (Google) Drive +## NYU (Google) Drive Google Drive ([NYU Drive](https://www.nyu.edu/life/information-technology/communication-and-collaboration/document-collaboration-and-sharing/nyu-drive.html)) is accessible from the NYU HPC environment and provides an option to users who wish to archive data or share data with external collaborators who do not have access to the NYU HPC environment. As of December 2023, storage limits were applied to all faculty, staff, and student NYU Google accounts. Please see [Google Workspace Storage](https://www.nyu.edu/life/information-technology/about-nyu-it/key-projects-and-initiatives/google-workspace-storage.html) for details. There are also limits to the data transfer rate in moving to/from Google Drive. Thus, moving many small files to Google Drive is not going to be efficient. Please read the [Instructions on how to use cloud storage within the NYU HPC Environment](./07_transferring_cloud_storage_data_with_rclone.md). -### HPC Storage Comparison Table +## HPC Storage Comparison Table | Space | Environment Variable | Purpose | Backed Up / Flushed | Quota Disk Space / # of Files | |-----------------------------|----------------------|-------------------------------------------------------|-------------------------------------|------------------------------------| From ae55015a5ec5717e383b982510ed5bc91bc5dc93 Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Tue, 9 Dec 2025 16:15:18 -0500 Subject: [PATCH 10/26] add the rps caveat back in --- docs/hpc/03_storage/04_research_project_space.mdx | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/hpc/03_storage/04_research_project_space.mdx b/docs/hpc/03_storage/04_research_project_space.mdx index 2925715d36..38a4cd6977 100644 --- a/docs/hpc/03_storage/04_research_project_space.mdx +++ b/docs/hpc/03_storage/04_research_project_space.mdx @@ -4,6 +4,7 @@ Research Project Space (RPS) volumes provide working space for sharing data and code amongst project or lab members. RPS directories are built on the same parallel file system (VAST) like HPC Scratch. They are mounted on the cluster Compute Nodes, and thus they can be accessed by running jobs. RPS directories are backed up and there is no old file purging policy. These features of RPS simplify the management of data in the HPC environment as users of the HPC Cluster can store their data and code on RPS directories and they do not need to move data between the HPC Scratch and the HPC Archive file systems. :::info +- Due to limitations of the underlying parallel file system, the total number of RPS volumes that can be created is limited. - There is an annual cost associated with RPS. - The disk space and inode usage in RPS directories do not count towards quota limits in other HPC file systems (Home, Scratch, and Archive). ::: From da538b1463c53d2b3a35818c77c737f7f0c99f0f Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Tue, 9 Dec 2025 16:17:45 -0500 Subject: [PATCH 11/26] minor cleanup --- .../03_storage/01_intro_and_data_management.mdx | 17 +++++------------ 1 file changed, 5 insertions(+), 12 deletions(-) diff --git a/docs/hpc/03_storage/01_intro_and_data_management.mdx b/docs/hpc/03_storage/01_intro_and_data_management.mdx index 709c564db4..eb604b34fd 100644 --- a/docs/hpc/03_storage/01_intro_and_data_management.mdx +++ b/docs/hpc/03_storage/01_intro_and_data_management.mdx @@ -1,18 +1,11 @@ # HPC Storage -The HPC Environment provides access to a number of file-systems to better serve the needs of researchers managing data during the various stages of the [research data life cycle](https://guides.nyu.edu/dataservices#s-lg-box-33756318). Each HPC file-system comes with different features, policies, and availability. +The HPC Environment provides access to a number of file-systems to better serve the needs of researchers managing data during the various stages of the [research data life cycle](https://guides.nyu.edu/dataservices#s-lg-box-33756318). Each HPC file-system comes with different features, policies, and availability. Numerous data management tools are available that enable data transfers and data sharing, recommended best practices, and various scenarios and use cases of managing data in the HPC Environment. We strongly recommend [`globus`](./03_globus.md) as the primary data transfer tool. -Numerous data management tools are available that enable data transfers and data sharing, recommended best practices, and various scenarios and use cases of managing data in the HPC Environment. We strongly recommend [`globus`](./03_globus.md) as the primary data transfer tool. - -Selected public datasets are available to all HPC users, such as a subset of The Cancer Genome Atlas (`TCGA`), the Million Song Database, `ImageNet`, and Reference Genomes. More information can be found in the [datasets section](../04_datasets/01_intro.md). - -:::warning[HPC is only approved for Moderate Risk Data] -- The HPC Environment has been approved for storing and analyzing **Moderate Risk research data**, as defined in the [NYU Electronic Data and System Risk Classification Policy](https://www.nyu.edu/about/policies-guidelines-compliance/policies-and-guidelines/electronic-data-and-system-risk-classification.html). -- **High Risk** research data, such as those that include Personal Identifiable Information (**PII**) or electronic Protected Health Information (**ePHI**) or Controlled Unclassified Information (**CUI**) **should NOT be stored in the HPC Environment**. -- Only the Office of Sponsored Projects (OSP) and Global Office of Information Security (GOIS) are empowered to classify the risk categories of data. -::: -:::info[SRDE for High Risk Data] -Because the HPC system is not approved for High Risk data, we recommend using an approved system like the [Secure Research Data Environments (SRDE)](../../srde/01_getting_started/01_intro.md). +:::warning[Only approved for Moderate Risk Data] +- The HPC Environment has been approved for storing and analyzing **Moderate Risk** research data, as defined in the [NYU Electronic Data and System Risk Classification Policy](https://www.nyu.edu/about/policies-guidelines-compliance/policies-and-guidelines/electronic-data-and-system-risk-classification.html). +- **High Risk** research data, such as those that include Personal Identifiable Information (**PII**) or electronic Protected Health Information (**ePHI**) or Controlled Unclassified Information (**CUI**) **should NOT be stored in the HPC Environment**. For this, we recommend using the [Secure Research Data Environments (SRDE)](../../srde/01_getting_started/01_intro.md) instead. +- The Office of Sponsored Projects (OSP) and Global Office of Information Security (GOIS) are exclusively empowered to classify the risk categories of data. ::: Below is a list of file-systems with their characteristics and a summary table. Reviewing the list of available file-systems and the various scenarios/use cases that are presented below, can help you in selecting the right file-systems for your research project. Please note that there are strict limits on the size and number of files you are allowed to have on each filesystem. To find out your current disk space, inode quota utilization and the distribution of files under your scratch directory, refer to the section on [understanding user quota Limits and the myquota command.](./05_best_practices.md#user-quota-limits-and-the-myquota-command) From 2bd0278964e1d0446d331b668f2b5d34ea4bdf92 Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Tue, 9 Dec 2025 16:21:33 -0500 Subject: [PATCH 12/26] formatting fix --- docs/hpc/03_storage/01_intro_and_data_management.mdx | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/hpc/03_storage/01_intro_and_data_management.mdx b/docs/hpc/03_storage/01_intro_and_data_management.mdx index eb604b34fd..ddfa43b43f 100644 --- a/docs/hpc/03_storage/01_intro_and_data_management.mdx +++ b/docs/hpc/03_storage/01_intro_and_data_management.mdx @@ -11,7 +11,7 @@ The HPC Environment provides access to a number of file-systems to better serve Below is a list of file-systems with their characteristics and a summary table. Reviewing the list of available file-systems and the various scenarios/use cases that are presented below, can help you in selecting the right file-systems for your research project. Please note that there are strict limits on the size and number of files you are allowed to have on each filesystem. To find out your current disk space, inode quota utilization and the distribution of files under your scratch directory, refer to the section on [understanding user quota Limits and the myquota command.](./05_best_practices.md#user-quota-limits-and-the-myquota-command) ## User Home Directories -Every individual user has a home directory (under **`/home/$USER`**, environment variable **`$HOME`**) for permanently storing code and important configuration files. Home Directories provide limited storage space (**50 GB**) and inodes (files) **30,000** per user. Users can check their quota utilization using the [myquota](http://www.info-ren.org/projects/ckp/tech/software/version/myquota.html) command. User home directories are backed up daily and old files under **`$HOME`** are not purged. The user home directories are available on every cluster node (login nodes, compute nodes) as well as and the Data Transfer Node (gDTN). +Every individual user has a home directory (under `/home/$USER`, environment variable `$HOME`) for permanently storing code and important configuration files. Home Directories provide limited storage space (**50 GB**) and inodes (files) **30,000** per user. Users can check their quota utilization using the [myquota](http://www.info-ren.org/projects/ckp/tech/software/version/myquota.html) command. User home directories are backed up daily and old files under `$HOME` are not purged. The user home directories are available on every cluster node (login nodes, compute nodes) as well as and the Data Transfer Node (gDTN). :::warning Avoid changing file and directory permissions in your home directory to allow other users to access files. @@ -26,7 +26,7 @@ User Home Directories are not ideal for sharing files and folders with other use ::: ## HPC Scratch -The HPC scratch an all flash (VAST) file-system where most of the users store research data needed during the analysis phase of their research projects. The scratch file-system provides ***temporary*** storage for datasets needed for running jobs. Every user has a dedicated scratch directory (**/scratch/$USER**) with **5 TB** disk quota and **5,000,000 inodes** (files) limit per user. The scratch file-system is available on all nodes (compute, login, etc.) on Torch as well as Data Transfer Node (gDTN). +The HPC scratch an all flash (VAST) file-system where most of the users store research data needed during the analysis phase of their research projects. The scratch file-system provides ***temporary*** storage for datasets needed for running jobs. Every user has a dedicated scratch directory (`/scratch/$USER`) with **5 TB** disk quota and **5,000,000 inodes** (files) limit per user. The scratch file-system is available on all nodes (compute, login, etc.) on Torch as well as Data Transfer Node (gDTN). :::warning[Scratch Purging Policy] - Files on the /scratch file-system that have not been accessed for 60 or more days will be purged. From 0e12a4615093d7f205ac33adeb86829f2e58cda3 Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Tue, 9 Dec 2025 16:26:28 -0500 Subject: [PATCH 13/26] minor cleanup --- docs/hpc/03_storage/01_intro_and_data_management.mdx | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/hpc/03_storage/01_intro_and_data_management.mdx b/docs/hpc/03_storage/01_intro_and_data_management.mdx index ddfa43b43f..b22788c4cd 100644 --- a/docs/hpc/03_storage/01_intro_and_data_management.mdx +++ b/docs/hpc/03_storage/01_intro_and_data_management.mdx @@ -26,17 +26,17 @@ User Home Directories are not ideal for sharing files and folders with other use ::: ## HPC Scratch -The HPC scratch an all flash (VAST) file-system where most of the users store research data needed during the analysis phase of their research projects. The scratch file-system provides ***temporary*** storage for datasets needed for running jobs. Every user has a dedicated scratch directory (`/scratch/$USER`) with **5 TB** disk quota and **5,000,000 inodes** (files) limit per user. The scratch file-system is available on all nodes (compute, login, etc.) on Torch as well as Data Transfer Node (gDTN). +The HPC scratch is an all flash (VAST) file-system you can store research data needed during the analysis phase of their research projects. It provides ***temporary*** storage for datasets needed for running job. Your scratch directory (`/scratch/$USER`) has a limit of **5 TB** disk quota and **5,000,000 inodes** (files). The scratch file-system is available on all nodes (compute, login, etc.) on Torch as well as Data Transfer Node (gDTN). :::warning[Scratch Purging Policy] - Files on the /scratch file-system that have not been accessed for 60 or more days will be purged. -0 There are no backups of the scratch file-system. Files that were deleted accidentally or removed due to storage system failures CANNOT be recovered. +0 There are no backups of the scratch file-system. Files that were deleted accidentally or removed due to storage system failures cannot be recovered. - It is a policy violation to use scripts to change the file access time. Any user found to be violating this policy will have their HPC account locked. A second violation may result in your HPC account being turned off. ::: :::tip - There are no backups of HPC Scratch file-system and you should not put important source code, scripts, libraries, executables in `/scratch`. These files should instead be stored in file-systems that are backed up, such as `/home` or [Research Project Space (RPS)](./04_research_project_space.mdx). Code can also be stored in a `git` repository. -- Once a research project completes, users should archive their important files in the [HPC Archive file-system](./01_intro_and_data_management.mdx#hpc-archive). +- Upon the completion of your research study, you are encouraged to archive your data in the [HPC Archive file-system](./01_intro_and_data_management.mdx#hpc-archive). ::: ## HPC Research Project Space From 70f16d333b3b0b00c28c978c300ab0f1fa4a88be Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Tue, 9 Dec 2025 16:28:02 -0500 Subject: [PATCH 14/26] anything that is not high risk is okay on hpc, no need to state the same thing two ways --- docs/hpc/03_storage/01_intro_and_data_management.mdx | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/docs/hpc/03_storage/01_intro_and_data_management.mdx b/docs/hpc/03_storage/01_intro_and_data_management.mdx index b22788c4cd..31ba9a3b27 100644 --- a/docs/hpc/03_storage/01_intro_and_data_management.mdx +++ b/docs/hpc/03_storage/01_intro_and_data_management.mdx @@ -3,9 +3,8 @@ The HPC Environment provides access to a number of file-systems to better serve the needs of researchers managing data during the various stages of the [research data life cycle](https://guides.nyu.edu/dataservices#s-lg-box-33756318). Each HPC file-system comes with different features, policies, and availability. Numerous data management tools are available that enable data transfers and data sharing, recommended best practices, and various scenarios and use cases of managing data in the HPC Environment. We strongly recommend [`globus`](./03_globus.md) as the primary data transfer tool. :::warning[Only approved for Moderate Risk Data] -- The HPC Environment has been approved for storing and analyzing **Moderate Risk** research data, as defined in the [NYU Electronic Data and System Risk Classification Policy](https://www.nyu.edu/about/policies-guidelines-compliance/policies-and-guidelines/electronic-data-and-system-risk-classification.html). - **High Risk** research data, such as those that include Personal Identifiable Information (**PII**) or electronic Protected Health Information (**ePHI**) or Controlled Unclassified Information (**CUI**) **should NOT be stored in the HPC Environment**. For this, we recommend using the [Secure Research Data Environments (SRDE)](../../srde/01_getting_started/01_intro.md) instead. -- The Office of Sponsored Projects (OSP) and Global Office of Information Security (GOIS) are exclusively empowered to classify the risk categories of data. +- The Office of Sponsored Projects (OSP) and Global Office of Information Security (GOIS) are exclusively empowered to classify the risk categories. These are defined in the [NYU Electronic Data and System Risk Classification Policy](https://www.nyu.edu/about/policies-guidelines-compliance/policies-and-guidelines/electronic-data-and-system-risk-classification.html). ::: Below is a list of file-systems with their characteristics and a summary table. Reviewing the list of available file-systems and the various scenarios/use cases that are presented below, can help you in selecting the right file-systems for your research project. Please note that there are strict limits on the size and number of files you are allowed to have on each filesystem. To find out your current disk space, inode quota utilization and the distribution of files under your scratch directory, refer to the section on [understanding user quota Limits and the myquota command.](./05_best_practices.md#user-quota-limits-and-the-myquota-command) From dde1267d0cae33461cc36a5b96054b7505003d58 Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Tue, 9 Dec 2025 16:31:33 -0500 Subject: [PATCH 15/26] minor cleanup --- docs/hpc/03_storage/01_intro_and_data_management.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/hpc/03_storage/01_intro_and_data_management.mdx b/docs/hpc/03_storage/01_intro_and_data_management.mdx index 31ba9a3b27..258fec89b8 100644 --- a/docs/hpc/03_storage/01_intro_and_data_management.mdx +++ b/docs/hpc/03_storage/01_intro_and_data_management.mdx @@ -47,7 +47,7 @@ The HPC team makes available a number of public datasets that are commonly used ## HPC Archive Once the Analysis stage of the [research data life cycle](https://guides.nyu.edu/dataservices#s-lg-box-33756318) has completed, you should compress your data before moving it onto the archive (`/archive/$USER`). For instance you can use the `tar` command to compress all your data into a single `tar.gz` file. The HPC Archive file-system is not accessible by running jobs; it is suitable for long-term data storage. Each user has access to a default disk quota of **2TB** and ***20,000 inode (files) limit***. The rather low limit on the number of inodes per user is intentional. The archive file-system is available only ***on login nodes*** of Torch. The archive file-system is backed up daily. -- Here is an example `tar` command that combines the data in a directory named `my_run_dir` under `$SCRATCH` and outputs the tar file in the user's `$ARCHIVE`: +Here is an example `tar` command that combines the data in a directory named `my_run_dir` under `$SCRATCH` and outputs the tar file in the user's `$ARCHIVE`: ```sh # to archive `$SCRATCH/my_run_dir` tar cvf $ARCHIVE/simulation_01.tar -C $SCRATCH my_run_dir From e00228d870335ad1a40fc260cede3cdef5ed7207 Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Tue, 9 Dec 2025 16:34:07 -0500 Subject: [PATCH 16/26] minor cleanup --- docs/hpc/03_storage/01_intro_and_data_management.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/hpc/03_storage/01_intro_and_data_management.mdx b/docs/hpc/03_storage/01_intro_and_data_management.mdx index 258fec89b8..e33e899c5f 100644 --- a/docs/hpc/03_storage/01_intro_and_data_management.mdx +++ b/docs/hpc/03_storage/01_intro_and_data_management.mdx @@ -10,7 +10,7 @@ The HPC Environment provides access to a number of file-systems to better serve Below is a list of file-systems with their characteristics and a summary table. Reviewing the list of available file-systems and the various scenarios/use cases that are presented below, can help you in selecting the right file-systems for your research project. Please note that there are strict limits on the size and number of files you are allowed to have on each filesystem. To find out your current disk space, inode quota utilization and the distribution of files under your scratch directory, refer to the section on [understanding user quota Limits and the myquota command.](./05_best_practices.md#user-quota-limits-and-the-myquota-command) ## User Home Directories -Every individual user has a home directory (under `/home/$USER`, environment variable `$HOME`) for permanently storing code and important configuration files. Home Directories provide limited storage space (**50 GB**) and inodes (files) **30,000** per user. Users can check their quota utilization using the [myquota](http://www.info-ren.org/projects/ckp/tech/software/version/myquota.html) command. User home directories are backed up daily and old files under `$HOME` are not purged. The user home directories are available on every cluster node (login nodes, compute nodes) as well as and the Data Transfer Node (gDTN). +You have access to a home directory at `/home/$USER` (accessible via the environment variable `$HOME`) for permanently storing code and important configuration files. Home Directories provide limited storage space (**50 GB**) and inodes (files) capacity **30,000**. You can check your quota utilization using the [myquota](http://www.info-ren.org/projects/ckp/tech/software/version/myquota.html) command. Home directories are backed up daily and old files under `$HOME` are not purged. Home directories are available on every cluster node (login nodes, compute nodes) as well as and the Data Transfer Node (gDTN). :::warning Avoid changing file and directory permissions in your home directory to allow other users to access files. From 9aecd908aba9b2ab797dee28fde782af2ae2db80 Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Tue, 9 Dec 2025 16:48:44 -0500 Subject: [PATCH 17/26] minor cleanup --- docs/hpc/03_storage/01_intro_and_data_management.mdx | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/hpc/03_storage/01_intro_and_data_management.mdx b/docs/hpc/03_storage/01_intro_and_data_management.mdx index e33e899c5f..32c857355c 100644 --- a/docs/hpc/03_storage/01_intro_and_data_management.mdx +++ b/docs/hpc/03_storage/01_intro_and_data_management.mdx @@ -1,13 +1,13 @@ # HPC Storage -The HPC Environment provides access to a number of file-systems to better serve the needs of researchers managing data during the various stages of the [research data life cycle](https://guides.nyu.edu/dataservices#s-lg-box-33756318). Each HPC file-system comes with different features, policies, and availability. Numerous data management tools are available that enable data transfers and data sharing, recommended best practices, and various scenarios and use cases of managing data in the HPC Environment. We strongly recommend [`globus`](./03_globus.md) as the primary data transfer tool. +The HPC environment provides access to various file-systems to better serve your needs for managing data during all stages of the [research data life cycle](https://guides.nyu.edu/dataservices#s-lg-box-33756318). Each HPC file-system comes with different features, policies, and availability. Numerous data management tools are available for data sharing and data transfers (though we strongly recommend using [`globus`](./03_globus.md)). :::warning[Only approved for Moderate Risk Data] - **High Risk** research data, such as those that include Personal Identifiable Information (**PII**) or electronic Protected Health Information (**ePHI**) or Controlled Unclassified Information (**CUI**) **should NOT be stored in the HPC Environment**. For this, we recommend using the [Secure Research Data Environments (SRDE)](../../srde/01_getting_started/01_intro.md) instead. -- The Office of Sponsored Projects (OSP) and Global Office of Information Security (GOIS) are exclusively empowered to classify the risk categories. These are defined in the [NYU Electronic Data and System Risk Classification Policy](https://www.nyu.edu/about/policies-guidelines-compliance/policies-and-guidelines/electronic-data-and-system-risk-classification.html). +- The Office of Sponsored Projects (OSP) & Global Office of Information Security (GOIS) are exclusively empowered to classify the risk categories for a dataset as listed in the [NYU Electronic Data and System Risk Classification Policy](https://www.nyu.edu/about/policies-guidelines-compliance/policies-and-guidelines/electronic-data-and-system-risk-classification.html). ::: -Below is a list of file-systems with their characteristics and a summary table. Reviewing the list of available file-systems and the various scenarios/use cases that are presented below, can help you in selecting the right file-systems for your research project. Please note that there are strict limits on the size and number of files you are allowed to have on each filesystem. To find out your current disk space, inode quota utilization and the distribution of files under your scratch directory, refer to the section on [understanding user quota Limits and the myquota command.](./05_best_practices.md#user-quota-limits-and-the-myquota-command) +All available file-systems are listed below followed by a table comparing their differences. Reviewing the list of available file-systems and their intended uses can help you in selecting the right file-system for your research project. Please note that there are strict limits on the size and number of files you are allowed to have on each filesystem. To find out your current disk space & inode quota utilization alongside the distribution of files within your directories, refer to the section on [understanding user quota Limits and the myquota command.](./05_best_practices.md#user-quota-limits-and-the-myquota-command) ## User Home Directories You have access to a home directory at `/home/$USER` (accessible via the environment variable `$HOME`) for permanently storing code and important configuration files. Home Directories provide limited storage space (**50 GB**) and inodes (files) capacity **30,000**. You can check your quota utilization using the [myquota](http://www.info-ren.org/projects/ckp/tech/software/version/myquota.html) command. Home directories are backed up daily and old files under `$HOME` are not purged. Home directories are available on every cluster node (login nodes, compute nodes) as well as and the Data Transfer Node (gDTN). From ac4db7e45a2102002bbab0b1f1f33a59ac406a35 Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Tue, 9 Dec 2025 16:59:46 -0500 Subject: [PATCH 18/26] minor cleanup --- docs/hpc/03_storage/01_intro_and_data_management.mdx | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/docs/hpc/03_storage/01_intro_and_data_management.mdx b/docs/hpc/03_storage/01_intro_and_data_management.mdx index 32c857355c..b1456fe723 100644 --- a/docs/hpc/03_storage/01_intro_and_data_management.mdx +++ b/docs/hpc/03_storage/01_intro_and_data_management.mdx @@ -1,13 +1,11 @@ # HPC Storage -The HPC environment provides access to various file-systems to better serve your needs for managing data during all stages of the [research data life cycle](https://guides.nyu.edu/dataservices#s-lg-box-33756318). Each HPC file-system comes with different features, policies, and availability. Numerous data management tools are available for data sharing and data transfers (though we strongly recommend using [`globus`](./03_globus.md)). - :::warning[Only approved for Moderate Risk Data] -- **High Risk** research data, such as those that include Personal Identifiable Information (**PII**) or electronic Protected Health Information (**ePHI**) or Controlled Unclassified Information (**CUI**) **should NOT be stored in the HPC Environment**. For this, we recommend using the [Secure Research Data Environments (SRDE)](../../srde/01_getting_started/01_intro.md) instead. +- High Risk data, such as those that include Personal Identifiable Information (PII) or electronic Protected Health Information (ePHI) or Controlled Unclassified Information (CUI) **should NOT be stored in the HPC Environment**. We recommend using the [Secure Research Data Environments (SRDE)](../../srde/01_getting_started/01_intro.md) instead for this. - The Office of Sponsored Projects (OSP) & Global Office of Information Security (GOIS) are exclusively empowered to classify the risk categories for a dataset as listed in the [NYU Electronic Data and System Risk Classification Policy](https://www.nyu.edu/about/policies-guidelines-compliance/policies-and-guidelines/electronic-data-and-system-risk-classification.html). ::: -All available file-systems are listed below followed by a table comparing their differences. Reviewing the list of available file-systems and their intended uses can help you in selecting the right file-system for your research project. Please note that there are strict limits on the size and number of files you are allowed to have on each filesystem. To find out your current disk space & inode quota utilization alongside the distribution of files within your directories, refer to the section on [understanding user quota Limits and the myquota command.](./05_best_practices.md#user-quota-limits-and-the-myquota-command) +The HPC environment provides access to the file-systems listed below to better serve your needs for managing research data during all stages of the [research data life cycle](https://guides.nyu.edu/dataservices#s-lg-box-33756318). Reviewing the list of available file-systems and their intended uses can help you in selecting the right file-system for your tasks. Please note that there are strict limits on the size and number of files you are allowed to have on each filesystem. To find out your current disk space & inode quota utilization refer to the section on [understanding user quota limits.](./05_best_practices.md#user-quota-limits-and-the-myquota-command) ## User Home Directories You have access to a home directory at `/home/$USER` (accessible via the environment variable `$HOME`) for permanently storing code and important configuration files. Home Directories provide limited storage space (**50 GB**) and inodes (files) capacity **30,000**. You can check your quota utilization using the [myquota](http://www.info-ren.org/projects/ckp/tech/software/version/myquota.html) command. Home directories are backed up daily and old files under `$HOME` are not purged. Home directories are available on every cluster node (login nodes, compute nodes) as well as and the Data Transfer Node (gDTN). From 45465415ad82f6d679964ac76e2732f7c7b3ea31 Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Tue, 9 Dec 2025 17:07:21 -0500 Subject: [PATCH 19/26] minor cleanup --- docs/hpc/03_storage/01_intro_and_data_management.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/hpc/03_storage/01_intro_and_data_management.mdx b/docs/hpc/03_storage/01_intro_and_data_management.mdx index b1456fe723..9976964fbb 100644 --- a/docs/hpc/03_storage/01_intro_and_data_management.mdx +++ b/docs/hpc/03_storage/01_intro_and_data_management.mdx @@ -43,7 +43,7 @@ The HPC Research Project Space (RPS) provides data storage space for research pr The HPC team makes available a number of public datasets that are commonly used in analysis jobs. The data-sets are available Read-Only under `/scratch/work/public`. For some of the datasets users must provide a signed usage agreement before accessing. Public datasets available on the HPC clusters can be viewed on the [Datasets page](../04_datasets/01_intro.md). ## HPC Archive -Once the Analysis stage of the [research data life cycle](https://guides.nyu.edu/dataservices#s-lg-box-33756318) has completed, you should compress your data before moving it onto the archive (`/archive/$USER`). For instance you can use the `tar` command to compress all your data into a single `tar.gz` file. The HPC Archive file-system is not accessible by running jobs; it is suitable for long-term data storage. Each user has access to a default disk quota of **2TB** and ***20,000 inode (files) limit***. The rather low limit on the number of inodes per user is intentional. The archive file-system is available only ***on login nodes*** of Torch. The archive file-system is backed up daily. +Once the Analysis stage of the [research data life cycle](https://guides.nyu.edu/dataservices#s-lg-box-33756318) has completed, you should compress your data before moving it onto the archive (`/archive/$USER`). For instance you can use the `tar` command to compress all your data into a single `tar.gz` file. The HPC Archive file-system is not accessible by running jobs; it is suitable for long-term data storage. Each user has access to a default disk quota of **2TB** and is limited to **20,000 inode (files)**. The rather low limit on the number of inodes per user is intentional. The archive file-system is available only ***on login nodes*** of Torch. The archive file-system is backed up daily. Here is an example `tar` command that combines the data in a directory named `my_run_dir` under `$SCRATCH` and outputs the tar file in the user's `$ARCHIVE`: ```sh From bd74be8476326506b01be82e9d56ba59cf00d8ec Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Tue, 9 Dec 2025 17:11:16 -0500 Subject: [PATCH 20/26] minor cleanup --- docs/hpc/03_storage/01_intro_and_data_management.mdx | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/docs/hpc/03_storage/01_intro_and_data_management.mdx b/docs/hpc/03_storage/01_intro_and_data_management.mdx index 9976964fbb..1ebbcc5c47 100644 --- a/docs/hpc/03_storage/01_intro_and_data_management.mdx +++ b/docs/hpc/03_storage/01_intro_and_data_management.mdx @@ -23,15 +23,14 @@ User Home Directories are not ideal for sharing files and folders with other use ::: ## HPC Scratch -The HPC scratch is an all flash (VAST) file-system you can store research data needed during the analysis phase of their research projects. It provides ***temporary*** storage for datasets needed for running job. Your scratch directory (`/scratch/$USER`) has a limit of **5 TB** disk quota and **5,000,000 inodes** (files). The scratch file-system is available on all nodes (compute, login, etc.) on Torch as well as Data Transfer Node (gDTN). +The HPC scratch is an all flash (VAST) file-system you can store research data needed during the analysis phase of their research projects. It provides ***temporary*** storage for datasets needed for running job. Your scratch directory (`/scratch/$USER`) has a limit of **5 TB** disk quota and **5,000,000 inodes** (files). The scratch file-system is available on all nodes (compute, login, etc.) on Torch as well as Data Transfer Node (gDTN). There are no backups for this file-system and files that are deleted accidentally or removed due to storage system failures cannot be recovered. :::warning[Scratch Purging Policy] -- Files on the /scratch file-system that have not been accessed for 60 or more days will be purged. -0 There are no backups of the scratch file-system. Files that were deleted accidentally or removed due to storage system failures cannot be recovered. -- It is a policy violation to use scripts to change the file access time. Any user found to be violating this policy will have their HPC account locked. A second violation may result in your HPC account being turned off. +- Files on the `/scratch` file-system that have not been accessed for 60 or more days will be purged. +- It is a policy violation to use scripts to change the file access time. Any user found to be violating this policy will have their HPC account locked. A second violation may result in your access to HPC being revoked. ::: -:::tip +:::tip[Avoiding data loss from purging] - There are no backups of HPC Scratch file-system and you should not put important source code, scripts, libraries, executables in `/scratch`. These files should instead be stored in file-systems that are backed up, such as `/home` or [Research Project Space (RPS)](./04_research_project_space.mdx). Code can also be stored in a `git` repository. - Upon the completion of your research study, you are encouraged to archive your data in the [HPC Archive file-system](./01_intro_and_data_management.mdx#hpc-archive). ::: From ff08242baf806b8d8ad548ea50b71cf7c2d96254 Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Tue, 9 Dec 2025 17:17:43 -0500 Subject: [PATCH 21/26] myquota cross reference instead of ancienct external link --- docs/hpc/03_storage/01_intro_and_data_management.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/hpc/03_storage/01_intro_and_data_management.mdx b/docs/hpc/03_storage/01_intro_and_data_management.mdx index 1ebbcc5c47..11c9121a73 100644 --- a/docs/hpc/03_storage/01_intro_and_data_management.mdx +++ b/docs/hpc/03_storage/01_intro_and_data_management.mdx @@ -8,7 +8,7 @@ The HPC environment provides access to the file-systems listed below to better serve your needs for managing research data during all stages of the [research data life cycle](https://guides.nyu.edu/dataservices#s-lg-box-33756318). Reviewing the list of available file-systems and their intended uses can help you in selecting the right file-system for your tasks. Please note that there are strict limits on the size and number of files you are allowed to have on each filesystem. To find out your current disk space & inode quota utilization refer to the section on [understanding user quota limits.](./05_best_practices.md#user-quota-limits-and-the-myquota-command) ## User Home Directories -You have access to a home directory at `/home/$USER` (accessible via the environment variable `$HOME`) for permanently storing code and important configuration files. Home Directories provide limited storage space (**50 GB**) and inodes (files) capacity **30,000**. You can check your quota utilization using the [myquota](http://www.info-ren.org/projects/ckp/tech/software/version/myquota.html) command. Home directories are backed up daily and old files under `$HOME` are not purged. Home directories are available on every cluster node (login nodes, compute nodes) as well as and the Data Transfer Node (gDTN). +You have access to a home directory at `/home/$USER` (accessible via the environment variable `$HOME`) for permanently storing code and important configuration files. Home Directories provide limited storage space (**50 GB**) and inodes (files) capacity **30,000**. You can check your quota utilization using the `myquota`command as [described here](./05_best_practices.md#user-quota-limits-and-the-myquota-command). Home directories are backed up daily and old files under `$HOME` are not purged. Home directories are available on every cluster node (login nodes, compute nodes) as well as and the Data Transfer Node (gDTN). :::warning Avoid changing file and directory permissions in your home directory to allow other users to access files. From 02b2cf5573f0260f2ae5010971792770ec0b5b62 Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Tue, 9 Dec 2025 17:20:28 -0500 Subject: [PATCH 22/26] myquota output from torch instead of greene --- docs/hpc/03_storage/05_best_practices.md | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/docs/hpc/03_storage/05_best_practices.md b/docs/hpc/03_storage/05_best_practices.md index 2f95adc8da..30ab86593b 100644 --- a/docs/hpc/03_storage/05_best_practices.md +++ b/docs/hpc/03_storage/05_best_practices.md @@ -11,12 +11,15 @@ Users can check their current utilization of quota using the myquota command. Th In the following example the user who executes the `myquota` command is out of inodes in their home directory. The user inode quota limit on the `/home` file system **30.0K inodes** and the user has **33000 inodes**, thus **110%** of the inode quota limit. ```sh $ myquota -Hostname: log-1 at Sun Mar 21 21:59:08 EDT 2021 -Filesystem Environment Backed up? Allocation Current Usage -Space Variable /Flushed? Space / Files Space(%) / Files(%) -/home $HOME Yes/No 50.0GB/30.0K 8.96GB(17.91%)/33000(110.00%) -/scratch $SCRATCH No/Yes 5.0TB/1.0M 811.09GB(15.84%)/2437(0.24%) -/archive $ARCHIVE Yes/No 2.0TB/20.0K 0.00GB(0.00%)/1(0.00%) +Quota Information for NetID +Hostname: torch-login-2 at 2025-12-09 17:18:24 + +Filesystem Environment Backed up? Allocation Current Usage +Space Variable /Flushed? Space / Files Space(%) / Files(%) + +/home $HOME YES/NO 0.05TB/0.03M 0.0TB(0.0%)/54(0%) +/scratch $SCRATCH NO/YES 5.0TB/5.0M 0.0TB(0.0%)/1(0%) +/archive $ARCHIVE YES/NO 2.0TB/0.02M 0.0TB(0.0%)/1(0%) ``` Users can find out the number of inodes (files) used per subdirectory under their home directory (`$HOME`), by running the following commands: ```sh From a2ee30cfb1996aab31a87a9d9519815fd0c39786 Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Tue, 9 Dec 2025 17:29:06 -0500 Subject: [PATCH 23/26] use du --inodes instead of tedious for loop --- docs/hpc/03_storage/05_best_practices.md | 55 +++++++++++++++--------- 1 file changed, 34 insertions(+), 21 deletions(-) diff --git a/docs/hpc/03_storage/05_best_practices.md b/docs/hpc/03_storage/05_best_practices.md index 30ab86593b..b020990b9c 100644 --- a/docs/hpc/03_storage/05_best_practices.md +++ b/docs/hpc/03_storage/05_best_practices.md @@ -2,8 +2,8 @@ ## User Quota Limits and the myquota command All users have quote limits set on HPC fie systems. There are several types of quota limits, such as limits on the amount of disk space (disk quota), number of files (inode quota) etc. The default user quota limits on HPC file systems are listed [on our Data Management page](./01_intro_and_data_management.mdx#hpc-storage-mounts-comparison-table). -:::warning -_One of the common issues users report is running out of inodes in their home directory._ This usually occurs during software installation, for example installing conda environment under their home directory. Running out of quota causes a variety of issues such as running user jobs being interrupted or users being unable to finish the installation of packages under their home directory. +:::warning[Home directory inode quotas] +_One of the common issues users report is running out of inodes in their home directory._ This usually occurs during software installation, for example installing conda environment under their home directory. Running out of quota causes a variety of issues such as running user jobs being interrupted or users being unable to finish the installation of packages under their home directory. ::: Users can check their current utilization of quota using the myquota command. The myquota command provides a report of the current quota limits on mounted file systems, the user's quota utilization, as well as the percentage of quota utilization. @@ -21,33 +21,46 @@ Space Variable /Flushed? Space / Files Space(%) / File /scratch $SCRATCH NO/YES 5.0TB/5.0M 0.0TB(0.0%)/1(0%) /archive $ARCHIVE YES/NO 2.0TB/0.02M 0.0TB(0.0%)/1(0%) ``` -Users can find out the number of inodes (files) used per subdirectory under their home directory (`$HOME`), by running the following commands: +You can use the following command to print the list of files within each sub-folder for a given directory: ```sh $cd $HOME -$ for d in $(find $(pwd) -maxdepth 1 -mindepth 1 -type d | sort -u); do n_files=$(find $d | wc -l); echo $d $n_files; done -/home/netid/.cache 1507 -/home/netid/.conda 2 -/home/netid/.config 2 -/home/netid/.ipython 11 -/home/netid/.jupyter 2 -/home/netid/.keras 2 -/home/netid/.local 24185 -/home/netid/.nv 2 -/home/netid/.sacrebleu 46 -/home/netid/.singularity 1 -/home/netid/.ssh 5 -/home/netid/.vscode-server 7216 +$du --inodes -h --max-depth=1 +6 ./.ssh +88 ./.config +2 ./.vnc +2 ./.aws +3 ./.lmod.d +5.3K ./.local +3 ./.dbus +408 ./ondemand +2 ./.virtual_documents +6 ./.nv +6.7K ./.pixi +33 ./workshop_scripts +5 ./.cupy +6 ./.gnupg +1 ./.emacs.d +194 ./.nextflow +6 ./.terminfo +2 ./.conda +2 ./.singularity +3 ./.vast-dev +1 ./custom +185 ./genai-workshop +6 ./.atuin +1 ./.apptainer +9 ./.subversion +4 ./packages +1.4K ./.cache +15K . ``` ## Large number of small files -In case your dataset or workflow requires to use large number of small files, this can create a bottleneck due to read/write rates. - -Please refer to [our page on working with a large number of files](./06_large_number_of_small_files.md) to learn about some of the options we recommend to consider. +In case your dataset or workflow requires to use large number of small files, this can create a bottleneck due to read/write rates. Please refer to [our page on working with a large number of files](./06_large_number_of_small_files.md) to learn about some of the options we recommend to consider. ## Installing Python packages :::warning -Your home directory has a relatively small number of inodes. -If you create a conda or python environment in you home directory, this can eat up all the inodes. +Your home directory is limited to a relatively small number of inodes (30,000). Creating conda/python environments in you home directory, this can eat easily exhaust your inode quota. ::: Please review the [Package Management section](../06_tools_and_software/01_intro.md#package-management-for-r-python--julia-and-conda-in-general) of the [Torch Software Page](../06_tools_and_software/01_intro.md). From 2129b6e76d6f4f0f07dbcdb6454bad98c977c8a2 Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Tue, 9 Dec 2025 17:30:11 -0500 Subject: [PATCH 24/26] Update docs/hpc/03_storage/01_intro_and_data_management.mdx Co-authored-by: Robert Young --- docs/hpc/03_storage/01_intro_and_data_management.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/hpc/03_storage/01_intro_and_data_management.mdx b/docs/hpc/03_storage/01_intro_and_data_management.mdx index 11c9121a73..1cdf26cd76 100644 --- a/docs/hpc/03_storage/01_intro_and_data_management.mdx +++ b/docs/hpc/03_storage/01_intro_and_data_management.mdx @@ -23,7 +23,7 @@ User Home Directories are not ideal for sharing files and folders with other use ::: ## HPC Scratch -The HPC scratch is an all flash (VAST) file-system you can store research data needed during the analysis phase of their research projects. It provides ***temporary*** storage for datasets needed for running job. Your scratch directory (`/scratch/$USER`) has a limit of **5 TB** disk quota and **5,000,000 inodes** (files). The scratch file-system is available on all nodes (compute, login, etc.) on Torch as well as Data Transfer Node (gDTN). There are no backups for this file-system and files that are deleted accidentally or removed due to storage system failures cannot be recovered. +The HPC scratch is an all flash (VAST) file-system you can store research data needed during the analysis phase of your research projects. It provides ***temporary*** storage for datasets needed for running jobs. Your scratch directory (`/scratch/$USER`) has a limit of **5 TB** disk quota and **5,000,000 inodes** (files). The scratch file-system is available on all nodes (compute, login, etc.) on Torch as well as Data Transfer Node (gDTN). There are no backups for this file-system and files that are deleted accidentally or removed due to storage system failures cannot be recovered. :::warning[Scratch Purging Policy] - Files on the `/scratch` file-system that have not been accessed for 60 or more days will be purged. From 6451eb5000df2aaa53b01bbf1508587f068930b1 Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Tue, 9 Dec 2025 17:32:39 -0500 Subject: [PATCH 25/26] per reviewer comment --- docs/hpc/03_storage/01_intro_and_data_management.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/hpc/03_storage/01_intro_and_data_management.mdx b/docs/hpc/03_storage/01_intro_and_data_management.mdx index 1cdf26cd76..a01c8adadb 100644 --- a/docs/hpc/03_storage/01_intro_and_data_management.mdx +++ b/docs/hpc/03_storage/01_intro_and_data_management.mdx @@ -8,7 +8,7 @@ The HPC environment provides access to the file-systems listed below to better serve your needs for managing research data during all stages of the [research data life cycle](https://guides.nyu.edu/dataservices#s-lg-box-33756318). Reviewing the list of available file-systems and their intended uses can help you in selecting the right file-system for your tasks. Please note that there are strict limits on the size and number of files you are allowed to have on each filesystem. To find out your current disk space & inode quota utilization refer to the section on [understanding user quota limits.](./05_best_practices.md#user-quota-limits-and-the-myquota-command) ## User Home Directories -You have access to a home directory at `/home/$USER` (accessible via the environment variable `$HOME`) for permanently storing code and important configuration files. Home Directories provide limited storage space (**50 GB**) and inodes (files) capacity **30,000**. You can check your quota utilization using the `myquota`command as [described here](./05_best_practices.md#user-quota-limits-and-the-myquota-command). Home directories are backed up daily and old files under `$HOME` are not purged. Home directories are available on every cluster node (login nodes, compute nodes) as well as and the Data Transfer Node (gDTN). +You have access to a home directory at `/home/$USER` (accessible via the environment variable `$HOME`) for permanently storing code and important configuration files. Home Directories provide limited storage space (**50 GB**) and inodes (files) capacity **30,000**. You can check your quota utilization using the `myquota`command as [described here](./05_best_practices.md#user-quota-limits-and-the-myquota-command). Home directories are backed up daily and old files under `$HOME` are not purged. Home directories are available on every cluster node (login nodes, compute nodes) and the Data Transfer Node (gDTN). :::warning Avoid changing file and directory permissions in your home directory to allow other users to access files. From b27f0a3ee5857c606ee2d5b478df053b1787a3ac Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Tue, 9 Dec 2025 17:34:46 -0500 Subject: [PATCH 26/26] fix remark lint error --- docs/hpc/03_storage/01_intro_and_data_management.mdx | 6 +++--- .../{05_best_practices.md => 05_best_practices.mdx} | 0 2 files changed, 3 insertions(+), 3 deletions(-) rename docs/hpc/03_storage/{05_best_practices.md => 05_best_practices.mdx} (100%) diff --git a/docs/hpc/03_storage/01_intro_and_data_management.mdx b/docs/hpc/03_storage/01_intro_and_data_management.mdx index a01c8adadb..836f04f426 100644 --- a/docs/hpc/03_storage/01_intro_and_data_management.mdx +++ b/docs/hpc/03_storage/01_intro_and_data_management.mdx @@ -5,10 +5,10 @@ - The Office of Sponsored Projects (OSP) & Global Office of Information Security (GOIS) are exclusively empowered to classify the risk categories for a dataset as listed in the [NYU Electronic Data and System Risk Classification Policy](https://www.nyu.edu/about/policies-guidelines-compliance/policies-and-guidelines/electronic-data-and-system-risk-classification.html). ::: -The HPC environment provides access to the file-systems listed below to better serve your needs for managing research data during all stages of the [research data life cycle](https://guides.nyu.edu/dataservices#s-lg-box-33756318). Reviewing the list of available file-systems and their intended uses can help you in selecting the right file-system for your tasks. Please note that there are strict limits on the size and number of files you are allowed to have on each filesystem. To find out your current disk space & inode quota utilization refer to the section on [understanding user quota limits.](./05_best_practices.md#user-quota-limits-and-the-myquota-command) +The HPC environment provides access to the file-systems listed below to better serve your needs for managing research data during all stages of the [research data life cycle](https://guides.nyu.edu/dataservices#s-lg-box-33756318). Reviewing the list of available file-systems and their intended uses can help you in selecting the right file-system for your tasks. Please note that there are strict limits on the size and number of files you are allowed to have on each filesystem. To find out your current disk space & inode quota utilization refer to the section on [understanding user quota limits.](./05_best_practices.mdx#user-quota-limits-and-the-myquota-command) ## User Home Directories -You have access to a home directory at `/home/$USER` (accessible via the environment variable `$HOME`) for permanently storing code and important configuration files. Home Directories provide limited storage space (**50 GB**) and inodes (files) capacity **30,000**. You can check your quota utilization using the `myquota`command as [described here](./05_best_practices.md#user-quota-limits-and-the-myquota-command). Home directories are backed up daily and old files under `$HOME` are not purged. Home directories are available on every cluster node (login nodes, compute nodes) and the Data Transfer Node (gDTN). +You have access to a home directory at `/home/$USER` (accessible via the environment variable `$HOME`) for permanently storing code and important configuration files. Home Directories provide limited storage space (**50 GB**) and inodes (files) capacity **30,000**. You can check your quota utilization using the `myquota`command as [described here](./05_best_practices.mdx#user-quota-limits-and-the-myquota-command). Home directories are backed up daily and old files under `$HOME` are not purged. Home directories are available on every cluster node (login nodes, compute nodes) and the Data Transfer Node (gDTN). :::warning Avoid changing file and directory permissions in your home directory to allow other users to access files. @@ -18,7 +18,7 @@ User Home Directories are not ideal for sharing files and folders with other use :::tip[`inode` limits] - One of the common issues that users report regarding their home directories is running out of inodes (i.e. the number of files stored under their home exceeds the inode limit), which by default is set to 30,000 files -- To find out the current space and inode quota utilization and the distribution of files under your home directory, please see: [Understanding user quota limits and the myquota command.](./05_best_practices.md#user-quota-limits-and-the-myquota-command) +- To find out the current space and inode quota utilization and the distribution of files under your home directory, please see: [Understanding user quota limits and the myquota command.](./05_best_practices.mdx#user-quota-limits-and-the-myquota-command) - Working with `conda` environments: To avoid running out of inode limits in home directories, the HPC team recommends **setting up `conda` environments with Singularity overlay images** as [described here](../07_containers/03_singularity_with_conda.md). Avoid creating `conda` environments in your `$HOME` directory. ::: diff --git a/docs/hpc/03_storage/05_best_practices.md b/docs/hpc/03_storage/05_best_practices.mdx similarity index 100% rename from docs/hpc/03_storage/05_best_practices.md rename to docs/hpc/03_storage/05_best_practices.mdx