HPC User Guidelines

HPC clusters are sharely used among multiple users, thus your actions can have serious impacts on the HPC system(s) and also affect other users. All users are expected to use their HPC account, storage and computational resources in a responsible, ethical and professional manner. The following policies and rules are in place to ensure fair use of the computing resources and prevent unauthorized or malicious use. Any abuse or misuse of the HPC system, or violation of these policies, will result in termination of the user account.

  1. Usage and Accounting: All research usage is monitored and consolidated under accounts for postdocs, students, research staff, and external collaborators. Users may not use their HPC account for any commercial or paying work.
  2. External Collaborators: Cluster use by external collaborators is permitted upon request by IT with suitable justification. All external use is subject to periodic review.
  3. Usage Allocations: Computational usage is measured and tracked in terms of Service Units (SUs). One SU is equivalent to one core-hour of computing (one hour of elapsed time consuming a single compute core and/or a corresponding portion of the memory on a single node). The number of SUs accumulated per hour may depend on usage of special resources and/or the job priority selected when a job is submitted. Depending on need and resource availability, users may not necessarily be granted access to all clusters. Resources on the cluster may include both publicly available nodes and storage available to all users, and nodes and storage that are reserved for exclusive or high-priority access by specific users.
  4. Storage and Quotas: During the 2017 calendar year, the HPC has been shifting from the Production (2) cluster to the new cluster code named Yamato. Scheduling and storage should be arranged and will be set out in terms of storage available per user.
  5. Scheduling Policies:  The HPC cluster uses a job queuing/scheduling system to manage user jobs. In addition, there may be limits on the number of jobs that any one user may run concurrently, as well as on the aggregate number of cores that a user may be using at any one time. Please refer to the user guide for more information on the scheduling system.
  6. User Support and Assistance: The NWU staff is responsible for installing and supporting the HPC hardware, system software, and a number of widely used software tools, libraries, and applications (e.g. Gaussian®, MedeA®, Material Studio, etc.). In addition, the NWU staff provides user assistance with cluster usage and installation of less widely used software packages. To facilitate support activities and maintain a high level of service quality, the NWU makes use of a support software system to track requests and problem reports. Users are strongly encouraged to make requests or report problems to NWU staff by email (hpc@nwu.ac.za) in order to ensure that such requests or reports are properly entered into the support software system. 
  7. Additional Considerations: The HPC cluster is a shared facility used by numerous users, so the NWU takes several steps to ensure safe and equitable access for all users. 
  8. Regulated Data: No regulated data may be stored or used on the NWU-HPC cluster because of ethical issues and sensitive data. Such data include, but is not limited to, so-called “3-Lock” data such as electronic Protected Health Information (ePHI), HiPAA data, or other data subject to governmental regulations or private data use agreements.
  9. Accounts and Authentication: Each user is permitted only a single account, usually corresponding to the user’s staff/student number. Access to a cluster requires use of a NWU IP address to a login node using suitable authentication. For access from an off-campus location, a VPN connection is required.
  10. Login Nodes: The login nodes on the cluster is shared by all users and are intended only for lightweight activities such as editing, reviewing program input and output files, file transfers, and job submission and monitoring. Users are not permitted to run programs (including interactive programs like Matlab®, R, or Mathematica®) on the login nodes, and the system administrators reserve the right to terminate, without prior notice, any user session found to be consuming excessive resources on a login node. For security reasons, the system software on the login nodes may be updated frequently and become inconsistent with the system software installed on the compute nodes, so it is strongly recommended that all compilations and other program building activities be performed on a compute node.
  11. Maintenance Periods: The NWU strives to operate its cluster on a 7×24 basis, except for regularly scheduled maintenance periods (approximately twice per year for up to 5 days). The NWU will publish a regular maintenance schedule and will notify users well in advance of scheduled maintenance periods. Users are responsible for managing their workloads accordingly. In the rare event of an emergency maintenance period, users may receive little or no notice.
  12. Licensed Software: NWU has licensed a number of commercial software products for use on the cluster, and those are available to all users, subject to limits on the number of simultaneous users. In addition, a number of individual users have licensed commercial software products for use only by them. Users are responsible for ensuring that they abide by all license terms and requirements of the software that they use on the HPC cluster.
  13. Security: Users are responsible for using HPC resources and facilities in an efficient, effective, ethical and lawful manner. Security is a major concern on the HPC cluster, therefore we request that users apply appropriate security best practices to secure their account, including keeping their password secret.
  14. Scheduling system: Access to HPC clusters is to be via secure communication channel (e.g. ssh) to respective master/login nodes. All computational jobs run on the cluster systems must be submitted via the resource management system.  Compute nodes are intended for handling heavy computational work, and must be accessed via the resource management system only. Direct access to compute nodes is not permitted. This enables resources to be sensibly allocated and most efficiently used. Please refer to the user guide for more information on the scheduling system.
  15. You are responsible for your data. Only data that is stored on /home is backed up. Data stored on the HPC will be kept for a maximum period of one year, thereafter the data will be removed to make space for new users. Should you need to keep your data for a longer period, additional arrangements should be made.
  16. Acknowledgements: Researchers are required to give full acknowledgements to the HPC in all outputs, including journal articles, conference proceedings or presentations, student dissertations/theses, technical reports, etc. that contain work conducted using the HPC resources. The following form of acknowledgement must be included:   "Computations were performed using facilities provided by the North-West University’s High Performance Computing service: http://hpc.nwu.ac.za."

Disclaimers: Every effort will be made to provide for uninterrupted completion of all jobs that start to run. However, in rare instances, unforeseen circumstances may cause job failures or require suspending or even killing jobs without warning. Users are strongly urged to make use of check pointing or other standard programming practices to provide for this possibility. Every effort will be made to ensure that data stored on the cluster is not lost or damaged. However, only home directories will be backed up (daily in most cases), and any data that have not yet been backed up is subject to loss due to hardware failures or other circumstances. It is each user’s responsibility to ensure that important files created in project storage is preserved by copying them to alternative storage facilities intended for that purpose.