New Systems Migration

From ACENET
Jump to: navigation, search

This page provides details specifically about migrating calculations and data off ACENET legacy systems.


Since 2016, Compute Canada has been carrying out one of the largest advanced research computing renewals in Canada’s history. Funding from the Canada Foundation for Innovation (CFI) and provincial partners is being used to replace aging systems with new national systems that will consolidate resources. For more information on the national migration process and the latest update to the technology refresh program, please visit the Compute Canada website.

ACENET's four primary computing clusters - Mahone, Fundy, Glooscap and Placentia - were scheduled for decommission on 31 March 2018.

Due to the high demand nationally for computing resources under the 2018-19 Resource Allocation Competition (RAC), Compute Canada and the four regional partners asked the Canada Foundation for Innovation (CFI) to fund maintenance and operations for a number of systems across the country for an additional year. The intent was to bridge resources while the newest Compute Canada cluster, Niagara (a Large Parallel system with 60,000 cores), becomes fully operational.

On 16 March 2018, CFI approved Compute Canada’s request. As a result, the Placentia and Glooscap clusters will remain operational until 31 March 2019 in order to accommodate the RAC award recipients.

Mahone and Fundy were decommissioned 31 March 2018.

Thank you for your patience during this migration period. We are working hard to limit the extent of disruptions and impact to your research during this transition. If you have questions or concerns about any aspect of this migration process, please contact ACENET support.

Contents

Decreased performance and limited support

"Maintenance and operations funding" does not mean the systems are under comprehensive vendor support programs; the cost of that would be prohibitive. Only critical components (interconnects and shared storage) will be maintained, and that subject to the availability and cost of parts. Users should ensure that all key data is constantly backed up...

  • to physical media in their possession (e.g. downloaded to USB drive), or
  • to another, newer research computing cluster.

Users should also expect a gradual loss of processing capacity as compute nodes fail and are not repaired.

Moving data to new systems

See General Directives for Migration on the Compute Canada Documentation Wiki for step-by-step instructions of how to transfer your data. The steps, summarized, are:

  1. Delete any unneeded files.
  2. Archive and compress your files before transferring them, if appropriate.
  3. Choose a system to land on. Cedar and Graham are both suitable replacements for legacy ACENET systems.
  4. Transfer the data using Globus if possible; sftp, scp, or rsync if not.

The following resources are available:

As always, our technical staff are ready to help you in any way. Please contact support with any questions or concerns.

Accessing the new systems

SSH access

You can SSH to the new servers using the following login nodes:

cedar.computecanada.ca
graham.computecanada.ca

You will need to use your Compute Canada username and password to login (the same credentials you use for the CCDB). If you have difficulty logging in or have forgotten your Compute Canada username or password, contact support.

Globus endpoints

You can transfer files to and from the new systems using the Globus endpoints:

computecanada#cedar 
computecanada#graham
acenet#glooscap
acenet#placentia

Instructions on using Globus can be found here.

Training on the new systems

There is a great deal of training available on how to use the new systems, both from ACENET staff, and through webinars provided by our partners SHARCNET and WestGrid. You can view upcoming sessions on our training page.

How-to videos are available on ACENET's YouTube channel and Compute Canada's.

ACENET research consultants will be available for online office hours at the following times:

  • Tuesdays, 10h00-11h00 Atlantic time (10h30-11h30 Newfoundland)
  • Wednesdays, 11h00-12h00 Atlantic time (11h30-12h30 Newfoundland)
  • Thursdays, 11h00-12h00 Atlantic time (11h30-12h30 Newfoundland)

To contact a research consultant during online office hours, connect to the ACENET Office Hours Google Hangout. You will be able to ask questions by text or by voice, and you and the research consultant will be able to share screens. (If you want to use voice, video, and screen-sharing features, we recommend using Chrome or Safari browsers; recent versions of Firefox may not support these.)

RAC Allocations

2018-2019

A significant volume of 2018-2019 RAC allocations have been assigned to Glooscap and Placentia, in conformance with CFI's request that the systems being maintained are clearly national resources, and not treated as regional or local resources.

  • If you have received a 2018-2019 RAC allocation on one of these systems and you do not already hold an ACENET account, email support@ace-net.ca and ask for an account to be created. The PI of a research group must hold an active ACENET account in order for any other members of the group to hold accounts.
  • 2018-2019 RAC allocations have been implemented as priority adjustments via the Grid Engine share tree. A group without a RAC allocation has 100 shares, notionally corresponding to one CPU. A RAC-holding group has a larger share in direct proportion to the CPUs they are awarded on the system (Glooscap or Placentia). Other queue policies (e.g. short.q, medium.q, long.q definitions) are unchanged. If this means you cannot use your RAC allocation, please contact support@ace-net.ca describing your needs, and we will discuss modifications to the queue policies to enable your research. See also Job Control.

No data retention after decommissioning

Following the decommissioning dates, 2018 March 31 on Fundy and Mahone and 2019 March 31 on Glooscap and Placentia, user data may be deleted without further notice. ACENET will not retain any long term or backup copies of user data. Users should ensure they take the appropriate steps to comply with any data management requirements of their institution and funding agency.

Continued ACENET Support

While the new national infrastructure is not located in Atlantic Canada, ACENET will continue to be the primary support organization for research groups in Atlantic Canada, providing training, research consulting, and troubleshooting. Reach out to your local team, or contact us.

User Support
Resources