Infrastructure renewal completed events
Jump to navigation
Jump to search
This page provides details of completed events which are part of the infrastructure renewal activities.
Start Time | End Time | Status | System | Type | Description |
---|---|---|---|---|---|
Jan 13, 2025 | Feb 14, 2025 | Complete | Béluga, Narval | Temporary Reduction | Performance and stability tests on Rorqual will require the shutdown of all Béluga compute nodes and about half of Narval compute nodes from 8 a.m. on January 13 until 12 p.m. (noon) on January 31, 2025 (EST). Login nodes and data access will remain operational. On Narval, approximately 50% of nodes from each category (CPU, GPU, and large memory) will be shut down. During the shutdown time, Béluga Storage will be mounted to Narval (/lustre01, /lustre02, /lustre03, /lustre04 of Beluga). Béluga and Juno cloud instances are unaffected. Jobs on Béluga scheduled to complete after 8 a.m. on January 13 will remain queued until the cluster resumes.
Jan 30, 2025 UPDATE: Narval's compute capacity is at 100% until February 6, then again at 30% for the last Rorqual tests. Béluga and Narval should be back to 100% capacity on February 14. For updated information, please see https://status.alliancecan.ca. |
Jan 22, 2025 | Jan 22, 2025 (1 day) | Complete | Niagara, Mist | Outage | Niagara and Mist compute nodes will be shut down on January 22, 2025 from 8 AM to 5 PM EST to support ongoing system improvements and the integration with the new system, Trillium.
The login nodes, file systems, and the HPSS system will remain available. The scheduler will hold jobs that are submitted until the maintenance has finished. |
Jan 13, 2025 | Jan 21, 2025 (9 days) | Complete | Cedar (100%) | Outage | The Cedar compute cluster will be shut down in preparation for the infrastructure renewal. Jobs submitted to the cluster will queue and may start running if they can complete before the shutdown. Jobs that cannot run will remain in the queue until the cluster is fully operational on January 21. The Cedar /scratch filesystem will be migrated to new storage. Please move any important data immediately to your /project, /nearline, or /home directory.
Cedar cloud will remain operational during this period. |
Nov 25, 2024 | Nov 26, 2024 (1 day) | Complete | Niagara | Outage | A full power shutdown will take place for main panel upgrades ahead of Trillium cluster setup. All Niagara services, including the cluster and scheduler, will pause during this time. The scheduler will hold jobs that cannot finish before the start of the shutdown. Users are encouraged to submit smaller, short-duration jobs to optimize idle node usage before the maintenance begins. |
Nov 7, 2024 | Nov 8, 2024 (1 day) | Complete | Niagara | Outage | All systems and storage at the SciNet Datacenter (Niagara, Mist, HPSS, Rouge, Teach, JupyterHub, Balam) will be unavailable from 7 a.m. to 5 p.m. ET. This outage is necessary for installing new electrical equipment (UPS) as part of a systems refresh. The scheduler will pause jobs unable to finish before the shutdown. Users can prioritize short jobs to utilize otherwise idle nodes prior to maintenance. |
Nov 7, 2024, 6 a.m. PST | Nov 8, 2024, 6 a.m. PST | Complete | Cedar | Outage | Cedar compute nodes will be unavailable during this period. However, Cedar login nodes, storage, and cloud services will remain operational and unaffected. |