Trillium

From Alliance Doc
Jump to navigation Jump to search
Other languages:
Availability: Spring 2025
Login node: to be determined
Globus endpoint: to be determined
Data transfer node (rsync, scp, sftp,...): to be determined
Portal: to be determined

Trillium is a large parallel cluster built by Lenovo Canada and hosted by SciNet at the University of Toronto.

Installation and transition

Due to limits on available power and cooling capacity there will be an interim period in which a significant portion of the old Niagara will be shut down in order to provide power for the new system's acceptance testing and transition. We'll update you when we have a better idea of Trillium's installation schedule.

Storage

Parallel storage: 29 petabytes, NVMe SSD based storage from VAST Data.

High-performance network

  • Nvidia “NDR” Infiniband network
    • 400 Gbit/s network bandwidth for CPU nodes
    • 800 Gbit/s network bandwidth for GPU nodes
    • Fully non-blocking, meaning every node can talk to every other node at full bandwidth simultaneously.

Node characteristics

nodes cores available memory CPU GPU
1224 192 768GB DDR5 2 x AMD EPYC 9655 (Zen 5) @ 2.6 GHz, 384MB cache L3
60 96 768GB DDR5 1 x AMD EPYC 9654 (Zen 4) @ 2.4 GHz, 384MB cache L3 4 x NVidia H100 SXM (80 GB memory)