Trillium

From Alliance Doc
Jump to navigation Jump to search
Other languages:
Availability: Spring 2025
Login node: to be determined
Globus endpoint: to be determined
Data transfer node (rsync, scp, sftp,...): to be determined
Portal: to be determined

Trillium is a large parallel cluster built by Lenovo Canada and hosted by SciNet at the University of Toronto.

Installation and transition[edit]

Due to limits on available power and cooling capacity there will be an interim period in which a significant portion of the old Niagara will be shut down in order to provide power for the new system's acceptance testing and transition. We'll update you when we have a better idea of Trillium's installation schedule.

Storage[edit]

Parallel storage: 29 petabytes, NVMe SSD based storage from VAST Data.

High-performance network[edit]

  • Nvidia “NDR” Infiniband network
    • 400 Gbit/s network bandwidth for CPU nodes
    • 800 Gbit/s network bandwidth for GPU nodes
    • Fully non-blocking, meaning every node can talk to every other node at full bandwidth simultaneously.

Node characteristics[edit]

nodes cores available memory CPU GPU
1224 192 768GB DDR5 2 x AMD EPYC 9655 (Zen 5) @ 2.6 GHz, 384MB cache L3
60 96 768GB DDR5 1 x AMD EPYC 9654 (Zen 4) @ 2.4 GHz, 384MB cache L3 4 x NVidia H100 SXM (80 GB memory)