Trillium: Difference between revisions

From Alliance Doc
Jump to navigation Jump to search
No edit summary
No edit summary
Line 1: Line 1:
<languages />
<translate>
{| class="wikitable"
{| class="wikitable"
|-
|-
Line 33: Line 37:


* Parallel storage: 29 petabytes, all-flash, from Vast Data.
* Parallel storage: 29 petabytes, all-flash, from Vast Data.
</translate>

Revision as of 03:41, 5 November 2024

Other languages:


Availability: Spring 2025
Login node: to be determined
Globus endpoint: to be determined
Data transfer node (rsync, scp, sftp,...): to be determined
Portal: to be determined

This is the page for the large parallel cluster named Trillium hosted by SciNet at the University of Toronto.

The Trillium cluster will be deployed in the spring of 2025.

This cluster, built by Lenovo, will consist of:

  • 1,224 CPU nodes, each with
    • Two 96-core AMD EPYC “Zen5” processors (192 cores per node).
    • 768 GiB of memory.
  • 60 GPU nodes, each with
    • 4 x NVIDIA H100 SXM 80GB
    • One 96-core AMD EPYC “Zen4” processors.
    • 768 GiB of memory.
  • Nvidia “NDR” Infiniband network
    • 400 Gbps network bandwidth for CPU nodes
    • 800 Gbps network bandwidth for GPU nodes
    • Fully non-blocking, meaning every node can talk to every other node at full bandwidth simultaneously.
  • Parallel storage: 29 petabytes, all-flash, from Vast Data.