Meltdown and Spectre bugs: Difference between revisions

From Alliance Doc
Jump to navigation Jump to search
Line 8: Line 8:


== What is Compute Canada doing about it ? ==
== What is Compute Canada doing about it ? ==
Teams managing the Compute Canada clusters are acting digilently to update their servers as needed and as patches are being released by various vendors. Many servers have already been patched, but some may require more updates as vendors release new patches.


== What should I do about it ? ==
== What should I do about it ? ==

Revision as of 20:40, 10 January 2018

Meltdown and Spectre are bugs related to speculative execution in a variety of CPU architectures developed during the past ten to fifteen years and which affect in particular processors from Intel and AMD, including those in use on Compute Canada clusters. A detailed discussion of the two bugs can be found on this page and Compute Canada personnel are currently patching all of the systems vulnerable to attacks based on these bugs. What sort of performance degradation users will observe as a consequence of the patches is dependent on the software you are using and how it interacts with the operating system but in general the more filesystem activity and other input/output operations that a program performs during its execution, the more likely it is to suffer from a slowdown. Some benchmarks of the performance loss for AI and machine learning codes are publicly available and we recommend that users consider running some simple tests of their own to see if there is any substantial loss of performance with their own code(s).

What are the impacts ?

Availability impacts

Updates to patch the vulnerabilities require updating the operating system and rebooting the nodes. For compute nodes, this is typically done in a rolling fashion, resulting in nodes being unavailable for a short period of time. This may impair the scheduling of large jobs, but typically goes unnoticed by users. Some nodes, such as login nodes and cloud hosts will however see a short interruption of service.

Performance impacts

What is Compute Canada doing about it ?

Teams managing the Compute Canada clusters are acting digilently to update their servers as needed and as patches are being released by various vendors. Many servers have already been patched, but some may require more updates as vendors release new patches.

What should I do about it ?

References

  1. Other general information about Spectre and Meltdown is available on the US-CERT web site.
    • Includes comprehensive links to vendor patch sites.