Known issues: Difference between revisions

From Alliance Doc
Jump to navigation Jump to search
No edit summary
(nearline issue is now Cedar only)
Line 14: Line 14:


== Quota and filesystem problems == <!--T:7-->
== Quota and filesystem problems == <!--T:7-->
=== Nearline === <!--T:10-->
* Nearline capabilities are not yet fully available; see [[National Data Cyberinfrastructure]] for a brief description of the intended functionality.
** January 2019: on Graham, /nearline has been in near-production use since late last year.  Cedar is adding some storage hardware to enable /nearline as well, but it is not clear whether Cedar /nearline will be available for RAC2019.  Contact [[Technical Support|technical support]] if you would like more information.


=== Missing project folder === <!--T:11-->
=== Missing project folder === <!--T:11-->
Line 23: Line 19:


= Cedar only = <!--T:3-->
= Cedar only = <!--T:3-->
Nothing to report at this time.
* /nearline storage is not yet available.


= Graham only = <!--T:4-->
= Graham only = <!--T:4-->

Revision as of 17:30, 5 February 2019

Other languages:

Report an issue

Shared issues

  • The status page at http://status.computecanada.ca/ is not updated automatically yet, so may lag in showing current status.
  • CC Clusters are vulnerable to the recent Meltdown/Spectre vulnerabilities, and will be updated, which involves updating the OS and CPU microcode. Read more at Meltdown and Spectre bugs.

Scheduler issues

No known issues at this time.

Quota and filesystem problems

Missing project folder

  • Upon creation of a new account for a Principal Investigator, the PROJECT storage space might not be allocated until the next business day.

Cedar only

  • /nearline storage is not yet available.

Graham only

  • A component of the /project filesystem malfunctioned while returning to service as described here. Some details of the restoration process:
    • A list of affected files (.files_being_restored) is in the base directory of each project. For privacy reasons, this list is only readable by the project owner (sponsor or PI).
    • Affected files behave oddly: they cannot be removed (because they make reference to an offline storage component, so a failure is returned when trying to deallocate). They can be moved, though (the "mv" command doesn't work though.) Affected files will show up in directory listings with "?" in some fields.
    • We are restoring files to a separate location, then moving the broken file elsewhere, and moving the restored file into place. This is happening incrementally, and every restored file is noted in another per-project list (.files_restored).
    • Based on the overall number of files affected, we expect the entire process to take 8-10 days. The particular order of files restored depends on tape scheduling, so is difficult to predict.