38,760
edits
(Updating to match new version of source page) |
(Updating to match new version of source page) |
||
Line 1: | Line 1: | ||
<languages /> | <languages /> | ||
== Computational resources == | == Computational resources == | ||
<b>CPU</b> (pronounced as separate letters): Is the abbreviation for central processing unit. Sometimes referred to simply as the central processor, but more commonly called processor, the CPU is the brains of the computer where most calculations take place. | |||
<b>GPU:</b> GPU computing is the use of a graphics processing unit (GPU) to accelerate deep learning, analytics, and engineering applications, for example. GPU accelerators now power energy-efficient data centres in government labs, universities, enterprises, and small-and-medium businesses around the world. They play a huge role in accelerating applications in platforms ranging from artificial intelligence to cars, drones, and robots. | |||
<b>VCPU:</b> Stands for virtual central processing unit. One or more VCPUs are assigned to every Virtual Machine (VM) within a cloud environment. Each VCPU is seen as a single physical CPU core by the VM’s operating system. | |||
<b>VGPU:</b> Stands for virtual graphics processing unit (VGPU). One or more VGPUs can be assigned to Virtual Machines (VM) within a cloud environment. Each VGPU is seen as a single physical GPU device by the VM's operating system. | |||
<b>Reference GPU Unit (RGU):</b> RGU is a unit measuring the amount of GPU resources that are used. It represents the "cost" of utilizing a particular GPU model, whose RGU value varies based on performance. For example: 1 GPU A100-40GB = 4.0 RGU; 1 GPU V100-16GB = 2.2 RGU; 1 GPU P100-12GB = 1.0 RGU. | |||
== Resource allocations == | == Resource allocations == | ||
Line 18: | Line 18: | ||
== Batch computing == | == Batch computing == | ||
<b>Cluster:</b> A group of interconnected compute nodes managed as a unit by a scheduling program. | |||
<b>Compute node:</b> A computational unit of a cluster, one or more of which can be allocated to a job. A node has its own operating system image, one or more CPU cores and some memory (RAM). Nodes can be used by the jobs in either exclusive or shared manner depending on the cluster. | |||
'''Core year:''' The equivalent of using 1 CPU core continuously for a full year. Using 12 cores for a month, or 365 cores for a single day are both equivalent to 1 core-year. Compute allocations are based on core | '''Core-year:''' The equivalent of using 1 CPU core continuously for a full year. Using 12 cores for a month, or 365 cores for a single day are both equivalent to 1 core-year. Compute allocations are based on core-years allocations. | ||
'''Core equivalent:''' A core equivalent is a bundle made up of a single core and some amount of associated memory. In other words, a core equivalent is a core plus the amount of memory considered to be associated with each core on a given system. See detailed explanation [[Allocations_and_compute_scheduling|here]]. | '''Core-equivalent:''' A core-equivalent is a bundle made up of a single core and some amount of associated memory. In other words, a core-equivalent is a core plus the amount of memory considered to be associated with each core on a given system. See detailed explanation [[Allocations_and_compute_scheduling|here]]. | ||
'''GPU year:''' a GPU year is the equivalent of using 1 GPU continuously for a full year or 12 GPU for a month. | '''GPU-year:''' a GPU-year is the equivalent of using 1 GPU continuously for a full year or 12 GPU for a month. | ||
'''RGU year:''' RGU year is a calculated value that results from multiplying GPU years times the RGU of a given GPU model. For example, 10 GPU years of an A100-40GB (which ''costs'' 4 RGU) equals 40 RGU years. | '''RGU-year:''' RGU-year is a calculated value that results from multiplying GPU-years times the RGU of a given GPU model. For example, 10 GPU-years of an A100-40GB (which ''costs'' 4 RGU) equals 40 RGU-years. | ||
'''Head or Login node:''' Typically when you access a cluster you are accessing a head node, or gateway/login node. A head node is configured to be the launching point for jobs running on the cluster. When you are told or asked to login or access a cluster, invariably you are being directed to log into the head node, often nothing more than a node configured to act as a middle point between the actual cluster and the outside network. | '''Head or Login node:''' Typically when you access a cluster you are accessing a head node, or gateway/login node. A head node is configured to be the launching point for jobs running on the cluster. When you are told or asked to login or access a cluster, invariably you are being directed to log into the head node, often nothing more than a node configured to act as a middle point between the actual cluster and the outside network. | ||
Line 40: | Line 40: | ||
'''Serial job:''' A job that requires one compute CPU core to run. | '''Serial job:''' A job that requires one compute CPU core to run. | ||
'''Uneven usage:''' Most | '''Uneven usage:''' Most schedulers are tuned to deliver a certain number of core-years over a fixed period of time, assuming relatively consistent usage of the system. Users may have very inconsistent workloads, with significant peaks and valleys in their usage. They therefore may need a “burst” of compute resources in order to use their RAC allocation effectively. Normally we expect allocations to be used in a relatively even way throughout the award period. If you anticipate having bursty workloads or variable usage, please indicate that in your RAC application. If you are having problems running jobs, contact [[Technical support]]. | ||
== Memory == | == Memory == |