Cloud troubleshooting guide: Difference between revisions
Jump to navigation
Jump to search
mNo edit summary |
(copy editing pre-translation) |
||
Line 17: | Line 17: | ||
==Issue: I can't reach my virtual machine== <!--T:3--> | ==Issue: I can't reach my virtual machine== <!--T:3--> | ||
<ol> | <ol> | ||
<li>If you | <li>If you cannot connect to your virtual machine (VM, also known as an "instance"), or cannot connect to some service hosted in the cloud, check the [[System status/en|Compute Canada System Status page]]. If there is an incident on your hosting cloud, wait until the incident is resolved then try to connect again.</li> | ||
<li>If there is no incident for the cloud hosting your project | <li>If there is no incident reported on the System Status page for the cloud hosting your project, try to log in to the OpenStack dashboard for your cloud project. For example, if your project is hosted at Arbutus use this link to log in: https://arbutus.cloud.computecanada.ca. Login links for other clouds can be found on the [[Cloud/en|cloud wiki page]]. </li> | ||
<li>If you are able to reach the login page for your cloud but are | <li>If you cannot reach the login page for your cloud, verify that you have internet connectivity: Try to reach https://www.google.com with a browser, for example. If you have internet connectivity but cannot reach the login page for your cloud, submit a ticket to the cloud queue by emailing cloud@computecanada.ca. Include your name, username, hosting cloud, and project name, and the steps you have taken thus far. For more on submitting tickets see [[Technical support/en|Technical Support ]].</li> | ||
<li>If you are able to | <li>If you are able to reach the login page for your cloud but are cannot log in, please see the “Can’t login to Cloud” guide in the previous section on this page. | ||
<li>If you are able to log in to your cloud dashboard, there are a few things you can do to see if your VM is running:</li> | |||
<ol style="list-style-type:lower-roman"> | <ol style="list-style-type:lower-roman"> | ||
<li>Navigate to the Instances screen on | <li>Navigate to the Instances screen on the left-side menu. Look at the Power State for your VM. It should be “Running”. If it is not in the “Running” state (for example, “Shut Down”) try to restart it using the "Actions" menu on the right-hand side. Select "Start Instance" or "Resume Instance" depending on what options are available to you.</li> | ||
<ol style="list-style-type:disc"> | <ol style="list-style-type:disc"> | ||
<li> | <li>Look through the action logs to try to figure out why it was taken out of the running state. From the Instances screen, click on your Instance name (VM name) and then click on the “Action Log” tab. This will show all the actions that have been applied to your VM. If there is an action you don’t recognize, contact support (email: cloud@computecanada.ca) for help to figure out what it was. Include your name, username, hosting cloud, project name, and the User ID from the action log for the action you want to investigate.</li> | ||
<li>The | <li>The "log" tab from the same screen will show you the console log for your VM. Look at that for error messages as well.</li> | ||
</ol> | </ol> | ||
<li>If you | <li>If you can't restart your VM, submit a ticket to the cloud queue by emailing cloud@computecanada.ca. Include your name, username, hosting cloud, project name, VM ID, the issue you are seeing, and the steps you have taken to trouble-shoot it so far. You can find the VM ID by clicking on your instance, then looking at the overview tab. For more on submitting tickets see [[Technical support/en|Technical Support]].</li> | ||
</ol> | </ol> | ||
<li>Can you reach your VM with Secure Shell (SSH) protocol? | <li>Can you reach your VM with Secure Shell (SSH) protocol? |
Revision as of 22:17, 9 December 2020
This page describes how to troubleshoot some issues that come up frequently when using Compute Canada cloud service. This includes solutions you can try yourself, as well as advice about submitting a trouble ticket, including what information to include in the ticket. Not all issues can be solved by you, the user; some things require a system administrator. If you work through this guide and it advises you to submit a ticket, it is likely an issue which you cannot easily solve yourself.
Issue: I can't log in to the cloud
- You need to specifically apply for a cloud project in order to log in to our cloud service. If you have not applied for and been granted a cloud project you will not be able to log in, you will get the error message “Invalid Credentials”. You can apply for a cloud project here: CC cloud project and RAS request form
- Once you have applied for a cloud project it can take a few days for your request to be approved. When it is approved you will receive an email with important information for accessing your project. If you have not received this confirmation email, but more than 3 business days have passed since you submitted your request, submit a ticket to cloud@computecanada.ca with your name, institution and the email address you used to submit the request.
- Make sure you are logging into the correct cloud. Your confirmation email will tell you which cloud is hosting your project. Login links for the different clouds can be found on the Cloud Wiki page in the section “Using the Cloud”.
- If you have a confirmed cloud project and are unable to log in, check the System status page to see if there is an incident affecting service on your cloud.
- Make sure you are using the correct username. You need to use your Compute Canada username, the same as you would use to log in to an HPC cluster. Do not use your email address. Test logging in at this link to see whether it is an issue with your username or password.
- If your password is rejected, reset it by visiting this link.
- If you have followed these steps and still can’t log in to your cloud project, it's time to submit a ticket. Email cloud@computecanada.ca with your username, project name, and which cloud you are trying to access. Please also describe the steps you've taken so far.
- A discussion of best practices when submitting a ticket can be found on the Compute Canada Technical Support page.
Issue: I can't reach my virtual machine
- If you cannot connect to your virtual machine (VM, also known as an "instance"), or cannot connect to some service hosted in the cloud, check the Compute Canada System Status page. If there is an incident on your hosting cloud, wait until the incident is resolved then try to connect again.
- If there is no incident reported on the System Status page for the cloud hosting your project, try to log in to the OpenStack dashboard for your cloud project. For example, if your project is hosted at Arbutus use this link to log in: https://arbutus.cloud.computecanada.ca. Login links for other clouds can be found on the cloud wiki page.
- If you cannot reach the login page for your cloud, verify that you have internet connectivity: Try to reach https://www.google.com with a browser, for example. If you have internet connectivity but cannot reach the login page for your cloud, submit a ticket to the cloud queue by emailing cloud@computecanada.ca. Include your name, username, hosting cloud, and project name, and the steps you have taken thus far. For more on submitting tickets see Technical Support .
- If you are able to reach the login page for your cloud but are cannot log in, please see the “Can’t login to Cloud” guide in the previous section on this page.
- If you are able to log in to your cloud dashboard, there are a few things you can do to see if your VM is running:
- Navigate to the Instances screen on the left-side menu. Look at the Power State for your VM. It should be “Running”. If it is not in the “Running” state (for example, “Shut Down”) try to restart it using the "Actions" menu on the right-hand side. Select "Start Instance" or "Resume Instance" depending on what options are available to you.
- Look through the action logs to try to figure out why it was taken out of the running state. From the Instances screen, click on your Instance name (VM name) and then click on the “Action Log” tab. This will show all the actions that have been applied to your VM. If there is an action you don’t recognize, contact support (email: cloud@computecanada.ca) for help to figure out what it was. Include your name, username, hosting cloud, project name, and the User ID from the action log for the action you want to investigate.
- The "log" tab from the same screen will show you the console log for your VM. Look at that for error messages as well.
- If you can't restart your VM, submit a ticket to the cloud queue by emailing cloud@computecanada.ca. Include your name, username, hosting cloud, project name, VM ID, the issue you are seeing, and the steps you have taken to trouble-shoot it so far. You can find the VM ID by clicking on your instance, then looking at the overview tab. For more on submitting tickets see Technical Support.
- Can you reach your VM with Secure Shell (SSH) protocol?
- If you can’t reach your application or web service hosted on your VM, but you have followed steps 1-4 and your VM is running, then you need to try to connect using SSH. You can find instructions for doing this in the Cloud Quick Start Guide, scroll down to the section near the bottom of the page to “Connecting to your VM with SSH”.
- If you are getting a login prompt verify you are using the correct key pair and username. You can check you are using the correct key pair by clicking on “instances” under “compute” on the open sack page, look under the column “Key Pair”, and make sure you are using that key pair to login. The username will be dependant on the operating system of your VM (Note: If you explicitly change your username with a custom CloudInit script then it will be what you have changed it to):
Operating System Username Debian debian Ubuntu ubuntu CentOS centos Fedora fedora - If you are not getting a login prompt, you can double check your security settings:
- Verify your own ip address has not changed. You can check your own ip address at this link https:/ipv4.icanhazip.com/. Your ip address must be unblocked in the security settings in order to reach your VM, so if it has changed you will need to add a new rule to your security group.
- Check that your ip address is unblocked for SSH connections to your VM. You can do this by clicking on “Network” in the left-hand side navigation panel, then “Security groups”. Click “Manage Rules for the security group for your VM (unless you have setup a separate group for your VM, this will be the “default group”. You can check this by going to the instance overview page). There should be a TCP rule there to allow ingress ssh at your-ip-address/32. If this rule is not there click add rule, select SSH from the list, then enter your-ip-adress/32 in the CIDR field box at the bottom and click “Add”.
- If you have completed all these steps and still cannot connect to your instance, it’s time to submit a ticket. Send an email to cloud@computecanada.ca and provide the cloud name, project name, instance UUID (you can find this by clicking on Instances -in the compute menu on the left hand side- then clicking on the specific instance name you are having trouble with, then look at the ID field in the overview tab for that instance. The UUID will be a long alpha-numeric sequence) , and all information collected from the above steps.
- More information for contacting support and ticket submission best practices etc. can be found on the Compute Canada Technical Support page.
Issue: I can't delete my volume
- Check to see if the volume you are trying to delete is attached to a running virtual machine (VM, also known as instance): Volumes that are attached to running VM’s cannot be deleted. You can check this by logging into the cloud dashboard for your project (see the cloud wiki page for a list of the cloud login links), opening the volumes menu on the left side navigation menu, then selecting volumes. You will be presented with a list of all your volumes. If the “attached to” column is empty then your volume is not attached. If there is a VM listed there, it is attached to that VM and will need to be detached before it can be deleted. For instructions on how to detach a volume see the “Detaching a Volume” section of the OpenStack Wiki page.
- Once you have the volume detached check the status of your volume: open the Volumes menu on the left hand side and select "Volumes" from the menu. Now look at the "Status" column for your volume. If it is still listed as "In-use" then you need to submit a ticket: Email cloud@computecanada.ca with any info collected during troubleshooting, username, project name cloud name and volume UUID. If it is listed as "available" proceed to the next step.
- Now check to see if there is a snapshot of the volume: If there are snapshots of the volume you are trying to delete, you must delete the snapshots first, then you can delete the source volume. To check if your volume has snapshots, open the “Volumes” menu on the left side panel, and then select “Snapshots”. You will see a list of all your snapshots. Look at the volume name column, to see if there are any snapshots of the volume you wish to delete. To delete a snapshot click the drop-down arrow in the “Actions” column in the row for the snapshot you wish to delete and select “Delete Volume Snapshot”.
- If you have followed these steps and still can’t delete your volume it's time to submit a ticket. Email cloud@computecanada.ca with any info collected during troubleshooting, username, project name cloud name and volume UUID.
- More information for contacting support and ticket submission best practices etc. can be found on the Compute Canada Technical Support page.
Issue: My virtual machine won't launch
- First check to see if your virtual machine (VM, also known as "instance") is over quota: Your cloud project has a limit on the number of VM’s, CPUS and GB of RAM you can have in use at any given time. If you try to launch a VM that would cause you to exceed your project quota (whether it’s due to number of VM’s, CPUS or RAM), then your VM launch will fail. You can check your project quota by logging into your project cloud dashboard (see the cloud wiki page for a list of cloud login links) and using the left side navigation menu, and selecting compute, then overview. It will show you how much of your allotted resources are currently in use. If you need more resources for your project you can request them using this request form. Details for resource request limits and how to obtain large resource allocations (>10TB) can be found at the Cloud RAS Allocations wiki page.
- If you are getting the following error: "Error: Failed to perform requested operation on instance "instance_name", the instance has an error status: Please try again later [Error: No valid host was found. There are not enough hosts available.]" then check the following:
- You may be choosing an inappropriate availability zone when you are launching your instance. The first section you need to fill in when launching your instance is the “Details” section which includes your instance name (name for your virtual machine), the description, and the Availability Zone. The default setting is “Any Availability Zone” which allows the software to choose a Zone based on your requirements and the system availability. If you manually select the zone yourself instead of using the default option you may see this “not enough hosts” error. You can fix this by setting your availability zone back to “Any Availability Zone”.
- If you are still getting this “not enough hosts error” or you have extenuating reasons why you need a particular availability zone, send an email to cloud@computecanada.ca and provide the cloud name, project name, all the steps you have taken to troubleshoot thus far, and the extenuating reasons why you need a particular availability zone if that is the case.
- If you have followed these steps and still can’t get your instance to boot it's time to submit a ticket. Email cloud@computecanada.ca with any info collected during troubleshooting, username, project name cloud name.
- More information for contacting support and ticket submission best practices etc. can be found on the Compute Canada Technical Support page.