Cloud troubleshooting guide: Difference between revisions
Jump to navigation
Jump to search
(Marked this version for translation) |
(copy editing pre-translation) |
||
Line 2: | Line 2: | ||
<translate> | <translate> | ||
<!--T:1--> | <!--T:1--> | ||
This page | This page describes how to troubleshoot some issues that come up frequently when using Compute Canada cloud service. This includes solutions you can try yourself, as well as advice about submitting a trouble ticket, including how to gather the most important information. Not all issues can be solved by you, the user; some things require a system administrator. If you work through this guide and it advises you to submit a ticket, it is likely an issue which cannot easily be solved by the user. | ||
==Issue: I can't log in to the cloud== <!--T:2--> | |||
<ol> | <ol> | ||
<li>You need to specifically apply for a cloud project in order to | <li>You need to specifically apply for a cloud project in order to log in to our cloud service. If you have not applied for and been granted a cloud project you will not be able to log in, you will get the error message “Invalid Credentials”. You can apply for a cloud project here: [https://docs.google.com/forms/d/e/1FAIpQLSeU_BoRk5cEz3AvVLf3e9yZJq-OvcFCQ-mg7p4AWXmUkd5rTw/viewform CC cloud project and RAS request form]</li> | ||
<li>If you have applied for a cloud project it can take a few days for your request to be approved, at which | <li>If you have applied for a cloud project it can take a few days for your request to be approved, at which time you will receive an email with important information for accessing your project. If you have not received this confirmation email, and more than 3 business days have passed since you submitted your request, we recommend that you submit a ticket to cloud@computecanada.ca with your name, institution and the email address you used to submit the request.</li> | ||
<li>Make sure you are logging into the correct cloud. Your confirmation email | <li>Make sure you are logging into the correct cloud. Your confirmation email will tell you which cloud is hosting your project. Login links for the different clouds can be found on the [[Cloud/en|Cloud Wiki page]] in the section “Using the Cloud”.</li> | ||
<li>If you have a confirmed cloud project and are unable to | <li>If you have a confirmed cloud project and are unable to log in, check the [[System status/en|System status page]] to see if there is an incident affecting service on your cloud. | ||
<li> | <li>Make sure you are using the correct username. You need to use your Compute Canada username, the same as you would use to log in to an HPC cluster. Do ''not'' use your email address. Test logging in at [https://ccdb.computecanada.ca/security/login this link] to see whether it is an issue with your username and/or password.</li> | ||
<li>If | <li>If your password is rejected, reset it by visiting [https://ccdb.computecanada.ca/security/forgot this link]. | ||
<li>If you have followed these steps and still can’t | <li>If you have followed these steps and still can’t log in to your cloud project, it's time to submit a ticket. Email cloud@computecanada.ca with your username, project name, and which cloud you are trying to access. Please also describe the steps you've taken so far. | ||
<li> | <li>A discussion of best practices when submitting a ticket can be found on the [[Technical support/en|Compute Canada Technical Support page]].</li></ol> | ||
==Issue: I can't reach my virtual machine== <!--T:3--> | ==Issue: I can't reach my virtual machine== <!--T:3--> |
Revision as of 21:40, 9 December 2020
This page describes how to troubleshoot some issues that come up frequently when using Compute Canada cloud service. This includes solutions you can try yourself, as well as advice about submitting a trouble ticket, including how to gather the most important information. Not all issues can be solved by you, the user; some things require a system administrator. If you work through this guide and it advises you to submit a ticket, it is likely an issue which cannot easily be solved by the user.
Issue: I can't log in to the cloud
- You need to specifically apply for a cloud project in order to log in to our cloud service. If you have not applied for and been granted a cloud project you will not be able to log in, you will get the error message “Invalid Credentials”. You can apply for a cloud project here: CC cloud project and RAS request form
- If you have applied for a cloud project it can take a few days for your request to be approved, at which time you will receive an email with important information for accessing your project. If you have not received this confirmation email, and more than 3 business days have passed since you submitted your request, we recommend that you submit a ticket to cloud@computecanada.ca with your name, institution and the email address you used to submit the request.
- Make sure you are logging into the correct cloud. Your confirmation email will tell you which cloud is hosting your project. Login links for the different clouds can be found on the Cloud Wiki page in the section “Using the Cloud”.
- If you have a confirmed cloud project and are unable to log in, check the System status page to see if there is an incident affecting service on your cloud.
- Make sure you are using the correct username. You need to use your Compute Canada username, the same as you would use to log in to an HPC cluster. Do not use your email address. Test logging in at this link to see whether it is an issue with your username and/or password.
- If your password is rejected, reset it by visiting this link.
- If you have followed these steps and still can’t log in to your cloud project, it's time to submit a ticket. Email cloud@computecanada.ca with your username, project name, and which cloud you are trying to access. Please also describe the steps you've taken so far.
- A discussion of best practices when submitting a ticket can be found on the Compute Canada Technical Support page.
Issue: I can't reach my virtual machine
- If you are having any issues connecting to you virtual machine (VM, also known as "instance") or hosted service, the first step is to check the Compute Canada System Status page. If there is an incident on your hosting cloud you may need to wait till the incident is resolved before connecting to your hosted service/VM.
- If there is no incident for the cloud hosting your project on the System Status page, you need to confirm that you can reach the dashboard for your cloud project. (ex. Use this link to login to Arbutus:https://arbutus.cloud.computecanada.ca.) Login links for other clouds can be found on the cloud wiki page. If you cannot reach the login page to your cloud dashboard and you have verified your internet connectivity (ex. you can reach google) then it is recommended that you submit a ticket. To submit a ticket to the cloud queue, email: cloud@computecanada.ca. Include your name, username, hosting cloud and project name, and the steps you have taken to trouble-shoot thus far. For more information on submitting support tickets see the Technical Support page on the Compute Canada wiki.
- If you are able to reach the login page for your cloud but are having trouble actually logging in, please see the “Can’t login to Cloud” guide in the upper section on this page for next steps.
- If you are able to login to your cloud dashboard, there are a few things there that you can check to see if your VM is actually running:
- Navigate to the Instances screen on your left side menu. Look at the Power State for your VM. It should be “Running”. If it is not in the “Running” state (for example “Shut Down”, you can try and restart it using the actions menu on the far right hand side, select either “Start Instance” or “Resume Instance” depending on what options are available to you.
- You can look through the action logs to try and figure out why it was taken out of the running state. From the instances screen, click on your Instance name (VM name) and then click on the “Action Log” tab. This will show all the actions that have been used on your VM. If there is an action you don’t recognize you can contact support (email: cloud@computecanada.ca) to try and figure out who it was, just be sure to include your name, username, hosting cloud, project name and the user ID from the action log for the action you want to investigate.
- The “log” tab from the same screen will show you the console log for your VM, so you can look through that log for error messages as well.
- If you are unable to restart your VM then it's recommended that you submit a ticket. To submit a ticket to the cloud queue, email: cloud@computecanada.ca. Include your name, username, hosting cloud, project name and VM ID (You can find this by clicking on your instance, then looking at the overview tab), the steps you have taken to trouble-shoot and the issue you are seeing. For more information on submitting support tickets see the Technical Support page on the Compute Canada wiki.
- Can you reach your VM with Secure Shell (SSH) protocol?
- If you can’t reach your application or web service hosted on your VM, but you have followed steps 1-4 and your VM is running, then you need to try to connect using SSH. You can find instructions for doing this in the Cloud Quick Start Guide, scroll down to the section near the bottom of the page to “Connecting to your VM with SSH”.
- If you are getting a login prompt verify you are using the correct key pair and username. You can check you are using the correct key pair by clicking on “instances” under “compute” on the open sack page, look under the column “Key Pair”, and make sure you are using that key pair to login. The username will be dependant on the operating system of your VM (Note: If you explicitly change your username with a custom CloudInit script then it will be what you have changed it to):
Operating System Username Debian debian Ubuntu ubuntu CentOS centos Fedora fedora - If you are not getting a login prompt, you can double check your security settings:
- Verify your own ip address has not changed. You can check your own ip address at this link https:/ipv4.icanhazip.com/. Your ip address must be unblocked in the security settings in order to reach your VM, so if it has changed you will need to add a new rule to your security group.
- Check that your ip address is unblocked for SSH connections to your VM. You can do this by clicking on “Network” in the left-hand side navigation panel, then “Security groups”. Click “Manage Rules for the security group for your VM (unless you have setup a separate group for your VM, this will be the “default group”. You can check this by going to the instance overview page). There should be a TCP rule there to allow ingress ssh at your-ip-address/32. If this rule is not there click add rule, select SSH from the list, then enter your-ip-adress/32 in the CIDR field box at the bottom and click “Add”.
- If you have completed all these steps and still cannot connect to your instance, it’s time to submit a ticket. Send an email to cloud@computecanada.ca and provide the cloud name, project name, instance UUID (you can find this by clicking on Instances -in the compute menu on the left hand side- then clicking on the specific instance name you are having trouble with, then look at the ID field in the overview tab for that instance. The UUID will be a long alpha-numeric sequence) , and all information collected from the above steps.
- More information for contacting support and ticket submission best practices etc. can be found on the Compute Canada Technical Support page.
Issue: I can't delete my volume
- Check to see if the volume you are trying to delete is attached to a running virtual machine (VM, also known as instance): Volumes that are attached to running VM’s cannot be deleted. You can check this by logging into the cloud dashboard for your project (see the cloud wiki page for a list of the cloud login links), opening the volumes menu on the left side navigation menu, then selecting volumes. You will be presented with a list of all your volumes. If the “attached to” column is empty then your volume is not attached. If there is a VM listed there, it is attached to that VM and will need to be detached before it can be deleted. For instructions on how to detach a volume see the “Detaching a Volume” section of the OpenStack Wiki page.
- Once you have the volume detached check the status of your volume: open the Volumes menu on the left hand side and select "Volumes" from the menu. Now look at the "Status" column for your volume. If it is still listed as "In-use" then you need to submit a ticket: Email cloud@computecanada.ca with any info collected during troubleshooting, username, project name cloud name and volume UUID. If it is listed as "available" proceed to the next step.
- Now check to see if there is a snapshot of the volume: If there are snapshots of the volume you are trying to delete, you must delete the snapshots first, then you can delete the source volume. To check if your volume has snapshots, open the “Volumes” menu on the left side panel, and then select “Snapshots”. You will see a list of all your snapshots. Look at the volume name column, to see if there are any snapshots of the volume you wish to delete. To delete a snapshot click the drop-down arrow in the “Actions” column in the row for the snapshot you wish to delete and select “Delete Volume Snapshot”.
- If you have followed these steps and still can’t delete your volume it's time to submit a ticket. Email cloud@computecanada.ca with any info collected during troubleshooting, username, project name cloud name and volume UUID.
- More information for contacting support and ticket submission best practices etc. can be found on the Compute Canada Technical Support page.
Issue: My virtual machine won't launch
- First check to see if your virtual machine (VM, also known as "instance") is over quota: Your cloud project has a limit on the number of VM’s, CPUS and GB of RAM you can have in use at any given time. If you try to launch a VM that would cause you to exceed your project quota (whether it’s due to number of VM’s, CPUS or RAM), then your VM launch will fail. You can check your project quota by logging into your project cloud dashboard (see the cloud wiki page for a list of cloud login links) and using the left side navigation menu, and selecting compute, then overview. It will show you how much of your allotted resources are currently in use. If you need more resources for your project you can request them using this request form. Details for resource request limits and how to obtain large resource allocations (>10TB) can be found at the Cloud RAS Allocations wiki page.
- If you are getting the following error: "Error: Failed to perform requested operation on instance "instance_name", the instance has an error status: Please try again later [Error: No valid host was found. There are not enough hosts available.]" then check the following:
- You may be choosing an inappropriate availability zone when you are launching your instance. The first section you need to fill in when launching your instance is the “Details” section which includes your instance name (name for your virtual machine), the description, and the Availability Zone. The default setting is “Any Availability Zone” which allows the software to choose a Zone based on your requirements and the system availability. If you manually select the zone yourself instead of using the default option you may see this “not enough hosts” error. You can fix this by setting your availability zone back to “Any Availability Zone”.
- If you are still getting this “not enough hosts error” or you have extenuating reasons why you need a particular availability zone, send an email to cloud@computecanada.ca and provide the cloud name, project name, all the steps you have taken to troubleshoot thus far, and the extenuating reasons why you need a particular availability zone if that is the case.
- If you have followed these steps and still can’t get your instance to boot it's time to submit a ticket. Email cloud@computecanada.ca with any info collected during troubleshooting, username, project name cloud name.
- More information for contacting support and ticket submission best practices etc. can be found on the Compute Canada Technical Support page.