Transferring data: Difference between revisions

Undo revision 37686 by Diane27 (talk)
No edit summary
(Undo revision 37686 by Diane27 (talk))
Line 3: Line 3:
<translate>
<translate>
<!--T:4-->
<!--T:4-->
=General  transfer tools=
The following tools can be used to transfer data within Computa Canada resources or to and from external computers:
The following tools can be used to transfer data within Compute Canada resources or to and from external computers:
* Secure copy [https://en.wikipedia.org/wiki/Secure_copy scp] (examples [http://www.hypexr.org/linux_scp_help.php here])
* Secure copy [https://en.wikipedia.org/wiki/Secure_copy scp] (examples [http://www.hypexr.org/linux_scp_help.php here])
* [[Transferring_data#SFTP | SFTP]]
* [[Transferring_data#SFTP | SFTP]]
* [https://en.wikipedia.org/wiki/Rsync rsync]
* [https://en.wikipedia.org/wiki/Rsync rsync]


=Between Compute Canada resources= <!--T:3-->
==Between Compute Canada resources== <!--T:3-->
[[Globus]] is the preferred tool for transferring data between Compute Canada systems, and if it can be used, it should.
[[Globus]] is the preferred tool for transferring data between Compute Canada systems, and if it can be used, it should.


=To and from your personal computer= <!--T:1-->
==To and from your personal computer== <!--T:1-->
You will need software that supports secure transfer of files between your computer and the Compute Canada machines. The commands <code>scp</code> and <code>sftp</code> can be used in a command-line environment on '''Linux''' or '''Mac''' OS X computers. On '''Microsoft Windows''' platforms, [https://docs.computecanada.ca/wiki/Connecting_with_MobaXTerm/en MobaXterm] offers both a graphical file transfer function and a [[Linux introduction|command-line]] interface via [[SSH]], while [http://winscp.net/eng/index.php WinSCP] is another free program that supports file transfer. [https://docs.computecanada.ca/wiki/Connecting_with_PuTTY/en PuTTY] comes with <code>pscp</code> and <code>psftp</code> which are essentially the same as the Linux and Mac command line programs.
You will need software that supports secure transfer of files between your computer and the Compute Canada machines. The commands <code>scp</code> and <code>sftp</code> can be used in a command-line environment on '''Linux''' or '''Mac''' OS X computers. On '''Microsoft Windows''' platforms, [https://docs.computecanada.ca/wiki/Connecting_with_MobaXTerm/en MobaXterm] offers both a graphical file transfer function and a [[Linux introduction|command-line]] interface via [[SSH]], while [http://winscp.net/eng/index.php WinSCP] is another free program that supports file transfer. [https://docs.computecanada.ca/wiki/Connecting_with_PuTTY/en PuTTY] comes with <code>pscp</code> and <code>psftp</code> which are essentially the same as the Linux and Mac command line programs.


Line 18: Line 17:
If it takes more than one minute to move your files to or from Compute Canada servers, we recommend you install and try [[Globus#Personal_Computers|Globus Personal Connect]]. [[Globus]] transfers can be set up and will go on in the background without you. Most Compute Canada legacy systems can be reached with Globus.
If it takes more than one minute to move your files to or from Compute Canada servers, we recommend you install and try [[Globus#Personal_Computers|Globus Personal Connect]]. [[Globus]] transfers can be set up and will go on in the background without you. Most Compute Canada legacy systems can be reached with Globus.


=From the World Wide Web= <!--T:5-->
==From the World Wide Web== <!--T:5-->
The standard tool for downloading data from websites is [https://en.wikipedia.org/wiki/Wget wget].
The standard tool for downloading data from websites is [https://en.wikipedia.org/wiki/Wget wget].


=Synchronizing files= <!--T:6-->
==Synchronizing files== <!--T:6-->
To synchronize or "sync" files (or directories) stored in two different locations means to ensure that the two copies are the same. Here are several different ways to do this.
To synchronize or "sync" files (or directories) stored in two different locations means to ensure that the two copies are the same. Here are several different ways to do this.


==Globus Transfer== <!--T:7-->
===Globus Transfer=== <!--T:7-->
We find Globus usually gives the best performance and reliability.
We find Globus usually gives the best performance and reliability.


Line 49: Line 48:
For more information about Globus please see [[Globus]].
For more information about Globus please see [[Globus]].


==Rsync== <!--T:12-->
===Rsync=== <!--T:12-->
[https://en.wikipedia.org/wiki/Rsync Rsync] is a popular tool for ensuring that two separate datasets are the same but can be quite slow if there are a lot of files or there is a lot of latency between the two sites, i.e. they are geographically apart or on different networks. Running rsync will check the modification time and size of each file, and will only transfer the file if one or the other does not match. If you expect modification times not to match on the two systems you can use the "-c" option, which will compute checksums at the source and destination, and transfer only if the checksums do not match.
[https://en.wikipedia.org/wiki/Rsync Rsync] is a popular tool for ensuring that two separate datasets are the same but can be quite slow if there are a lot of files or there is a lot of latency between the two sites, i.e. they are geographically apart or on different networks. Running rsync will check the modification time and size of each file, and will only transfer the file if one or the other does not match. If you expect modification times not to match on the two systems you can use the "-c" option, which will compute checksums at the source and destination, and transfer only if the checksums do not match.


==Using checksums to check if files match== <!--T:13-->
===Using checksums to check if files match=== <!--T:13-->
If Globus is unavailable between the two systems being synchronized and Rsync is taking too long, then you can use a  [https://en.wikipedia.org/wiki/Checksum checksum] utility on both systems to determine if the files match. In this example we use <code>sha1sum</code>.
If Globus is unavailable between the two systems being synchronized and Rsync is taking too long, then you can use a  [https://en.wikipedia.org/wiki/Checksum checksum] utility on both systems to determine if the files match. In this example we use <code>sha1sum</code>.


Line 84: Line 83:
}}
}}


=SFTP= <!--T:21-->
==SFTP== <!--T:21-->
[https://en.wikipedia.org/wiki/SSH_File_Transfer_Protocol SFTP] (Secure File Transfer Protocol) uses the SSH protocol to transfer files between machines which encrypts data being transferred.
[https://en.wikipedia.org/wiki/SSH_File_Transfer_Protocol SFTP] (Secure File Transfer Protocol) uses the SSH protocol to transfer files between machines which encrypts data being transferred.


rsnt_translations
56,563

edits