Tar: Difference between revisions
No edit summary |
|||
Line 75: | Line 75: | ||
<br> | <br> | ||
<source lang="console"> | <source lang="console"> | ||
[user_name@localhost$ ls | [user_name@localhost]$ ls | ||
bin/ documents/ jobs/ new.log.dat programs/ report/ results/ tests/ work/ | bin/ documents/ jobs/ new.log.dat programs/ report/ results/ tests/ work/ | ||
</source> | </source> |
Revision as of 21:33, 27 November 2016
This is not a complete article: This is a draft, a work in progress that is intended to be published into an article, which may or may not be ready for inclusion in the main wiki. It should not necessarily be considered factual or authoritative.
This page is a draft ... Work in progress
How to prepare archives of your data before migration process?
Before starting the Data Migration from one place to another or between Compute Canada clusters or its regional partners, the users should start by preparing and archiving their Data in order to facilitate the migration process and improve the efficiency of the whole process. It is easier for the secure copy protocol to migrate one archive file of a reasonable size than migrating thousands of small files. To avoid any interruption or slowing down of the file system that can affect the migration process, it is recommended to transfer archives rather than the whole directory with all files and directories individually especially the directories with large number of files. In this page, you can find examples and more practical commands to prepare your archives and save time and increase the efficiency of the file transfer.
Why should you archive and compress your files?
Archiving your files reduces the total number of files you need to migrate and gives you more control of your data when it comes to move them from one place to another. In addition, the compression will reduce the space you need for your data and will speed up the migration process.
What is an archive or a tar-file?
An archive puts many files or directories together into a single file on your disk. You can restore individual files from the archive or the whole archive on any Unix-like system (Unix, Linux, MacOS, Cygwin). To create archives, some utilities like tar
are used.
How do you archive files with the tar command?
The primary archiving utility is the tar
command that bundles a bunch of files or directories together and generates a single file or archive file with [.tar] as the extension. This command is an archiving utility designed to store and extract files from an archive file known as a tar-file: [your_archive_name.tar].
Depending on the options and arguments invoked (see below), this command can create an archive, add files to an existing archive, list the content of an archive, extract one or more files, or extract the whole archive, etc ... The use of a directory name always implies that all the sub-directories below should be included in the final archive except if the command is invoked with --exclude
option that will not take into account the specified files when the archive is created.
How to compress/uncompress tar files?
Once the archive is created [your_archive.tar], it is possible to use the compression commands to reduce the size of the tar file. This can be done by using one of the compression algorithms that are installed on each Linux or MacOs environment:
gzip
: this utility or command is used to compress any file or any archive file ['your_file or your_archive_name.tar] and produce the output file: [your_file.gz or your_archive_name.tar.gz]. The commandgunzip
can be used later to uncompress and retrieve your original file or archive file: [your_file or your_archive_name.tar].bzip2
: is another utility that can be used to compress any file or any archive file: [your_file or your_archive_name.tar] and produce the output file: [your_file.bz2 or your_archive_name.tar.bz2]. The commandbunzip2
can be invoked later to uncompress and retrieve the original file or archive: [your_file or your_archive_name.tar].
Note: Condensed commands can combine at the same time tar
and gzip
(or bzip2
); or tar
and gunzip
(or <coce>bunzip2). For users who are not very familiar with the combined commands, it is possible to create your compressed archives and retrieve your original files in two steps:
- Compressing:
tar
[some-adequate-options] followed bygzip
orbzip2
. - Uncompressing:
gunzip
(orbunzip2
) followed bytar
[some-adequate-options].
These archiving utilities are invoked with some options and arguments. For more details on how to use these utilities, you can type on your terminal: man <command>
.
The general syntax for tar
, gzip
, gunzip
, bzip2
and bunzip2
is as follow:
tar [option(s)] [your_file.tar or your_archive_name.tar] [filename(s), directory or directories]
gzip [your_file or your_archive_name.tar]
gunzip [your_file.gz or your_archive_name.tar.gz]
bzip [your_file or your_archive_name.tar]
bunzip2 [your_file.bz2 or your_archive_name.tar.bz2]
Let us mention that:
gunzip
is only used to uncompress files with gz extension.bunzip2
is only used to uncompress files with bz2 extension.
These are the most common options for tar command:
-c
: {option is used to create a new archive.}-v
: {verbosely list files which are processed.}-f
: {following is the archive file name.}-t
: {list the content of an archive file.}-r
: {to add files an existing archive.}-A
: {to append an archive at the end on another.}-x
: {extract files from archive.}-z
: {filter the archive throughgzip
.}-C
: {directory file: performs a chdir [change directory] operation on directory and performs thec
(create) orr
(replace) operation on file.}
Common and useful commands to use to prepare your archives:
To illustrate the different commands and how to use archive utilities, we use a given directory that looks like a home directory or any other directory that contains files, sub-directories ... etc. Let us suppose that you have already cleaned and removed the data you do not need and your data is ready for migration. Before that, there is one more step which is to compress your data. In the following, you will find the most common use of archiving and compressing utilities with adequate options. As an example, we use one directory called here Migration (or whatever is the name of your directory) and see how we can apply the different archiving and compressing utilities.
On your terminal, change the directory to Migration (or the directory you want to work with) then:
- Use pwd {present work directory} to see the current working path.
- Use ls {list} command to see the files and the sub-directories in the current working path.
- Use du -sh {disk usage} to see the size of the files, directories and sub-directories. This information will help you to see how to prepare your archives and which files to put together or to compress separately.
As shown in this example:
[user_name@localhost]$ ls
bin/ documents/ jobs/ new.log.dat programs/ report/ results/ tests/ work/
[user_name@localhost]$ pwd
/global/scratch/user_name/Migration
[user_name@localhost]$ ls
bin/ documents/ jobs/ new.log.dat programs/ report/ results/ tests/ work/
[user_name@localhost]$ du -sh *
3,0K bin
876K documents
136K jobs
12K new.log.dat
68K programs
1,8M report
120K results
48K tests
46K work