Tar: Difference between revisions
(Marked this version for translation) |
No edit summary |
||
Line 93: | Line 93: | ||
* The gzip and bzip2 are applied to a single file or a single archive file but not a directory. | * The gzip and bzip2 are applied to a single file or a single archive file but not a directory. | ||
---> | ---> | ||
=== How to tar a given directory? === | |||
Now, we can go back to our test example and try to create an archive called '''results.tar''' for the directory '''results'''; on your terminal type: | |||
<source lang="console"> | |||
[user_name@localhost]$ ls | |||
bin/ documents/ jobs/ new.log.dat programs/ report/ results/ tests/ work/ | |||
</source> | |||
Then: | |||
<source lang="console"> | |||
[user_name@localhost]$ tar -cvf results.tar results | |||
results/ | |||
results/log1.dat | |||
results/log5.dat | |||
results/Res-01/ | |||
results/Res-01/log.15Feb16.1 | |||
results/Res-01/log.15Feb16.4 | |||
results/Res-02/ | |||
results/Res-02/log.15Feb16.balance.b.1 | |||
results/Res-02/log.15Feb16.balance.b.4 | |||
</source> | |||
Using <code>ls</code> command we can see the tar file created: | |||
<source lang="console"> | |||
[user_name@localhost]$ ls | |||
bin/ documents/ jobs/ new.log.dat programs/ report/ results/ results.tar tests/ work/ | |||
</source> | |||
In this example, we have invoked the <code>tar</code> command with the options '''c''' {for create}, '''v''' {for verbosity} and '''f''' {for file}. As a name for the archive, we have used '''results.tar'''; this name can be something else but it is better to keep similar name as the file or directory we want to '''tar'''. It is easier to recognize your data later without having to uncompress them to see what data you have in this file. | |||
If we want to add more directories to a tar file; for example, an archive file called '''full_results.tar''' that for the directories '''results''', '''reports''' and '''documents''', we can proceed as follow: | |||
<source lang="console"> | |||
[user_name@localhost]$ tar -cvf full_results.tar results report documents/ | |||
results/ | |||
results/log1.dat | |||
results/log5.dat | |||
results/Res-01/ | |||
results/Res-01/log.15Feb16.1 | |||
results/Res-01/log.15Feb16.4 | |||
results/Res-02/ | |||
results/Res-02/log.15Feb16.balance.b.1 | |||
results/Res-02/log.15Feb16.balance.b.4 | |||
report/ | |||
report/report-2016.pdf | |||
report/report-a.pdf | |||
documents/ | |||
documents/1504.pdf | |||
documents/ff.doc | |||
</source> | |||
<source lang="console"> | |||
[user_name@localhost]$ ls | |||
bin/ documents/ full_results.tar jobs/ new.log.dat programs/ report/ results/ results.tar tests/ work/ | |||
</source> | |||
=== How to tar for example all the files or directories that start with a given a letter, "r" for example: === | |||
In our working directory, we have two directories that starts with r (report, results). | |||
<source lang="console"> | |||
[user_name@localhost]$ tar -cvf archive.tar r* | |||
report/ | |||
report/report-2016.pdf | |||
report/report-a.pdf | |||
results/ | |||
results/log1.dat | |||
results/log5.dat | |||
results/Res-01/ | |||
results/Res-01/log.15Feb16.1 | |||
results/Res-01/log.15Feb16.4 | |||
results/Res-02/ | |||
results/Res-02/log.15Feb16.balance.b.1 | |||
results/Res-02/log.15Feb16.balance.b.4 | |||
</source> | |||
In this example, we put together the content of the directories '''results''' and '''report''' into one single archive called '''archive.tar'''. | |||
=== How to see the content of a tar file? === | |||
From our previous example, let us consider the tar file '''results.tar''' that corresponds all the files and sub-directories in the directory of interest results to see what are the files in it. This can be achieved by invoking the '''–t''' option. This gives also additional information about the files like permission, date, owner, etc. | |||
<source lang="console"> | |||
[user_name@localhost]$ tar -tvf results.tar | |||
drwxrwxr-x name name 0 2016-11-20 11:02 results/ | |||
-rw-r--r-- name name 10905 2016-11-16 16:31 results/log1.dat | |||
-rw-r--r-- name name 10909 2016-11-16 16:31 results/log5.dat | |||
drwxrwxr-x name name 0 2016-11-16 19:36 results/Res-01/ | |||
-rw-r--r-- name name 11672 2016-11-16 15:10 results/Res-01/log.15Feb16.1 | |||
-rw-r--r-- name name 11682 2016-11-16 15:10 results/Res-01/log.15Feb16.4 | |||
drwxrwxr-x name name 0 2016-11-16 19:37 results/Res-02/ | |||
-rw-r--r-- name name 34111 2016-11-16 15:10 results/Res-02/log.15Feb16.balance.b.1 | |||
-rw-r--r-- name name 34117 2016-11-16 15:10 results/Res-02/log.15Feb16.balance.b.4 | |||
</source> | |||
In this example, tar command was invoked with the option t {for list}, v {for virbosity} and {f for file}. This command shows all the files that are in the tar file with additional information about the permission, the date, ownership .... | |||
If you are interested just in listing the files in the tar file, use the following options (tf instead of tvf): | |||
[MEDUSA@MEDUSA-PC Migration]$ tar -tf results.tar | |||
results/ | |||
results/log1.dat | |||
results/log5.dat | |||
results/Res-01/ | |||
results/Res-01/log.15Feb16.1 | |||
results/Res-01/log.15Feb16.4 | |||
results/Res-02/ | |||
results/Res-02/log.15Feb16.balance.b.1 | |||
results/Res-02/log.15Feb16.balance.b.4 | |||
If you are interested in the number of files in the tar file, it is possible to combine one the previous commands with a pipe { | } and wc -l { word count with the option -l to count only the number of lines}. This command counts the number of lines in the output from the command before the pipe symbol. | |||
[MEDUSA@MEDUSA-PC Migration]$ tar -tvf results.tar | wc -l | |||
9 | |||
[MEDUSA@MEDUSA-PC Migration]$ tar -tf results.tar | wc -l | |||
9 | |||
From this example, we have a total of 9. This number include all the files and subdirectories that are in the directory results including this directory itself. | |||
The options in the previous commands van be invoked separately. For example: | |||
The option -tvf is equivalent to -t -v -f | |||
The option -v is equivalent to --verbose | |||
The option -t is equivalent to -t | |||
The option --file=results.tar is equivalent to -f results.tar | |||
Note: The option -f or --file= comes always before the tar file. | |||
How to search for a given file in the tar archive file without un-tarring the archive? | |||
We have seen previously how to list the files in the archive. So, it possible to list the files and look at for your file or use the list commands combined with pipe an grep commands. For example, let us see if we can fine the file: log.15Feb16.4 (the path to this file is: results/Res-01/log.15Feb16.4). | |||
[MEDUSA@MEDUSA-PC Migration]$ tar -tf results.tar | grep -a log.15Feb16.4 | |||
results/Res-01/log.15Feb16.4 | |||
[MEDUSA@MEDUSA-PC Migration]$ tar -tvf results.tar | grep -a log.15Feb16.4 | |||
-rw-r--r-- MEDUSA/None 11682 2016-11-16 15:10 results/Res-01/log.15Feb16.4 | |||
Now, we can try see if we can find another file called for example pbs_file (this file does not exist in our archive): | |||
[MEDUSA@MEDUSA-PC Migration]$ tar -tf results.tar | grep -a pbs_file | |||
[MEDUSA@MEDUSA-PC Migration]$ tar -tvf results.tar | grep -a pbs_file | |||
As you can see, the output of the commands is empty meaning that the file does not exist in the archive. | |||
If you want to list all the files that start for example by log in the archive, type on your terminal: | |||
[MEDUSA@MEDUSA-PC Migration]$ tar -tf results.tar | grep -a log* | |||
results/log1.dat | |||
results/log5.dat | |||
results/Res-01/log.15Feb16.1 | |||
results/Res-01/log.15Feb16.4 | |||
results/Res-02/log.15Feb16.balance.b.1 | |||
results/Res-02/log.15Feb16.balance.b.4 | |||
Or add the v option for more details: | |||
[MEDUSA@MEDUSA-PC Migration]$ tar -tvf results.tar | grep -a log* | |||
-rw-r--r-- MEDUSA/None 10905 2016-11-16 16:31 results/log1.dat | |||
-rw-r--r-- MEDUSA/None 10909 2016-11-16 16:31 results/log5.dat | |||
-rw-r--r-- MEDUSA/None 11672 2016-11-16 15:10 results/Res-01/log.15Feb16.1 | |||
-rw-r--r-- MEDUSA/None 11682 2016-11-16 15:10 results/Res-01/log.15Feb16.4 | |||
-rw-r--r-- MEDUSA/None 34111 2016-11-16 15:10 results/Res-02/log.15Feb16.balance.b.1 | |||
-rw-r--r-- MEDUSA/None 34117 2016-11-16 15:10 results/Res-02/log.15Feb16.balance.b.4 | |||
Note: | |||
· The command more can be also invoked after the pipe symbol to list the files in the archive or the compressed file. | |||
How to append a file or files or add a new file to the end of archive or tar file? | |||
The r option can be used to add files to existing archives, without having to create new ones or extract the archive and run tar again to create the archive. Here is a quick example: let us add the file new.log.dat to the archive results.tar | |||
[MEDUSA@MEDUSA-PC Migration]$ tar -rf results.tar new.log.dat | |||
Here, the tar command added the file new.log.dat at the end of the archive results.tar. | |||
To check out use the previous options to list the files in the tar file: | |||
[MEDUSA@MEDUSA-PC Migration]$ tar -tvf results.tar | |||
drwxrwxr-x MEDUSA/None 0 2016-11-20 11:02 results/ | |||
-rw-r--r-- MEDUSA/None 10905 2016-11-16 16:31 results/log1.dat | |||
-rw-r--r-- MEDUSA/None 10909 2016-11-16 16:31 results/log5.dat | |||
drwxrwxr-x MEDUSA/None 0 2016-11-16 19:36 results/Res-01/ | |||
-rw-r--r-- MEDUSA/None 11672 2016-11-16 15:10 results/Res-01/log.15Feb16.1 | |||
-rw-r--r-- MEDUSA/None 11682 2016-11-16 15:10 results/Res-01/log.15Feb16.4 | |||
drwxrwxr-x MEDUSA/None 0 2016-11-16 19:37 results/Res-02/ | |||
-rw-r--r-- MEDUSA/None 34111 2016-11-16 15:10 results/Res-02/log.15Feb16.balance.b.1 | |||
-rw-r--r-- MEDUSA/None 34117 2016-11-16 15:10 results/Res-02/log.15Feb16.balance.b.4 | |||
-rw-r--r-- MEDUSA/None 10905 2016-11-20 11:16 new.log.dat | |||
Note: Files cannot be added to compressed archives (gz or bzip2). Files can only be added to plain tar archives. | |||
The ‘-r‘ option in the tar command can also be used to append or add a directory or directories to existing tar file. Let’s add report to results.tar from our previous example: | |||
[MEDUSA@MEDUSA-PC Migration]$ tar -rf results.tar report/ | |||
[MEDUSA@MEDUSA-PC Migration]$ tar -tvf results.tar | |||
drwxrwxr-x MEDUSA/None 0 2016-11-20 11:02 results/ | |||
-rw-r--r-- MEDUSA/None 10905 2016-11-16 16:31 results/log1.dat | |||
-rw-r--r-- MEDUSA/None 10909 2016-11-16 16:31 results/log5.dat | |||
drwxrwxr-x MEDUSA/None 0 2016-11-16 19:36 results/Res-01/ | |||
-rw-r--r-- MEDUSA/None 11672 2016-11-16 15:10 results/Res-01/log.15Feb16.1 | |||
-rw-r--r-- MEDUSA/None 11682 2016-11-16 15:10 results/Res-01/log.15Feb16.4 | |||
drwxrwxr-x MEDUSA/None 0 2016-11-16 19:37 results/Res-02/ | |||
-rw-r--r-- MEDUSA/None 34111 2016-11-16 15:10 results/Res-02/log.15Feb16.balance.b.1 | |||
-rw-r--r-- MEDUSA/None 34117 2016-11-16 15:10 results/Res-02/log.15Feb16.balance.b.4 | |||
-rw-r--r-- MEDUSA/None 10905 2016-11-20 11:16 new.log.dat | |||
drwxrwxr-x MEDUSA/None 0 2016-11-20 11:02 report/ | |||
-rw-r--r-- MEDUSA/None 924729 2015-11-20 04:14 report/report-2016.pdf | |||
-rw-r--r-- MEDUSA/None 924729 2015-11-20 04:14 report/report-a.pdf | |||
[MEDUSA@MEDUSA-PC Migration]$ tar -rf results.tar report/ | |||
[MEDUSA@MEDUSA-PC Migration]$ tar -tf results.tar | |||
results/ | |||
results/log1.dat | |||
results/log5.dat | |||
results/Res-01/ | |||
results/Res-01/log.15Feb16.1 | |||
results/Res-01/log.15Feb16.4 | |||
results/Res-02/ | |||
results/Res-02/log.15Feb16.balance.b.1 | |||
results/Res-02/log.15Feb16.balance.b.4 | |||
new.log.dat | |||
new.log.dat | |||
report/ | |||
report/report-2016.pdf | |||
report/report-a.pdf | |||
How to add two archive files with concatenate option? | |||
As we can add a file to archive it is possible to add an archive to another archive. This can be done by invoking the -A option. Let us add the archive report.tar (for the directory report) to the archive results.tar. | |||
[MEDUSA@MEDUSA-PC Migration]$ ls | |||
bin/ documents/ jobs/ new.log.dat programs/ report/ report.tar results/ results.tar tests/ work/ | |||
[MEDUSA@MEDUSA-PC Migration]$ tar -tvf results.tar | |||
drwxr-xr-x MEDUSA/None 0 2016-11-20 16:16 results/ | |||
-rw-r--r-- MEDUSA/None 10905 2016-11-20 16:16 results/log1.dat | |||
-rw-r--r-- MEDUSA/None 10909 2016-11-20 16:16 results/log5.dat | |||
drwxr-xr-x MEDUSA/None 0 2016-11-20 16:16 results/Res-01/ | |||
-rw-r--r-- MEDUSA/None 11682 2016-11-20 16:16 results/Res-01/log.15Feb16.4 | |||
drwxr-xr-x MEDUSA/None 0 2016-11-20 16:16 results/Res-02/ | |||
-rw-r--r-- MEDUSA/None 34111 2016-11-20 16:16 results/Res-02/log.15Feb16.balance.b.1 | |||
-rw-r--r-- MEDUSA/None 34117 2016-11-20 16:16 results/Res-02/log.15Feb16.balance.b.4 | |||
[MEDUSA@MEDUSA-PC Migration]$ tar -A -f results.tar report.tar | |||
[MEDUSA@MEDUSA-PC Migration]$ tar -tvf results.tar | |||
drwxr-xr-x MEDUSA/None 0 2016-11-20 16:16 results/ | |||
-rw-r--r-- MEDUSA/None 10905 2016-11-20 16:16 results/log1.dat | |||
-rw-r--r-- MEDUSA/None 10909 2016-11-20 16:16 results/log5.dat | |||
drwxr-xr-x MEDUSA/None 0 2016-11-20 16:16 results/Res-01/ | |||
-rw-r--r-- MEDUSA/None 11682 2016-11-20 16:16 results/Res-01/log.15Feb16.4 | |||
drwxr-xr-x MEDUSA/None 0 2016-11-20 16:16 results/Res-02/ | |||
-rw-r--r-- MEDUSA/None 34111 2016-11-20 16:16 results/Res-02/log.15Feb16.balance.b.1 | |||
-rw-r--r-- MEDUSA/None 34117 2016-11-20 16:16 results/Res-02/log.15Feb16.balance.b.4 | |||
drwxrwxr-x MEDUSA/None 0 2016-11-20 11:02 report/ | |||
-rw-r--r-- MEDUSA/None 924729 2015-11-20 04:14 report/report-2016.pdf | |||
-rw-r--r-- MEDUSA/None 924729 2015-11-20 04:14 report/report-a.pdf | |||
In the above example, we have used the command tar with -A {for append} (tar -A -f results.tar report.tar) to add the archive report.tar to the archive results.tar as you can see from the comparison of output of the command (tar -tvf results.tar) before and after the append operation. | |||
-A, --catenate, --concatenate | |||
[MEDUSA@MEDUSA-PC Migration]$ tar -A -f full-results.tar report.tar | |||
[MEDUSA@MEDUSA-PC Migration]$ tar -A --file=full-results.tar report.tar | |||
[MEDUSA@MEDUSA-PC Migration]$ tar --list --file=full-results.tar | |||
How to extract the whole archive? | |||
To extract an archive, we use x {for extract} option with f {for file}; v {for verbosity} can be also added. | |||
Let us extract the whole archive results.tar; if we want to extract it in the same directory, we have to make sure that there is no directory with this name otherwise the extracted data go to that directory. It is also possible to extract the archive and redirect to data to another directory. For example we create a directory moved_results and extract the data from the archive results.tar to this directory. | |||
[MEDUSA@MEDUSA-PC Migration]$ tar -xvf results.tar -C new_results/ | |||
results/ | |||
results/log1.dat | |||
results/log5.dat | |||
results/Res-01/ | |||
results/Res-01/log.15Feb16.1 | |||
results/Res-01/log.15Feb16.4 | |||
results/Res-02/ | |||
results/Res-02/log.15Feb16.balance.b.1 | |||
results/Res-02/log.15Feb16.balance.b.4 | |||
new.log.dat | |||
report/ | |||
report/report-2016.pdf | |||
report/report-a.pdf | |||
[MEDUSA@MEDUSA-PC Migration]$ ls new_results/ | |||
new.log.dat report/ results/ | |||
How to compress your file (or files), or your tar archive? | |||
From our previous example, we use gzip or bzip2 to compress the files: new.log.dat and results.tar | |||
Using gzip: | |||
[MEDUSA@MEDUSA-PC Migration]$ ls | |||
bin/ documents/ jobs/ new.log.dat new_results/ programs/ report/ results/ results.tar tests/ work/ | |||
[MEDUSA@MEDUSA-PC Migration]$ gzip new.log.dat | |||
[MEDUSA@MEDUSA-PC Migration]$ gzip results.tar | |||
[MEDUSA@MEDUSA-PC Migration]$ ls | |||
bin/ documents/ jobs/ new.log.dat.gz new_results/ programs/ report/ results/ results.tar.gz tests/ work/ | |||
Using bzip2: | |||
[MEDUSA@MEDUSA-PC Migration]$ ls | |||
bin/ documents/ jobs/ new.log.dat new_results/ programs/ report/ results/ results.tar tests/ work/ | |||
[MEDUSA@MEDUSA-PC Migration]$ bzip2 new.log.dat | |||
[MEDUSA@MEDUSA-PC Migration]$ bzip2 results.tar | |||
[MEDUSA@MEDUSA-PC Migration]$ ls | |||
bin/ documents/ jobs/ new.log.dat.bz2 new_results/ programs/ report/ results/ results.tar.bz2 tests/ work/ | |||
In order to compress, use the "z" or "j" option for gzip or bzip respectively. | |||
$ tar -cvzf abc.tar.gz ./new/ | |||
The extension of the file name does not really matter. "tar.gz" and tgz are common extensions for files compressed with gzip. ".tar.bz2" and ".tbz" are commonly used extensions for bzip compressed files. | |||
[MEDUSA@MEDUSA-PC Migration]$ ls | |||
bin/ documents/ jobs/ new.log.dat programs/ report/ results/ tests/ work/ | |||
[MEDUSA@MEDUSA-PC Migration]$ tar -cvzf results.tar.gz results/ | |||
results/ | |||
results/log1.dat | |||
results/log5.dat | |||
results/Res-01/ | |||
results/Res-01/log.15Feb16.4 | |||
results/Res-02/ | |||
results/Res-02/log.15Feb16.balance.b.1 | |||
results/Res-02/log.15Feb16.balance.b.4 | |||
[MEDUSA@MEDUSA-PC Migration]$ ls | |||
bin/ documents/ jobs/ new.log.dat programs/ report/ results/ results.tar.gz tests/ work/ | |||
[MEDUSA@MEDUSA-PC Migration]$ tar -cvjf results.tar.bz2 results/ | |||
results/ | |||
results/log1.dat | |||
results/log5.dat | |||
results/Res-01/ | |||
results/Res-01/log.15Feb16.4 | |||
results/Res-02/ | |||
results/Res-02/log.15Feb16.balance.b.1 | |||
results/Res-02/log.15Feb16.balance.b.4 | |||
[MEDUSA@MEDUSA-PC Migration]$ ls | |||
bin/ documents/ jobs/ new.log.dat programs/ report/ results/ results.tar.bz2 results.tar.gz tests/ work/ | |||
Notes: | |||
· Another extension tgz can be used instead of tar.gz | |||
· Another extension tbz can be used instead of tar.bz2 |
Revision as of 23:28, 30 November 2016
Archiving means creating one file that contains a number of smaller files within it. Archiving data can improve the efficiency of file storage, and of file transfers. It is faster for the secure copy protocol (scp), for example, to transfer one archive file of a reasonable size than thousands of small files of equal total size.
Compressing means encoding a file such that the same information is contained in fewer bytes of storage. The advantage for long-term data storage should be obvious. For data transfers, the time spent compressing the data can be balanced against the time saved moving fewer bytes as described in this discussion of data compression and transfer from the US National Center for Supercomputing Applications.
Use tar to archive files and directories
The primary archiving utility on all Linux and Unix-like systems is the tar command. It will bundle a bunch of files or directories together and generate a single file, called an archive file or tar-file. By convention an archive file has .tar
as the file name extension.
When you archive a directory with tar
, it will by default include all files and sub-directories contained in it, and sub-sub-directories contained in those, and so on. So
[name@server ~]$ tar --create --file project1.tar project1
will pack all the contents of directory project1/
into the file project1.tar
. The original directory will be unchanged, so this may double the amount of disk space occupied!
You can extract files from the archive using the same command with a different option:
[name@server ~]$ tar --extract --file project1.tar
If there is no directory with the original name, it will be created. If a directory of that name exists and contains files of the same names as in the archive file, they will be overwritten.
How to compress and uncompress tar files
tar
can compress an archive file at the same time it creates it. There are a number of compression methods to choose from. We recommend either xz
or gzip
, which can be used like so:
[name@server ~]$ tar --create --xz --file project1.tar.xz project1
[name@server ~]$ tar --extract --xz --file project1.tar.xz
[name@server ~]$ tar --create --gzip --file project1.tar.gz project1
[name@server ~]$ tar --extract --gzip --file project1.tar.gz
Typically, --xz
will produce a smaller compressed file (a "better compression ratio") but takes longer and uses more RAM while working [1]. --gzip
does not typically compress as small, but may be used if you encounter difficulties due to insufficient memory or excessive run time during tar --create
.
You can also run tar --create
first without compression and then use the commands xz
or gzip
in a separate step, although there is rarely a reason to do so. Similarly you can run xz -d
or gzip -d
to decompress an archive file before running tar --extract
, but again there is rarely a reason to do so.
Common tar options
These are the most common options for tar command. There are two synonymous forms for each, a single-letter form prefixed with a single dash, and a whole-word form prefixed with a double dash:
-c
or--create
: Create a new archive.-f
or--file=
: Following is the archive file name.-x
or--extract
: Extract files from archive.-t
or--list
: List the contents of an archive file.-J
or--xz
: Compress or uncompress withxz
.-z
or--gzip
: Compress or uncompress withgzip
.
Single-letter options can be combined with a single dash, so for example
[name@server ~]$ tar -cJf project1.tar.zx project1
is equivalent to
[name@server ~]$ tar --create --xz --file=project1.tar.xz project1
There are many more options for tar
, and may depend on the version you are using. You can get a complete list of the options available on your system with man tar
or tar --help
. Note in particular that some older systems might not support --xz
compression.
How to tar a given directory?
Now, we can go back to our test example and try to create an archive called results.tar for the directory results; on your terminal type:
[user_name@localhost]$ ls
bin/ documents/ jobs/ new.log.dat programs/ report/ results/ tests/ work/
Then:
[user_name@localhost]$ tar -cvf results.tar results
results/
results/log1.dat
results/log5.dat
results/Res-01/
results/Res-01/log.15Feb16.1
results/Res-01/log.15Feb16.4
results/Res-02/
results/Res-02/log.15Feb16.balance.b.1
results/Res-02/log.15Feb16.balance.b.4
Using ls
command we can see the tar file created:
[user_name@localhost]$ ls
bin/ documents/ jobs/ new.log.dat programs/ report/ results/ results.tar tests/ work/
In this example, we have invoked the tar
command with the options c {for create}, v {for verbosity} and f {for file}. As a name for the archive, we have used results.tar; this name can be something else but it is better to keep similar name as the file or directory we want to tar. It is easier to recognize your data later without having to uncompress them to see what data you have in this file.
If we want to add more directories to a tar file; for example, an archive file called full_results.tar that for the directories results, reports and documents, we can proceed as follow:
[user_name@localhost]$ tar -cvf full_results.tar results report documents/
results/
results/log1.dat
results/log5.dat
results/Res-01/
results/Res-01/log.15Feb16.1
results/Res-01/log.15Feb16.4
results/Res-02/
results/Res-02/log.15Feb16.balance.b.1
results/Res-02/log.15Feb16.balance.b.4
report/
report/report-2016.pdf
report/report-a.pdf
documents/
documents/1504.pdf
documents/ff.doc
[user_name@localhost]$ ls
bin/ documents/ full_results.tar jobs/ new.log.dat programs/ report/ results/ results.tar tests/ work/
How to tar for example all the files or directories that start with a given a letter, "r" for example:
In our working directory, we have two directories that starts with r (report, results).
[user_name@localhost]$ tar -cvf archive.tar r*
report/
report/report-2016.pdf
report/report-a.pdf
results/
results/log1.dat
results/log5.dat
results/Res-01/
results/Res-01/log.15Feb16.1
results/Res-01/log.15Feb16.4
results/Res-02/
results/Res-02/log.15Feb16.balance.b.1
results/Res-02/log.15Feb16.balance.b.4
In this example, we put together the content of the directories results and report into one single archive called archive.tar.
How to see the content of a tar file?
From our previous example, let us consider the tar file results.tar that corresponds all the files and sub-directories in the directory of interest results to see what are the files in it. This can be achieved by invoking the –t option. This gives also additional information about the files like permission, date, owner, etc.
[user_name@localhost]$ tar -tvf results.tar
drwxrwxr-x name name 0 2016-11-20 11:02 results/
-rw-r--r-- name name 10905 2016-11-16 16:31 results/log1.dat
-rw-r--r-- name name 10909 2016-11-16 16:31 results/log5.dat
drwxrwxr-x name name 0 2016-11-16 19:36 results/Res-01/
-rw-r--r-- name name 11672 2016-11-16 15:10 results/Res-01/log.15Feb16.1
-rw-r--r-- name name 11682 2016-11-16 15:10 results/Res-01/log.15Feb16.4
drwxrwxr-x name name 0 2016-11-16 19:37 results/Res-02/
-rw-r--r-- name name 34111 2016-11-16 15:10 results/Res-02/log.15Feb16.balance.b.1
-rw-r--r-- name name 34117 2016-11-16 15:10 results/Res-02/log.15Feb16.balance.b.4
In this example, tar command was invoked with the option t {for list}, v {for virbosity} and {f for file}. This command shows all the files that are in the tar file with additional information about the permission, the date, ownership .... If you are interested just in listing the files in the tar file, use the following options (tf instead of tvf):
[MEDUSA@MEDUSA-PC Migration]$ tar -tf results.tar
results/
results/log1.dat
results/log5.dat
results/Res-01/
results/Res-01/log.15Feb16.1
results/Res-01/log.15Feb16.4
results/Res-02/
results/Res-02/log.15Feb16.balance.b.1
results/Res-02/log.15Feb16.balance.b.4
If you are interested in the number of files in the tar file, it is possible to combine one the previous commands with a pipe { | } and wc -l { word count with the option -l to count only the number of lines}. This command counts the number of lines in the output from the command before the pipe symbol.
[MEDUSA@MEDUSA-PC Migration]$ tar -tvf results.tar | wc -l 9 [MEDUSA@MEDUSA-PC Migration]$ tar -tf results.tar | wc -l 9
From this example, we have a total of 9. This number include all the files and subdirectories that are in the directory results including this directory itself. The options in the previous commands van be invoked separately. For example: The option -tvf is equivalent to -t -v -f The option -v is equivalent to --verbose The option -t is equivalent to -t The option --file=results.tar is equivalent to -f results.tar Note: The option -f or --file= comes always before the tar file.
How to search for a given file in the tar archive file without un-tarring the archive? We have seen previously how to list the files in the archive. So, it possible to list the files and look at for your file or use the list commands combined with pipe an grep commands. For example, let us see if we can fine the file: log.15Feb16.4 (the path to this file is: results/Res-01/log.15Feb16.4).
[MEDUSA@MEDUSA-PC Migration]$ tar -tf results.tar | grep -a log.15Feb16.4
results/Res-01/log.15Feb16.4
[MEDUSA@MEDUSA-PC Migration]$ tar -tvf results.tar | grep -a log.15Feb16.4 -rw-r--r-- MEDUSA/None 11682 2016-11-16 15:10 results/Res-01/log.15Feb16.4
Now, we can try see if we can find another file called for example pbs_file (this file does not exist in our archive):
[MEDUSA@MEDUSA-PC Migration]$ tar -tf results.tar | grep -a pbs_file
[MEDUSA@MEDUSA-PC Migration]$ tar -tvf results.tar | grep -a pbs_file
As you can see, the output of the commands is empty meaning that the file does not exist in the archive. If you want to list all the files that start for example by log in the archive, type on your terminal:
[MEDUSA@MEDUSA-PC Migration]$ tar -tf results.tar | grep -a log*
results/log1.dat
results/log5.dat
results/Res-01/log.15Feb16.1
results/Res-01/log.15Feb16.4
results/Res-02/log.15Feb16.balance.b.1
results/Res-02/log.15Feb16.balance.b.4
Or add the v option for more details:
[MEDUSA@MEDUSA-PC Migration]$ tar -tvf results.tar | grep -a log*
-rw-r--r-- MEDUSA/None 10905 2016-11-16 16:31 results/log1.dat
-rw-r--r-- MEDUSA/None 10909 2016-11-16 16:31 results/log5.dat
-rw-r--r-- MEDUSA/None 11672 2016-11-16 15:10 results/Res-01/log.15Feb16.1
-rw-r--r-- MEDUSA/None 11682 2016-11-16 15:10 results/Res-01/log.15Feb16.4
-rw-r--r-- MEDUSA/None 34111 2016-11-16 15:10 results/Res-02/log.15Feb16.balance.b.1
-rw-r--r-- MEDUSA/None 34117 2016-11-16 15:10 results/Res-02/log.15Feb16.balance.b.4
Note: · The command more can be also invoked after the pipe symbol to list the files in the archive or the compressed file. How to append a file or files or add a new file to the end of archive or tar file? The r option can be used to add files to existing archives, without having to create new ones or extract the archive and run tar again to create the archive. Here is a quick example: let us add the file new.log.dat to the archive results.tar
[MEDUSA@MEDUSA-PC Migration]$ tar -rf results.tar new.log.dat
Here, the tar command added the file new.log.dat at the end of the archive results.tar.
To check out use the previous options to list the files in the tar file:
[MEDUSA@MEDUSA-PC Migration]$ tar -tvf results.tar
drwxrwxr-x MEDUSA/None 0 2016-11-20 11:02 results/
-rw-r--r-- MEDUSA/None 10905 2016-11-16 16:31 results/log1.dat
-rw-r--r-- MEDUSA/None 10909 2016-11-16 16:31 results/log5.dat
drwxrwxr-x MEDUSA/None 0 2016-11-16 19:36 results/Res-01/
-rw-r--r-- MEDUSA/None 11672 2016-11-16 15:10 results/Res-01/log.15Feb16.1
-rw-r--r-- MEDUSA/None 11682 2016-11-16 15:10 results/Res-01/log.15Feb16.4
drwxrwxr-x MEDUSA/None 0 2016-11-16 19:37 results/Res-02/
-rw-r--r-- MEDUSA/None 34111 2016-11-16 15:10 results/Res-02/log.15Feb16.balance.b.1
-rw-r--r-- MEDUSA/None 34117 2016-11-16 15:10 results/Res-02/log.15Feb16.balance.b.4
-rw-r--r-- MEDUSA/None 10905 2016-11-20 11:16 new.log.dat
Note: Files cannot be added to compressed archives (gz or bzip2). Files can only be added to plain tar archives. The ‘-r‘ option in the tar command can also be used to append or add a directory or directories to existing tar file. Let’s add report to results.tar from our previous example:
[MEDUSA@MEDUSA-PC Migration]$ tar -rf results.tar report/
[MEDUSA@MEDUSA-PC Migration]$ tar -tvf results.tar
drwxrwxr-x MEDUSA/None 0 2016-11-20 11:02 results/
-rw-r--r-- MEDUSA/None 10905 2016-11-16 16:31 results/log1.dat
-rw-r--r-- MEDUSA/None 10909 2016-11-16 16:31 results/log5.dat
drwxrwxr-x MEDUSA/None 0 2016-11-16 19:36 results/Res-01/
-rw-r--r-- MEDUSA/None 11672 2016-11-16 15:10 results/Res-01/log.15Feb16.1
-rw-r--r-- MEDUSA/None 11682 2016-11-16 15:10 results/Res-01/log.15Feb16.4
drwxrwxr-x MEDUSA/None 0 2016-11-16 19:37 results/Res-02/
-rw-r--r-- MEDUSA/None 34111 2016-11-16 15:10 results/Res-02/log.15Feb16.balance.b.1
-rw-r--r-- MEDUSA/None 34117 2016-11-16 15:10 results/Res-02/log.15Feb16.balance.b.4
-rw-r--r-- MEDUSA/None 10905 2016-11-20 11:16 new.log.dat
drwxrwxr-x MEDUSA/None 0 2016-11-20 11:02 report/
-rw-r--r-- MEDUSA/None 924729 2015-11-20 04:14 report/report-2016.pdf
-rw-r--r-- MEDUSA/None 924729 2015-11-20 04:14 report/report-a.pdf
[MEDUSA@MEDUSA-PC Migration]$ tar -rf results.tar report/ [MEDUSA@MEDUSA-PC Migration]$ tar -tf results.tar results/ results/log1.dat results/log5.dat results/Res-01/ results/Res-01/log.15Feb16.1 results/Res-01/log.15Feb16.4 results/Res-02/ results/Res-02/log.15Feb16.balance.b.1 results/Res-02/log.15Feb16.balance.b.4 new.log.dat new.log.dat report/ report/report-2016.pdf report/report-a.pdf
How to add two archive files with concatenate option?
As we can add a file to archive it is possible to add an archive to another archive. This can be done by invoking the -A option. Let us add the archive report.tar (for the directory report) to the archive results.tar.
[MEDUSA@MEDUSA-PC Migration]$ ls
bin/ documents/ jobs/ new.log.dat programs/ report/ report.tar results/ results.tar tests/ work/
[MEDUSA@MEDUSA-PC Migration]$ tar -tvf results.tar
drwxr-xr-x MEDUSA/None 0 2016-11-20 16:16 results/
-rw-r--r-- MEDUSA/None 10905 2016-11-20 16:16 results/log1.dat
-rw-r--r-- MEDUSA/None 10909 2016-11-20 16:16 results/log5.dat
drwxr-xr-x MEDUSA/None 0 2016-11-20 16:16 results/Res-01/
-rw-r--r-- MEDUSA/None 11682 2016-11-20 16:16 results/Res-01/log.15Feb16.4
drwxr-xr-x MEDUSA/None 0 2016-11-20 16:16 results/Res-02/
-rw-r--r-- MEDUSA/None 34111 2016-11-20 16:16 results/Res-02/log.15Feb16.balance.b.1
-rw-r--r-- MEDUSA/None 34117 2016-11-20 16:16 results/Res-02/log.15Feb16.balance.b.4
[MEDUSA@MEDUSA-PC Migration]$ tar -A -f results.tar report.tar
[MEDUSA@MEDUSA-PC Migration]$ tar -tvf results.tar
drwxr-xr-x MEDUSA/None 0 2016-11-20 16:16 results/
-rw-r--r-- MEDUSA/None 10905 2016-11-20 16:16 results/log1.dat
-rw-r--r-- MEDUSA/None 10909 2016-11-20 16:16 results/log5.dat
drwxr-xr-x MEDUSA/None 0 2016-11-20 16:16 results/Res-01/
-rw-r--r-- MEDUSA/None 11682 2016-11-20 16:16 results/Res-01/log.15Feb16.4
drwxr-xr-x MEDUSA/None 0 2016-11-20 16:16 results/Res-02/
-rw-r--r-- MEDUSA/None 34111 2016-11-20 16:16 results/Res-02/log.15Feb16.balance.b.1
-rw-r--r-- MEDUSA/None 34117 2016-11-20 16:16 results/Res-02/log.15Feb16.balance.b.4
drwxrwxr-x MEDUSA/None 0 2016-11-20 11:02 report/
-rw-r--r-- MEDUSA/None 924729 2015-11-20 04:14 report/report-2016.pdf
-rw-r--r-- MEDUSA/None 924729 2015-11-20 04:14 report/report-a.pdf
In the above example, we have used the command tar with -A {for append} (tar -A -f results.tar report.tar) to add the archive report.tar to the archive results.tar as you can see from the comparison of output of the command (tar -tvf results.tar) before and after the append operation. -A, --catenate, --concatenate
[MEDUSA@MEDUSA-PC Migration]$ tar -A -f full-results.tar report.tar
[MEDUSA@MEDUSA-PC Migration]$ tar -A --file=full-results.tar report.tar
[MEDUSA@MEDUSA-PC Migration]$ tar --list --file=full-results.tar
How to extract the whole archive? To extract an archive, we use x {for extract} option with f {for file}; v {for verbosity} can be also added. Let us extract the whole archive results.tar; if we want to extract it in the same directory, we have to make sure that there is no directory with this name otherwise the extracted data go to that directory. It is also possible to extract the archive and redirect to data to another directory. For example we create a directory moved_results and extract the data from the archive results.tar to this directory.
[MEDUSA@MEDUSA-PC Migration]$ tar -xvf results.tar -C new_results/
results/
results/log1.dat
results/log5.dat
results/Res-01/
results/Res-01/log.15Feb16.1
results/Res-01/log.15Feb16.4
results/Res-02/
results/Res-02/log.15Feb16.balance.b.1
results/Res-02/log.15Feb16.balance.b.4
new.log.dat
report/
report/report-2016.pdf
report/report-a.pdf
[MEDUSA@MEDUSA-PC Migration]$ ls new_results/
new.log.dat report/ results/
How to compress your file (or files), or your tar archive? From our previous example, we use gzip or bzip2 to compress the files: new.log.dat and results.tar
Using gzip:
[MEDUSA@MEDUSA-PC Migration]$ ls
bin/ documents/ jobs/ new.log.dat new_results/ programs/ report/ results/ results.tar tests/ work/
[MEDUSA@MEDUSA-PC Migration]$ gzip new.log.dat
[MEDUSA@MEDUSA-PC Migration]$ gzip results.tar
[MEDUSA@MEDUSA-PC Migration]$ ls
bin/ documents/ jobs/ new.log.dat.gz new_results/ programs/ report/ results/ results.tar.gz tests/ work/
Using bzip2:
[MEDUSA@MEDUSA-PC Migration]$ ls
bin/ documents/ jobs/ new.log.dat new_results/ programs/ report/ results/ results.tar tests/ work/
[MEDUSA@MEDUSA-PC Migration]$ bzip2 new.log.dat
[MEDUSA@MEDUSA-PC Migration]$ bzip2 results.tar
[MEDUSA@MEDUSA-PC Migration]$ ls
bin/ documents/ jobs/ new.log.dat.bz2 new_results/ programs/ report/ results/ results.tar.bz2 tests/ work/
In order to compress, use the "z" or "j" option for gzip or bzip respectively. $ tar -cvzf abc.tar.gz ./new/ The extension of the file name does not really matter. "tar.gz" and tgz are common extensions for files compressed with gzip. ".tar.bz2" and ".tbz" are commonly used extensions for bzip compressed files.
[MEDUSA@MEDUSA-PC Migration]$ ls bin/ documents/ jobs/ new.log.dat programs/ report/ results/ tests/ work/
[MEDUSA@MEDUSA-PC Migration]$ tar -cvzf results.tar.gz results/ results/ results/log1.dat results/log5.dat results/Res-01/ results/Res-01/log.15Feb16.4 results/Res-02/ results/Res-02/log.15Feb16.balance.b.1 results/Res-02/log.15Feb16.balance.b.4
[MEDUSA@MEDUSA-PC Migration]$ ls bin/ documents/ jobs/ new.log.dat programs/ report/ results/ results.tar.gz tests/ work/ [MEDUSA@MEDUSA-PC Migration]$ tar -cvjf results.tar.bz2 results/ results/ results/log1.dat results/log5.dat results/Res-01/ results/Res-01/log.15Feb16.4 results/Res-02/ results/Res-02/log.15Feb16.balance.b.1 results/Res-02/log.15Feb16.balance.b.4
[MEDUSA@MEDUSA-PC Migration]$ ls bin/ documents/ jobs/ new.log.dat programs/ report/ results/ results.tar.bz2 results.tar.gz tests/ work/
Notes: · Another extension tgz can be used instead of tar.gz · Another extension tbz can be used instead of tar.bz2