Tar: Difference between revisions
(translation tags) |
(Marked this version for translation) |
||
Line 2: | Line 2: | ||
<translate> | <translate> | ||
<!--T:1--> | |||
[https://en.wikipedia.org/wiki/Archive_file Archiving] means creating one file that contains a number of smaller files within it. Archiving data can improve the efficiency of file storage, and of file transfers. It is faster for the secure copy protocol ([https://en.wikipedia.org/wiki/Secure_copy scp]), for example, to transfer one archive file of a reasonable size than thousands of small files of equal total size. | [https://en.wikipedia.org/wiki/Archive_file Archiving] means creating one file that contains a number of smaller files within it. Archiving data can improve the efficiency of file storage, and of file transfers. It is faster for the secure copy protocol ([https://en.wikipedia.org/wiki/Secure_copy scp]), for example, to transfer one archive file of a reasonable size than thousands of small files of equal total size. | ||
<!--T:2--> | |||
[https://en.wikipedia.org/wiki/Data_compression Compressing] means encoding a file such that the same information is contained in fewer bytes of storage. The advantage for long-term data storage should be obvious. For data transfers, the time spent compressing the data can be balanced against the time saved moving fewer bytes as described in this discussion of [https://bluewaters.ncsa.illinois.edu/data-transfer-doc data compression and transfer] from the US National Center for Supercomputing Applications. | [https://en.wikipedia.org/wiki/Data_compression Compressing] means encoding a file such that the same information is contained in fewer bytes of storage. The advantage for long-term data storage should be obvious. For data transfers, the time spent compressing the data can be balanced against the time saved moving fewer bytes as described in this discussion of [https://bluewaters.ncsa.illinois.edu/data-transfer-doc data compression and transfer] from the US National Center for Supercomputing Applications. | ||
=== Use tar to archive files and directories === | === Use tar to archive files and directories === <!--T:3--> | ||
The primary archiving utility on all Linux and Unix-like systems is the [https://www.gnu.org/software/tar/manual/tar.html tar] command. It will bundle a bunch of files or directories together and generate a single file, called an ''archive file'' or ''tar-file''. By convention an archive file has <code>.tar</code> as the file name extension. | The primary archiving utility on all Linux and Unix-like systems is the [https://www.gnu.org/software/tar/manual/tar.html tar] command. It will bundle a bunch of files or directories together and generate a single file, called an ''archive file'' or ''tar-file''. By convention an archive file has <code>.tar</code> as the file name extension. | ||
Line 13: | Line 15: | ||
will pack all the contents of directory <code>project1/</code> into the file <code>project1.tar</code>. The original directory will be unchanged, so this may double the amount of disk space occupied! | will pack all the contents of directory <code>project1/</code> into the file <code>project1.tar</code>. The original directory will be unchanged, so this may double the amount of disk space occupied! | ||
<!--T:4--> | |||
You can extract files from the archive using the same command with a different option: | You can extract files from the archive using the same command with a different option: | ||
{{Command|tar --extract --file project1.tar}} | {{Command|tar --extract --file project1.tar}} | ||
If there is no directory with the original name, it will be created. If a directory of that name exists and contains files of the same names as in the archive file, they will be overwritten. | If there is no directory with the original name, it will be created. If a directory of that name exists and contains files of the same names as in the archive file, they will be overwritten. | ||
=== How to compress and uncompress tar files === | === How to compress and uncompress tar files === <!--T:5--> | ||
<code>tar</code> can compress an archive file at the same time it creates it. There are a number of compression methods to choose from. We recommend either '''<code>xz</code>''' or '''<code>gzip</code>''', which can be used like so: | <code>tar</code> can compress an archive file at the same time it creates it. There are a number of compression methods to choose from. We recommend either '''<code>xz</code>''' or '''<code>gzip</code>''', which can be used like so: | ||
{{Commands|tar --create --xz --file project1.tar.xz project1 | {{Commands|tar --create --xz --file project1.tar.xz project1 | ||
Line 25: | Line 28: | ||
Typically, <code>--xz</code> will produce a smaller compressed file (a "better compression ratio") but takes longer and uses more RAM while working [http://catchchallenger.first-world.info/wiki/Quick_Benchmark:_Gzip_vs_Bzip2_vs_LZMA_vs_XZ_vs_LZ4_vs_LZO]. <code>--gzip</code> does not typically compress as small, but may be used if you encounter difficulties due to insufficient memory or excessive run time during <code>tar --create</code>. | Typically, <code>--xz</code> will produce a smaller compressed file (a "better compression ratio") but takes longer and uses more RAM while working [http://catchchallenger.first-world.info/wiki/Quick_Benchmark:_Gzip_vs_Bzip2_vs_LZMA_vs_XZ_vs_LZ4_vs_LZO]. <code>--gzip</code> does not typically compress as small, but may be used if you encounter difficulties due to insufficient memory or excessive run time during <code>tar --create</code>. | ||
<!--T:6--> | |||
You can also run <code>tar --create</code> first without compression and then use the commands <code>xz</code> or <code>gzip</code> in a separate step, although there is rarely a reason to do so. Similarly you can run <code>xz -d</code> or <code>gzip -d</code> to decompress an archive file before running <code>tar --extract</code>, but again there is rarely a reason to do so. | You can also run <code>tar --create</code> first without compression and then use the commands <code>xz</code> or <code>gzip</code> in a separate step, although there is rarely a reason to do so. Similarly you can run <code>xz -d</code> or <code>gzip -d</code> to decompress an archive file before running <code>tar --extract</code>, but again there is rarely a reason to do so. | ||
=== Common tar options === | === Common tar options === <!--T:7--> | ||
These are the most common options for tar command. There are two synonymous forms for each, a single-letter form prefixed with a single dash, and a whole-word form prefixed with a double dash: | These are the most common options for tar command. There are two synonymous forms for each, a single-letter form prefixed with a single dash, and a whole-word form prefixed with a double dash: | ||
* <code>-c</code> or <code>--create</code>: Create a new archive. | * <code>-c</code> or <code>--create</code>: Create a new archive. | ||
Line 40: | Line 44: | ||
{{Command|tar --create --xz --file{{=}}project1.tar.xz project1}} | {{Command|tar --create --xz --file{{=}}project1.tar.xz project1}} | ||
<!--T:8--> | |||
There are many more options for <code>tar</code>, and may depend on the version you are using. You can get a complete list of the options available on your system with <code>man tar</code> or <code>tar --help</code>. Note in particular that some older systems might not support <code>--xz</code> compression. | There are many more options for <code>tar</code>, and may depend on the version you are using. You can get a complete list of the options available on your system with <code>man tar</code> or <code>tar --help</code>. Note in particular that some older systems might not support <code>--xz</code> compression. | ||
Revision as of 20:47, 30 November 2016
Archiving means creating one file that contains a number of smaller files within it. Archiving data can improve the efficiency of file storage, and of file transfers. It is faster for the secure copy protocol (scp), for example, to transfer one archive file of a reasonable size than thousands of small files of equal total size.
Compressing means encoding a file such that the same information is contained in fewer bytes of storage. The advantage for long-term data storage should be obvious. For data transfers, the time spent compressing the data can be balanced against the time saved moving fewer bytes as described in this discussion of data compression and transfer from the US National Center for Supercomputing Applications.
Use tar to archive files and directories
The primary archiving utility on all Linux and Unix-like systems is the tar command. It will bundle a bunch of files or directories together and generate a single file, called an archive file or tar-file. By convention an archive file has .tar
as the file name extension.
When you archive a directory with tar
, it will by default include all files and sub-directories contained in it, and sub-sub-directories contained in those, and so on. So
[name@server ~]$ tar --create --file project1.tar project1
will pack all the contents of directory project1/
into the file project1.tar
. The original directory will be unchanged, so this may double the amount of disk space occupied!
You can extract files from the archive using the same command with a different option:
[name@server ~]$ tar --extract --file project1.tar
If there is no directory with the original name, it will be created. If a directory of that name exists and contains files of the same names as in the archive file, they will be overwritten.
How to compress and uncompress tar files
tar
can compress an archive file at the same time it creates it. There are a number of compression methods to choose from. We recommend either xz
or gzip
, which can be used like so:
[name@server ~]$ tar --create --xz --file project1.tar.xz project1
[name@server ~]$ tar --extract --xz --file project1.tar.xz
[name@server ~]$ tar --create --gzip --file project1.tar.gz project1
[name@server ~]$ tar --extract --gzip --file project1.tar.gz
Typically, --xz
will produce a smaller compressed file (a "better compression ratio") but takes longer and uses more RAM while working [1]. --gzip
does not typically compress as small, but may be used if you encounter difficulties due to insufficient memory or excessive run time during tar --create
.
You can also run tar --create
first without compression and then use the commands xz
or gzip
in a separate step, although there is rarely a reason to do so. Similarly you can run xz -d
or gzip -d
to decompress an archive file before running tar --extract
, but again there is rarely a reason to do so.
Common tar options
These are the most common options for tar command. There are two synonymous forms for each, a single-letter form prefixed with a single dash, and a whole-word form prefixed with a double dash:
-c
or--create
: Create a new archive.-f
or--file=
: Following is the archive file name.-x
or--extract
: Extract files from archive.-t
or--list
: List the contents of an archive file.-J
or--xz
: Compress or uncompress withxz
.-z
or--gzip
: Compress or uncompress withgzip
.
Single-letter options can be combined with a single dash, so for example
[name@server ~]$ tar -cJf project1.tar.zx project1
is equivalent to
[name@server ~]$ tar --create --xz --file=project1.tar.xz project1
There are many more options for tar
, and may depend on the version you are using. You can get a complete list of the options available on your system with man tar
or tar --help
. Note in particular that some older systems might not support --xz
compression.