tar(1) is the GNU tape archiver. It takes several files
or directories and creates one large file. This allows you to compress
an entire directory tree, which is impossible by just using
gzip or bzip2. tar
has many command line options, which are explained in its man page. This
section will just cover the most common uses of tar.
The most common use for tar is to decompress and
unarchive a package that you've downloaded from a web site or ftp site.
Most files will come with a .tar.gz extension. This is
commonly known as a “tarball”. It means that several files were
archived using tar and then compressed using
gzip. You might also see this listed as a
.tar.Z file. It means the same thing, but this is usually
encountered on older Unix systems.
Alternatively, you might find a .tar.bz2 file somewhere.
Kernel source is distributed as such because it is a smaller download. As you
might have guessed, this is several files archived with tar
and then bzipped.
You can get to all the files in this archive by making use of
tar and some command line arguments. Unarchiving a
tarball makes use of the -z flag, which means to first
run the file through gunzip and decompress it. The most
common way to decompress a tarball is like so:
That's quite a few options. So what do they all mean? The
-x means to extract. This is important, as it tells
tar exactly what to do with the input file. In this case,
we'll be splitting it back up into all the files that it came from.
-v means to be verbose. This will list all the files
that are being unarchived. It is perfectly acceptable to leave this option
off, if somewhat boring. Alternatively, you could use
-vv to be very verbose and list even more information
about each file being unarchived. The -z option tells
tar to run hejaz.tar.gz through
gunzip first. And finally, the -f
option tells tar that the next string on the command line
is the file to operate on.
There are a few other ways to write this same command. On older systems
lacking a decent copy of GNU tar, you might see it written like so:
$ gzip -dc hejaz.tar.gz | tar -xvf - |
This command line will unzip the file and send the output to
tar. Since gzip will write its
output to standard out if told to do so, this command will write the
decompressed file to standard out. The pipe then sends it to
tar for unarchiving. The “-” means to operate
on standard input. It will unarchive the stream of data that it gets from
gzip and write that to the disk.
Another way to write the first command line is to leave off the dash before
the options, like so:
You might also encounter a bzipped archive. The version of
tar that comes with Slackware Linux can handle these the
same as gzipped archives. Instead of the -x command
line option, you'd use -y:
It is important to note that tar will place the unarchived
files in the current directory. So, if you had an archive in
/tmp that you wanted to decompress
into your home directory, there are two options. First, the archive could be
moved into your home directory and then run through tar.
Or, you could specify the path to the archive file on the command line:
$ tar -xvzf /tmp/bar.tar.gz |
The contents of the archive would be dumped into your home directory, and the
original compressed archive file will still be in /tmp.
The second most common operation with tar is making your
own archives. Making an archive is no more complicated than unarchiving other
files; it just takes a different set of command lines options.
To create a compressed tar archive of all the files in the current directory
(including any subdirectories and their files), you would use
tar like so:
$ tar -cvzf archive.tar.gz . |
In this command line, the -c option tells
tar to create an archive, while the -z
option runs the resulting archive file through gzip to
compress it. archive.tar.gz is the file that you want to
create. You can call it anything you want, and if you include a full path
name, it will put the archive in that directory. Here is an example of that:
$ tar -cvzf /tmp/archive.tar.gz . |
The archive would then go into /tmp.
You can also list all the file and directories that you want to be included in
the archive by listing them at the end of the command. In this case, the
. is the directory to include in the archive. This
could easily be replaced with a list of various files, or whatever you want
to archive.