http://invisible-island.net/autoconf/
Copyright © 2014–2015,2016 by Thomas E. Dickey
There is no standard version of the tar program. This may surprise some, who either assume that because it is available “everywhere” or have read comments to the contrary, suppose it must be a standard. In POSIX (since 2001), the equivalent of tar is the pax program. As noted in the rationale:
The pax utility was new for the ISO POSIX-2:1993 standard. It represents a peaceful compromise between advocates of the historical tar and cpio utilities.
Arnold Robbins (in Unix in a Nutshell) gives more detail:
pax [options] [patterns]
Portable Archive Exchange program. When members of the first POSIX 1003.2 working group could not standardize on either tar or cpio, they invented this program. (See also cpio and tar.)
GNU/Linux and Max OS X use almost identical versions of pax, developed by the OpenBSD team, based on the original freely available version by Keith Muller.
I used (and preferred) cpio starting in early 1986, when I wrote sccs_tools. My project had about twenty tape cartridges storing snapshots of the project's sources. The cpio program was used for writing and reading those tapes. I continued to use cpio for my own backups when I started development with Linux early in 1994. Here is a fragment from my backup script from May 1994:
cpio --verbose --reset-access-time --format=ustar -B -o -O $DST
The nice thing about cpio was that it accepts a list of pathnames from its standard input. Sadly, cpio was not prevalent on the systems where I was developing at that time, and I began to rely upon tar. Unlike cpio, tar requires its pathnames to be given on the command-line, limiting its use of standard input/output to the actual data being processed. Aside from doing my backups, I also exchange data with others in tar-files. The compelling reason for using tar is that (unlike cpio) if I provide a tar-file to others, they are likely to have a program to read it.
Unlike cpio, there were several implementations of tar. Others have tabulated differences (I will not summarize those here).
Usually tar is just a “given”, used for distributing and receiving a set of files.
Initially, in 1997 (see CHANGES2.8), tar was simply one of several programs for which I chose to compile-in full pathnames, to match the previous hand-crafted makefiles as well as to ensure that a specific program was run, rather than just any program named “tar”:
1997-04-02 * refine CF_PATH_PROG to allow for machines that haven't the given programs, by using only the program name and added configure option --disable-full-paths to enforce this behavior. - TD 1997-03-23 * Add autoconf tests for paths of programs, including sendmail vs mmdf - TD
lynx.cfg
which
could override the compiled-in pathnames.
2002-12-01 (2.8.5dev.11) * document xxx_PATH variables in lynx.cfg -TD
That included TAR_PATH.
Early in 2004, I extended the check for tar to include similar programs (referring to CHANGES):
2004-01-28 (2.8.5pre.4) * modify configure check for tar to test several common variants including star, modify makefile.in to use the configured 'tar' program (request by FLWM) -TD
The last step addressed the more common tar (or pax!) variants. Here is the configure check which I wrote:
dnl CF_TAR_OPTIONS version: 1 updated: 2004/01/26 20:58:41
dnl --------------
dnl This is just a list of the most common tar options, allowing for variants
dnl that can operate with the "-" standard input/output option.
AC_DEFUN([CF_TAR_OPTIONS],
[
case ifelse($1,,tar,$1) in
*pax)
TAR_UP_OPTIONS="-w"
TAR_DOWN_OPTIONS="-r"
TAR_PIPE_OPTIONS=""
TAR_FILE_OPTIONS="-f"
;;
*star)
TAR_UP_OPTIONS="-c -f"
TAR_DOWN_OPTIONS="-x -U -f"
TAR_PIPE_OPTIONS="-"
TAR_FILE_OPTIONS=""
;;
*tar)
# FIXME: some versions of tar require, some don't allow the "-"
TAR_UP_OPTIONS="-cf"
TAR_DOWN_OPTIONS="-xf"
TAR_PIPE_OPTIONS="-"
TAR_FILE_OPTIONS=""
;;
esac
AC_SUBST(TAR_UP_OPTIONS)
AC_SUBST(TAR_DOWN_OPTIONS)
AC_SUBST(TAR_FILE_OPTIONS)
AC_SUBST(TAR_PIPE_OPTIONS)
])dnl
It supplements this chunk:
CF_PATH_PROG(TAR, tar, pax gtar gnutar bsdtar star) CF_TAR_OPTIONS($TAR) AC_DEFINE_UNQUOTED(TAR_UP_OPTIONS, "$TAR_UP_OPTIONS") AC_DEFINE_UNQUOTED(TAR_DOWN_OPTIONS, "$TAR_DOWN_OPTIONS") AC_DEFINE_UNQUOTED(TAR_FILE_OPTIONS, "$TAR_FILE_OPTIONS") AC_DEFINE_UNQUOTED(TAR_PIPE_OPTIONS, "$TAR_PIPE_OPTIONS")
With these parameters of “tar” it was possible to rework some hardcoded command-lines to un-tar files which were downloaded by lynx, e.g., (and simplifying):
gzip -dc filename.tar.dc | $TAR_PATH $TAR_DOWN_OPTIONS $TAR_PIPE_OPTIONS
That worked well enough, but there were a few trouble-spots.
The configure check assumes too much about the option syntax, by basing the available options on the tar program name. It would be possible to improve on this by testing the program against known useful options.
The check does not concern itself with the ownership of files which are extracted from the tar archive. Lynx disables setuid operation, but could be run by the root user.
A configure script cannot be counted on to run as root, and cannot test whether a tar program requires some special option to preserve file ownership.
SVR4 tar on AIX, HPUX, Solaris documents these
options, with some variations.
I omit an unrelated paragraph from the “o” option for
brevity:
- o
- When o is used for reading, it causes the extracted file to take on the user and group IDs of the user running the program rather than those on the tape. This is the default for the ordinary user and can be overridden, to the extent that system protections allow, by using the p function modifier.
- p
- Cause file to be restored to the original modes and ownerships written on the archive, if possible. This is the default for the superuser, and can be overridden by the o function modifier. If system protections prevent the ordinary user from executing chown(), the error is ignored, and the ownership is set to that of the restoring process (see chown(2)). The set-user-id, set-group-id, and sticky bit information are restored as allowed by the protections defined by chmod() if the chown() operation above succeeds.
The same options were documented in SunOS 4 tar (with fewer words, of course):
o Suppress information specifying owner and modes of directories which tar normally places in the archive. Such information makes former versions of tar generate an error message like: filename/: cannot create when they encounter it. p Restore the named files to their original modes, ignoring the present umask(2V). SetUID and sticky information are also extracted if you are the super-user. This option is only useful with the x key letter.
Not all tar programs have made that distinction. In 1997, there was a thread on devel@XFree86.Org with this item:
Date: Sun, 13 Jul 1997 01:24:55 +1000 From: David Dawes <dawes@rf900.physics.usyd.edu.au> To: devel@XFree86.Org Subject: Extract utility (was: Re: missing 'p' flag for tar in RELNOTES) On Fri, Jun 06, 1997 at 10:03:16PM +0200, Matthieu Herrb wrote: >David Dawes wrote (in a message from Fri 6) > > > > Not all versions of tar require the 'p' flag for this. Gnu tar for > > example doesn't require this. Neither does the 'tar' that comes with > > Solaris 2.5 (in spite of what the man page implies). Which tar does > > OpenBSD use? > >A modified pax. > > > Is using OpenBSD's cpio a better option > > (if it knows how to extract tar archives)? > >it's based on pax too, but it does preserve the file modes on >extraction, so it's indeed better. > > > I'm more and more coming to the conclusion that we should provide an > > 'extract' binary for each OS that people can use to unpack the .tgz > > files in a reliable way. I would currently see this as being say GNU > > tar, with the --unlink flag that some BSD versions have added included > > and enabled by default, and modified to use zlib to avoid the need for > > a separate gzip binary. > >Yes. For example OpenBSD's pax based commands can't read some tarballs >made by GNU tar. I've done some work on this, and I have something which we can hopefully use for 3.3.1. It is gnu tar 1.12, with support added to make use of zlib so that it is self-contained. When run as "extract" it sets the -x, -z and --unlink-first flags, and accepts multiple .tgz files on the command line. The -t flag can be used to override -x and list the contents. When run under any other name, it behaves like tar. The code for this is available as utils-1.0.0.tgz in the beta directory. Can those who build binary distributions please check that it compiles and works OK. If there are any problems, let me know. Building it should only require running 'make' from the utils directory. >That's the reason for which I didn't contribute back my buid-bindist >scripts for 3.3. This has forced me to use one of the pax based >commands. Unfortunatly none of them have the equivalent of the GNU tar >'--exclude-from' option, so I had to build explicit lists of files to >include in each tarball. This binary can be used (under the name gnu-tar) to build the bindists. In fact, it is probably best to use this one so that compatibility problems are avoided. David
Beyond the parameterization, lynx's extraction of files from an archive is simplistic, assumes no errors. In practice, that could fail for any of several reasons. But the most interesting one is due to tar-file format differences, e.g., in the way excessively long pathnames are stored.
Although POSIX documented (with pax) a scheme for storing long filenames in 1989, it was not until the mid-1990s before things started to settle out. Not everyone got on board at the same time.
For instance, Ant's documentation for the tar
task says:
Early versions of tar did not support path lengths greater than 100 characters. Over time several incompatible extensions have been developed until a new POSIX standard was created that added so called PAX extension headers (as the pax utility first introduced them) that among another things addressed file names longer than 100 characters. All modern implementations of tar support PAX extension headers.
Ant's tar support predates the standard with PAX extension headers, it supports different dialects that can be enabled using the longfile attribute. If the longfile attribute is set to fail, any long paths will cause the tar task to fail. If the longfile attribute is set to truncate, any long paths will be truncated to the 100 character maximum length prior to adding to the archive. If the value of the longfile attribute is set to omit then files containing long paths will be omitted from the archive. Either option ensures that the archive can be untarred by any compliant version of tar.
For more detailed information on Ant, see the documentation on The TAR package.
The interesting tar variants of course are those which I can inspect and compare their behavior at different points in time. That equates to saying that I can read the source code.
I have access to a few Unix systems for comparison (AIX 5-7, HPUX 11, Solaris 8-11). Because source is not generally available, there is not much to say.
Illumos (descendent of OpenSolaris) has
tar source (and
cpio source) in its Github
repository.
Interesting enough, it started as 4.3 BSD tar:
/* * Copyright (c) 1988, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright 2012 Milan Jurik. All rights reserved. * Copyright 2015 Joyent, Inc. */ /* Copyright (c) 1983, 1984, 1985, 1986, 1987, 1988, 1989 AT&T */ /* All Rights Reserved */ /* Copyright (c) 1987, 1988 Microsoft Corporation */ /* All Rights Reserved */ /* * Portions of this source code were derived from Berkeley 4.3 BSD * under license from the Regents of the University of California. */
For what it's worth, the cpio
source also uses
BSD code and has similar copyrights:
/* * Copyright (c) 1988, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright 2012 Milan Jurik. All rights reserved. * Copyright (c) 2012 Gary Mills */ /* Copyright (c) 1983, 1984, 1985, 1986, 1987, 1988, 1989 AT&T */ /* All Rights Reserved */ /* * Portions of this source code were derived from Berkeley 4.3 BSD * under license from the Regents of the University of California. */
The earliest sources I have at hand for tar are
ansitar
published on net.sources at
the beginning of July 1983.3BSD tar is more interesting than ansitar
,
because the latter works only for tapes, not
files. Also, ansitar
uses a different
header format.
Some of the BSD source code was reportedly AT&T source code, but not apparent because AT&T neglected to mark their sources. In reading the BSD source for tar and its manual page, there is no copyright notice applied until 1986 (for the 4.3BSD source code) and 1990 (for the manual page). That is not AT&T:
/* * Copyright (c) 1980 Regents of the University of California. * All rights reserved. The Berkeley software License Agreement * specifies the terms and conditions for redistribution. */
The successive releases from CSRG are clearly related (1980
through 1990).
A new implementation (part of pax) by Keith
Muller was introduced after that (seen in 4.4BSD-Lite):
/*- * Copyright (c) 1992 Keith Muller. * Copyright (c) 1992, 1993 * The Regents of the University of California. All rights reserved. * * This code is derived from software contributed to Berkeley by * Keith Muller of the University of California, San Diego.
Outside the BSD sources, there was another tar implementation. You may find a copy in DECUS as "posixtar" (July 9, 1987):
/* * A public domain tar(1) program. * * Written by John Gilmore, ihnp4!hoptoad!gnu, starting 25 Aug 85. * * @(#)tar.c 1.21 10/29/86 Public Domain - gnu */
This happens to be the same version that John
Gilmore posted to mod.sources volume 7 as
v07i088: Public-domain TAR program (1986/12/10).
It is likely the version fetched by Stallman as the basis for GNU
tar.
A quick check indicates that Gilmore wrote this shortly after leaving Sun:
Here's my two cents on the issue (disclaimer: I was emp #5 of Sun, though I've been gone more than two years). DEC, HP, Apollo, etc were happy with AT&T controlling Unix when it was clear AT&T was not a competitive threat. AT&T's inability to sell computers is legendary. In a tighter partnership with Sun, AT&T might actually be able to make money at computers, which would give the protesters a major competitor rather than a pussycat.
In 1985, Gilmore left Sun with $10,000 in his pocket, a Sun workstation, and significant stock holdings in the company.
It is mentioned in the BACKLOG file for GNU tar 1.12:
1. ....-..-.. John Gilmore: Re: I'm writing a public domain -tar- 2. 1985-09-14 Richard M. Stallman: I'm writing a public domain -tar- 3. 1985-12-03 John Gilmore: Re: tar 4. ....-..-.. David C. Anderson: Re: tar 5. 1986-10-31 John Gilmore: Re: wanted: a VMS program to write UNIX tar tapes 6. 1986-12-22 Richard M. Stallman: I got the tar 7. 1987-02-14 John Gilmore: Re: tar 8. 1987-12-15 Brian Reid: (none) 9. 1988-02-03 Jay Fenlason: (none)
Schilling refers to a version obtained from Sun Users Group as being the first that Gilmore published, and also leads the reader to believe that Gilmore did the work as an employee of Sun. For example:
The social background is: Star is maintained by me since 1982. Gtar started as PD-TAR/SUG-TAR from John Gilmore (a Sun employee) in late 1986 and it was taken by Stallman in 1989. In the early 1990s, the maintained changed frequently and in that time (1993) I first reported the problem. – schily Sep 5 at 9:47
The mod.sources volume 7 files are older, and there are significant differences:
Makefile | 107 +++====
PORTING | 45 !
README | 59 +!==
TODO | 76 ++-==
buffer.c | 712 +++++++++++++++++------===============================
create.c | 526 +-!======================================
extract.c | 407 +++++++-!=======================
list.c | 477 +!!================================
port.c | 431 ++++++++++++++++++++++!========
port.h | 19 =
sugtar/diffarch.c | 319 ++++++++++++++++++++++++
sugtar/open3.h | 45 +++
tar.1 | 185 +-=============
tar.c | 450 +-=================================
tar.h | 176 =============
15 files changed, 1184 insertions(+), 102 deletions(-), 240 modifications(!), 2508 unchanged lines(=)
Likewise, the tie-in to Sun is weaker than stated by Schilling.
Reflecting on it, there are other problems with Schilling's statement. But aside from those I have commented on, there is no independent source of information which can be used to compare against Schilling's account. For each detail where there is another source of information, it differs.
Gilmore made a second posting of
pdtar to comp.sources.unix volume12
v12i068: Public domain TAR (1987/11/29). One of the
differences between the two postings was the addition of
wildmat.c
, which is present in GNU tar 1.09,
indicating that this latter posting was used in the development
of GNU tar. First, compare against the volume 7 posting:
Makefile | 157 ++++++!===
PORTING | 57 +!
README | 54 !!
TODO | 69 +-==
buffer.c | 763 +++++++++++++++++++-----!==========================
create.c | 594 ++++++-!!!!!=============================
extract.c | 454 ++++++++++!!!!================
list.c | 507 +++!!!!===========================
names.c | 118 =======
port.c | 541 +++++++++++++++++++++++++++=========
port.h | 29
tar.1 | 215 +++!!!=======
tar.5 | 217 ==============
tar.c | 496 ++++-=============================
tar.h | 180 ===========
volume12/diffarch.c | 323 ++++++++++++++++++++++
volume12/msd_dir.c | 214 ++++++++++++++
volume12/msd_dir.h | 36 ++
volume12/open3.h | 50 +++
volume12/wildmat.c | 132 ++++++++
20 files changed, 2028 insertions(+), 115 deletions(-), 428 modifications(!), 2635 unchanged lines(=)
Now, compare against the SUG version:
Makefile | 157 +++!=====
PORTING | 57 !!
README | 59 !!=
TODO | 64 !==
buffer.c | 688 +++!!=========================================
create.c | 599 +++++-!!!!!=============================
diffarch.c | 324 =====================
extract.c | 451 +++!!!========================
list.c | 509 ++!!=============================
names.c | 118 =======
open3.h | 50 ==
pdtar-volume12/msd_dir.c | 214 ++++++++++++++
pdtar-volume12/msd_dir.h | 36 ++
pdtar-volume12/wildmat.c | 132 +++++++++
port.c | 541 +++++++=============================
port.h | 29
tar.1 | 215 ++!===========
tar.5 | 217 ==============
tar.c | 492 +++!=============================
tar.h | 180 ===========
20 files changed, 872 insertions(+), 41 deletions(-), 343 modifications(!), 3876 unchanged lines(=)
Considering the numbers, it seems that the SUG version is about midway between the Usenet postings for volume 7 and volume 12.
Here,
pax is mainly of interest because it implements the
USTAR (Unix standard
tar format), provided by modern implementations of
tar
.
The program itself was the result of a failure to agree on
whether tar
or cpio
was the one to
standardize, and as a result we have a program which does either.
The newsgroup thread beginning with John S. Quarterman's posting
tar
vs. cpio to comp.std.unix on June 1, 1987
summarizes the different points of view.
According to
Glen Fowler, the first “public implementation” of
pax
was written by Mark H. Colburn.
He posted it to comp.sources.unix as “Usenix/IEEE
POSIX replacement for TAR and CPIO”
(volume 17, issues
74,
75,
76,
77,
78, and
79, date February 3, 1989).
The manual pages for pax
on some Unix vendors
attribute pax
to Mark H. Colburn:
HPUX:
AUTHOR pax was developed by Mark H. Colburn, OSF, and HP. STANDARDS CONFORMANCE pax: XPG4, POSIX.2
IRIX (SGI):
COPYRIGHT Copyright (c) 1989 Mark H. Colburn. All rights reserved. Redistribution and use in source and binary forms are permitted provided that the above copyright notice is duplicated in all such forms and that any documentation, advertising materials, and other materials related to such distribution and use acknowledge that the software was developed by Mark H. Colburn and sponsored by The USENIX Association. THE SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. AUTHOR Mark H. Colburn Minnetech Consulting, Inc. 117 Mackubin Street, Suite 1 St. Paul, MN 55102 mark@jhereg.MN.ORG Sponsored by The USENIX Association for public distribution.
SCO:
Copyright Copyright © 1989 Mark H. Colburn. All rights reserved. Redistribution and use in source and binary forms are permitted provided that the above copyright notice is duplicated in all such forms and that any documentation, advertising materials, and other materials related to such distribution and use acknowledge that the software was developed by Mark H. Colburn and sponsored by The USENIX Association. THE SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE. Author Mark H. Colburn NAPS International 117 Mackubin Street, Suite 1 St. Paul, MN 55102 mark@jhereg.MN.ORG Sponsored by The USENIX Association for public distribution.
but not others:
Solaris (Illumos has no source for pax
, but
has a manual
page.
Tru64 Unix (originally OSF/1):
While there was early discussion (in
1990) for Minix to use Colburn's pax
, as of 2015
Minix manual pages list only
tar (no pax
). This is apparently
BSD tar (based on bulk import from NetBSD).
Later implementations include
Gunnar Ritter (2004), found in the Heirloom
Toolchest section of his Heirloom project.
Ritter's implementation provides several formats (see
manpage).
While there are a few exceptions, e.g., Linux
From Scratch which uses Gunnar Ritter's version,
most BSD- and Linux-systems provide the implementation by Keith
Muller:
FreeBSD and NetBSD began with checkins from 4.4BSD-Lite in 1994.
OpenBSD came later.
OpenBSD
source), (initial checkin from NetBSD on June 11,
1996).
Before importing pax
from NetBSD, OpenBSD used
GNU tar.
OSX pax
comes from OpenBSD, using
source from early 1998 according to the CVS identifiers
(for example, see
pax.c).
There are minor changes made by Apple, too small to see
here:
Makefile | 54 !
ar_io.c | 1372 ================================
ar_subs.c | 1288 ==============================
buf_subs.c | 1094 =========================
cache.c | 500 ===========
cpio.c | 1284 ==============================
extern.h | 299 =======
file_subs.c | 1117 ==========================
ftree.c | 565 =============
gen_subs.c | 467 ===========
getoldopt.c | 73 =
options.c | 1515 ====================================
osx-tar-20151206/Makefile.postamble | 5
osx-tar-20151206/Makefile.preamble | 1
osx-tar-20151206/PB.project | 54 +
pat_rep.c | 1240 =============================
pax.c | 426 ==========
sel_subs.c | 662 ===============
tables.c | 1434 ==================================
tar.c | 1214 ============================
tty_subs.c | 251 =====
21 files changed, 104 insertions(+), 1 deletion(-), 63 modifications(!), 14747 unchanged lines(=)
You can see the size by ignoring unchanged lines:
Makefile | 80 ++++++++++++++++++++++--------------
ar_io.c | 4 -
ar_subs.c | 4 +
buf_subs.c | 2
cache.c | 25 ++++++++---
cpio.c | 2
extern.h | 2
file_subs.c | 11 ++++
ftree.c | 2
gen_subs.c | 6 +-
getoldopt.c | 2
options.c | 2
osx-tar-20151206/Makefile.postamble | 5 ++
osx-tar-20151206/Makefile.preamble | 1
osx-tar-20151206/PB.project | 54 ++++++++++++++++++++++++
pat_rep.c | 2
pax.c | 4 -
sel_subs.c | 2
tables.c | 2
tar.c | 17 +++----
tty_subs.c | 2
21 files changed, 167 insertions(+), 64 deletions(-)
As of 2015,
Debian (see package) uses
Muller's version with updates by Thorsten
Glaser.
The package switched to this combination with Debian 6.0
(squeeze):
Before that (through Debian 5.0 lenny), it used an earlier port termed pax-1.5, introduced in Debian 2.0 (hamm):mircpio (20080906-1) experimental; urgency=low * Initial release * Adjust manpages to cope with GNU groffs inferiorities -- Thorsten Glaser <tg@freewrt.org> Sun, 07 Sep 2008 01:00:10 +0000
pax (1:1.5-1) unstable; urgency=low * Initial Release of the OpenBSD's pax program from Keith Muller -- David Frey <dFrey@debian.org> Wed, 10 Dec 1997 12:57:48 +0100
OpenSuSE, Red Hat and related (see rpmfind)
OpenBSD, use a version ported from OpenBSD by Thorsten Kukuk
at SuSE.
Kukuk's work stopped with version 3.4, released August 1,
2005 (see
ftp directory).
Kukuk's initial port (apparently from OpenBSD CVS early December 2001) went beyond the scope of a port:
In a third of the source files, Kukuk removed the CVS
identifiers and the ifdef's used to support pre-ANSI C
compilers.
There are a few porting changes scattered in
(such as renaming getline
to avoid
conflict),
but those are easily overlooked in the changes to
whitespace.
Those modified files are marked copyright by Kukuk.
The source code for each of Kukuk's snapshots has BSD copyrights and licenses (including Kukuk's contributions), but he added a GPL COPYING file to the releases. Perhaps he was confused about the licensing exemption for autoconf- and automake-files.
Kukuk's snapshot was made during a period where Todd
Miller was changing all calls to strcpy
and
strncpy
to use his strlcpy
.
However, pax
relied upon the null-padding
provided by strncpy
and shortly after
Kukuk's initial work, Miller reverted and amended the use
of strlcpy
in pax
(for example
revision 1.21 of tar.c), just after Kukuk's port.
Kukuk's initial 3.0 port (packaged a few weeks
later, in January) simply provided
strlcpy
in an add-on file. Because Kukuk's
port did not incorporate Miller's corrections, it seems
there was no communication between the two.
Here is a summary of Kukuk's initial port:
Makefile | 21
ar_io.c | 1363 ====================================
ar_subs.c | 1365 ++!!!!!!!!!!!!!!!!!!!!!!!!!!!!======
cache.c | 477 ============
cpio.1 | 295 --------
ftree.c | 539 ==============
gen_subs.c | 465 !!!!!!!!===
options.c | 1761 ++-!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!=========
pax-3.0-src/Makefile.am | 28
pax-3.0-src/Makefile.in | 401 ++++++++++
pax.c | 460 !!!!!!=====
pax.h | 244 ======
sel_subs.c | 655 =================
tar.1 | 295 --------
tar.c | 1222 -!!!!!!!!!!!!!!!!!!!!!!=========
15 files changed, 628 insertions(+), 804 deletions(-), 3717 modifications(!), 4442 unchanged lines(=)
If blanks are ignored, the summary line would change to
12 files changed, 723 insertions(+), 288 deletions(-), 1101 modifications(!), 6963 unchanged lines(=)
While Red Hat has provided a port of Muller's
pax
from OpenBSD since 2000 (see
changelog),
Mark Sobel's book
A Practical Guide to Red Hat Linux 8 (December
2002) says only
The syntax of the
pax
command is too complex to describe here (as you might expect from looking at all the options available totar
andcpio
). If it exists in your system, consult the manual pages. The USENIX Association funded the development of a portable implementation ofpax
and placed it in the public domain, so this utility is now widely available. Refer to thepax
man page for more information.
Checking further (see example), I found no support for Sobel's statement:
pax-1.5-2
:
* Fri Jun 30 2000 Preston Brown <pbrown@redhat.com> - debian version, which is a port from OpenBSD's latest.
* Tue Mar 05 2002 Matt Wilson <msw@redhat.com> - pull PAX source tarball from the SuSE package (which is based off this one yet claims copyright on the spec file)
There is a port back to NetBSD in pkgsrc.se, described as “a port of OpenBSD pax for SuSE Linux by Thorsten Kukuk”.
The Austin Group has a credits page where they
mention Gunnar Ritter's Heirloom Toolkit.
It also refers to Schilling's pax,
although the latter appears to be an error:
Working with the Open Source community
The group includes developers from the Open Source community. As part of acknowledging their valuable input the copyright holders have made several grants relating to use of the documentation in those projects. Some of these are listed: the Linux Man Pages project, the FreeBSD project, the NetBSD operating system, the Cygwin Project, Gunnar Ritter's Heirloom Toolkit and other tools, Joerg Schilling's pax and find, Jens Schweikardt book, and the ISPRAS Linux testing project.
For documentation on features, see the GNU tar manual. The manual's notion of history is in terms of random notes about features.
The mail in early 1988 from Jay Fenlason is a hint to when he began work on GNU tar. His progress was reported in successive GNU bulletins:
Bulletins 4 (February 4, 1988) and 5 (June 5, 1988) mention his work.
Bulletin 6 (January 6, 1989 lists the availability of GNU tar 1.07 on the beta tape (see ChangeLog for 1.13).
The earliest versions of GNU tar do not appear to be online. The earliest which you may find are (modified) versions 1.09 for MSDOS:
The two are the same, except that the FreeDOS files contain some additional DOS-specific files written by Kai Uwe Rommel to support direct disk access for OS/2 and DOS. Those would not have been incorporated into the GNU sources.
The accessible source-archives are not much help in researching its early history:
tar
it
essentially starts in 1994, with a series of commits by
François Pinard.tar
version 1.09 through 1.12From the latter, the v1.09.tar.gz
file is probably useful for comparisons. Comparing against
Gilmore's second posting, you can see that GNU tar had grown
somewhat (as well as discarding some pieces, such as the manual
page in favor of the “texinfo” file):
Makefile | 247 ++!===
PORTING | 57 -
README | 54 -
TODO | 55 -
buffer.c | 1352 +++++++++++++++++++++!!!!!!!==============
create.c | 1276 ++++++++++++++++++++++!!================
diffarch.c | 721 ++++++++++++-!!=======
extract.c | 747 +++++++++!=============
getoldopt.c | 89 ==
list.c | 726 +++++++!==============
msd_dir.c | 218 ======
msd_dir.h | 41 =
names.c | 135 ===
open3.h | 69
paxutils-1.09/COPYING | 249 +++++++
paxutils-1.09/ChangeLog | 636 ++++++++++++++++++++
paxutils-1.09/getdate.y | 882 ++++++++++++++++++++++++++++
paxutils-1.09/getopt.c | 596 ++++++++++++++++++
paxutils-1.09/getopt.h | 102 +++
paxutils-1.09/getopt1.c | 160 +++++
paxutils-1.09/gnu.c | 605 +++++++++++++++++++
paxutils-1.09/mangle.c | 226 +++++++
paxutils-1.09/rmt.h | 77 ++
paxutils-1.09/rtape_lib.c | 620 +++++++++++++++++++
paxutils-1.09/rtape_server.c | 226 +++++++
paxutils-1.09/tar.texinfo | 1289 ++++++++++++++++++++++++++++++++++++++++
paxutils-1.09/update.c | 534 ++++++++++++++++
paxutils-1.09/version.c | 90 ++
port.c | 1319 ++++++++++++++++++++++++=================
port.h | 47
tar.1 | 215 ------
tar.5 | 217 ------
tar.c | 1225 +++++++++++++++++++++++!!!============
tar.h | 297 +++-======
wildmat.c | 151 ===
35 files changed, 10391 insertions(+), 671 deletions(-), 786 modifications(!), 3702 unchanged lines(=)
The change-logs for GNU tar are helpful, since only a half-dozen people have done a significant number of commits to its source archives. Using the script which I wrote for counting changelogs, here are the percentages for developers with at least one percent of the total:
Percent Name 2.8 David J MacKenzie 20.8 François Pinard 1.5 Jay Fenlason 2.9 Michael I Bushnell 28.5 Paul Eggert 1.3 Pavel Raiskup 37.3 Sergey Poznyakoff 4.9 “other”
GNU tar releases were not at uniform intervals, but it is still useful to see how the contributions break down by time:
Version Date DJM FP JF MIB PE PR SP 1.28 2014-07-27 11.7 5.5 76.6 1.27 2013-10-05 30.5 16.2 43.8 1.26 2011-03-12 72.1 25.6 1.25 2010-11-07 46.9 53.1 1.24 2010-10-24 76.5 23.5 1.23 2010-03-10 92.3 1.22 2009-03-09 100.0 1.21 2008-12-27 96.8 1.20 2008-05-05 8.1 87.8 1.19 2007-10-10 94.4 1.18 2007-06-29 6.7 56.7 1.17 2007-06-08 31.0 68.1 1.16 2006-10-21 19.0 78.6 1.15 2004-12-20 10.0 88.7 1.14 2004-05-11 61.6 33.1 1.13 1997-07-08 9.8 61.0 4.7 9.0 11.5 1.12 1997-04-25 100.0 1.11 1992-09-09 66.9 32.4 1.10 1991-07-01 10.0 25.0 56.0 1.09 1990-10-16 18.6 78.0 1.08 1990-01-26 34.4 29.7 1.07 1989-01-26 100.0
Schilly tar (sometimes referred to as "star") was first published at the end of April 1997. It had not been published anywhere before that date.
For instance, Schilling commented in comp.unix.solaris 12/9/1996:
In article <5892f4$2...@news.Informatik.Uni-Oldenburg.DE>, Christian Kuehnke <Christia...@arbi.Informatik.Uni-Oldenburg.DE> wrote: > >j...@cs.tu-berlin.de (Joerg Schilling) writes: >> For a tar implementation that has no known bugs, will read all >> (currently except HP-UX) tar streams and is the fastest implementation ^^^^^^^^^^^^^^^^^^^^^^ if they contain device files >> at all (faster than ufsdump) look at: >> >> ftp://ftp.fokus.gmd.de/pub/unix/star >> >> for the rest of the goods. > >Nice. But why don't you provide the source? I always intended to provide star in source. There are some reasons, why I din't do this up to now: 1) I dont want star to go the same way as gnu tar You remember... Gnu tar has been first written in August 1985 by John Gilmore,ihnp4!hoptoad!gnu. It has been brought to the public at the Sun User Group meeting in december 1987 in San Jose as 'sugtar'. This version was really nice. The actual version has been ported to death. For this reason, I want to have star in my hands until I know the line for portability to other systems is clear. Star has been first written in 1982 by me. The main growth in functionality did come in May 1985. Although star has been designed to be very portable, id did run only on UNOS, SYSVr0-2, SunOS and Solaris. The major porting effort has been taken in 1994. It now runs on SunOS, Solaris, HP-UX, IRIX, Linux, DG/UX, AIX 2) Makefile system In May 1996 I made a makefile sytstem that allows simultaneous compilation on all supported platforms. This still needs some fine tuning until it may do the way to the public. I expect star to be available in souce in January 1997. Joerg P.S. Star has been ported to DG/UX with the help of Data General. It will soon be available on Data General systems as a fast backup. PP.S. GMD in Birlinghoven currently switches from 2MB/s X.25 to 34 MB/s ATM. For this reason our ftp server may not be reacheable from outside germany until the mid of the next week. -- EMail: jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin j...@cs.tu-berlin.de (uni) If you don't have iso-8859-1 j...@fokus.gmd.de (work) chars my name is URL: http://www.fokus.gmd.de/usr/schilling J"org Schilling
The actual announcement at the end of April 1997 was much longer (and was cross-posted to 26 newsgroups):
Path: euryale.cc.adfa.oz.au!newshost.carno.net.au!harbinger.cc.monash.edu.au!news.mira.net.au!news.netspace.net.au!news.mel.connect.com.au!munnari.OZ.AU!news.Hawaii.Edu!news.caldera.com!news.eli.net!uunet!in1.uu.net!160.45.4.4!fu-berlin.de!cs.tu-berlin.de!js From: js@cs.tu-berlin.de (Joerg Schilling) Newsgroups: comp.unix.admin,comp.unix.misc,alt.os.linux,alt.sys.sun,bln.comp.sun,bln.comp.unix,comp.os.linux.development.apps,comp.os.linux.misc,comp.sys.hp.apps,comp.sys.hp.misc,comp.sys.sgi.admin,comp.sys.sgi.apps,comp.sys.sgi.misc,comp.sys.sun.admin,comp.sys.sun.apps,comp.sys.sun.misc,comp.unix.aix,comp.unix.bsd.freebsd.misc,comp.unix.solaris,de.comp.os.linux.misc,de.comp.os.unix,linux.dev.admin,linux.dev.apps,maus.os.linux,maus.os.linux68k,maus.os.unix,uk.comp.os.linux Subject: STAR (tape archiver) source code released Date: 30 Apr 1997 10:57:06 GMT Organization: Technical University of Berlin, Germany Lines: 108 Distribution: inet Message-ID: <5k78i2$fht$1@news.cs.tu-berlin.de> NNTP-Posting-Host: 130.149.25.72 Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Summary: Star is a fast and Posix compliant tape archiver Xref: euryale.cc.adfa.oz.au comp.unix.admin:57542 comp.unix.misc:29041 alt.os.linux:20710 alt.sys.sun:11011 comp.os.linux.development.apps:32517 comp.os.linux.misc:172775 comp.sys.hp.apps:6817 comp.sys.hp.misc:11233 comp.sys.sgi.admin:46008 comp.sys.sgi.apps:14655 comp.sys.sgi.misc:30303 comp.sys.sun.admin:86045 comp.sys.sun.apps:15307 comp.sys.sun.misc:29517 comp.unix.aix:98941 comp.unix.bsd.freebsd.misc:40018 comp.unix.solaris:105077 de.comp.os.unix:409 Star, the fastest tar archiver for UNIX is now available in source. Star has many improvements compared to other tar imlementations (including gnu tar). See below for a short description of the highlight of star. Star is located on: ftp://ftp.fokus.gmd.de/pub/unix/star Revision history (short) 1982 First version on UNOS (extract only) 1985 Port to UNIX (fully funtional version) 1985 Added pre Posix method of handling special files/devices 1986 First experiments with fifo as external process. 1993 Remote tape access 1993 diff option 1994 Fifo with shared memory integrated into star 1994 Very long filenames and sparse files 1994 Gnutar and Ustar(Posix) handling added 1994 Xstar format (extended Posix) defined and introduced 1995 Ported to many platforms Supported platforms: SunOS Solaris Linux HP-UX DG/UX IRIX AIX FreeBSD Joerg ------------------------------------------------------------- Star is the fastest known implementation of a tar archiver. Star is able to make backups with more than 12MB/s if the disk and tape drive support such a speed. This is more than double the speed that ufsdump will get. Ampex got 13.5 MB/s with their new DLT tape drive. Ufsdump got a maximum speed of about 6MB/s with the same hardware. Star development started 1982, development is still in progress. The current version of star is stable and I never did my backups with other tools than star. Its main advantages over other tar implementations are: fifo - keeps the tape streaming. This gives you faster backups than you can achieve with ufsdump, if the size of the filesystem is > 1 GByte. pattern matcher - for a convenient user interface (see manual page for more details). To archive/extract a subset of files. sophisticated diff - user tailorable interface for comparing tar archives against file trees This is one of the most interesting parts of the star implementation. no namelen limitation - Pathnames up to 1024 Bytes may be archived. (The same limitation applies to linknames) This limit may be expanded in future without changing the method to record long names. deals with all 3 times - stores/restores all 3 times of a file (even creation time) may reset access time after doing backup does not clobber files - more recent copies on disk will not be clobbered from tape This may be the main advantage over other tar implementations. This allows automatically repairing of corruptions after a crash & fsck (Check for differences after doing this with the diff option). automatic byte swap - star automatically detects swapped archives and transparently reads them the right way automatic format detect - star automatically detects several common archive formats and adopts to them. Supported archive types are: Old tar, gnu tar, ansi tar, star. fully ansi compatible - Star is fully ANSI/Posix 1003.1 compatible. See README.otherbugs for a complete description of bugs found in other tar implementations. This is the first source release of star that I put on the net. Have a look at the manual page, it is included in the distribution. Author: Joerg Schilling Seestr. 110 D-13353 Berlin Germany Email: joerg@schily.isdn.cs.tu-berlin.de, js@cs.tu-berlin.de schilling@fokus.gmd.de Please mail bugs and suggestions to me. -- EMail: joerg@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin js@cs.tu-berlin.de (uni) If you don't have iso-8859-1 jes@fokus.gmd.de (work) chars my name is URL: http://www.fokus.gmd.de/usr/schilling J"org Schilling
There are a few points which the reader may not have noticed:
There has been (until perhaps this page) no published benchmark for the performance of Schily tar.
The long list of milestones is interesting but not relevant to Schilling's frequent statement:
the oldest free TAR implementation
Publication date is the relevant detail. Schily tar was first published eight years after GNU tar.
Regarding "Very long filenames", the source copyright for
longnames.c
says
Copyright (c) 1993, 1995 J. Schilling
which tells us that it may have been started in 1993, but was not complete until 1995. Further, the file's SCCS-ID shows
/* @(#)longnames.c 1.13 96/06/26 Copyright 1993, 1995 J. Schilling */
lengthening the development period another year. In any case, The long-names feature for GNU tar was released earlier than that. Without a published source, it is not possible to determine the extent to which Schilling borrowed, adapted or was otherwise influenced by the previously published work.
The initial release has no change-log, from which one might get clues to investigate inconsistencies in the release announcements.
Schilling added a change-log to version 1.1, released about a month later. The end of that file shows a problem:
Sun Mar 3 17:20:19 1991 Joerg Schilling <joerg@schily.isdn.cs.tu-berlin.de> * buffer.c 1.1 date and time created 91/01/31 17:20:19 by joerg Fri Jun 30 12:01:59 1989 Joerg Schilling <joerg@schily.isdn.cs.tu-berlin.de> * star.c 1.2 star divided into (star extract list create) ... SCCS revision info lost First full version made in 1986
It also notes further development changes to long-names:
Mon Jun 30 01:12:08 1997 Joerg Schilling <joerg@schily.isdn.cs.tu-berlin.de> * longnames.c 1.16 Avoid strcatl() for speed f_name/f_lname bug and bug with non-initialized m_add Mon Jun 9 21:25:18 1997 Joerg Schilling <joerg@schily.isdn.cs.tu-berlin.de> * longnames.c 1.15 NAMSIZ -> props.pr_maxsname/props.pr_maxslname Mon Jun 9 16:56:44 1997 Joerg Schilling <joerg@schily.isdn.cs.tu-berlin.de> * longnames.c 1.14 Bug that caused very long directory names from command line to overwrite the stack (av[i+1)
That is, there is no usable change-history before 1989, and the date given for a complete version is at the outset inconsistent with the release announcement:
1985 Port to UNIX (fully funtional version)
Checking dates, Schilling's change-log started almost six
months after the first public release of GNU tar 1.07 in
January 1989.
If there had been a published version of Schily tar in 1989,
we could gauge how much it had borrowed from BSD tar, and
continuing, how GNU tar influenced Schily tar. But there is
that eight-year delay.
Lists of obscure features (such as "Ustar") get
little attention.
Numbers are what get readers' attention.
Here is
one (cited
in the usual source of misinformation), from Unix Backup and
Recovery by W. Curtis Preston, O'Reilly, 1999:
A Really Fast tar Utility: starThe star utility is the fastest known implementation of tar. It has been tested at speeds exceeding 14 MB/s. (This is more than double the speed that dump gets.) star development started in 1982 and is still in progress. star's main advantages over other tar implementations are:
- FIFO
This is a “double-buffering” system that keeps the tape streaming. This gives you faster backups than you can achieve with dump, if the size of the filesystem is > 1GB.
- Sophisticated diff
It has a user-tailorable interface for comparing tar archives against file trees.
- Longer pathname length
You may archive pathnames up to 1024 bytes, as you can with dump.
- Does not clobber files
More recent copies on disk will not be clobbered from the backup volume. This may be the main advantage over other tar implementations. This allows automatic repair of a corrupted filesystem. (You can check for differences after doing this with the diff option.)
- Automatic byte swap
star automatically detects swapped archives and transparently reads them the right way.
star is available from
ftp://ftp.fokus.gmd.de/pub/unix/star
.
Both the Schily tar release announcement and Preston's summary are quoted here to make it simpler for the reader to observe how the summary in the book is based on the release announcement. Preston made some adjustments:
ufsdump was altered to “dump”, and
the 13.5 MB/s figure cited for comparison against Ampex was conflated into 14 MB/s attributed to the tar program.
The telling point is that Preston did not add a paragraph or two detailing how the performance was measured.
By the way, star (Schily tar) is not mentioned in the revised edition Backup & Recovery: Inexpensive Backup Solutions for Open Systems (2007). Instead, Preston says (page 106):
Use GNUtar
if You CanGNU
tar
is an extremely popular utility. Beside being able to read an archive written by any other version oftar
, it adds a significant level of functionality. Here are some of its most popular advancements:
- The
-d
option performs adiff
compare between the archive and a filesystem. It does this by reading the tape and comparing its contents against the files that it finds in the filesystem. Any differences are reported.- The
-a
option resets access times (atime).- The
-f
option runs a script whentar
reaches the end of a volume. This can be used to automatically swap volumes with a media changer.- The
-Z
and-z
options automatically pass the archive throughcompress
orgzip
, respectively.- The
-f
option supports remote device names.- By default, GNU
tar
suppresses a leading slash on absolute pathnames while creating or reading atar
archive. (You can suppress this with the-p
option.)- Some people also prefer the GNU style of arguments that are offered by GNU
tar
. Instead oftar cvf
, you can specifytar -create -verbose -file
.
Finally (perhaps not the last work on the topic), is
bsdtar
, built upon libarchive (originally in
Tim
Kientzle's webpage).
As Tim Kientzle relates, he began work on libarchive in 2003 for use in installers.
The first announcement was at the end of 2003:
libarchive/bsdtar snapshot available Tim Kientzle tim at kientzle.com Mon Dec 22 21:17:35 PST 2003 A fairly complete snapshot of libarchive and bsdtar, including source code, complete documentation, and some background about why I'm doing this and what I hope to accomplish is now available: http://people.freebsd.org/~kientzle/libarchive/ It needs a lot of testing still, but is getting to the point that someone other than me should be able to make sense of it. ;-) Feedback appreciated. Tim Kientzle kientzle at freebsd.org
Initially, commits were made to FreeBSD's CVS
repository.
As of 2015, you can see this in the SVN branches for FreeBSD
releases
6,
7,
8 and
9.
Starting with FreeBSD
10, development moved to Github.
Initial import of libarchive. What it is: A library for reading and writing various streaming archive formats, especially tar and cpio. Being a library, it should be easy to incorporate into pkg_* tools, sysinstall, and any other place that needs to read or write such archives. Features: * Full automatic detection of both compression and archive format. * Extensible internal architecture to make it easy to add new formats. * Support for "pax interchange format," a new POSIX-standard tar format that eliminates essentially all of the restrictions of historic formats. * BSD license Thanks to: jkh for pushing me to start this work, gordon for encouraging me to commit it, bde for answering endless style questions, and many others for feedback and encouragement. Status: Pretty good overall, though there are still a few rough edges and the library could always use more testing. Feedback eagerly solicited.
The NEWS file lists first milestone:
May 18, 2004: bsdtar can read Solaris, HP-UX, Unixware, star, gtar, and pdtar archives.
NetBSD's pkgsrc repository shows
The Git repository history only goes back to April 2008.
The wiki shows some details of releases.
There are older tarballs in the archive section, going back to February 2006.
With 74 contributors, 4544 commits as of December 7, 2015,
it is comparable in activity to GNU tar (45 contributors,
5448 commits).
That measure may be biased in favor of libarchive,
since the summary lists 3 branches.
From October 1992, though May 2005, I worked initially to collect useful development tools, for use by myself and other developers. I also provided fixes and feedback (e.g., cproto, mawk, vile). After a few years I was involved in development of these tools, to follow up on the fixes I had made, and became more selective about which to become involved with.
I gauged program quality by compiling candidates with gcc compiler warnings turned on, as well as doing test-builds with Unix compilers. For example, I used this script:
#!/bin/sh # these are my normal development-options OPTS="-Wall -Wstrict-prototypes -Wmissing-prototypes -Wshadow -Wconversion" gcc $OPTS "$@"
That made it simpler:
good code had few warnings; I could send a patch knowing that it would be treated properly.
bad code had many warnings; I simply deleted the program.
in between meant that I would (as I did for ncurses) start by sending a set of patches to clean those up before addressing my real issue.
For instance, that was what I had in mind when I sent mail to
Paul Eggert in 1993,
suggesting improvements to rcs
. The discussion was
inconclusive. A few years later, I read his Usenet postings (such
as
Re: Reverse function for gmtime()? ), with interest.
Still later (probably 1997 or 1998, though I am unable to locate it via Google), I was interested to note an exchange between Eggert and Schilling. Schilling was accusing Eggert of having deliberately implemented long-name support in GNU tar in a way designed to make it incompatible with POSIX. Schilling, of course, phrased his remarks in a more emphatic manner than I report here.
I examined the GNU tar source and read its change-log. According to that (reading it again):
Eggert was cited for only a couple of bug reports for GNU tar before becoming its maintainer in October 1997.
The work mentioned was apparently done by Jay Fenlason
(seen in mention of name mangling late in 1990)
and Michael I Bushnell (completed for version 1.11, September
1992).
I followed up by downloading a copy of Schilling's program. Of course, I screened it for compiler warnings. It was "in between" which calls for a collaborative effort. However viewing the episode with Eggert, it was obvious that Schilling was no improvement in comparison to Eric Raymond. Attempting to collaborate with Schilling would be comparable to Sindbad's adopting the Old Man of the Sea for a traveling companion.
So I deleted it.
I had occasion to revisit long filenames with tar for ncurses. Juergen Pfeifer added several filenames for the Ada95 binding which were long. That was because they (like Java class names versus filename), had to match package names which were long.
Despite my qualms, this was not initially a problem with tar. Later, that changed, and since problem reports were not frequent, it took a while to notice and address the problem. Here are a few mail interchanges to illustrate.
I tried untar'ing a file on ClarkNet's Solaris machine:
From florian@suse.de Sat Apr 3 01:19:48 1999 Received: from smtp-gw.vma.verio.net (smtp-gw.vma.verio.net [207.97.20.30]) by loas.clark.net (8.8.8/8.8.8) with ESMTP id BAA29138 for <dickey@clark.net>; Sat, 3 Apr 1999 01:19:48 -0500 (EST) Received: from Cantor.suse.de (Cantor.suse.de [194.112.123.193]) by smtp-gw.vma.verio.net (8.9.3/8.9.3) with ESMTP id BAA15725 for <dickey@clark.net>; Sat, 3 Apr 1999 01:20:03 -0500 (EST) Received: from Galois.suse.de (Galois.suse.de [194.112.123.130]) by Cantor.suse.de (Postfix) with ESMTP id F084632CE2 for <dickey@clark.net>; Sat, 03 Apr 1999 08:19:13 +0200 (MEST) Received: from knorke.saar.de (knorke.suse.de [10.0.0.254]) by Galois.suse.de (Postfix) with ESMTP id CDE529410 for <dickey@clark.net>; Sat, 3 Apr 1999 08:19:12 +0200 (MEST) Received: (from florian@localhost) by knorke.saar.de (8.8.8/8.8.8) id IAA08351 for dickey@clark.net; Sat, 3 Apr 1999 08:19:12 +0200 From: Florian La Roche <florian@suse.de> Date: Sat, 3 Apr 1999 08:19:12 +0200 To: dickey@clark.net Subject: Re: progress? Message-ID: <19990403081912.A8309@knorke.saar.de> References: <19990403004604.A7286@knorke.saar.de> <199904030226.VAA10981@shell.clark.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.95.4i In-Reply-To: <199904030226.VAA10981@shell.clark.net>; from dickey@clark.net on Fri, Apr 02, 1999 at 09:26:25PM -0500 Sender: florian@knorke.saar.de Status: RO Content-Length: 618 Lines: 19 > close - but there's a problem. I can see the contents, but I get a directory > checksum error trying to untar it. Here's what I get > > -rw------- 1 dickey ipusers 1378639 Apr 2 1999 ncurses-5.0-beta1.tar.gz > > sum: > 60558 2693 ncurses-5.0-beta1.tar.gz > > sum -r: > 31196 2693 ncurses-5.0-beta1.tar.gz knorke:~/source $ sum -r ncurses-5.0-beta1.tar.gz 31196 1347 I cannot reproduce any problem with that file. I have also tried to unpack it on the GNU machine and didn't get any error. Can you try it on a Linux machine? (At least with GNU tar to unpack it?) Florian La Roche
It worked for Potorti, but neither of us knew what the
@LongLink
was:
From dickey Fri Jul 30 09:56:17 1999 Subject: Re: File mode specification error on a tar.gz file To: F.Potorti@cnuce.cnr.it (Francesco Potorti` <F.Potorti@cnuce.cnr.it>) Date: Fri, 30 Jul 1999 09:56:17 -0400 (EDT) In-Reply-To: <m11ABia-001i1aC@fly.cnuce.cnr.it> from "Francesco Potorti` <F.Potorti@cnuce.cnr.it>" at Jul 30, 99 02:24:56 pm X-Mailer: ELM [version 2.4 PL25] Content-Type: text Status: RO Content-Length: 1067 Lines: 35 > > emacs 20.4 > > Download http://www.clark.net/pub/dickey/ncurses/ncurses.tar.gz and put > it in your current directory. > > emacs -q > M-x auto-compression-mode RET > C-x d RET > go to the ncurses.tar.gz line > RET > --> unzipping ncurses.tar.gz...done > Parsing tar file...done > File mode specification error: (wrong-type-argument integerp nil) > > The likely reason is that gnu tar 1.12, when run in listing mode over > that archive, outputs one line like this: > > Lr--r--r-- root/root 103 1999-06-15 03:03 ././@LongLink unknown file type `L' hmm (I have had occasional problems reading those tar files with non-GNU tar, but not seen any thing that I can pinpoint). I'll repack with Solaris tar (which works, afaik). -- did that, will see if I can identify the bogus 'L' entry. thanks. > Even after having read the tar docs I don't understand if that is normal > or not. Anyway, if possible, it would be nice for emacs to handle these > errors. -- Thomas E. Dickey dickey@clark.net http://www.clark.net/pub/dickey
The reason became clear after releasing ncurses 5.2 in 2000 with a few reports from people using Mac OS X and FreeBSD. Starting at that point, I changed my release process to use Solaris tar to create the release tar-balls for ncurses.
Alternatively, I could have used Schily tar. But I chose not to:
If it had some quirk which meant that its files
were not readable by some other flavor of tar,
then that meant the receiver would have to provide a
compatible tar.
Early on, there were few groups which provided precompiled packages for Schily tar.
While it is available as an add-on, nowhere is it the default tar program.
Notably, it is missing from Debian (see package tracker, and bugs, in particular #350624).
Likewise, it did not become part of OpenSolaris (see discussion).
It is not in OpenBSD ports. See this mailing list discussion in 2005 for an explanation.
It does not appear to be in Arch Linux (see package search).
Likewise missing from the HP-UX Porting & Archiving Center (see package search).
In the cases of Debian and OpenSolaris, Schilling antagonized the people whose cooperation was needed (see Garrett D'Amore followup in slides, e.g., provoking this response).
For more context,see
the newsgroup threads
Re: star vs GNU tar on comp.os.linux.misc
and
“tar --- sun or GNU” on
comp.unix.solaris in 2003,
as well as the bug-tar archives (which only go back to 2003).
Before 2003, it is hard to find bug reports (to provide
background for this dispute), although there was a mail alias
(see example).
The
BACKLOG and
THANKS files in
GNU tar 1.12 do not give hints regarding Schilling's
statements.
Using Solaris tar was only a stopgap fix. Fortunately, it turned out, on investigation, that only the development versions (with 8-digit year/month/day added to the pathname) produced pathnames long enough to pass the 100-character threshold.
The investigation was part of my check-list for ncurses6. Some of the results are interesting, hence this page.
I began collecting information for this investigation in 2014, creating an outline of this page.
Later, in June 2015, I built each of the GNU tar and Schily tar versions mentioned here using Debian 6 (gcc 4.4.5). I also wrote a test program to verify interoperability of the various tar formats with pathnames of different lengths.
In reviewing the initial results, I found that I should also include multiple versions of BSD tar, to comment on its influence vis-à-vis GNU and Schily tar.
In this study, I acquired for reference these versions of GNU tar:
1.11.8, 1.12, 1.13, 1.14, 1.15, 1.15.1, 1.16, 1.16.1, 1.17, 1.18, 1.19, 1.20, 1.21, 1.22, 1.23, 1.24, 1.25, 1.26, 1.27, 1.27.1, 1.28
For research, there is also a Git repository, but it is not very useful:
The earliest published version (1.07) cannot be found. I found it only mentioned as being on ccb.ucfs.edu in December 1989, and on prep.ai.mit.edu since May 1989.
I also obtained these versions of Schily tar from gd.tuwien.ac.at:
1.0, 1.1, 1.2, 1.3, 1.3.1, 1.4, 1.4.1, 1.4.2, 1.4.3, 1.5, 1.5.1
There is no publicly-accessible source repository for Schily tar. There is a Mercurial repository for Schillix, which matches Illumos up to mid-2010 (when OpenSolaris ended, as reported by The Register), but Schily tar is not there, and comparing the two repositories, it is immediately apparent that Illumos is the ongoing reference implementation because its history continues well past that point, unlike Schillix.
For testing other programs, I used the packaged versions (mainly Debian 6–8 and Solaris 10):
These tar
implementations provide several features.
For instance, the newer versions provide options for gzip
(and other) compression. While I do use those features, they are not
that important because they only simplify the use of compression,
but do not enhance it, e.g., by making it faster.
Adding compression support to tar is simpler than it may seem, provided that all it does is run an external program. I did this for diffstat with little effort (initially in 2000, later adding a configure check in 2006, etc). Modifying a program to use compression libraries takes appreciably more effort.
Often, comments are made that the compression is “not really part of tar”, which may or may not be accurate:
libarchive
.
Here is a table comparing the command-line support for compression
in these tar
implementations:
Date Format Program Version Feature 1995-06 compress GNU tar 1.11.8 -Z
option1995-06 gzip GNU tar 1.11.8 -z
option2004-05 bzip2 GNU tar 1.14 -j
option2010-03 xz GNU tar 1.23 -J
option2008-04 compress Schily tar 1.5 -Z
option2002-05 gzip Schily tar 1.4 -z
option2008-04 bzip2 Schily tar 1.5 -j
,-bz
options2013-01 xz Schily tar 1.5.2 -xz
option2010-03 compress BSD tar 2.8.3 -Z
option2010-03 gzip BSD tar 2.8.3 -z
option2010-03 bzip2 BSD tar 2.8.3 -j
,-y
options2010-03 xz BSD tar 2.8.3 -J
option2012-05 compress Solaris tar 5.11 -Z
option2012-05 gzip Solaris tar 5.11 -z
option2012-05 bzip2 Solaris tar 5.11 -j
optionN/A xz Solaris tar N/A N/A 2010-11 compress GNU tar 1.25 auto-sense 2004-12 gzip GNU tar 1.15 auto-sense 2004-12 bzip2 GNU tar 1.15 auto-sense 2010-10 xz GNU tar 1.24 auto-sense 2002-05 compress Schily tar 1.4 auto-sense 2002-05 gzip Schily tar 1.4 auto-sense 2002-05 bzip2 Schily tar 1.4 auto-sense 2013-01 xz Schily tar 1.5.2 auto-sense 2010-03 compress BSD tar 2.8.3 auto-sense 2010-03 gzip BSD tar 2.8.3 auto-sense 2010-03 bzip2 BSD tar 2.8.3 auto-sense 2010-03 xz BSD tar 2.8.3 auto-sense 2012-05 compress Solaris tar 5.11 auto-sense 2012-05 gzip Solaris tar 5.11 auto-sense 2012-05 bzip2 Solaris tar 5.11 auto-sense N/A xz Solaris tar N/A auto-sense
There are a few caveats:
-z
option, but it did not work until version 1.4
-tvf
”.
It is necessary to omit the “-
”.
compress
, it is necessary to separate the -Z
option from cf
, e.g., star cf foo -Z bar
.
-z
option.
To recap, tar
compression is interesting to some extent,
because it simplifies ad hoc commands involving tar
.
I do not use in the archive
script, which I use for preparing tarballs
to distribute to others. Rather, in the script, tar
pipes to gzip
(or bzip2
).