Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What exactly is the GNU tar ././@LongLink "trick"?

Tags:

interop

tar

gnu

I read that a tar entry type of 'L' (76) is used by gnu tar and gnu-compliant tar utilities to indicate that the next entry in the archive has a "long" name. In this case the header block with the entry type of 'L' usually encodes the name ././@LongLink .

My question is: where is the format of the next block described?

The format of a tar archive is very simple: it is just a series of 512-byte blocks. In the normal case, each file in a tar archive is represented as a series of blocks. The first block is a header block, containing the file name, entry type, modified time, and other metadata. Then the raw file data follows, using as many 512-byte blocks as required. Then the next entry.

If the filename is longer than will fit in the space allocated in the header block, gnu tar apparently uses what's known as "the ././@LongLink trick". I can't find a precise description for it.

When the entry type is 'L', how do I know how long the "long" filename is? Is the long name limited to 512 bytes, in other words, whatever fits in one block?

Most importantly: where is this documented?

like image 908
Cheeso Avatar asked Jan 16 '10 20:01

Cheeso


2 Answers

Just by observation of a single archive here's what I surmised about the 'L' entry type in tar archives, and the "././@LongLink" name:

The 'L' entry is present in a header for a series of 1 or more 512-byte blocks that hold just the filename for a file or directory with a name over 100 chars. For example, if the filename is 1200 chars long, then the size in the header block will be 1200, and there will be 3 additional blocks with filename data; the last block is partially filled.

Following that series is another header block, in the traditional form - a header with type '0' (regular file) or '5' (directory), followed by the appropriate number of data blocks with the entry data. In the header for this series, the name will be truncated to the first 100 characters of the actual name.

EDIT
See my implementation here: http://cheesoexamples.codeplex.com/SourceControl/changeset/view/99885#1868643

like image 187
Cheeso Avatar answered Oct 17 '22 07:10

Cheeso


Note that the information about all of that can be found in the libtar project:

http://www.feep.net/libtar/

The proposed header is libtar.h (opposed to the POSIX tar.h) which clearly includes a long filename, and long symbolic link.

Get the "fake" headers + data for the long filenames/links then the "real" header (except for the actual filename and symbolic link) after that.

HEADER type 'L'
BLOCKS of data with the real long filename
HEADER type 'K'
BLOCKS of data with the real symbolic link
HEADER type '0' (or '5' for directory, etc.)
BLOCKS of data with the actual file contents

Of course, under MS-Windows, you probably won't handle symbolic links, although with Win7 it is said that symbolic links under MS-Windows are working (finally—this is now official in Win10!)

Pertinent definition from libtar.h:

/* GNU extensions for typeflag */
#define GNU_LONGNAME_TYPE   'L'
#define GNU_LONGLINK_TYPE   'K'
like image 37
Alexis Wilke Avatar answered Oct 17 '22 06:10

Alexis Wilke