Please forgive me if there is a glaringly obvious answer to this question; I haven't found it because I'm not entire sure what I'm looking for. It may well be this duplicates a question I haven't found; sorry. I have a C executable that uses text, audio, video, icons and a variety of different file types. These files are stored locally; the folder structure is large and deep and would need to be installed alongside the application for it to operate correctly (not that I anticipate it being distributed I'm looking to package my own work for convenience). In my own opinion it would be more convenient if the file library was stored in a single file that remained accessible to the application for example alongside <code>/usr/bin/APPLICATION</code> or in the most appropriate location; accessed by the executable when required. I searched for questions similar and found suggestions that indicated two possible options Resource Files which appear to be native to Windows and Including files at compile. The first question leads to an answer similar to the second and doesn't answer the question relating to the existence of resource files for linux executables. It (like the second) looks at including the datafile in the compilation process. This is not so useful as if I only want to update my resources I'm forced to recompile the entire application (the media is dynamically added). QUESTION: Is there a way to store a variety of file types in one single file accessible to an executable in linux, and if so how would you implement this? My thoughts on this initially were to create a <code>.zip</code> or <code>.gz</code> file which might also offer compression as an added bonus but I have no idea how (or if it is even possible) to access data within such a file on the fly. I'm equally uncertain if there is a specific file type or library that offers a more suitable solution. Also I know virtually nothing about <code>.dat</code> files could these be used in this context on a linux system?

I do not understand why you would use a single file at all. Considering the added complexity (and increased chance of bugs creeping in) of file extraction and the associated overheads, I do not see how it would be "more convenient". <blockquote> I have a C executable that uses text, audio, video, icons and a variety of different file types. </blockquote> So do many other Linux applications. The normal approach, when using package management, is to put the architecture independent data (icons, audio, video, and so on) for application <code>/usr/bin/YOURAPP</code> in <code>/usr/share/YOURAPP/</code>, and architecture dependent data (like helper binaries) in <code>/usr/lib/YOURAPP</code>. It is extremely common for the latter two to be full directory trees, sometimes quite deep and wide. For locally compiled stuff, it is common to put these in <code>/usr/local/bin/YOURAPP</code>, <code>/usr/local/share/YOURAPP/</code>, and <code>/usr/local/share/YOURAPP/</code> instead, just to avoid confusing the package manager. (If you check <code>./configure</code> scripts or read <code>Makefile</code>s, this is the chief purpose of the <code>PREFIX</code> variable they support.) It is also common for the <code>/usr/bin/YOURAPP</code> to be a simple shell script, setting environment variables, or checking for user-specific overrides (from <code>$HOME/.YOURAPP/</code>), ending up with <code>exec /usr/lib/YOURAPP/YOURAPP.bin [parameters...]</code>, which replaces the shell with the actual binary executable without leaving the shell in memory. As an example, <code>/usr/share/octave/</code> on my machine contains a total of 138 directories (in a hierarchy of up to 7 directories deep) and 1463 files; about ten megabytes of "stuff" all told. LibreOffice, Eagle, Fritzing, and KiCAD take hundreds of megabytes there each, so Octave is not an extreme example in any way either.

C - Storing a large group of files as a single resource

Tags:

c

file

linux

Please forgive me if there is a glaringly obvious answer to this question; I haven't found it because I'm not entire sure what I'm looking for. It may well be this duplicates a question I haven't found; sorry.

I have a C executable that uses text, audio, video, icons and a variety of different file types. These files are stored locally; the folder structure is large and deep and would need to be installed alongside the application for it to operate correctly (not that I anticipate it being distributed I'm looking to package my own work for convenience).

In my own opinion it would be more convenient if the file library was stored in a single file that remained accessible to the application for example alongside /usr/bin/APPLICATION or in the most appropriate location; accessed by the executable when required.

I searched for questions similar and found suggestions that indicated two possible options Resource Files which appear to be native to Windows and Including files at compile. The first question leads to an answer similar to the second and doesn't answer the question relating to the existence of resource files for linux executables. It (like the second) looks at including the datafile in the compilation process. This is not so useful as if I only want to update my resources I'm forced to recompile the entire application (the media is dynamically added).

QUESTION: Is there a way to store a variety of file types in one single file accessible to an executable in linux, and if so how would you implement this?

My thoughts on this initially were to create a .zip or .gz file which might also offer compression as an added bonus but I have no idea how (or if it is even possible) to access data within such a file on the fly. I'm equally uncertain if there is a specific file type or library that offers a more suitable solution. Also I know virtually nothing about .dat files could these be used in this context on a linux system?

828

asked Aug 21 '15 23:08

Chortle

2 Answers

I do not understand why you would use a single file at all. Considering the added complexity (and increased chance of bugs creeping in) of file extraction and the associated overheads, I do not see how it would be "more convenient".

I have a C executable that uses text, audio, video, icons and a variety of different file types.

So do many other Linux applications. The normal approach, when using package management, is to put the architecture independent data (icons, audio, video, and so on) for application /usr/bin/YOURAPP in /usr/share/YOURAPP/, and architecture dependent data (like helper binaries) in /usr/lib/YOURAPP. It is extremely common for the latter two to be full directory trees, sometimes quite deep and wide.

For locally compiled stuff, it is common to put these in /usr/local/bin/YOURAPP, /usr/local/share/YOURAPP/, and /usr/local/share/YOURAPP/ instead, just to avoid confusing the package manager. (If you check ./configure scripts or read Makefiles, this is the chief purpose of the PREFIX variable they support.)

It is also common for the /usr/bin/YOURAPP to be a simple shell script, setting environment variables, or checking for user-specific overrides (from $HOME/.YOURAPP/), ending up with exec /usr/lib/YOURAPP/YOURAPP.bin [parameters...], which replaces the shell with the actual binary executable without leaving the shell in memory.

As an example, /usr/share/octave/ on my machine contains a total of 138 directories (in a hierarchy of up to 7 directories deep) and 1463 files; about ten megabytes of "stuff" all told. LibreOffice, Eagle, Fritzing, and KiCAD take hundreds of megabytes there each, so Octave is not an extreme example in any way either.

answered Sep 30 '22 01:09

Nominal Animal

You have several alternatives (TODO: add more ;)):

You can read some archiver file format specifications, writting code to read/write to those archivers, and waste your time doing so.

You can invent a dirty, simple file format, for example ("dsa" stands for "Dirty and Simple Archiver"):

#include <stdint.h>

// Located at the beginning of the file    
struct DSAHeader {
    char            magic[3];            // Shall be (char[]) { 'D', 'S', 'A' }
    unsigned char   endianness;          // The rest of the file is translated according to this field. 0 means little-endian, 1 means big-endian.
    unsigned char   checksum[16];         // MD5 sum of the whole file. (when calculating checksums, this field is psuedo-filled with zeros).
    uint32_t        fileCount;
    uint32_t        stringTableOffset;   // A table containing the files' names.
};

// A dsaHeader.fileCount-sized array of DSAInodeHeader follows the DSAHeader.
struct DSANodeHeader {
    unsigned char   type;              // 0 means directory, 1 means regular file.
    uint32_t        parentOffset;      // Pointer to the parent directory, or zero if the node is in the root.
    uint32_t        offset;            // The node's type-dependent header starts here.
    uint32_t        nodeSize;          // In bytes for files, and in number of entries for directories.
    uint32_t        dataOffset;        // The file's data starts at this offset for files, and a pointer to the first DSADirectoryEntryHeader for directories.
    uint32_t        filenameOffset;    // Relative to the string table.
};

typedef uint32_t    DSADirectoryEntryHeader;    // Offset to the entry's DSANodeHeader

The "string table" is a contiguous sequence of null-terminated character strings.

This format is greatly simple (and portable ;)). And, as a bonus, if you want (de)compression, you can use something like Zip, BZ2, or XZ to (de)compress your file (those programs/formats are archiver-agnostic, i.e, not dependent on tar, as commonly believed).

As last last (or first?) resort, you may use an existent library/API for manipulating archivers and compressed file formats.

Edit: Added support for directories :).

answered Sep 30 '22 02:09

3442

Related questions
                            
                                how to write a c function that can take both dynamic/statically allocated 2D array? [duplicate]
                            
                                Switch Statement: Is the logic different in C v/s. other languages like Java?
                            
                                Are there any C compilers that'll warn about using undeclared defines
                            
                                Best way to print information when debugging a race condition
                            
                                What is the use of the return value in strrev()?
                            
                                How does "for ( ; *p; ++p) *p = tolower(*p);" work in c?
                            
                                Generate or find C headers for ICU core on OSX
                            
                                Erlang spawning large amounts of C processes
                            
                                Is it bad that LANG and LC_ALL are empty when running `locale -a` on OS X Yosemite?
                            
                                Efficient comparison of small integer vectors
                            
                                Are the C functions recvfrom and sendto mutually exclusive?
                            
                                Undefined symbols for architecture x86_64 (clang)
                            
                                struct of arrays and memory access patterns
                            
                                How to run shell commands in a C program [closed]
                            
                                How to find the "exit" of a C program
                            
                                Win32 API named pipe, All pipe instances are busy
                            
                                getchar() and buffer order
                            
                                Making a typedef struct public for local declaration, but keep the structure member access private to the module it is defined in
                            
                                Free a pointer from an external function
                            
                                storage size of ‘names’ isn’t known

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With