Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to store a hash in extended file attributes on OS X with Java?

Preface I am working on a platform in-depended media database written in java where the media files are identified by a file hash. The user shall be able to move the files around, so I do NOT want to rely on any file path. Once imported, I store the path and the hash in my database. I developed a fast file-hash-id algorithm based on a tradeoff between accuracy and performance, but fast is not always fast enough. :)

In order to update and import mediafiles, I need to (re)create the file hashes of all files in my library. My idea is now to calculate the hash just once and store it in the files metadata (extended attributes) to boost performance on filesystems which support extended file attributes. (NTFS, HFS+, ext3...) I already implemented it, and you can find the current source here: archimedesJ.io.metadata

Attempts At a first glance, Java 1.7 offers with the UserDefinedFileAttributeView a nice way to handle metadata. For most platforms this works. Sadly, UserDefinedFileAttributeView does not work on HFS+. Albeit, I do not understand why especially the HFS+ filesystem is not supported - it is one of the leading formats for metadata? (see related Question - which does not provide any solution)

How to store extended file attributes on OS X with Java? In oder to come by this java limitation, I decided to use the xattr commandline tool present on OSX and use it with Javas Process handling to read the output from it. My implementation works, but it is very slow. (Recalculation of the file hash is faster, how ironic! I am testing on a Mac BookPro Retina, with an SSD.)

It turned out, that the xattr tool works quite slow. (Writing is damn slow, but more importantly also reading an attribute is slow) To prove that it is not a Java issue but the tool itself, I have created a simple bash script to use the xattr tool on several files which have my custom attribute:

FILES=/Users/IsNull/Pictures/
for f in $FILES
do
  xattr -p vidada.hash $f
done

If I run it, the lines appear "fast" after each other, but I would expect to show me the output immediately within milliseconds. A little delay is clearly visible and thus I guess the tool is not that fast. Using this in java gives me an additional overhead of creating a process, parsing the output which makes it even a bit slower.

Is there a better way to access the extended attributes on HFS+ with Java? What is a fast way to work with the extended attributes on OS X with Java?

like image 222
IsNull Avatar asked May 02 '13 13:05

IsNull


People also ask

How are extended attributes stored?

Extended attributes are not stored in the main data for files, but in the Attributes area of the volume metadata. As such they are out of reach of normal file tools, and can only be accessed using tools specifically intended to work with xattrs.

What are extended attributes Macos?

Extended attributes are arbitrary metadata stored with a file, but separate from the filesystem attributes (such as modification time or file size). The metadata is often a null-terminated UTF-8 string, but can also be arbitrary binary data. One or more files may be specified on the command line.

Does NFS support extended attributes?

NFS V4 uses the existing POSIX ACLs and the extended attribute support in Linux that is supported by GPFS™.

Which command allows for modifying the extended attributes?

File attributes that are stored in the Attributes file can be view, and something manipulated, with the xattr command. This command be can used to “display, modify, or remove the extended attributes of one or more files, including directories and symbolic links.


2 Answers

OS X's /usr/bin/xattr is probably rather slow because it's implemented as a Python script. The C API for setting extended attributes is setxattr(2). Here's an example:

if(setxattr("/path/to/file",
            attribute_name,
            (void *)attribute_data,
            attribute_size,
            0,
            XATTR_NOFOLLOW) != 0)
{
  /* an error occurred, see errno */
}

You can create a JNI wrapper to access this function from Java; you might also want getxattr(2), listxattr(2), and removexattr(2), depending on what else your app needs to do.

like image 194
jatoben Avatar answered Oct 14 '22 07:10

jatoben


I have created a JNI wrapper for accessing the extended attributes now directly over the C-API. It is a open source Java Maven project and avaiable on GitHub/xattrj

For reference, I post the interesting source pieces here. For the latest sources, please refer to the above project page.

Xattrj.java

public class Xattrj {

    /**
     * Write the extended attribute to the given file
     * @param file
     * @param attrKey
     * @param attrValue
     */
    public void writeAttribute(File file, String attrKey, String attrValue){
        writeAttribute(file.getAbsolutePath(), attrKey, attrValue);
    }

    /**
     * Read the extended attribute from the given file
     * @param file
     * @param attrKey
     * @return
     */
    public String readAttribute(File file, String attrKey){
        return readAttribute(file.getAbsolutePath(), attrKey);
    }

    /**
     * Write the extended attribute to the given file
     * @param file
     * @param attrKey
     * @param attrValue
     */
    private native void writeAttribute(String file, String attrKey, String attrValue);

    /**
     * Read the extended attribute from the given file
     * @param file
     * @param attrKey
     * @return
     */
    private native String readAttribute(String file, String attrKey);


    static {
        try {
            System.out.println("loading xattrj...");
            LibraryLoader.loadLibrary("xattrj");
            System.out.println("loaded!");
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

org_securityvision_xattrj_Xattrj.cpp

#include <jni.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "org_securityvision_xattrj_Xattrj.h"
#include <sys/xattr.h>


/**
 * writeAttribute
 * writes the extended attribute
 *
 */
JNIEXPORT void JNICALL Java_org_securityvision_xattrj_Xattrj_writeAttribute
    (JNIEnv *env, jobject jobj, jstring jfilePath, jstring jattrName, jstring jattrValue){

    const char *filePath= env->GetStringUTFChars(jfilePath, 0);
    const char *attrName= env->GetStringUTFChars(jattrName, 0);
    const char *attrValue=env->GetStringUTFChars(jattrValue,0);

    int res = setxattr(filePath,
                attrName,
                (void *)attrValue,
                strlen(attrValue), 0,  0); //XATTR_NOFOLLOW != 0
    if(res){
      // an error occurred, see errno
        printf("native:writeAttribute: error on write...");
        perror("");
    }
}


/**
 * readAttribute
 * Reads the extended attribute as string
 *
 * If the attribute does not exist (or any other error occurs)
 * a null string is returned.
 *
 *
 */
JNIEXPORT jstring JNICALL Java_org_securityvision_xattrj_Xattrj_readAttribute
    (JNIEnv *env, jobject jobj, jstring jfilePath, jstring jattrName){

    jstring jvalue = NULL;

    const char *filePath= env->GetStringUTFChars(jfilePath, 0);
    const char *attrName= env->GetStringUTFChars(jattrName, 0);

    // get size of needed buffer
    int bufferLength = getxattr(filePath, attrName, NULL, 0, 0, 0);

    if(bufferLength > 0){
        // make a buffer of sufficient length
        char *buffer = (char*)malloc(bufferLength);

        // now actually get the attribute string
        int s = getxattr(filePath, attrName, buffer, bufferLength, 0, 0);

        if(s > 0){
            // convert the buffer to a null terminated string
            char *value = (char*)malloc(s+1);
            *(char*)value = 0;
            strncat(value, buffer, s);
            free(buffer);

            // convert the c-String to a java string
            jvalue = env->NewStringUTF(value);
        }
    }
    return jvalue;
}

Now the makefile which has troubled me quite a bit to get things working:

CC=gcc
LDFLAGS= -fPIC -bundle
CFLAGS= -c -shared -I/System/Library/Frameworks/JavaVM.framework/Versions/Current/Headers -m64


SOURCES_DIR=src/main/c++
OBJECTS_DIR=target/c++
EXECUTABLE=target/classes/libxattrj.dylib

SOURCES=$(shell find '$(SOURCES_DIR)' -type f -name '*.cpp')
OBJECTS=$(SOURCES:$(SOURCES_DIR)/%.cpp=$(OBJECTS_DIR)/%.o)

all: $(EXECUTABLE)

$(EXECUTABLE): $(OBJECTS)
    $(CC) $(LDFLAGS) $(OBJECTS) -o $@

$(OBJECTS): $(SOURCES)
    mkdir -p $(OBJECTS_DIR)
    $(CC) $(CFLAGS) $< -o $@



clean:
    rm -rf $(OBJECTS_DIR) $(EXECUTABLE)
like image 28
IsNull Avatar answered Oct 14 '22 08:10

IsNull