Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Minimal code to reliably store java object in a file

In my tiny little standalone Java application I want to store information.

My requirements:

  • read and write java objects (I do not want to use SQL, and also querying is not required)
  • easy to use
  • easy to setup
  • minimal external dependencies

I therefore want to use jaxb to store all the information in a simple XML-file in the filesystem. My example application looks like this (copy all the code into a file called Application.java and compile, no additional requirements!):

@XmlRootElement
class DataStorage {
    String emailAddress;
    List<String> familyMembers;
    // List<Address> addresses;
}

public class Application {

    private static JAXBContext jc;
    private static File storageLocation = new File("data.xml");

    public static void main(String[] args) throws Exception {
        jc = JAXBContext.newInstance(DataStorage.class);

        DataStorage dataStorage = load();

        // the main application will be executed here

        // data manipulation like this:
        dataStorage.emailAddress = "[email protected]";
        dataStorage.familyMembers.add("Mike");

        save(dataStorage);
    }

    protected static DataStorage load() throws JAXBException {
        if (storageLocation.exists()) {
            StreamSource source = new StreamSource(storageLocation);
            return (DataStorage) jc.createUnmarshaller().unmarshal(source);
        }
        return new DataStorage();
    }

    protected static void save(DataStorage dataStorage) throws JAXBException {
        jc.createMarshaller().marshal(dataStorage, storageLocation);
    }
}

How can I overcome these downsides?

  • Starting the application multiple times could lead to inconsistencies: Several users could run the application on a network drive and experience concurrency issues
  • Aborting the write process might lead to corrupted data or loosing all data
like image 923
slartidan Avatar asked Feb 18 '16 10:02

slartidan


People also ask

How do you maintain a state of an object in Java?

Java Serialization is one of the route using which you can save an object along with all its instance variables by writing it into a stream (serialize) and also reading the stream to get the object back (deserialization).

Why is block storage faster?

Block storage provides better storage performance than its counterpart. This is mainly due to the way it stores units of data. Since data is split into subsequent data blocks, block storage allows modifying (or retrieving) only part of a file instead of the entire unit.

How do you read and write an object to a file in Java?

Writing and Reading objects in JavaThe objects can be converted into byte-stream using java. io. ObjectOutputStream . In order to enable writing of objects into a file using ObjectOutputStream , it is mandatory that the concerned class implements Serializable interface as shown in the class definition below.


2 Answers

Seeing your requirements:

  • Starting the application multiple times
  • Several users could run the application on a network drive
  • Protection against data corruption

I believe that an XML based filesystem will not be sufficient. If you consider a proper relational database an overkill, you could still go for an H2 db. This is a super-lightweight db that would solve all these problems above (even if not perfectly, but surely much better than a handwritten XML db), and is still very easy to setup and maintain.

You can configure it to persist your changes to the disk, can be configured to run as a standalone server and accept multiple connections, or can run as part of your application in embedded-mode too.

Regarding the "How do you save the data" part:

In case you do not want to use any advanced ORM library (like Hibernate or any other JPA implementation) you can still use plain old JDBC. Or at least some Spring-JDBC, which is very lightweight and easy to use.

"What do you save"

H2 is a relational database. So whatever you save, it will end up in columns. But! If you really do not plan to query your data (neither apply migration scripts on it), saving your already XML-serialized objects is an option. You can easily define a table with an ID + a "data" varchar column, and save your xml there. There is no limit on data-length in H2DB.

Note: Saving XML in a relational database is generally not a good idea. I am only advising you to evaluate this option, because you seem confident that you only need a certain set of features from what an SQL implementation can provide.

like image 148
Gergely Bacso Avatar answered Sep 20 '22 05:09

Gergely Bacso


Inconsistencies and concurrency are handled in two ways:

  • by locking
  • by versioning

Corrupted writing can not be handled very well at application level. The file system shall support journaling, which tries to fix that up to some extent. You can do this also by

  • making your own journaling file (i.e. a short-lived separate file containing changes to be committed to the real data file).

All of these features are available even in the simplest relational database, e.g. H2, SQLite, and even a web page can use such features in HTML5. It is quite an overkill to reimplement these from scratch, and the proper implementation of the data storage layer will actually make your simple needs quite complicated.

But, just for the records:

Concurrency handling with locks

  • prior starting to change the xml, use a file lock to gain an exclusive access to the file, see also How can I lock a file using java (if possible)
  • once the update is done, and you sucessfully closed the file, release the lock

Consistency (atomicity) handling with locks

  • other application instances may still try to read the file, while one of the apps are writing it. This can cause inconsistency (aka dirty-read). Ensure that during writing, the writer process has an exclusive lock on the file. If it is not possible to gain an exclusive access lock, the writer has to wait a bit and retry.

  • an application reading the file shall read it (if it can gain access, no other instances do an exclusive lock), then close the file. If reading is not possible (because of other app locking), wait and retry.

  • still an external application (e.g. notepad) can change the xml. You may prefer an exclusive read-lock while reading the file.

Basic journaling

Here the idea is that if you may need to do a lot of writes, (or if you later on might want to rollback your writes) you don't want to touch the real file. Instead:

  • writes as changes go to a separate journaling file, created and locked by your app instance

  • your app instance does not lock the main file, it locks only the journaling file

  • once all the writes are good to go, your app opens the real file with exclusive write lock, and commits every change in the journaling file, then close the file.

As you can see, the solution with locks makes the file as a shared resource, which is protected by locks and only one applicaition can access to the file at a time. This solves the concurrency issues, but also makes the file access as a bottleneck. Therefore modern databases such as Oracle use versioning instead of locking. The versioning means that both the old and the new version of the file are available at the same time. Readers will be served by the old, most complete file. Once writing of the new version is finished, it is merged to the old version, and the new data is getting available at once. This is more tricky to implement, but since it allows reading all the time for all applications in parallel, it scales much better.

like image 21
Gee Bee Avatar answered Sep 18 '22 05:09

Gee Bee