Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Recommended json data structure to store chat log in a text file

I have a chat app where I need to store chat log for individual contacts. Currently I am using plain array like below to store in a .txt file

var messages = [{_id:1, message : "Hello"}, {_id:$1, message : "Hello", }];

I can see below problems with it if it grows bigger.

  1. While retrieving back it takes so much processing to convert it back to array from txt format.

  2. Takes so much cache memory.

But, I feel it simplifies Searching messages. Would like to know if there are better alternatives to this structure.

Note : The reason for preferring .txt file over indexedDB or webSQL is I do not want to deal with storage limitations.

like image 555
redV Avatar asked Jan 09 '23 13:01

redV


2 Answers

You can remove words _id and message and store plain array [id, message] instead of object in that way you will reduce your file size. You can read files from ending by chunks while search. You can store search indexes in separate file.

UPDATE:

So your file will be:

[[1,'message1'],[2,'message2'],[1,'message3']]

It will be much shorter then with keys and if you have static schema - no problem to parse it:

var file = JSON.parse(fs.readFileSync('filepath.txt'));
for (var i = 0; i < file.length; i++) {
    messages.push({id: file[i][0], message: file[i][1]};
}

After little thinking, I guess it will be better to separate files by date 2014-10-26.txt and search in them one by one and you can stream output via socket while searching, so user will get immediate output.

If you don't care that much about RAM, you can even cache files in variables to prevent often read from HDD and set TTL on cache expire to get rid of not used files for a while. This way you will not get that much memory used.

like image 178
monkeyinsight Avatar answered Jan 22 '23 18:01

monkeyinsight


I gather from some of the discussion that your 'json text file' will be stored using File API, so there is a definite limit on filesize as opposed to some server where you could assume almost 'infinite' storage. So, you will eventually have to limit the number of messages and contacts.

Since this is a 'chat log', I assume you need to preserve the order of messages and users to be able to display a chat history (if you only wanted to store all messages from individual users, the answer would be quite different.)

So, you are stuck with storing {userid:XXX, message:"YYY"} in a fixed order.

As monkeyinsight suggests, using JSON this will be smallest as an array: [XXX,"YYY"]

To answer your concerns: 1. amount of processing and 2. storage space, it would be possible to store blocks of these messages in separate files along with an index

var message_index["msg001.json","msg002.json", ... ];

where each file contains, say, 1000 messages as

var messages=[[userid,message], ...];

allowing you to drop the oldest file from message_index when you run out of storage or whatever.

The actual files can be reduced in size a little more if you don't save them as JSON and simply use a string, e.g.

var messages=":1:message one:2:message two:3:another message:4:last message";

var getMsg= function( id, msgs ){
    var key= ":"+id+":";
    var ikey= msgs.search(key);
    var message={'id':id, 'message':""};
    if( ikey >= 0 ){
        var start= ikey+key.length;
        var end= msgs.indexOf(":", start );
        if( end > start ){
            message.message= msgs.substr(start, end-start); 
        }else{ 
            message.message= msgs.substr(start); 
        }
    }
    return message;
}

var m= getMsg(4, messages);   // returns: {id: 4, message: "last message"}

saving 4 characters per entry (e.g.":1:a" vs "[1,'a']," )

You would, of course, have to ensure that the separator (':' in this example) did not occur in a message.

like image 25
TonyWilk Avatar answered Jan 22 '23 20:01

TonyWilk