Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create a reliable chat using Google App Engine Data Store (HRE)?

In my application I want to have a live chat feature - in which multiple people (perhaps 5 or more) can chat together at the same time.

I am using a Java based Google App Engine - this is really the first time I've tried to use GAE Datastore, I'm so used to using Oracle/MySQL so I think my strategy is wrong.

Note: For simplicity, I am omitting any validation/security checks In some servlet called WriteMessage I have the following code

Entity entity = new Entity("ChatMessage");
entity.setProperty("userName", request.getParameter("userName"));
entity.setProperty("message", request.getParameter("message"));
entity.setProperty("time", new Date());
DatastoreService datastore = DatastoreServiceFactory.getDatastoreService();
datastore.put(entity);

In some other servlet called ReadMessages I have the following code

String id = request.getParameter("id");
DatastoreService datastore = DatastoreServiceFactory.getDatastoreService();
Query query = new Query("ChatMessage");
if (id != null) {
  // Client requested only messages with id greater than this id
  Filter idFilter = new FilterPredicate(Entity.KEY_RESERVED_PROPERTY,
        FilterOperator.GREATER_THAN,
        KeyFactory.createKey("ChatMessage", Long.parseLong(id)));
  query.setFilter(idFilter);
}
PreparedQuery pq = datastore.prepare(query);
JsonArray messages = new JsonArray();
for (Entity result : pq.asIterable()) {
  JsonObject jmsg = new JsonObject();

  // Client will use this id on the next request to read to poll only
  // "new" messages
  jmsg.addProperty("id", result.getKey().getId());
  jmsg.addProperty("userName", (String) result.getProperty("userName"));
  jmsg.addProperty("message", (String) result.getProperty("message"));
  jmsg.addProperty("time", ((Date) result.getProperty("time")).getTime());
  messages.add(jmsg);
}
PrintWriter out = response.getWriter();
out.print(messages.toString());

In the javascript client code - the WriteMessage servlet is called any time the user submits a new message - and ReadMessages servlet is called every second to get new messages.

In order to optimize, the javascript will send the id of the last message that it received (or possibly the highest id it has received so far) on subsequent requests to ReadMessage, so that the response only contains messages that it hasn't seen before.

This all seems to work at first, but I think maybe there are a couple of things wrong with this code.

Here is what I think is wrong:

  • Some messages might not be read because I am relying on the id of the ChatMessage's key to filter out messages that the JS client has already seen before - I don't think that will be reliable right?

  • Some writes might fail because there might be 5 or 6 incoming writes at the same exact time - and my understanding is that this might result in ConcurrentModificationException if there are too many writes per second.

  • The date passed on the entity is the current date of the JRE on the application server - maybe I should be using something like "sysdate()" in SQL? I don't know if this is actually an issue or not.

How can I fix the code so that:

  1. All chat messages will be written - would it just be best to have a fail-over so that if the request fails the javascript will just re-attempt until successful?

  2. All chat messages will be read (no exceptions)

  3. Clean up old messages so that only 1000 or so messages are stored

like image 526
codefactor Avatar asked May 29 '13 05:05

codefactor


1 Answers

It's kinda refreshing when someone's actually worked on a problem before posting a question to SO.

While you did list a bunch of valid issues that you're running into with your approach, I'd propose your biggest problem would be cost. You're adding a new entity with each chat message, and furthermore that entity needs to be indexed. So you're talking about multiple write ops for every message sent. You also have to pay for each entity you delete, so you'll have to pay to clean up.

On the plus side of your design, you're not using transactions or ancestors to create your entities, so you shouldn't hit a write perf limit.

On the read side, you read one entity per message, so the costs are going to add up there as well. The fact that you're querying without transactions or ancestor queries means you may not see the latest ChatMessage entities when you query.

Also, unlike SQL, GAE datastore ids are not monotonically increasing, so querying by id GREATER_THAN is not going to work.

Now for the suggestion. I warn you, this will be a lot of work.

  1. Minimize the number of entities you use. Instead of adding a new entity per message, use a larger entity that stores multiple messages per entity.

  2. Instead of querying for message entities, fetch them by key. Fetching entities by key will give you strongly consistent results instead of eventually consistent results. This is important if you want to ensure all the latest chat messages are read (no exceptions)

This does introduce two new problems you'll need to deal with:

  • If multiple writes are going to the same entity, you'll hit some sort of write performance limit.

  • Since your entities can grow to be large, you'll need to handle the case to make sure they don't outgrow the 1MB limit.

You'll need two entity Kinds. You'll need a MessageLog Kind that stores multiple messages. You'll probably want to store the messages as a List within the MessageLog. You'll need multiple MessageLog entities for a given chat, primarily for write performance. (search for "Google App Engine Sharding" for more info).

You'll need a Chat Kind that essentially stores a list of MessageLog keys. This allows for multiple Chats to go on. Your original implementation seemed to have just one global chat. Or if you want that, just use a single instance of the Chat.

Neither of these really need to be indexed, since you will fetch everything by Key. This will reduce costs.

When you start a new Chat, you would create a number of MessageLog entities based on the perf you expect you'd need. 1 Entity per write per second you expect. If you have more people in the chat, I'd create more MessageLogs. Then create a Chat entity and store the list of MessageLog keys in it.

On a message write you would do the following: - Fetch the appropriate Chat entity by key, you now have a list of MessageLogs - Pick one MessageLog to distribute the load so all writes aren't hitting the same entity. There may be multiple techniques to picking one, but for this example, pick one randomly. - Format the new message and insert it into the MessageLog. You may also consider dropping old messages in the MessageLog at this point. You also want to do some safety checking to ensure that the MessageLog is within the 1MB entity size limit. - Write the MessageLog. This should only incur 1 write op instead of the minimum 3 write ops for writing a new entity. RECOMMENDED: Append the message to a memcache entry for the given chat that contains the entire chat log.

On a read you would do the following: RECOMMENDED: First check the memcache entry for the given chat, if it exists, just return that, done. - Fetch the appropriate Chat entity by key, you now have a list of MessageLogs - Fetch all the MessageLogs by key. Now you have all your messages in the chat, and they are up to date. - Parse all the MessageLogs, and reconstruct the whole chat log. RECOMMENDED: Store the reconstructed message log in memcache, so you don't have to do it again. - Return the reconstructed chat log.

Consider using the Channel API to send the messages to the viewers. Viewers can receive messages more quickly than once per second this way. I personally find that the Channel API isn't 100% reliable, so I wouldn't get rid of the polling entirely, but you might be ok with polling once every 30 seconds as backup.

Imagine a chat with say 100 messages in it. Your original plan would cost about 101 read ops the read the 100 messages. In this new method, you'd have something like 5-10 MessageLog entities, so the cost would be 6-11 read ops. If you get a memcache hit, you don't need any read ops. But you do have to write the code to reconstruct the chat log from multiple MessageLog objects.

like image 171
dragonx Avatar answered Nov 15 '22 03:11

dragonx