Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Most efficient data structure to represent threaded comments in Java?

I want to represent threaded comments in Java. This would look similar to the way comments are threaded on reddit.com

hello
   hello
      hello
      hello
   hello
   hello
      hello

As in the example above, responses are nested in the HTML with appropriate indentation to reflect their relationship to prior comments.

What would be an efficient way to represent this in Java?

I'm thinking some kind of tree data structure would be appropriate.

But is there one in particular which would be most efficient to minimize tree traversals?

This would be important if I have voting on each comment. Because then the tree would need to be reordered after each vote - a potentially expensive operation computationally.

By the way, if anyone knows of an open source existing implementation of this in Java, that would help too.

like image 460
Hula Avatar asked Dec 10 '22 21:12

Hula


1 Answers

I would use levels of linked lists.

message1
    message2
        message3
        message4
    message5
    message6
        message7

Each node would have a pointer to its:

- forward sibling  (2->5, 3->4, 5->6,                   1/4/6/7->NULL).
- backward sibling (4->3, 5->2, 6->5,                   1/2/3/7->NULL).
- first child      (1->2, 2->3, 6->7,                   3/4/5/7->NULL).
- parent           (2->1, 3->2, 4->2, 5->1, 6->1, 7->6,       1->NULL).

Within each level, messages would be sorted in the list by vote count (or whatever other score you wanted to use).

That would give you maximum flexibility for moving things around and you could move whole sub-trees (e.g., message2) just by changing the links at the parent and that level.

For example, say message6 gets a influx of votes that makes it more popular than message5. The changes are (adjusting both the next and previous sibling pointers):

  • message2 -> message6
  • message6 -> message5
  • message5 -> NULL.

to get:

message1
    message2
        message3
        message4
    message6
        message7
    message5

If it continues until it garners more votes than message2, the following occurs:

  • message6 -> message2
  • message2 -> message5

AND the first-child pointer of message1 is set to message6 (it was message2), still relatively easy, to get:

message1
    message6
        message7
    message2
        message3
        message4
    message5

Re-ordering only needs to occur when a score change results in a message becoming more than its upper sibling or less than its lower sibling. You don't need to re-order after every score change.

like image 76
paxdiablo Avatar answered Dec 12 '22 10:12

paxdiablo