Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best Data Structure for fast retrieval, update, and keeping ordering

The problem is as follows

  • I need to keep track of url + click count.
  • I need to be able to update url quickly with click count when user click on that url.
  • I need to be able to retrieve the top 10 click count URL quickly.

NOTE: Assuming you cannot use the database.

What is the best data structure to achieve the result?

I have thought about using a map before, but map doesn't keep track of ordering of the top 10 clicks.

like image 875
Hong Wei Wang Avatar asked Sep 11 '17 17:09

Hong Wei Wang


People also ask

What is the best data structure for fast retrieval of data?

Hash Table. Hashtable is a list of paired values, the first item in the pair is the key, and the second item is the value. With a hash table, you can access objects by the key, so this structure is high-speed for lookups. Hash tables are faster than the arrays for lookups.

Which data structure is best for retrieval?

The binary tree is used to store elements in sorted order, which makes searching and retrieval easier. Tree data structures have so many types such as Binary Tree, Binary Search Tree, Red-Black trees, B-tree, Weighted-balanced tree, and Heap. It has so many applications in database management.

What data structure keeps order?

Data structures such as binary search trees -- also known as an ordered or sorted binary tree -- provide efficient methods of sorting objects, such as character strings used as tags.

Which data structure has efficient search time?

In computer science, a search data structure is any data structure that allows the efficient retrieval of specific items from a set of items, such as a specific record from a database.


2 Answers

You need an additional List<Map.Entry<URL,Integer>> for holding the top ten, with T being the click count for the lowermost.

  • If you count another click and this count is still not greater than T: do nothing.
  • If the increased count is greater than T, check whether the URL is in the list or not. If it is, do nothing. If it is not, add this entry to the List, sort and delete the last entry if the list has more than 10 entries. Update T.
like image 195
laune Avatar answered Sep 25 '22 21:09

laune


The best data structure I can think is of using the TreeSet.

The elements of TreeSet are sorted, so you can easily find top items. Also make sure for URL you maintain a separate comparator class which implements Comparator, so you can put your logic of keeping elements sorted all the time based on count. Use this comparator while creating the TreeSet. Insertion/Update/delete/Get all operations happen in O(logn)

Here is the code, how you should define the structure.

 TreeSet<URL> treeSet = new TreeSet<URL>(new URLComparator());


class URL {
    private String url;
    int count;

    public URL(String string, int i) {
        url = string;
        count = i;
    }

    @Override
    public int hashCode() {
        return url.hashCode();
    }

    @Override // No need to write this method. Just used it for testing 
    public String toString() {
        return "url : " + url + " ,count : " + count+"\n";
    }

}

One more info- Use hashcode method of your URL class as hashcode of your url.

This is how you define URLComparator class. compare logic is based on URL count.

class URLComparator implements Comparator<URL> {

    @Override
    public int compare(URL o1, URL o2) {
        return new Integer(o2.count).compareTo(o1.count);
    }

}

Testing

TreeSet<URL> treeSet = new TreeSet<URL>(new URLComparator());

treeSet.add(new URL("url1", 12));
treeSet.add(new URL("url2", 0));
treeSet.add(new URL("url3", 5));

System.out.println(treeSet);

Output:-

[url : url1 ,count : 12
, url : url3 ,count : 5
, url : url2 ,count : 0
]

To print top 10 elements, use following code.

Iterator<URL> iterator = treeSet.iterator();
int count = 0;
while(count < 10 && iterator.hasNext() ){
    System.out.println(iterator.next());
    count++;
}
like image 29
nagendra547 Avatar answered Sep 25 '22 21:09

nagendra547