Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Data Structure for fast insertion and random access in already sorted data

p = random_point(a,b)
#random_point() returns a tuple/named-tuple (x,y)
#0<x<a 0<y<b
if centers.validates(p):
    centers.insert(p)
#centers is the data structure to store points

In the centers data structure all x and y coordinates are stored in two separate sorted(ascending) lists, one for x and other for y. Each node in x points to the corresponding y, and vice versa, so that they can be separately sorted and still hold the pair property: centers.get_x_of(y) and centers.get_y_of(x)

enter image description here

Properties that I require in data structure:

  1. Fast Insertion, in already sorted data (preferably log n)
  2. Random access
  3. Sort x and y separately, without losing pair property

Initially I thought of using simple Lists, and using Binary search to get the index for inserting any new element. But I found, that, it can be improved using self balancing trees like AVL or B-trees. I could make two trees each for x and y, with each node having an additional pointer that could point from x-tree node to y-tree node.

But I don't know how to build random access functionality in these trees. The function centers.validate() tries to insert x & y, and runs some checks with the neighboring elements, which requires random access:

def validate(p):
    indices = get_index(p)
    #returns a named tuple of indices to insert x and y, Eg: (3,7)
    condition1 =  func(x_list[indices.x-1], p.x) and func(x_list[indices.x+1], p.x)
    condition2 =  func(y_list[indices.y-1], p.y) and func(y_list[indices.y+1], p.y)
    #func is some mathematical condition on neighboring elements of x and y
    return condition1 and condition2

In the above function I need to access neighboring elements of x & y data structure. I think implementing this in trees would complicate it. Are there any combination of data structure that can achieve this? I am writing this in Python(if that can help)

like image 445
Dost Arora Avatar asked Jun 06 '18 13:06

Dost Arora


People also ask

Which data structure has the fastest insertion procedure?

Primitive data types, abstract data types, binary trees, array, B-tree, heap nd trie are some of the major data structure. Amongst these, a balanced binary tree has the fastest insertion procedure with a complexity of O(logn).

Which data structure is best for random access?

An array is a random access data structure, where each element can be accessed directly and in constant time. A typical illustration of random access is a book - each page of the book can be open independently of others. Random access is critical to many algorithms, for example binary search.

Which data structure is better for storing sorted data?

The best data structure for searching sorted integers is an array. You can search it with log(N) operations, and it is more compact (less memory overhead) than a tree.

Which data structure is best for insertion and searching?

std::unordered_map usually implemented as (hashtable) will give you best insert/search time (O(1)) but does not preserve nor provide any order.


1 Answers

Class with 2 dicts that hold the values with the keys being the key of the other dict that contains the related value to the value in this dict. It would need to maintain a list per dict for the current order to call elements of that dict in when calling it (your current sort of that dicts values). You would need a binary or other efficient sort to operate on each dict for insertion, though it would really be using the order list for that dict to find each midpoint key and then checking against value from that key.

like image 59
user85779 Avatar answered Sep 21 '22 21:09

user85779