Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Inserting values into a sorted array

What would be the quickest way to insert values into the correct position in a sorted numpy array?

For example, I would like to insert every value of binto a:

a = [1,1,2,4,7,7,11,13,13,13,15,20,25,26,27,30,45,70]

b = [5,7,9,45]

I've tried looping through a for each value of b and inserting it that way. I've also tried the bisect_left method:

for i in b:
a.insert(bisect_left(a,i),i)

Both methods are too slow as I have hundreds of thousands of data elements to go through.

Any ideas?

like image 933
user2957178 Avatar asked Nov 08 '13 13:11

user2957178


People also ask

Why insertion sort is best for sorted array?

Insertion Sort is preferred for fewer elements. It becomes fast when data is already sorted or nearly sorted because it skips the sorted values. Efficiency: Considering average time complexity of both algorithm we can say that Merge Sort is efficient in terms of time and Insertion Sort is efficient in terms of space.

What is used to store elements in sorted order?

Solution. Use the associative container set , declared in <set> , which stores items in sorted order.

What is the time complexity for inserting an item into a sorted array?

The insertion and deletion of elements in a sorted array executes at O(n), due to the need to shift all the elements following the element to be inserted or deleted; in comparison a self-balancing binary search tree inserts and deletes at O(log n).


2 Answers

You could use searchsorted and insert:

a = numpy.array([1,1,2,4,7,7,11,13,13,13,15,20,25,26,27,30,45,70])
b = numpy.array([5,7,9,45])
ii = numpy.searchsorted(a, b)
a = numpy.insert(a, ii, b)
like image 149
Ill-Conditioned Matrix Avatar answered Nov 06 '22 07:11

Ill-Conditioned Matrix


let's note n = len(a) and m = len(b),

  1. you can use a binary search to find each element's position and insert it, that would done in m*n*log(n) time
  2. you can merge both arrays, that would have an n+m complexity
  3. you can use a specialized structure, a balanced binary tree, you can find a lot of implementation of these in python, the time complexity will be mlog(n)

Now given possible values of n and m, you can determine which solution is best, but don't expect to do better than that

like image 30
Samy Arous Avatar answered Nov 06 '22 07:11

Samy Arous