I'm trying to come up with a fast algorithm to compute the quantity <code>b[i]= med |y_i+y_j|, 1<=j!=i<=n</code> when the <code>y_1,...,y_n</code> are sorted already (so <code>b[]</code> is a vector of same length as <code>y[]</code>). I will assume that all elements of <code>y[]</code> are unique and that n is even. So, the code below computes the <code>b[i]</code>'s the naive (<code>O(n**2)</code>) way: (I wrote this in R for convenience, but I'm language agnostic) <pre class="prettyprint"><code>n<-30 a_fast<-b_slow<-rep(NA,n) y<-sort(rnorm(n,100,1)) z<-y for(i in 1:n){ b_slow[i]<-median(abs(y[-i]+y[i])) } </code></pre> I have a tentative proposal --below-- for doing it in <code>O(n)</code>. But it only works if <code>y[]</code> contains positive numbers. My question is: how should I change the fast algorithm to also work when <code>y[]</code> contains both positive and negative numbers? Is this even possible? <h3>EDIT:</h3> And the code below the (tentative) <code>O(n)</code> way (I wrote this in R for convenience, but I'm language agnostic) <pre class="prettyprint"><code>tryA<-floor(1+(n-1)/2+1) tryB<-floor(1+(n-1)/2) medA<-y[tryA] medB<-y[tryB] for(i in 1:(tryA-1)){ a_fast[i]<-medA+y[i] } for(i in tryA:n){ a_fast[i]<-medB+y[i] } </code></pre> <h3>Simple example:</h3> Simple, illustrative example. If we have a vector of length 4 <pre class="prettyprint"><code>-3, -1, 2, 4 </code></pre> Then, for example for i=1, the 3 absolute pairwise sums are <pre class="prettyprint"><code> 4 1 1 </code></pre> and their median is 1. Then, for example for i=2, the 3 absolute pairwise sums are <pre class="prettyprint"><code> 4 1 3 </code></pre> and their median is 3. Here is a longer example with both positive and negative <code>y[]</code>: <pre class="prettyprint"><code> -1.27 -0.69 -0.56 -0.45 -0.23 0.07 0.13 0.46 1.56 1.72 </code></pre> and here are my new <code>b_slow[]</code> (this is the ground thruth, computed the naive way): <pre class="prettyprint"><code>1.20 0.92 1.00 1.01 0.79 0.53 0.56 0.53 1.33 1.49 </code></pre> but now, my new <code>a_fast[]</code> don't match no more: <pre class="prettyprint"><code> -1.20 -0.62 -0.49 -0.38 -0.16 -0.16 -0.10 0.23 1.33 1.49 </code></pre> <h3>EDIT:</h3> Here is my implementation of Francis's solution (up to the point where we have two sorted array, the median of which is easy to compute). I did it in R to stay in the spirit of the question. Nonetheless, I seem to be missing a correction factor for the index (the ww in the code below) so the code below is sometimes off by a little bit. This is because in the definition above we compute the medians over n-1 observations (i!=j). <pre class="prettyprint"><code> n<-100 y<-rnorm(n) y<-sort(y) b<-rep(NA,n) #Naive --O(n**2)-- approch: for(i in 1:n){ b[i]<-median(abs(y[-i]+y[i])) } k<-rep(NA,n) i<-1 k[i]<-min(na.omit(c(which(y+y[i]>0)[1],n))) #binary search: O(log(n)) -- for(i in 2:n){ #O(n) k_prov<-k[i-1] while(y[k_prov]+y[i]>0 && k_prov>0) k_prov<-k_prov-1 k[i]<-max(k_prov+1,1) #for(i in 1:n){ should give the same result. # k[i]<-which(y+y[i]>0)[1] #} } i<-sample(1:n,1) x1<--y[1:(k[i]-1)]-y[i] x2<-y[i]+y[n:k[i]] x3<-c(x1,x2) plot(x3) ww<-ifelse(i<k[i] & i>n/2,n/2+1,n/2) sort(x3)[ww] #this can be computed efficiently: O(log(n)) b[i] #this is the O(n**2) result. </code></pre>

Here is a O(Nxln(N)xln(N)) solution : for all i : 1) find item k such as <code>j<k <=> y[j]+y[i]<0</code> (dichotomy, O(ln(N))) k separates two sorted lists : one above -y[i], the other below -y[i], for which the sign should be changed to get abs(y[i]+y[j]). Now, we are looking for the median of these lists. From here, it is just the problem of finding the median of two sorted lists, repeated n times. 2)Let's pick the maximum (M=abs(y[1]-y[i]) or M=abs(y[size]-y[i])) and minimum (m around k) of these lists and restart a dichotomy (O(ln(N)). Let's start by picking the middle (M+m)/2...at any stage, let pick the middle... 3)Stage of this big dichotomy : How many items y[j]+y[i] are above (M+m)/2 in the first list ? Once again a dichotomy... O(ln(N)). How many items -y[j]-y[i] are above (M+m)/2 in the second list ? Guess what ? Dichotomy... Sum these two numbers. If it is above (size-1)/2, m=(M+m)/2. Otherwise M=(M+m)/2. 4)If m=M stop ! <code>b[i]=m;</code> I guess somebody will come with a better solution... Edit : I should thank @user189035 for his link to an O(ln(n+m)) algorithm to compute the median of two sorted lists. How to find the kth smallest element in the union of two sorted arrays? Bye,

Efficient algorithm to compute the median of pariwise absolute sums of a sorted array

Q: How do you find the median in a sorted array?

Find the median of the two sorted arrays( The median of the array formed by merging both the arrays). Median: The middle element is found by ordering all elements in sorted order and picking out the one in the middle (or if there are two middle numbers, taking the mean of those two numbers).

Q: How do you find the median of a sorted number?

The median is the middle value in a set of data. First, organize and order the data from smallest to largest. To find the midpoint value, divide the number of observations by two. If there are an odd number of observations, round that number up, and the value in that position is the median.

Q: What is the best time complexity to get median of 2 sorted arrays?

In this basic approach to finding the median of two sorted arrays of different lengths, we have traversed the arrays and counted the first m + n sorted elements of the merged array. As we are traversing only the first m + n elements of the arrays, the time complexity is O(m + n).

Tags:

c++

c

language-agnostic

algorithm

I'm trying to come up with a fast algorithm to compute the quantity b[i]= med |y_i+y_j|, 1<=j!=i<=n when the y_1,...,y_n are sorted already (so b[] is a vector of same length as y[]). I will assume that all elements of y[] are unique and that n is even.

So, the code below computes the b[i]'s the naive (O(n**2)) way: (I wrote this in R for convenience, but I'm language agnostic)

n<-30
a_fast<-b_slow<-rep(NA,n)
y<-sort(rnorm(n,100,1))
z<-y
for(i in 1:n){
    b_slow[i]<-median(abs(y[-i]+y[i]))
}

I have a tentative proposal --below-- for doing it in O(n). But it only works if y[] contains positive numbers.

My question is: how should I change the fast algorithm to also work when y[] contains both positive and negative numbers? Is this even possible?

EDIT:

And the code below the (tentative) O(n) way (I wrote this in R for convenience, but I'm language agnostic)

tryA<-floor(1+(n-1)/2+1)
tryB<-floor(1+(n-1)/2)
medA<-y[tryA]
medB<-y[tryB]
for(i in 1:(tryA-1)){
        a_fast[i]<-medA+y[i]
}
for(i in tryA:n){
        a_fast[i]<-medB+y[i]
}

Simple example:

Simple, illustrative example. If we have a vector of length 4

-3, -1, 2, 4

Then, for example for i=1, the 3 absolute pairwise sums are

  4 1 1

and their median is 1.

Then, for example for i=2, the 3 absolute pairwise sums are

  4 1 3

and their median is 3.

Here is a longer example with both positive and negative y[]:

 -1.27 -0.69 -0.56 -0.45 -0.23  0.07  0.13  0.46  1.56  1.72

and here are my new b_slow[] (this is the ground thruth, computed the naive way):

1.20 0.92 1.00 1.01 0.79 0.53 0.56 0.53 1.33 1.49

but now, my new a_fast[] don't match no more:

 -1.20 -0.62 -0.49 -0.38 -0.16 -0.16 -0.10  0.23  1.33  1.49

EDIT:

Here is my implementation of Francis's solution (up to the point where we have two sorted array, the median of which is easy to compute). I did it in R to stay in the spirit of the question.

Nonetheless, I seem to be missing a correction factor for the index (the ww in the code below) so the code below is sometimes off by a little bit. This is because in the definition above we compute the medians over n-1 observations (i!=j).

 n<-100
 y<-rnorm(n)
 y<-sort(y)

 b<-rep(NA,n)
 #Naive --O(n**2)-- approch:
 for(i in 1:n){
     b[i]<-median(abs(y[-i]+y[i]))
 }

 k<-rep(NA,n)
 i<-1
 k[i]<-min(na.omit(c(which(y+y[i]>0)[1],n))) #binary search: O(log(n)) -- 
 for(i in 2:n){                  #O(n)
     k_prov<-k[i-1]
     while(y[k_prov]+y[i]>0 && k_prov>0)     k_prov<-k_prov-1
     k[i]<-max(k_prov+1,1)
     #for(i in 1:n){ should give the same result.
     #   k[i]<-which(y+y[i]>0)[1]
     #}
 }

 i<-sample(1:n,1)
 x1<--y[1:(k[i]-1)]-y[i]
 x2<-y[i]+y[n:k[i]]
 x3<-c(x1,x2)
 plot(x3)
 ww<-ifelse(i<k[i] & i>n/2,n/2+1,n/2)
 sort(x3)[ww]  #this can be computed efficiently: O(log(n))
 b[i]          #this is the O(n**2) result.

463

asked May 15 '14 16:05

user189035

1 Answers

Here is a O(Nxln(N)xln(N)) solution :

for all i :

1) find item k such as j<k <=> y[j]+y[i]<0 (dichotomy, O(ln(N)))

k separates two sorted lists : one above -y[i], the other below -y[i], for which the sign should be changed to get abs(y[i]+y[j]). Now, we are looking for the median of these lists.

From here, it is just the problem of finding the median of two sorted lists, repeated n times.

2)Let's pick the maximum (M=abs(y[1]-y[i]) or M=abs(y[size]-y[i])) and minimum (m around k) of these lists and restart a dichotomy (O(ln(N)). Let's start by picking the middle (M+m)/2...at any stage, let pick the middle...

3)Stage of this big dichotomy : How many items y[j]+y[i] are above (M+m)/2 in the first list ? Once again a dichotomy... O(ln(N)). How many items -y[j]-y[i] are above (M+m)/2 in the second list ? Guess what ? Dichotomy... Sum these two numbers. If it is above (size-1)/2, m=(M+m)/2. Otherwise M=(M+m)/2.

4)If m=M stop ! b[i]=m;

I guess somebody will come with a better solution...

Edit : I should thank @user189035 for his link to an O(ln(n+m)) algorithm to compute the median of two sorted lists. How to find the kth smallest element in the union of two sorted arrays?

Bye,

answered Oct 12 '22 23:10

francis

Related questions
                            
                                std::lock_guard<std::mutex> segfaults on construction?
                            
                                PostMessage: Guaranteed to be delivered?
                            
                                constexpr with std::array - "Non-type template argument is not a constant expression" [duplicate]
                            
                                L2 data and instruction cache decreased suddenly
                            
                                hash_value function in C++11
                            
                                What protection certificates would be required for an application that consists of setup, loading utility and main application?
                            
                                Const and non-const functors
                            
                                "QComboBox Pop-up" expanding and QtWebkit
                            
                                uncallable template constructor? [duplicate]
                            
                                Can I use Visual C++ compiler without Visual Studio? [closed]
                            
                                Are there cases where constexpr should be avoided, even it it could be used?
                            
                                How to optimize this product of three matrices in C++ for x86?
                            
                                How the has_trivial_default_constructor works?
                            
                                C++ Linked List Node with class
                            
                                How to get a file descriptor from a std::basic_ios for clang on OS X?
                            
                                Suggested max size for stack allocations
                            
                                C++ method overloading: base and derived parameters
                            
                                How to use LZMA SDK in C++?
                            
                                How to permanently override HOMEBREW_CC and HOMEBREW_CXX settings?
                            
                                Could non-static member variable be modified in constexpr constructor (C++14)?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Efficient algorithm to compute the median of pariwise absolute sums of a sorted array

Tags:

c++

c

language-agnostic

algorithm

EDIT:

Simple example:

EDIT:

user189035

People also ask

1 Answers

francis

Recent Activity

Donate For Us