Can someone give me ideas for the following problem? Given an array <code>ar[]</code> of length <code>n</code>, and some queries, each query is of the form <code>a, b, c</code>, find the number with the smallest index <code>i</code> such that the index lies in the range <code>[c, n]</code> and such that <code>a < ar[i] < b</code>. There are <code>(n - 1)</code> in total, <code>c</code> goes from <code>1</code> to <code>n - 1</code>. The expected complexity for each query should be about <code>O(log n)</code>, and a precomputation of complexity at most <code>O(n log n)</code> is suitable. Intuitively, segment tree came up to my mind, but I couldn't think of a way of building it, nor what to keep in each node.

I've modified Mason Bryant's technique to something that works. The problems were more bugs in FindLowestIndex and a more major bug searching the tree (it can return multiple results). Despite doing that work, it still doesn't actually solve the problem. <code>O(n log n)</code> time setting up is easy enough, but using this technique I'm only able to get <code>O((log n)^2)</code> query time. I wonder if you have a link to the original problem in case there are more clarifications there? Or I wonder if the problem is really solvable. Or maybe <code>O((log n)^2)</code> is "about" <code>O(log n)</code> as the problem requests, it's less than <code>O(n)</code> anyway. The technique is to store our array in a typical segment tree, but along with the usual segment information we also store, in order, all the indexes of the elements under each node. This extra storage only takes an extra <code>O(n log n)</code> time/space if you add it up properly (n items stored at each of log n levels), so it doesn't hurt our setup time in any way that matters. Then we query the tree to find the minimum set of nodes that are contained by our range of (a, b). This query takes about the same time as a typical segment tree query (<code>O(log n)</code>) and finds at most about 2*log n matching segments. As we query, we find the lowest index in each matching segment that matches our constraint c. We can use a binary search to find this index since we kept the indices in order, so it takes <code>O(log n)</code> time worst case for each matching node. When we add it all up appropriately, the total time is <code>O((log n)^2)</code>. Let me know whichever steps you'd like clarified. C# Code: <pre class="prettyprint lang-cs prettyprint-override"><code>void Test() { DateTime start; TimeSpan at = new TimeSpan(), bt = new TimeSpan(); Random rg = new Random(); for(int i = 0; i < 20; i++) { // build arrays from 5 to 10000 elements long, random values between 0 and 100 // Break even time for queries is around 10000 elements in array for the values and queries used int[] array = (from n in Enumerable.Range(1, rg.Next(5, 10000)) select rg.Next(0, 100)).ToArray<int>(); // Setup should be O(n log n) time/space ArraySearcher Searcher = new ArraySearcher(array); // Test each array a number of times equal to its length, with random values for a, b, c for(int j = 0; j < array.Length; j++) { int a = rg.Next(-1, 101), b = rg.Next(a, 102), c = rg.Next(0, array.Length); start = DateTime.Now; int expected = BruteSearch(array, a, b, c); at += DateTime.Now - start; // Search should be O(log n) time, but in reality is O((log n)^2) :( start = DateTime.Now; int got = Searcher.QuickSearch(a, b, c); bt += DateTime.Now - start; System.Diagnostics.Debug.Assert(got == expected); } } MessageBox.Show(at.ToString() + ", " + bt.ToString()); } int BruteSearch(int[] array, int a, int b, int c) { for(int i = c; i < array.Length; i++) if(a < array[i] && array[i] < b) return i; return -1; } class ArraySearcher { SegmentNode Root; List<SegmentNode> Nodes; public ArraySearcher(int[] array) { Nodes = array.Select((value, index) => new SegmentNode(value, value, new List<int> { index }, null, null)).ToList<SegmentNode>(); // Sorting will take us O(n log n) Nodes.Sort(); // Creating a normal segment tree takes O(n log n) // In addition, in this tree each node stores all the indices below it in order // There are a total of n of these indices at each tree level, kept in order as we go at no extra time cost // There are log n levels to the tree // So O(n log n) extra time and space is spent making our lists, which doesn't hurt our big O notation compared to just sorting this.Root = SegmentNode.Merge(Nodes, 0, Nodes.Count - 1); } public int QuickSearch(int a, int b, int c) { return this.Root.FindLowestIndex(a, b, c); } class SegmentNode : IComparable { public int Min, Max; public List<int> Indices; public SegmentNode Left, Right; public SegmentNode(int Min, int Max, List<int> Indices, SegmentNode Left, SegmentNode Right) { this.Min = Min; this.Max = Max; this.Indices = Indices; this.Left = Left; this.Right = Right; } int IComparable.CompareTo(object other) { return this.Min - ((SegmentNode)other).Min; } public static SegmentNode Merge(List<SegmentNode> Nodes, int Start, int End) { int Count = End - Start; if(Start == End) return Nodes[Start]; if(End - Start == 1) return SegmentNode.Merge(Nodes[Start], Nodes[End]); return SegmentNode.Merge(SegmentNode.Merge(Nodes, Start, Start + Count/2), SegmentNode.Merge(Nodes, Start + Count/2 + 1, End)); } public static SegmentNode Merge(SegmentNode Left, SegmentNode Right) { int LeftCounter = 0, RightCounter = 0; List<int> NewIndices = new List<int>(); while(LeftCounter < Left.Indices.Count || RightCounter < Right.Indices.Count) { if(LeftCounter < Left.Indices.Count && (RightCounter == Right.Indices.Count || Left.Indices[LeftCounter] < Right.Indices[RightCounter])) NewIndices.Add(Left.Indices[LeftCounter++]); else NewIndices.Add(Right.Indices[RightCounter++]); } return new SegmentNode(Left.Min, Right.Max, NewIndices, Left, Right); } public int FindLowestIndex(int SeekMin, int SeekMax, int c) { // This will find at most O(log n) matching segments, always less than 2 from each level of the tree // Each matching segment is binary searched in at worst O(log n) time // Total does indeed add up to O((log n)^2) if you do it right if(SeekMin < this.Min && SeekMax > this.Max) return FindLowestIndex(this.Indices, c); if(SeekMax <= this.Min || SeekMin >= this.Max) return -1; int LeftMatch = this.Left.FindLowestIndex(SeekMin, SeekMax, c); int RightMatch = this.Right.FindLowestIndex(SeekMin, SeekMax, c); if(LeftMatch == -1) return RightMatch; if(RightMatch == -1) return LeftMatch; return LeftMatch < RightMatch ? LeftMatch : RightMatch; } int FindLowestIndex(List<int> Indices, int c) { int left = 0, right = Indices.Count - 1, mid = left + (right - left) / 2; while(left <= right) { if(Indices[mid] == c) return c; if(Indices[mid] > c) right = mid - 1; else left = mid + 1; mid = left + (right - left) / 2; } if(mid >= Indices.Count) return -1; // no exact match, so return the next highest. return Indices[mid]; } } } </code></pre>

Searching an array for integers in a specific range

Q: How do you search an array for a specific element?

Use filter if you want to find all items in an array that meet a specific condition. Use find if you want to check if that at least one item meets a specific condition. Use includes if you want to check if an array contains a particular value. Use indexOf if you want to find the index of a particular item in an array.

Q: What is the most efficient way to search an array of numbers?

Sequential search is the best that we can do when trying to find a value in an unsorted array. 1 But if the array is sorted in increasing order by value, then we can do much better. We use a process called binary search.

Tags:

arrays

algorithm

search

tree

Can someone give me ideas for the following problem?

Given an array ar[] of length n, and some queries, each query is of the form a, b, c, find the number with the smallest index i such that the index lies in the range [c, n] and such that a < ar[i] < b. There are (n - 1) in total, c goes from 1 to n - 1. The expected complexity for each query should be about O(log n), and a precomputation of complexity at most O(n log n) is suitable. Intuitively, segment tree came up to my mind, but I couldn't think of a way of building it, nor what to keep in each node.

739

asked Dec 14 '11 12:12

Chris

2 Answers

Ok, I thought I should implement the O(nlogn)/O(logn) solution I discussed with user12861.

The code works by building n trees, one for each c. Each tree shares most of its content with the smaller ones, with a maximum of logn new nodes. This way the total memory and time cost for preprocessing is limited to O(nlogn).

The actual trees are quite similar to interval trees. The nodes store the minimum and maximum value they span over, and the smallest index they contain. My implementation is however not self-balancing, but this can be added using your favorite heuristics.

Click to copy

import sys

class Node:
    pass
class Interval(Node):
    def __init__(self,left,val,i,right):
        self.i = i; self.val = val
        self.left = left; self.right = right
        self.mini = min(left.mini, i, right.mini)
        self.min = min(left.min, val, right.min)
        self.max = max(left.max, val, right.max)
    def add(self,val,i):
        # We do naive inserting here. In a real worst case logn solution,
        # self-balancing would take place here.
        if (val,i) < (self.val,self.i): return Interval(self.left.add(val,i), self.val, self.i, self.right)
        if (self.val,self.i) < (val,i): return Interval(self.left, self.val, self.i, self.right.add(val,i))
        return self
    def query(self,a,b):
        # If the entire interval is covered, return mini, this is what keeps us
        # at max 2 opened nodes per level in the tree.
        if a <= self.min and self.max <= b: return self.mini
        # Otherwise compose an answer from the subtrees.
        res = sys.maxint
        if a <= self.val <= b: res = min(res, self.i)
        if self.left.min <= b and self.left.max >= a: res = min(res, self.left.query(a,b))
        if self.right.min <= b and self.right.max >= a: res = min(res, self.right.query(a,b))
        return res
class Tip(Node):
    def __init__(self):
        # This is a typical null-pattern idiom. The self.values are set to
        # the min/max identities to ensure to interference with the upwards
        # value propagation.
        self.mini = sys.maxint
        self.min = sys.maxint
        self.max = -sys.maxint
    def add(self,val,i):
        return Interval(Tip(), val, i, Tip())
    def query(self,a,b):
        return sys.maxint

# The input data array
data = [0, 3, 1, 2, 0, 4, 9, 6, 1, 7]
N = len(data)

# Build the array of trees. O(nlogn)
ar = [None]*N + [Tip()]
for i in range(N-1, -1, -1):
    ar[i] = ar[i+1].add(data[i],i)

# The query function. O(logn)
def query(a,b,c):
    return ar[c].query(a,b)

# Test
print query(2, 7, 4) # = 5

174

answered Oct 17 '22 07:10

Thomas Ahle

I've modified Mason Bryant's technique to something that works. The problems were more bugs in FindLowestIndex and a more major bug searching the tree (it can return multiple results).

Despite doing that work, it still doesn't actually solve the problem. O(n log n) time setting up is easy enough, but using this technique I'm only able to get O((log n)^2) query time. I wonder if you have a link to the original problem in case there are more clarifications there? Or I wonder if the problem is really solvable. Or maybe O((log n)^2) is "about" O(log n) as the problem requests, it's less than O(n) anyway.

The technique is to store our array in a typical segment tree, but along with the usual segment information we also store, in order, all the indexes of the elements under each node. This extra storage only takes an extra O(n log n) time/space if you add it up properly (n items stored at each of log n levels), so it doesn't hurt our setup time in any way that matters. Then we query the tree to find the minimum set of nodes that are contained by our range of (a, b). This query takes about the same time as a typical segment tree query (O(log n)) and finds at most about 2*log n matching segments. As we query, we find the lowest index in each matching segment that matches our constraint c. We can use a binary search to find this index since we kept the indices in order, so it takes O(log n) time worst case for each matching node. When we add it all up appropriately, the total time is O((log n)^2).

Let me know whichever steps you'd like clarified.

C# Code:

Click to copy

void Test() {
  DateTime start;
  TimeSpan at = new TimeSpan(), bt = new TimeSpan();

  Random rg = new Random();
  for(int i = 0; i < 20; i++) {
    // build arrays from 5 to 10000 elements long, random values between 0 and 100
    // Break even time for queries is around 10000 elements in array for the values and queries used
    int[] array = (from n in Enumerable.Range(1, rg.Next(5, 10000)) select rg.Next(0, 100)).ToArray<int>();

    // Setup should be O(n log n) time/space
    ArraySearcher Searcher = new ArraySearcher(array);

    // Test each array a number of times equal to its length, with random values for a, b, c
    for(int j = 0; j < array.Length; j++) {
      int a = rg.Next(-1, 101), b = rg.Next(a, 102), c = rg.Next(0, array.Length);
      start = DateTime.Now;
      int expected = BruteSearch(array, a, b, c);
      at += DateTime.Now - start;
      // Search should be O(log n) time, but in reality is O((log n)^2) :(
      start = DateTime.Now;
      int got = Searcher.QuickSearch(a, b, c);
      bt += DateTime.Now - start;
      System.Diagnostics.Debug.Assert(got == expected);
    }
  }
  MessageBox.Show(at.ToString() + ", " + bt.ToString());
}

int BruteSearch(int[] array, int a, int b, int c) {
  for(int i = c; i < array.Length; i++)
    if(a < array[i] && array[i] < b)
      return i;
  return -1;
}

class ArraySearcher {

  SegmentNode Root;
  List<SegmentNode> Nodes;

  public ArraySearcher(int[] array) {
    Nodes = array.Select((value, index) => new SegmentNode(value, value, new List<int> { index }, null, null)).ToList<SegmentNode>();
    // Sorting will take us O(n log n)
    Nodes.Sort();
    // Creating a normal segment tree takes O(n log n)
    // In addition, in this tree each node stores all the indices below it in order
    // There are a total of n of these indices at each tree level, kept in order as we go at no extra time cost
    // There are log n levels to the tree
    // So O(n log n) extra time and space is spent making our lists, which doesn't hurt our big O notation compared to just sorting
    this.Root = SegmentNode.Merge(Nodes, 0, Nodes.Count - 1);
  }

  public int QuickSearch(int a, int b, int c) {
    return this.Root.FindLowestIndex(a, b, c);
  }

  class SegmentNode : IComparable {

    public int Min, Max;
    public List<int> Indices;
    public SegmentNode Left, Right;

    public SegmentNode(int Min, int Max, List<int> Indices, SegmentNode Left, SegmentNode Right) {
      this.Min = Min;
      this.Max = Max;
      this.Indices = Indices;
      this.Left = Left;
      this.Right = Right;
    }

    int IComparable.CompareTo(object other) {
      return this.Min - ((SegmentNode)other).Min;
    }

    public static SegmentNode Merge(List<SegmentNode> Nodes, int Start, int End) {
      int Count = End - Start;
      if(Start == End)
        return Nodes[Start];
      if(End - Start == 1)
        return SegmentNode.Merge(Nodes[Start], Nodes[End]);
      return SegmentNode.Merge(SegmentNode.Merge(Nodes, Start, Start + Count/2), SegmentNode.Merge(Nodes, Start + Count/2 + 1, End));
    }

    public static SegmentNode Merge(SegmentNode Left, SegmentNode Right) {
      int LeftCounter = 0, RightCounter = 0;
      List<int> NewIndices = new List<int>();
      while(LeftCounter < Left.Indices.Count || RightCounter < Right.Indices.Count) {
        if(LeftCounter < Left.Indices.Count && (RightCounter == Right.Indices.Count || Left.Indices[LeftCounter] < Right.Indices[RightCounter]))
          NewIndices.Add(Left.Indices[LeftCounter++]);
        else
          NewIndices.Add(Right.Indices[RightCounter++]);
      }
      return new SegmentNode(Left.Min, Right.Max, NewIndices, Left, Right);
    }

    public int FindLowestIndex(int SeekMin, int SeekMax, int c) {
      // This will find at most O(log n) matching segments, always less than 2 from each level of the tree
      // Each matching segment is binary searched in at worst O(log n) time
      // Total does indeed add up to O((log n)^2) if you do it right
      if(SeekMin < this.Min && SeekMax > this.Max)
        return FindLowestIndex(this.Indices, c);
      if(SeekMax <= this.Min || SeekMin >= this.Max)
        return -1;
      int LeftMatch = this.Left.FindLowestIndex(SeekMin, SeekMax, c);
      int RightMatch = this.Right.FindLowestIndex(SeekMin, SeekMax, c);
      if(LeftMatch == -1)
        return RightMatch;
      if(RightMatch == -1)
        return LeftMatch;
      return LeftMatch < RightMatch ? LeftMatch : RightMatch;
    }

    int FindLowestIndex(List<int> Indices, int c) {
      int left = 0, right = Indices.Count - 1, mid = left + (right - left) / 2;
      while(left <= right) {
        if(Indices[mid] == c)
          return c;
        if(Indices[mid] > c)
          right = mid - 1;
        else
          left = mid + 1;
        mid = left + (right - left) / 2;
      }
      if(mid >= Indices.Count)
        return -1;
      // no exact match, so return the next highest.
      return Indices[mid];
    }
  }
}

answered Oct 17 '22 08:10

user12861

Related questions
                            
                                Find an element in an array, but the element can jump
                            
                                Why can one not assign to arrays inside maps in Go?
                            
                                How a big array allocates memory?
                            
                                Foreach string replacement with multiple variations
                            
                                Understanding memory usage of MySQL result in PHP (PDO)
                            
                                SCSS :nth-child mixin with array
                            
                                CSS Variable arrays - possible?
                            
                                Convert array values to object keys
                            
                                Why doesn't array conform to Equatable, when its items are Equatable in Swift?
                            
                                Why does indexing in an array break type safety in TypeScript?
                            
                                How do I use a default array if no command line arguments are given?
                            
                                Why don't C++ compilers optimize away reads and writes to struct data members as opposed to distinct local variables?
                            
                                Remove elements/objects From Array in ElasticSearch Followed by Matching Query
                            
                                Numpy: find indeces of mask edges
                            
                                Does the "cast to first member of standard layout" type punning rule extend to arrays?
                            
                                Getting Values From Json Data Inside Array in Mysql
                            
                                What is the standard way to set an array of pointers to NULL?
                            
                                How do arrays in JavaScript work ? (ie without a starting dimension size)
                            
                                How to deduce array size from an enum template argument?
                            
                                Arrays as separate type

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With