O(n^2) isn't fast enough in solving this. any faster approaches?

Question

I've been trying to solve this problem on ACM Timus

http://acm.timus.ru/problem.aspx?space=1&num=1932

My first approach is O(n^2) which surely isn't fast enough to pass all tests. The below O(n^2) code gives TL on test 10.

import java.util.*;
import java.io.*;

public class testtest
{
    public static void main(String[] args) throws IOException
    {
        BufferedReader rr = new BufferedReader(new InputStreamReader(System.in));
        int n = Integer.parseInt(rr.readLine());
        String[] p = new String[n];
        for (int i = 0; i < n; i ++)
        {
            p[i] = rr.readLine();
        }
        int[] diff = new int[]{0, 0, 0, 0};
        for (int i = 0; i < n - 1; i ++)
        {
            for (int j = i + 1; j < n; j ++)
            {
                int ans  = (p[i].charAt(0) == p[j].charAt(0) ? 0 : 1) +
                           (p[i].charAt(1) == p[j].charAt(1) ? 0 : 1) +
                           (p[i].charAt(2) == p[j].charAt(2) ? 0 : 1) +
                           (p[i].charAt(3) == p[j].charAt(3) ? 0 : 1);
                diff[ans - 1] ++;
            }
        }
        System.out.print(diff[0] + " " + diff[1] + " " + diff[2] + " " + diff[3]);
    }
}

any idea to make this approach faster? I've noticed that only a limited set of characters are allowed in input ('0' .. '9', 'a' .. 'f') so we can create arrays (memory limitation is enough) to do fast checks if characters have been entered before.

Thanks... I don't need actual implementation, just quick idea/thoughts would be great. EDIT: Thanks for great ideas. I've tried improvements on O(n^2) using bit-logics, but still time limit exceeded. The pascal code is below.

program Project2;

{$APPTYPE CONSOLE}

var
  i, j, n, k, bits: integer;
  arr: array[1..65536] of integer;
  diff: array[1..4] of integer;
  a, b, c, d: char;

function g(c: char): integer; inline;
begin
  if ((c >= '0') and (c <= '9')) then
  begin
    Result := Ord(c) - 48;
  end
  else
  begin
    Result := Ord(c) - 87;
  end;
end;

begin
  Readln(n);
  for i := 1 to n do
  begin
    Read(a); Read(b); Read(c); Readln(d);
    arr[i] := g(a) * 16 * 16 * 16 + g(b) * 16 * 16 + g(c) * 16 + g(d);
    for j := 1 to i - 1 do
    begin
      bits := arr[i] xor arr[j];
      k := ((bits or (bits shr 1) or (bits shr 2) or (bits shr 3)) and $1111) mod 15;
      Inc(diff[k]);
    end;
  end;
  Write(diff[1], ' ', diff[2], ' ', diff[3], ' ', diff[4]);
{$IFNDEF ONLINE_JUDGE}
  Readln;
{$ENDIF}
end.

So I guess, I've to try other better suggestions..

EDIT: I have tried Daniel's algorithm and it is very promising, maybe there is a mistake in the code below, It keeps getting Wrong Answer on Test 10... could anybody take a look? Many thanks...

import java.util.*;
import java.io.*;

public class testtest
{
    private static int g(char ch)
    {
        if ((ch >= '0') && (ch <= '9'))
        {
            return (int)ch - 48;
        }
        return (int)ch - 87;
    }

    public static void main(String[] args) throws IOException
    {
        BufferedReader rr = new BufferedReader(new InputStreamReader(System.in));
        int n = Integer.parseInt(rr.readLine());
        int[] p = new int[n];
        int[] all = new int[65536];
        int[][] miss = new int[4][4096];
        int[] g12 = new int[256];
        int[] g13 = new int[256];
        int[] g14 = new int[256];
        int[] g23 = new int[256];
        int[] g24 = new int[256];
        int[] g34 = new int[256];
        int[][] gg = new int[4][16];
        int same3, same2, same1, same0, same4;
        for (int i = 0; i < n; i ++)
        {
            String s = rr.readLine();
            int x = g(s.charAt(0)) * 4096 + g(s.charAt(1)) * 256 + g(s.charAt(2)) * 16 + g(s.charAt(3));
            p[i] = x;
            all[x] ++;
            miss[0][x >> 4] ++;
            miss[1][(x & 0x000F) | ((x & 0xFF00) >> 4)] ++;
            miss[2][(x & 0x00FF) | ((x & 0xF000) >> 4)] ++;
            miss[3][x & 0x0FFF] ++;
            g12[x >> 8] ++;
            g13[((x & 0x00F0) >> 4) | ((x & 0xF000) >> 8)] ++;
            g14[(x & 0x000F) | ((x & 0xF000) >> 8)] ++;
            g23[(x & 0x0FF0) >> 4] ++;
            g24[(x & 0x000F) | ((x & 0x0F00) >> 4)] ++;
            g34[x & 0x00FF] ++;
            gg[0][x >> 12] ++;
            gg[1][(x & 0xF00) >> 8] ++;
            gg[2][(x & 0xF0) >> 4] ++;
            gg[3][x & 0xF] ++;
        }

        same4 = 0;
        for (int i = 0; i < 65536; i ++)
        {
            same4 += (all[i] - 1) * (all[i]) / 2;
        }

        same3 = 0;
        for (int i = 0; i < 4096; i ++)
        {
            same3 += (miss[0][i] - 1) * (miss[0][i]) / 2;
            same3 += (miss[1][i] - 1) * (miss[1][i]) / 2;
            same3 += (miss[2][i] - 1) * (miss[2][i]) / 2;
            same3 += (miss[3][i] - 1) * (miss[3][i]) / 2;
        }

        same2 = 0;
        for (int i = 0; i < 256; i ++)
        {
            same2 += (g12[i] - 1) * g12[i] / 2;
            same2 += (g13[i] - 1) * g13[i] / 2;
            same2 += (g14[i] - 1) * g14[i] / 2;
            same2 += (g23[i] - 1) * g23[i] / 2;
            same2 += (g24[i] - 1) * g24[i] / 2;
            same2 += (g34[i] - 1) * g34[i] / 2;
        }

        same1 = 0;
        for (int i = 0; i < 16; i ++)
        {
            same1 += (gg[0][i] - 1) * gg[0][i] / 2;
            same1 += (gg[1][i] - 1) * gg[1][i] / 2;
            same1 += (gg[2][i] - 1) * gg[2][i] / 2;
            same1 += (gg[3][i] - 1) * gg[3][i] / 2;
        }

        same3 -= 4 * same4;
        same2 -= 6 * same4 + 3 * same3;
        same1 -= 4 * same4 + 3 * same3 + 2 * same2;
        same0 = (int)((long)(n * (n - 1) / 2) - same4 - same3 - same2 - same1);
        System.out.print(same3 + " " + same2 + " " + same1 + " " + same0);
    }
}

EDIT Finally got AC... thanks Daniel for such great algorithm!

Daniel Fischer · Accepted Answer

For small n, a brute-force O(n²) algorithm checking each pair is faster, of course, so one would want to find a good cut-off point for switching algorithms. Without measuring, I'd expect a value between 200 and 3000 due to not-even-back-of-envelope considerations.

Convert the pirate ID to an int by parsing it as a hexadecimal number. Store the IDs in

int[] pirates = new int[n];

First, count the number of pairs of pirates with identical IDs (this step can be omitted here, since there are none by the problem statement).

int[] allFour = new int[65536];
for(int i = 0; i < n; ++i) {
    allFour[pirate[i]] += 1;
}
int fourIdentical = 0;
for(int i = 0; i < 65536; ++i) {
    fourIdentical += allFour[i]*(allFour[i] - 1) / 2;
}

Next, count the pairs of pirates with three identical nibbles in their ID,

int oneTwoThree(int p) {
    return p >> 4;
}
int oneTwoFour(int p) {
    return (p & 0x000F) | ((p & 0xFF00) >> 4);
}
int oneThreeFour(int p) {
    return (p & 0x00FF) | ((p & 0xF000) >> 4);
}
int twoThreeFour(int p) {
    return p & 0x0FFF;
}

int[] noFour  = new int[4096];
int[] noThree = new int[4096];
int[] noTwo   = new int[4096];
int[] noOne   = new int[4096];

for(int i = 0; i < n; ++i) {
    noFour[oneTwoThree(pirate[i])] += 1;
    noThree[oneTwoFour(pirate[i])] += 1;
    noTwo[oneThreeFour(pirate[i])] += 1;
    noOne[twoThreeFour(pirate[i])] += 1;
}

int threeIdentical = 0;
for(int i = 0; i < 4096; ++i) {
    threeIdentical += noFour[i]*(noFour[i]-1) / 2;
}
for(int i = 0; i < 4096; ++i) {
    threeIdentical += noThree[i]*(noThree[i]-1) / 2;
}
for(int i = 0; i < 4096; ++i) {
    threeIdentical += noTwo[i]*(noTwo[i]-1) / 2;
}
for(int i = 0; i < 4096; ++i) {
    threeIdentical += noOne[i]*(noOne[i]-1) / 2;
}

But, every pair of pirates with four identical nibbles has been counted 4 choose 3 = 4 times here, for each of the possible selections of three nibbles, so we need to subtract that (well, not for the problem, but in principle):

threeIdentical -= 4*fourIdentical;

Then, count the pairs of pirates with two identical nibbles in their ID:

int oneTwo(int p) {
    return p >> 8;
}
int oneThree(int p) {
    return ((p & 0x00F0) >> 4) | ((p & 0xF000) >> 8);
}
int oneFour(int p) {
    return (p & 0x000F) | ((p & 0xF000) >> 8);
}
int twoThree(int p) {
    return (p & 0x0FF0) >> 4;
}
int twoFour(int p) {
    return (p & 0x000F) | ((p & 0x0F00) >> 4);
}
int threeFour(int p) {
    return p & 0x00FF;
}

allocate six arrays of 256 ints and count the number of pirates with the corresponding nibbles in places a and b, like

int twoIdentical = 0;
int[] firstTwo = new int[256];
for(int i = 0; i < n; ++i) {
    firstTwo[oneTwo(pirate[i])] += 1;
}
for(int i = 0; i < 256; ++i) {
    twoIdentical += firstTwo[i]*(firstTwo[i] - 1) / 2;
}
// analogous for the other five possible choices of two nibbles

But, the pairs with four identical nibbles have been counted 4 choose 2 = 6 times here, and the pairs with three identical nibbles have been counted 3 choose 2 = 3 times, so we need to subtract

twoIdentical -= 6*fourIdentical + 3*threeIdentical;

Next, the number of pairs with one identical nibble. I trust you can guess what four functions and arrays are needed. Then we will have counted the pairs with four identical nibbles 4 choose 1 = 4 times, the ones with three identical nibbles 3 choose 1 = 3 times, and the pairs with two identical nibbles 2 choose 1 = 2 times, so

oneIdentical -= 4*fourIdentical + 3*threeIdentical + 2*twoIdentical;

Finally, the number of pairs with no identical nibbles is

int noneIdentical = (int)((long)n*(n-1) / 2) - oneIdentical - twoIdentical - threeIdentical - fourIdentical;

(the cast to long to avoid overflow of n*(n-1)).

Neil · Answer

This might be a fast way to compare pairs of identifiers. It assumes that both identifiers have been read in as hexadecimal numbers rather than strings.

int bits = p[i] ^ p[j];
int ans = ((bits | bits >> 1 | bits >> 2 | bits >> 3) & 0x1111) % 15;
diff[ans - 1]++;

Asiri Rathnayake · Answer

How about this:

Create a 4-dimensional array of integers with each dimension being 16 elements wide with all the elements initialized to 0. So, that would be 16*16*16*16*4 = 262144 bytes, which is about 262KB. Now, index (insert) each of the identifiers into this structure as they are being read. When you are indexing, increment the counter at each dimension...

As you are indexing into this structure, you can incrementally calculate the the required answer.

For an example, let's say I'm indexing the identifier dead. On the first dimension, select the position corresponding to d, let's say the counter at this position is n1, this means that so far there have been n1 identifiers with d at position 1. Now take e and insert it into the correct position on the second dimension... and say the counter at this position is n2, then I know for certain that there are n1 - n2 identifiers which differ in 3 positions compared to the current identifier being inserted...

EDIT: I'm beginning to doubt myself. What if two identifiers are different in the first character but equal in the second? This approach will not be able to pick that up me thinks. Needs more thinking...

EDIT: After pausing on a "smart solution", I tried to improve upon the basic O(n^2) algorithm as below.

Each identifier has 4 hexa-decimal digits and therefore can be packed into a single integer (a 2 byte word would be enough though). So, such an integer would look like:

[ ][ ][ ][ ] * [ ][ ][ ][ ] * [ ][ ][ ][ ] * [ ][ ][ ][ ]

Where each [ ] represents a bit and each [ ][ ][ ][ ] group (half-byte) corresponds to a hexa-decimal digit in the identifier. Now we need a way to compare two of these identifiers fast. Given two identifiers a and b, first we calculate c = a ^ b, which will give us the XOR of the two bit patters (the difference, sort of). Now we compare this bit pattern c with the following constant patterns:

u = [1][1][1][1] * [0][0][0][0] * [0][0][0][0] * [0][0][0][0]

v = [0][0][0][0] * [1][1][1][1] * [0][0][0][0] * [0][0][0][0]

w = [0][0][0][0] * [0][0][0][0] * [1][1][1][1] * [0][0][0][0]

x = [0][0][0][0] * [0][0][0][0] * [0][0][0][0] * [1][1][1][1]

The comparison is like below:

diffs = 0

if (c & u)
  diffs++
if (c & v)
  diffs++
if (c & w)
  diffs++
if (c & x)
  diffs++

At the end of this operation, diffs will contain the number of characters from which a and b differs. Note that the same approach can be implemented without packing the identifiers into integers (i.e keep each hexa-decimal digit on it's own integer / char type) but I thought that the memory saved by this packing might mean something.

IMHO, this is only a minor improvement over the basic O(n^2) algorithm. I'd be pretty disappointed if the accepted answer is this sort of a trickery and not an answer based on a better algorithm / data-structure. Eagerly waiting for someone to offer some advice on such an answer... :-)

fgb · Answer

This can be solved with a single pass through the id list, so O(n) time but you still need to be careful so that the implementation runs in time.

Consider the problem of finds pairs of equal Strings. This can be done by keep track of the count of each String previously found in a map:

long pairs = 0;
Set<String, Long> map = new HashMap<String, Long>();
for(String id:ids) {
    if(map.containsKey(id)) {
        pairs += map.get(id);
        map.put(id, map.get(id) + 1);
    } else {
        map.put(id, 1);
    }
}

Now to find pairs that differ in one character. You can store Strings excluding each character. So for "abcd", store ".bcd", "a.cd", "ab.d", and "abc." in the Map. For each id, you can add up the counts of each String, for each missing character position. This will also count equal Strings, so subtract the count of equal Strings.

Generalize to more missing characters using the Inclusion/Exclusion principle. For example, the count of Strings with 2 different characters from s is:

sum of all character positions `x` and `y`:
    the number of strings equal to `s` excluding characters `x` and `y`
    - the number of strings equal to `s` excluding character `x`
    - the number of strings equal to `s` excluding character `y`
    + the number of strings equal to `s`

For efficiency reasons, you don't want to do string manipulation so the actual implementation should convert the String to integers. Also, instead of a Map you can use a long array (where the index represents the String, and the value is the count of matching Strings).

O(n^2) isn't fast enough in solving this. any faster approaches?

Tags:

java

string

algorithm

justyy

4 Answers

Daniel Fischer

Neil

Asiri Rathnayake

fgb

Recent Activity

Donate For Us

O(n^2) isn't fast enough in solving this. any faster approaches?

Tags:

java

string

algorithm

justyy

4 Answers

Daniel Fischer

Neil

Asiri Rathnayake

fgb

Related questions

Recent Activity

Donate For Us