Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert Java long's as Strings while keeping natural order

I'm currently looking at a simple programming problem that might be fun to optimize - at least for anybody who believes that programming is art :) So here is it:

How to best represent long's as Strings while keeping their natural order?

Additionally, the String representation should match ^[A-Za-z0-9]+$. (I'm not too strict here, but avoid using control characters or anything that might cause headaches with encodings, is illegal in XML, has line breaks, or similar characters that will certainly cause problems)

Here's a JUnit test case:

@Test
public void longConversion() {
    final long[] longs = { Long.MIN_VALUE, Long.MAX_VALUE, -5664572164553633853L,
            -8089688774612278460L, 7275969614015446693L, 6698053890185294393L,
            734107703014507538L, -350843201400906614L, -4760869192643699168L,
            -2113787362183747885L, -5933876587372268970L, -7214749093842310327L, };

    // keep it reproducible
    //Collections.shuffle(Arrays.asList(longs));

    final String[] strings = new String[longs.length];
    for (int i = 0; i < longs.length; i++) {
        strings[i] = Converter.convertLong(longs[i]);
    }

    // Note: Comparator is not an option
    Arrays.sort(longs);
    Arrays.sort(strings);

    final Pattern allowed = Pattern.compile("^[A-Za-z0-9]+$");
    for (int i = 0; i < longs.length; i++) {
        assertTrue("string: " + strings[i], allowed.matcher(strings[i]).matches());
        assertEquals("string: " + strings[i], longs[i], Converter.parseLong(strings[i]));
    }
}

and here are the methods I'm looking for

public static class Converter {
    public static String convertLong(final long value) {
        // TODO
    }

    public static long parseLong(final String value) {
        // TODO
    }
}

I already have some ideas on how to approach this problem. Still, I though I might get some nice (creative) suggestions from the community.

Additionally, it would be nice if this conversion would be

  • as short as possible
  • easy to implement in other languages

EDIT: I'm quite glad to see that two very reputable programmers ran into the same problem as I did: using '-' for negative numbers can't work as the '-' doesn't reverse the order of sorting:

  1. -0001
  2. -0002
  3. 0000
  4. 0001
  5. 0002
like image 993
sfussenegger Avatar asked Dec 08 '22 04:12

sfussenegger


1 Answers

Ok, take two:

class Converter {
  public static String convertLong(final long value) {
    return String.format("%016x", value - Long.MIN_VALUE);
  }

  public static long parseLong(final String value) {
    String first = value.substring(0, 8);
    String second = value.substring(8);
    long temp = (Long.parseLong(first, 16) << 32) | Long.parseLong(second, 16);
    return temp + Long.MIN_VALUE;
  }
}

This one takes a little explanation. Firstly, let me demonstrate that it is reversible and the resultant conversions should demonstrate the ordering:

for (long aLong : longs) {
  String out = Converter.convertLong(aLong);
  System.out.printf("%20d %16s %20d\n", aLong, out, Converter.parseLong(out));
}

Output:

-9223372036854775808 0000000000000000 -9223372036854775808
 9223372036854775807 ffffffffffffffff  9223372036854775807
-5664572164553633853 316365a0e7370fc3 -5664572164553633853
-8089688774612278460 0fbba6eba5c52344 -8089688774612278460
 7275969614015446693 e4f96fd06fed3ea5  7275969614015446693
 6698053890185294393 dcf444867aeaf239  6698053890185294393
  734107703014507538 8a301311010ec412   734107703014507538
 -350843201400906614 7b218df798a35c8a  -350843201400906614
-4760869192643699168 3dedfeb1865f1e20 -4760869192643699168
-2113787362183747885 62aa5197ea53e6d3 -2113787362183747885
-5933876587372268970 2da6a2aeccab3256 -5933876587372268970
-7214749093842310327 1be00fecadf52b49 -7214749093842310327

As you can see Long.MIN_VALUE and Long.MAX_VALUE (the first two rows) are correct and the other values basically fall in line.

What is this doing?

Assuming signed byte values you have:

  • -128 => 0x80
  • -1 => 0xFF
  • 0 => 0x00
  • 1 => 0x01
  • 127 => 0x7F

Now if you add 0x80 to those values you get:

  • -128 => 0x00
  • -1 => 0x7F
  • 0 => 0x80
  • 1 => 0x81
  • 127 => 0xFF

which is the correct order (with overflow).

Basically the above is doing that with 64 bit signed longs instead of 8 bit signed bytes.

The conversion back is a little more roundabout. You might think you can use:

return Long.parseLong(value, 16);

but you can't. Pass in 16 f's to that function (-1) and it will throw an exception. It seems to be treating that as an unsigned hex value, which long cannot accommodate. So instead I split it in half and parse each piece, combining them together, left-shifting the first half by 32 bits.

like image 85
cletus Avatar answered Dec 09 '22 18:12

cletus