Maximize AND on a sequence of XORs

Question

Problem

We are given 2 arrays a and b both of length n. We build a third array c by rearranging the values in b. The goal is to find the optimal c that maximizes

result = (a[0] ^ c[0]) & (a[1] ^ c[1]) & ... & (a[n - 1] ^ c[n - 1])

where ^ is XOR and & is AND.

Is it possible to do this efficiently? It's straightforward to iterate through all possible permutations of b, but this is infeasible for large n.

More details

The order of the values in a is fixed.
The order of the values in b may be rearranged to form c. That is, starting with b = [1, 2, 3], it may be that the maximum result is obtained when the values are rearranged to c = [2, 1, 3].
b may be rearranged in-place if needed (I do this in the sample code).
Since the optimal c is not necessarily unique, any optimal c may be returned.
Assume all values are 32-bit unsigned integers.
1 <= n <= 10,000.

Test cases

Input:
a = [3, 4, 5]
b = [6, 7, 8]
Output:
c = [8, 7, 6] (result = 3)

Input:
a = [1, 11, 7, 4, 10, 11]
b = [6, 20, 8, 9, 10, 7]
Output:
c = [8, 6, 10, 9, 7, 20] (result = 9)

Input:
a = [0, 1, 2, 4, 8, 16]
b = [512, 256, 128, 64, 32, 16]
Output:
c = [16, 32, 64, 128, 256, 512] (result = 0)

Sample code

And here is the C++ code I used for a naive solution that exhaustively tries all permutations of b (tested on Windows 10 with Visual Studio 2019):

#include <algorithm>  // next_permutation
#include <cstdint>  // uint32_t
#include <iostream>  // i/o
#include <vector>  // vector

using uint = std::uint32_t;
using uvec = std::vector<uint>;

uint andxor(const uvec& a, const uvec& c)
{
    // Start with all bits set
    uint result = -1;

    for (std::size_t i = 0; i < c.size(); ++i)
    {
        result &= a[i] ^ c[i];
    }
    return result;
}

uvec solvePermute(const uvec& a, uvec& b)
{
    // next_permutation expects a pre-sorted input
    std::sort(b.begin(), b.end());

    // Initialize the result with the first permutation of b
    uvec c = b;
    uint bestResult = andxor(a, c);

    // Try all permutations of b to maximize the result of andxor
    while (std::next_permutation(b.begin(), b.end()))
    {
        uint result = andxor(a, b);
        if (result > bestResult)
        {
            bestResult = result;
            c = b;
        }
    }

    return c;
}

int main()
{
    // First test case
    uvec a{ 3, 4, 5 };
    uvec b{ 6, 7, 8 };

    uvec c = solvePermute(a, b);
    uint bestResult = andxor(a, c);

    std::cout << "Maximum result is " << bestResult << " with c = ";
    for (uint ci : c)
    {
        std::cout << ci << " ";
    }

    return 0;
}

Nico Schertler · Accepted Answer

Dillon's answer already has the most important ideas. Using these ideas, we can solve the problem in linear time and space.

The key goal is to make highly significant bits of the result 1, regardless of whether we sacrifice lesser significant bits. If we focus on a single bit k, then we can do the following:

partition the numbers in a that have their k-th bit set to 1 (a1) and 0 (a0), respectively
partition the numbers in b that have their k-th bit set to 1 (b1) and 0 (b0), respectively
if the number of entries in a1 is equal to those in b0, we can set the k-th bit in the result to 1. In this case, we consider the partitioning successful.

The partitions have the following meaning: In our final c, we need to match the entries of a1 to the entries of b0 and the entries of a0 to b1. If we do this, the XOR operation will result in 1 for all entries and the AND operation will produce an overall 1.

No, how can we use this insight in an algorithm? I choose to represent the partitioning of a by indices (i.e., the partitioning is a set of sets of indices) and the partitioning of b by actual numbers. Initially, we start with a partitioning with only one set each (i.e., the partitioning for a has the set of all indices and the partitioning for b has b as an element). We start with the most significant bit and try to do the partitioning.

If we have a successful partitioning, we end up with two partitions for both a and b (one of them might be empty). Then we already know which numbers from b we can put at what indices. If we violate this result, we get a lesser end result.

If our partitioning is not successful, we simply ignore this step.

Now let's go on the the next bit. We might already have a partitioning that does not only have the initial sets, but something more fine-grained. We don't want to mix the partitions up. So we can partition the partitions with the same approach as before. If we are successful for all partitions, we use the sub-partitions from now on. If not, we use the original partitions.

If we do this for all bits, we end up with a mapping for the numbers in b and the indices they can be placed in to achieve the maximum final result. It might not be a unique mapping. If a partition contains more than one element, any mapping will produce the maximum. So we just need to choose one and have the result.

Here is one example from your question:

a = {    1,    11,     7,     4,    10,    11  }
  = { 0001b, 1011b, 0111b, 0100b, 1010b, 1011b }
b = {    6,     20,     8,     9,    10,     7  }
  = { 0110b, 10100b, 1000b, 1001b, 1010b, 0111b }

And here are the most important algorithm steps:

               index partitioning    |   b partitioning
 -----------+------------------------+-----------------------
initial     | { 0, 1, 2, 3, 4, 5 }   |  {6, 20, 8, 9, 10, 7 }
------------+------------------------+-----------------------
after bit 3 | { 1, 4, 5 }            |  { 6, 20, 7 }
            | { 0, 2, 3 }            |  { 8, 9, 10 }
------------+------------------------+-----------------------
after bit 0 | { 1, 5 }               |  { 6, 20 }
(final)     | { 4 }                  |  { 7 }
            | { 0, 2 }               |  { 8, 10 }
            | { 3 }                  |  { 9 }

So we have a non-unique case. Numbers 6 and 20 could go both at indices 1 and 5. But number 7 must definitely go at index 4. One solution would be:

c = { 8, 6, 10, 9, 7, 20 }

Checking:

a = { 0001b, 1011b, 0111b, 0100b, 1010b,  1011b }
XOR
c = { 1000b, 0110b, 1010b, 1001b, 0111b, 10100b }
-------------------------------------------------
    { 1001b, 1101b, 1101b, 1101b, 1101b, 11111b }

AND = 1001b = 9

And here is some example C++ code. Note that the code's focus is on understandability. There are a few things that can be implemented more efficiently.

#include <iostream>
#include <vector>
#include <cstdint>

struct Partition
{
    std::vector<size_t> indices;
    std::vector<uint32_t> bs;
};

struct Partitioning
{
    bool success;
    Partition p1;
    Partition p2;
};

Partitioning partition(const std::vector<uint32_t>& a, const std::vector<size_t>& indices, const std::vector<uint32_t>& b, size_t bit)
{
    uint32_t mask = 1 << bit;

    Partitioning result;

    // partition the indices of a
    for (size_t i : indices)
    {
        uint32_t n = a[i];
        if (n & mask)
            result.p1.indices.push_back(i);
        else
            result.p2.indices.push_back(i);
    }
    
    // partition b
    for (uint32_t n : b)
        if (n & mask)
            result.p2.bs.push_back(n);
        else
            result.p1.bs.push_back(n);

    // check if we are successful
    bool canMakeBit1 = result.p1.indices.size() == result.p1.bs.size();
    result.success = canMakeBit1;

    return result;
}

void findMax(const std::vector<uint32_t>& a, const std::vector<uint32_t>& b)
{
    std::vector<uint32_t> aIndices(a.size());
    for (size_t i = 0; i < a.size(); ++i)
        aIndices[i] = i;

    // current partitioning
    std::vector<std::vector<uint32_t>> partsIndices;
    partsIndices.push_back(aIndices);
    std::vector<std::vector<uint32_t>> partsBs;
    partsBs.push_back(b);

    // temporary partitionings
    std::vector<Partitioning> partitionings;

    // assume 32 bits
    size_t bit = 32;
    do
    {
        --bit;

        bool success = true;
        partitionings.clear();
        
        // try to partition all current partitions
        for (size_t i = 0; i < partsIndices.size(); ++i)
        {
            partitionings.push_back(partition(a, partsIndices[i], partsBs[i], bit));
            if (!partitionings.back().success)
            {
                success = false;
                break;
            }
        }

        // if all partitionings are successful
        if (success)
        {
            // replace the current partitioning with the new one
            partsIndices.clear();
            partsBs.clear();
            for (auto& p : partitionings)
            {
                if (p.p1.indices.size() > 0)
                {
                    partsIndices.push_back(p.p1.indices);
                    partsBs.push_back(p.p1.bs);
                }

                if (p.p2.indices.size() > 0)
                {
                    partsIndices.push_back(p.p2.indices);
                    partsBs.push_back(p.p2.bs);
                }
            }
        }
    } while (bit > 0);

    // Generate c
    std::vector<uint32_t> c(a.size());    
    for (size_t i = 0; i < partsIndices.size(); ++i)
    {
        const auto& indices = partsIndices[i];
        const auto& bs = partsBs[i];
        for (size_t j = 0; j < indices.size(); ++j)
        {
            c[indices[j]] = bs[j];
        }
    }

    // Print the result
    uint32_t result = 0xffffffff;
    for (size_t i = 0; i < a.size(); ++i)
    {
        std::cout << c[i] << " ";
        result = result & (a[i] ^ c[i]);
    }
    std::cout << std::endl << result << std::endl;
}

int main()
{
    std::vector<uint32_t> a = { 1, 11, 7, 4, 10, 11 };
    std::vector<uint32_t> b = { 6, 20, 8, 9, 10, 7 };
    
    findMax(a, b);

    return 0;
}

Dillon · Answer

TL;DR

I think it's possible to reformulate this as an assignment problem, for which the optimal solution can be found in O(n^3) time. But I did not attempt this in my answer.

Disclaimer

The approach I'll be describing still involves checking permutations. It generally seems to require fewer iterations than the naive approach, but the additional overhead of my solution might actually slow it down overall. I have not compared the runtime of my approach to the naive approach, and I have not exhaustively checked to see if it's free of errors (it seems to work fine for the 3 provided test cases).

That said, I did have some insights along the way, so maybe my attempt at this will help someone else come up with a more efficient solution. Hopefully my explanation is clear and I'm not just rambling.

General idea

Imagine we create an undirected bipartite graph where the two independent sets of nodes correspond to a and b, and each node is an array element.

Let's consider one bit at a time, say, the least-significant bit (LSB). We'll label the LSB as bit 0. Let's also consider the first test case (and for simplicity, we'll only consider the lowest 4 bits instead of all 32):

Input:
a = [3, 4, 5]  // binary: [0011, 0100, 0101]
b = [6, 7, 8]  // binary: [0110, 0111, 1000]

Our graph has 3 nodes in the a set labelled 3, 4, and 5; and it has 3 nodes in the b set labelled 6, 7, and 8. We draw an edge between nodes if the chosen bit (bit 0) of the a node is different from the chosen bit of the b node. For example, we would draw an edge between nodes 3 and 6, because the LSB of 3 (0011) is different from the LSB of 6 (0110). We would not draw an edge between nodes 3 and 7, because the LSB of 3 (0011) is the same as the LSB of 7 (0111). If we work this out all the way, we end up with the following adjacency list for bit 0:

3: 6, 8
4: 7
5: 6, 8

We have two sets of edges that can be made from each node in a:

If the chosen bit is 1, draw an edge to all nodes in b where the chosen bit is 0.
If the chosen bit is 0, draw an edge to all nodes in b where the chosen bit is 1.

We can observe that the chosen bit could potentially be 1 in the final, optimal result if and only if the number of a nodes for which that bit is 1 is the same as the number of b nodes for which that bit is 0. Otherwise, that bit cannot be 1 in the final result, as we would have to pair at least one a node with a same-bit-valued b node, producing 0 for that bit after XOR and thus 0 for that bit after AND.

Now, if we create such a graph for each of the 4 bits and determine which bits are candidates for being 1 in the final result, this gives us an upper bound on the optimal result. This also gives us an obvious lower bound on the result, as we know we could at least find an ordering of b that results in the most-significant candidate bit being set.

The adjacency lists for each bit in our test case are given below.

Bit 0 (LSB)
3: 6, 8
4: 7
5: 6, 8
(candidate bit)

Bit 1
3: 8
4: 6, 7
5: 6, 7
(candidate bit)

Bit 2
3: 6, 7
4: 8
5: 8
(NOT candidate)

Bit 3 (MSB)
3: 8
4: 8
5: 8
(NOT candidate)

Upper bound: 3 (both candidate bits 0 and 1 are set)
Lower bound: 2 (only candidate bit 1 is set)

Finding `c`

To get the optimal c, we need to start with the most-significant candidate bit. We loop through all valid cs (permutations of b where we adhere to the adjacency list for that bit), of which there would hopefully be relatively few compared to the total number of possible permutations of b.

For each of these permutations, we see if any of them are valid for the next-most-significant candidate bit. If none of them are, then we check the bit after that, and so on. If no other bit can be included while adhering to the adjacency list for any permutation on the MSB, then only the MSB can be 1 in the final result and that's the solution. Otherwise, we want to prioritize permutations that work for bits that are more significant, and we need to recursively check the permutations that are valid for all bits up to that point.

What does it mean for a permutation to be "valid"? Essentially, the adjacency list acts as a constraint on the mapping from a to c. If the constraints of one candidate bit do not conflict with those of another candidate bit, we know those bits can both be 1 in the final result. For example, looking at bits 0 and 1, we see that there is a way to satisfy both of these mappings (namely, 3: 8; 4: 7; 5: 6). Therefore, that permutation (c = [8, 7, 6]) is valid for both bits 0 and 1. (And indeed, this permutation turns out to be the optimal one.)

For an example of conflicting constraints, consider bits 0 and 2. Bit 2 requires node 4 to be connected to node 8. That is, at index i which satisfies a[i] == 4, we need to set c[i] = 8 in order for bit 2 to be set in the final result. However, bit 0 requires node 4 to be connected to node 7. This is a conflict, so bits 0 and 2 cannot both be set in the final result. (But bit 2 wasn't a candidate bit anyway, because two of the a nodes (4 and 5) have a 1 for that bit but only one of the b nodes (8) has a 0 for that bit.)

Possible relation to solved graph theory problems

I'm not very familiar with graph theory problems, but it seems to me like this bipartite graph formulation of the problem is related to maximum weighted bipartite matching also known as the assignment problem. Though, our edges are currently unweighted. Maybe an edge could be weighted according to how many bits it exists for? Perhaps a greater weight would be given to more significant bits? I think we would still need to only consider graphs from candidate bits.

Essentially, build a nxn adjacency matrix from the bipartite graph of each candidate bit. Assign uniform weight pow(2, bit) to the edges; so edge weights are 1 for bit 0, 2 for bit 1, 4 for bit 2, and so on. This ensures that more significant candidate bits are preferred over any combination of lesser ones. Then combine the weights across all adjacency matrices to build the final representative adjacency matrix. This would be the input to the assignment problem.

If this comparison is valid, there is the Hungarian algorithm for optimally solving the assignment problem in O(n^3) time. This is significantly better than the O(n!) time needed for the naive permutations approach.

Messy code

For the sake of completeness, I'm including the code I wrote for my proposed approach, although I think it's more worthwhile investigating the potential of reformulating this problem as an assignment problem. To use this code, start from the sample code provided in the question and copy this code into the same file. Inside main, call solvePermuteLess instead of solvePermute.

#include <unordered_map>
#include <unordered_set>

uint recursiveVerification(size_t currentBit,
    std::unordered_map<size_t, size_t>& a2bConstraints,
    const uvec& possibleBits,
    const std::unordered_map<uint, std::vector<size_t>>& aSetMap,
    const std::unordered_map<uint, std::vector<size_t>>& bUnsetMap)
{
    uint bestScore = 0;
    std::unordered_map<size_t, size_t> a2bBest = a2bConstraints;

    uint upperBoundScore = 0;
    for (uint bit = currentBit; bit < possibleBits.size(); ++bit)
    {
        upperBoundScore += 1u << possibleBits[possibleBits.size() - bit - 1];
    }

    for (size_t bit = currentBit; bit < possibleBits.size(); ++bit)
    {
        uint bitValue = possibleBits[possibleBits.size() - bit - 1];
        std::vector<size_t> aSet = aSetMap.at(bitValue);
        std::vector<size_t> bUnset = bUnsetMap.at(bitValue);
        std::sort(bUnset.begin(), bUnset.end());
        do
        {
            // Set up necessary mappings for this permutation
            std::unordered_map<size_t, size_t> a2b;
            for (size_t i = 0; i < aSet.size(); ++i)
            {
                a2b[aSet[i]] = bUnset[i];
            }

            // Check for conflicts in mappings
            bool hasConflicts = false;
            for (auto jt = a2bConstraints.cbegin(); jt != a2bConstraints.cend(); ++jt)
            {
                auto findIt = a2b.find(jt->first);
                if (findIt != a2b.end())
                {
                    // The same value in `a` is being mapped. Make sure it's to
                    // the same value in `b`.
                    if (findIt->second != jt->second)
                    {
                        // Not mapped to same value; invalid permutation
                        hasConflicts = true;
                        break;
                    }
                }
            }
            if (hasConflicts)
            {
                // We found conflicting mappings; try the next permutation
                continue;
            }

            // If we reach this point, there were no mapping conflicts. We know
            // this bit can be set alongside the parent bit. Merge the
            // constraints and then try the next bit.
            for (auto jt = a2bConstraints.cbegin(); jt != a2bConstraints.cend(); ++jt)
            {
                a2b[jt->first] = jt->second;
            }

            // Recursively check permutations of lower bits
            uint score = (1u << bitValue)
                + recursiveVerification(bit + 1, a2b, possibleBits, aSetMap, bUnsetMap);

            // Now a2b contains the best-performing mapping for this
            // permutation. Track the best mapping across all permutations.
            if (score > bestScore)
            {
                bestScore = score;
                a2bBest = a2b;

                // If we achieve the upper-bound result (all bits set), we know
                // we can't do any better; so stop early
                if (bestScore == upperBoundScore)
                {
                    break;
                }
            }
        } while (std::next_permutation(bUnset.begin(), bUnset.end()));

        if (bestScore > 0)
        {
            // We were able to include the current bit, and we already found
            // the optimal permutation for lower bits. We do not need to
            // continue this loop, and we have our final score for this bit.
            break;
        }

        // If we reach this point, we could not include the current bit, so
        // we'll now try the next one
    }

    // Update the global constraints and return the score
    a2bConstraints = a2bBest;
    return bestScore;
}

uvec solvePermuteLess(const uvec& a, uvec& b)
{
    // For each bit, find all values in `a` where it's set and in `b` where
    // it's not set. We know that these are the only values we'd be interested
    // in pairing in the end.
    const uint BITSIZE = 32;
    const size_t N = a.size();
    std::unordered_map<uint, std::vector<size_t>> aSetMap;
    std::unordered_map<uint, std::vector<size_t>> bUnsetMap;
    for (uint bit = 0; bit < BITSIZE; ++bit)
    {
        for (size_t i = 0; i < N; ++i)
        {
            uint aiBit = (a[i] >> bit) & 0x1;
            if (aiBit == 1)
            {
                aSetMap[bit].push_back(i);
            }

            uint biBit = (b[i] >> bit) & 0x1;
            if (biBit == 0)
            {
                bUnsetMap[bit].push_back(i);
            }
        }
    }

    // Find which bits could possibly be set
    uint upperBoundResult = 0;
    uvec possibleBits;
    for (uint bit = 0; bit < BITSIZE; ++bit)
    {
        if (aSetMap[bit].size() == bUnsetMap[bit].size())
        {
            upperBoundResult += 1u << bit;
            possibleBits.push_back(bit);
        }
    }

    // State the upper bound on the result, if we assume all `possibleBits` are
    // set
    std::cout << "Upper bound on result: " << upperBoundResult << "
";
    std::cout << "Possible set bits (LSB = 0): [ ";
    for (uint bit : possibleBits)
    {
        std::cout << bit << " ";
    }
    std::cout << "]
";

    // If there's no hope, just return the original b
    if (possibleBits.empty())
    {
        return b;
    }

    // Also state a lower bound on the result, namely the MSB
    std::cout << "Lower bound on result: " << (1u << possibleBits.back()) << "
";

    // Iterate through all permutations of possibilities for each bit, starting
    // with the MSB (which we know will be part of our solution)
    uint bestScore = 0;
    uvec c = b;
    std::vector<size_t> aSet = aSetMap[possibleBits.back()];
    std::vector<size_t> bUnset = bUnsetMap[possibleBits.back()];
    std::sort(bUnset.begin(), bUnset.end());
    do
    {
        // So far, we are unconstrained with what would be aUnset and bSet, but
        // these might be constrained by later bits. Build the constraints
        // mapping indices of a to indices of b. Currently unconstrained
        // mappings will be treated as wildcards until further constraints are
        // made for lower bits.
        std::unordered_map<size_t, size_t> a2b;
        for (size_t i = 0; i < aSet.size(); ++i)
        {
            a2b[aSet[i]] = bUnset[i];
        }

        // Recursively check permutations of lower bits
        uint score = (1u << possibleBits.back())
            + recursiveVerification(1, a2b, possibleBits, aSetMap, bUnsetMap);

        // If the current permutation outperformed the previous best (or if
        // this is the first permutation), update the global results
        if (score > bestScore)
        {
            bestScore = score;
            
            // Build c using the mappings
            c = uvec(N);
            std::unordered_set<size_t> aUnmappedIndices(N);
            std::unordered_set<size_t> bUnmappedIndices(N);
            for (size_t j = 0; j < N; ++j)
            {
                aUnmappedIndices.insert(j);
                bUnmappedIndices.insert(j);
            }
            for (auto it = a2b.cbegin(); it != a2b.cend(); ++it)
            {
                c[it->first] = b[it->second];
                aUnmappedIndices.erase(it->first);
                bUnmappedIndices.erase(it->second);
            }
            // For unconstrained mappings, use arbitrary ordering
            for (auto ai = aUnmappedIndices.begin(), bi = bUnmappedIndices.begin(); ai != aUnmappedIndices.end(); ++ai, ++bi)
            {
                c[*ai] = b[*bi];
            }

            // If we achieved the upper bound result (all bits set), we know we
            // can't do any better; so we might as well stop searching
            if (bestScore == upperBoundResult)
            {
                break;
            }
        }
    } while (std::next_permutation(bUnset.begin(), bUnset.end()));

    return c;
}

Maximize AND on a sequence of XORs

Tags:

arrays

algorithm

bit-manipulation

dynamic-programming

xor

Problem

More details

Test cases

Sample code

Kolya Ivanov

2 Answers

Nico Schertler

TL;DR

Disclaimer

General idea

Finding `c`

Possible relation to solved graph theory problems

Messy code

Dillon

Recent Activity

Donate For Us

Maximize AND on a sequence of XORs

Tags:

arrays

algorithm

bit-manipulation

dynamic-programming

xor

Problem

More details

Test cases

Sample code

Kolya Ivanov

2 Answers

Nico Schertler

TL;DR

Disclaimer

General idea

Finding c

Possible relation to solved graph theory problems

Messy code

Dillon

Related questions

Recent Activity

Donate For Us

Finding `c`