Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing field access flags in java

Tags:

java

parsing

I have an assignment wherein I have to parse the field access flags of a java .class file. The specification for a .class file can be found here: Class File Format (page 26 & 27 have the access flags and hex vals).

This is fine, I can do this no worries. My issue is that there is a large number of combinations.

I know the public, private and protected are mutually exclusive, which reduces the combinations somewhat. Final and transient are also mutually exclusive. The rest however are not.

At the moment, I have a large switch statement to do the comparison. I read in the hex value of the access flag and then increment a counter, depending on if it is public, private or protected. This works fine, but it seems quite messy to just have every combination listed in a switch statement. i.e. public static, public final, public static final, etc.

I thought of doing modulo on the access flag and the appropriate hex value for public, private or protected, but public is 0x0001, so that won't work.

Does anyone else have any ideas as to how I could reduce the amount of cases in my switch statement?

like image 262
FizzBuzz Avatar asked Feb 02 '23 18:02

FizzBuzz


2 Answers

What is the problem? The specification says that it's a bit flag, that means that you should look at a value as a binary number, and that you can test if a specific value is set by doing a bitwise AND.

E.g

/*
ACC_VOLATILE = 0x0040 = 10000000
ACC_PUBLIC   = 0x0001 = 00000001
Public and volatile is= 10000001
*/


publicCount += flag & ACC_PUBLIC > 0 ? 1 : 0;
volatileCount += flag & ACC_VOLATILE > 0 ? 1 : 0;
like image 155
Kaj Avatar answered Feb 05 '23 07:02

Kaj


If you are trying to avoid a pattern like this one I just stole:

if (access_flag & ACC_PUBLIC != 0)
{
    public++;
}
if (access_flag & ACC_FINAL != 0)
{
    final++;
}
...

It's a great instinct. I make it a rule never to write code that looks redundant like that. Not only is it error-prone and more code in your class, but copy & paste code is really boring to write.

So the big trick is to make this access "Generic" and easy to understand from the calling class--pull out all the repeated crap and just leave "meat", push the complexity to the generic routine.

So an easy way to call a method would be something like this that gives an array of bitfields that contain many bit combinations that need counted and a list of fields that you are interested in (so that you don't waste time testing fields you don't care about):

int[] counts = sumUpBits(arrayOfFlagBitfields, ACC_PUBLIC | ACC_FINAL | ACC_...);

That's really clean, but then how do you access the return fields? I was originally thinking something like this:

System.out.println("Number of public classes="+counts[findBitPosition(ACC_PUBLIC]));
System.out.println("Number of final classes="+counts[findBitPosition(ACC_FINAL)]);

Most of the boilerplate here is gone except the need to change the bitfields to their position. I think two changes might make it better--encapsulate it in a class and use a hash to track positions so that you don't have to convert bitPosition all the time (if you prefer not to use the hash, findBitPosition is at the end).

Let's try a full-fledged class. How should this look from the caller's point of view?

BitSummer bitSums=new BitSummer(arrayOfFlagBitfields, ACC_PUBLIC, ACC_FINAL);
System.out.println("Number of public classes="+bitSums.getCount(ACC_PUBLIC));
System.out.println("Number of final classes="+bitSums.getCount(ACC_FINAL));

That's pretty clean and easy--I really love OO! Now you just use the bitSums to store your values until they are needed (It's less boilerplate than storing them in class variables and more clear than using an array or a collection)

So now to code the class. Note that the constructor uses variable arguments now--less surprise/more conventional and makes more sense for the hash implementation.

By the way, I know this seems like it would be slow and inefficient, but it's probably not bad for most uses--if it is, it can be improved, but this should be much shorter and less redundant than the switch statement (which is really the same as this, just unrolled--however this one uses a hash & autoboxing which will incur an additional penalty).

public class BitSummer {
    // sums will store the "sum" as <flag, count>
    private final HashMap<Integer, Integer> sums=new HashMap<Integer, Integer>();

    // Constructor does all the work, the rest is just an easy lookup.
    public BitSummer(int[] arrayOfFlagBitfields, int ... positionsToCount) {

        // Loop over each bitfield we want to count
        for(int bitfield : arrayOfFlagBitfields) {

            // and over each flag to check
            for(int flag : positionsToCount) {
                // Test to see if we actually should count this bitfield as having the flag set
                if((bitfield & flag) != 0) {
                    sums.put(flag, sums.get(flag) +1); // Increment value
                }
            }
        }
    }

    // Return the count for a given bit position
    public int getCount(int bit) {
        return sums.get(bit);
    }
}

I didn't test this but I think it's fairly close. I wouldn't use it for processing video packets in realtime or anything, but for most purposes it should be fast enough.

As for maintaining code may look "Long" compared to the original example but if you have more than 5 or 6 fields to check, this will actually be a shorter solution than the chained if statements and significantly less error/prone and more maintainable--also more interesting to write.

If you really feel the need to eliminate the hashtable you could easily replace it with a sparse array with the flag position as the index (for instance the count of a flag 00001000/0x08 would be stored in the fourth array position). This would require a function like this to calculate the bit position for array access (both storing in the array and retrieving)

private int findBitPosition(int flag) {
    int ret;
    while( ( flag << 1 ) != 0 )
        ret++;
    return ret;
}

That was fun.

like image 42
Bill K Avatar answered Feb 05 '23 08:02

Bill K