Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

resolving logical operations - AND, OR, looping conditions dynamically

I have an incoming records filter stored with the logical clause as given below.

Acct1 = 'Y' AND Acct2 = 'N' AND Acct3 = 'N' AND Acct4 = 'N' AND Acct5 = 'N' AND ((Acct6 = 'N' OR Acct7 = 'N' AND Acct1 = 'Y') AND Formatted= 'N' AND Acct9 = 'N' AND (Acct10 = 'N' AND Acct11 = 'N') AND EditableField= 'N' )

My data input to this clause will be from Csv file as below.

Country,Type,Usage,Acct1,Acct2,Acct3,Acct4,Acct5,Acct6,Acct7,Formatted,Acct9,Acct10,Acct11,EditableField
USA,Premium,Corporate,Y,N,Y,N,N,N,Y,N,Y,N,Y,N,
Mexico,Premium,Corporate,Y,N,Y,N,Y,N,Y,N,Y,N,Y,N,
USA,Premium,Corporate,Y,N,Y,N,N,N,N,Y,Y,N,Y,N,
USA,Premium,Corporate,Y,N,Y,N,Y,N,Y,Y,Y,N,Y,N,

I will have to filter out the records in the file based on the conditions defined in the clause. This is a example of one simple clause but there will be more inner conditions than this and the clause can be changed whenever the user want and there will be 10 such clauses the records has to pass through sequentially.

So I am looking for a way to dynamically interpret the clause and apply it on the incoming records. Please provide me your suggestions about how to design/ any example if available.

like image 433
Atom Avatar asked Aug 24 '15 22:08

Atom


People also ask

What are the 3 basic logical operation?

There are three logical operators: and , or , and not . The semantics (meaning) of these operators is similar to their meaning in English.

Which logical operator is used for or condition?

The logical OR ( || ) operator (logical disjunction) for a set of operands is true if and only if one or more of its operands is true. It is typically used with boolean (logical) values.

What are the 3 logical operators in Java?

Java has three logical operators: && , || , and ! , which respectively stand for and, or, and not. The results of these operators are similar to their meanings in English. For example, x > 0 && x < 10 is true when x is both greater than zero and less than 10.


2 Answers

Here's the complete solution which does not include third-party libraries like ANTLR or JavaCC. Note that while it's extensible, its capabilities are still limited. If you want to create much more complex expressions, you'd better use grammar generator.

First, let's write a tokenizer which splits the input string to the tokens. Here's the token types:

private static enum TokenType {
    WHITESPACE, AND, OR, EQUALS, LEFT_PAREN, RIGHT_PAREN, IDENTIFIER, LITERAL, EOF
}

The token class itself:

private static class Token {
    final TokenType type;
    final int start; // start position in input (for error reporting)
    final String data; // payload

    public Token(TokenType type, int start, String data) {
        this.type = type;
        this.start = start;
        this.data = data;
    }

    @Override
    public String toString() {
        return type + "[" + data + "]";
    }
}

To simplify the tokenization let's create a regexp which reads the next token from the input string:

private static final Pattern TOKENS = 
        Pattern.compile("(\\s+)|(AND)|(OR)|(=)|(\\()|(\\))|(\\w+)|\'([^\']+)\'");

Note that it has many groups, one group per TokenType in the same order (first comes WHITESPACE, then AND and so on). Finally the tokenizer method:

private static TokenStream tokenize(String input) throws ParseException {
    Matcher matcher = TOKENS.matcher(input);
    List<Token> tokens = new ArrayList<>();
    int offset = 0;
    TokenType[] types = TokenType.values();
    while (offset != input.length()) {
        if (!matcher.find() || matcher.start() != offset) {
            throw new ParseException("Unexpected token at " + offset, offset);
        }
        for (int i = 0; i < types.length; i++) {
            if (matcher.group(i + 1) != null) {
                if (types[i] != TokenType.WHITESPACE)
                    tokens.add(new Token(types[i], offset, matcher.group(i + 1)));
                break;
            }
        }
        offset = matcher.end();
    }
    tokens.add(new Token(TokenType.EOF, input.length(), ""));
    return new TokenStream(tokens);
}

I'm using java.text.ParseException. Here we apply the regex Matcher till the end of the input. If it doesn't match at the current position, we throw an exception. Otherwise we look for found matching group and create a token from it ignoring the WHITESPACE tokens. Finally we add a EOF token which indicates the end of the input. The result is returned as special TokenStream object. Here's the TokenStream class which will help us to do the parsing:

private static class TokenStream {
    final List<Token> tokens;
    int offset = 0;

    public TokenStream(List<Token> tokens) {
        this.tokens = tokens;
    }

    // consume next token of given type (throw exception if type differs)
    public Token consume(TokenType type) throws ParseException {
        Token token = tokens.get(offset++);
        if (token.type != type) {
            throw new ParseException("Unexpected token at " + token.start
                    + ": " + token + " (was looking for " + type + ")",
                    token.start);
        }
        return token;
    }

    // consume token of given type (return null and don't advance if type differs)
    public Token consumeIf(TokenType type) {
        Token token = tokens.get(offset);
        if (token.type == type) {
            offset++;
            return token;
        }
        return null;
    }

    @Override
    public String toString() {
        return tokens.toString();
    }
}

So we have a tokenizer, hoorah. You can test it right now using System.out.println(tokenize("Acct1 = 'Y' AND (Acct2 = 'N' OR Acct3 = 'N')"));

Now let's write the parser which will create the tree-like representation of our expression. First the interface Expr for all the tree nodes:

public interface Expr {
    public boolean evaluate(Map<String, String> data);
}

Its only method used to evaluate the expression for given data set and return true if data set matches.

The most basic expression is the EqualsExpr which is like Acct1 = 'Y' or 'Y' = Acct1:

private static class EqualsExpr implements Expr {
    private final String identifier, literal;

    public EqualsExpr(TokenStream stream) throws ParseException {
        Token token = stream.consumeIf(TokenType.IDENTIFIER);
        if(token != null) {
            this.identifier = token.data;
            stream.consume(TokenType.EQUALS);
            this.literal = stream.consume(TokenType.LITERAL).data;
        } else {
            this.literal = stream.consume(TokenType.LITERAL).data;
            stream.consume(TokenType.EQUALS);
            this.identifier = stream.consume(TokenType.IDENTIFIER).data;
        }
    }

    @Override
    public String toString() {
        return identifier+"='"+literal+"'";
    }

    @Override
    public boolean evaluate(Map<String, String> data) {
        return literal.equals(data.get(identifier));
    }
}

The toString() method is just for information, you can remove it.

Next we will define the SubExpr class which is either EqualsExpr or something more complex in parentheses (if we see the parenthesis):

private static class SubExpr implements Expr {
    private final Expr child;

    public SubExpr(TokenStream stream) throws ParseException {
        if(stream.consumeIf(TokenType.LEFT_PAREN) != null) {
            child = new OrExpr(stream);
            stream.consume(TokenType.RIGHT_PAREN);
        } else {
            child = new EqualsExpr(stream);
        }
    }

    @Override
    public String toString() {
        return "("+child+")";
    }

    @Override
    public boolean evaluate(Map<String, String> data) {
        return child.evaluate(data);
    }
}

Next is AndExpr which is a set of SubExpr expressions joined by AND operator:

private static class AndExpr implements Expr {
    private final List<Expr> children = new ArrayList<>();

    public AndExpr(TokenStream stream) throws ParseException {
        do {
            children.add(new SubExpr(stream));
        } while(stream.consumeIf(TokenType.AND) != null);
    }

    @Override
    public String toString() {
        return children.stream().map(Object::toString).collect(Collectors.joining(" AND "));
    }

    @Override
    public boolean evaluate(Map<String, String> data) {
        for(Expr child : children) {
            if(!child.evaluate(data))
                return false;
        }
        return true;
    }
}

I use Java-8 Stream API in the toString for brevity. If you cannot use Java-8, you may rewrite it with the for loop or remove toString completely.

Finally we define OrExpr which is a set of AndExpr joined by OR (usually OR has lower priority than AND). It's very similar to AndExpr:

private static class OrExpr implements Expr {
    private final List<Expr> children = new ArrayList<>();

    public OrExpr(TokenStream stream) throws ParseException {
        do {
            children.add(new AndExpr(stream));
        } while(stream.consumeIf(TokenType.OR) != null);
    }

    @Override
    public String toString() {
        return children.stream().map(Object::toString).collect(Collectors.joining(" OR "));
    }

    @Override
    public boolean evaluate(Map<String, String> data) {
        for(Expr child : children) {
            if(child.evaluate(data))
                return true;
        }
        return false;
    }
}

And the final parse method:

public static Expr parse(TokenStream stream) throws ParseException {
    OrExpr expr = new OrExpr(stream);
    stream.consume(TokenType.EOF); // ensure that we parsed the whole input
    return expr;
}

So you can parse your expressions to get the Expr objects, then evaluate them against the rows of your CSV file. I assume that you're capable to parse the CSV row into the Map<String, String>. Here's usage example:

Map<String, String> data = new HashMap<>();
data.put("Acct1", "Y");
data.put("Acct2", "N");
data.put("Acct3", "Y");
data.put("Acct4", "N");

Expr expr = parse(tokenize("Acct1 = 'Y' AND (Acct2 = 'Y' OR Acct3 = 'Y')"));
System.out.println(expr.evaluate(data)); // true
expr = parse(tokenize("Acct1 = 'N' OR 'Y' = Acct2 AND Acct3 = 'Y'"));
System.out.println(expr.evaluate(data)); // false
like image 152
Tagir Valeev Avatar answered Sep 27 '22 22:09

Tagir Valeev


I don't know how efficient this will be in Java, but basic string-replace operations could be a simple solution for this.

You start with a query string:

Acct1 = 'Y' AND Acct2 = 'N' AND Acct3 = 'Y' AND Acct4 = 'N' AND Acct5 = 'N' OR ((Acct6 = 'N' OR Acct7 = 'N') AND Acct8 = 'N' AND Acct9 = 'Y' AND (Acct10 = 'N' OR Acct11 = 'N') AND Acct12 = 'N')

For each line in the csv, e.g. Y,N,Y,N,Y,N,Y,N,Y,N,Y,N string-replace the column headers in the query by the values; that gives you:

Y = 'Y' AND N = 'N' AND Y = 'Y' AND N = 'N' AND Y = 'N' OR ((N = 'N' OR Y = 'N') AND N = 'N' AND Y = 'Y' AND (N = 'N' OR Y = 'N') AND N = 'N')

Then replace the comparisons by their boolean value:
- replace N = 'N' and Y = 'Y' by Y
- replace N = 'Y' and Y = 'N' by N

This will result in:

Y AND Y AND Y AND Y AND N OR ((Y OR N) AND Y AND Y AND (Y OR N) AND Y)

Then loop through a number of string-replace operations which replace truthy values by Y and falsey values by N:
- replace Y AND Y by Y
- replace N AND N, N AND Y and Y AND N, by N
- replace Y OR Y, N OR Y and Y OR N, by Y
- replace N OR N by N
- replace (N) by N
- replace (Y) by Y

This will gradually reduce the boolean statement:

Y AND Y AND Y AND Y AND N OR ((Y OR N) AND Y AND Y AND (Y OR N) AND Y)
Y AND Y AND N OR ((Y) AND Y AND (Y) AND Y)
Y AND N OR ( Y AND Y AND Y AND Y)
N OR ( Y AND Y )
N OR ( Y )
Y

If the queries include implicit precedences without brackets, like N AND N OR Y AND Y where you want AND to have precedence over OR, always exhaust the possibilities to replace AND and brackets before replacing OR:

while (string length decreases) {
    while (string length decreases) {
        replace every "(Z)" by "Z"
        replace every "X AND Y" by "Z"
    }
    replace one "X OR Y" by "Z"
}

During this reduction, make sure to check whether the string length has decreased after every iteration, to avoid infinite loops caused by malformed queries.

like image 21