Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java string comparison

I am comparing substrings in two large text files. Very simple, tokenizing into two token containers, comparing with 2 for loops. Performance is disastrous! Does anybody have an advice or idea how to improve performance?

for (int s = 0; s < txtA.TokenContainer.size(); s++) {
    String strTxtA = txtA.getSubStr(s);
    strLengthA = txtA.getNumToken(s);

    if (strLengthA >= dp.getMinStrLength()) {
        int tokenFileB = 1;

        for (int t = 0; t < txtB.TokenContainer.size(); t++) {
            String strTxtB = txtB.getSubStr(t);
            strLengthB = txtB.getNumToken(t);

            if (strTxtA.equalsIgnoreCase(strTxtB)) {
                try {
                    subStrTemp = new SubStrTemp(
                        txtA.ID, txtB.ID, tokenFileA, tokenFileB,
                        (tokenFileA + strLengthA - 1), 
                        (tokenFileB + strLengthB - 1));

                    if (subStrContainer.contains(subStrTemp) == false) {
                        subStrContainer.addElement(subStrTemp);
                    }
                } catch (Exception ex) {
                    logger.error("error");
                }
            }
            tokenFileB += strLengthB;
        }
        tokenFileA += strLengthA;
    }
}

Generally my code reading two large Strings with Java Tokonizer into containers A and B. And then trying to compare substrings.Possision of Substrgs which are existing in both strings to store into a Vector. But performance is awful, also don't really know how to solve it with HashMap.

like image 603
jackdaniels Avatar asked Apr 09 '26 12:04

jackdaniels


1 Answers

Your main problem is that you go through all txtB for each token in txtA.

You should store informations on token from txtA (in a HashMap for instance) and then in a second loop (but not a nested one) you compare the strings with the existing one in the Map.


On the same topic :

  • term frequency using java program
  • How to count words in java
like image 84
Colin Hebert Avatar answered Apr 12 '26 02:04

Colin Hebert



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!