Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can this Java code be improved to find sub-string in a string?

I was recently asked to submit a solution to a problem for a job.

Problem: Find a sub-string in a string.

Input: "Little star's deep dish pizza sure is fantastic."  
Search: "deep dish pizza"  
Output: "Little star's [[HIGHLIGHT]]deep dish pizza[[ENDHIGHLIGHT]] sure is fantastic."

Note that the highlighter doesn't have to have the exact same result on this example, since you are defining what a good snippet is and return the the most relevant snippet with the query terms highlighted.

The most important requirement was to write it as I would write a production code.

My solution was not accepted. How could I have improved it? I know, I could have used:

  1. Knuth–Morris–Pratt algorithm
  2. Regex (could I?)

My QUESTION:

  1. What do tech companies take into consideration when they review a code for a job. I submitted the code the same day, does that help in any way?

  2. In one of the comments, it pointed out, it looks like a school code than production code. How? Any suggestions?

My solution:
FindSubString.java

/**
 * FindSubString.java: Find sub-string in a given query
 * 
 * @author zengr
 * @version 1.0
 */

public class FindSubstring {
    private static final String startHighlight = "[[HIGHLIGHT]]";
    private static final String endHighlight = "[[ENDHIGHLIGHT]]";

    /**
     * Find sub-string in a given query
     * 
     * @param inputQuery: A string data type (input Query)
     * @param highlightDoc: A string data type (pattern to match)
     * @return inputQuery: A String data type.
     */
    public String findSubstringInQuery(String inputQuery, String highlightDoc) {
        try {

            highlightDoc = highlightDoc.trim();

            if (inputQuery.toLowerCase().indexOf(highlightDoc.toLowerCase()) >= 0) {
                // update query if exact doc exists
                inputQuery = updateString(inputQuery, highlightDoc);
            }

            else {
                // If exact doc is not in the query then break it up
                String[] docArray = highlightDoc.split(" ");

                for (int i = 0; i < docArray.length; i++) {
                    if (inputQuery.toLowerCase().indexOf(docArray[i].toLowerCase()) > 0) {
                        inputQuery = updateString(inputQuery, docArray[i]);
                    }
                }
            }
        } catch (NullPointerException ex) {
            // Ideally log this exception
            System.out.println("Null pointer exception caught: " + ex.toString());
        }

        return inputQuery;
    }

    /**
     * Update the query with the highlighted doc
     * 
     * @param inputQuery: A String data type (Query to update)
     * @param highlightDoc: A String data type (pattern around which to update)
     * @return inputQuery: A String data type.
     */
    private String updateString(String inputQuery, String highlightDoc) {
        int startIndex = 0;
        int endIndex = 0;

        String lowerCaseDoc = highlightDoc.toLowerCase();
        String lowerCaseQuery = inputQuery.toLowerCase();

        // get index of the words to highlight
        startIndex = lowerCaseQuery.indexOf(lowerCaseDoc);
        endIndex = lowerCaseDoc.length() + startIndex;

        // Get the highlighted doc
        String resultHighlightDoc = highlightString(highlightDoc);

        // Update the original query
        return inputQuery = inputQuery.substring(0, startIndex - 1) + resultHighlightDoc + inputQuery.substring(endIndex, inputQuery.length());
    }

    /**
     * Highlight the doc
     * 
     * @param inputString: A string data type (value to be highlighted)
     * @return highlightedString: A String data type.
     */
    private String highlightString(String inputString) {
        String highlightedString = null;

        highlightedString = " " + startHighlight + inputString + endHighlight;

        return highlightedString;
    }
}

TestClass.java

/**
 * TestClass.java: jUnit test class to test FindSubString.java
 * 
 * @author zengr
 * @version 1.0
 */

import junit.framework.Test;
import junit.framework.TestCase;
import junit.framework.TestSuite;

public class TestClass extends TestCase
{
    private FindSubstring simpleObj = null;
    private String originalQuery = "I like fish. Little star's deep dish pizza sure is fantastic. Dogs are funny.";

    public TestClass(String name) {
        super(name);
    }

    public void setUp() { 
        simpleObj = new FindSubstring();
    }

    public static Test suite(){

        TestSuite suite = new TestSuite();
        suite.addTest(new TestClass("findSubstringtNameCorrect1Test"));
        suite.addTest(new TestClass("findSubstringtNameCorrect2Test"));
        suite.addTest(new TestClass("findSubstringtNameCorrect3Test"));
        suite.addTest(new TestClass("findSubstringtNameIncorrect1Test"));
        suite.addTest(new TestClass("findSubstringtNameNullTest"));

        return suite;
    }

    public void findSubstringtNameCorrect1Test() throws Exception
    {
        String expectedOutput = "I like fish. Little star's deep [[HIGHLIGHT]]dish pizza[[ENDHIGHLIGHT]] sure is fantastic. Dogs are funny.";
        assertEquals(expectedOutput, simpleObj.findSubstringInQuery(originalQuery, "dish pizza"));
    }

    public void findSubstringtNameCorrect2Test() throws Exception 
    {
        String expectedOutput = "I like fish. Little star's [[HIGHLIGHT]]deep dish pizza[[ENDHIGHLIGHT]] sure is fantastic. Dogs are funny.";
        assertEquals(expectedOutput, simpleObj.findSubstringInQuery(originalQuery, "deep dish pizza"));
    }

    public void findSubstringtNameCorrect3Test() throws Exception 
    {
        String expectedOutput = "Hello [[HIGHLIGHT]]how[[ENDHIGHLIGHT]] are [[HIGHLIGHT]]you[[ENDHIGHLIGHT]]r?";
        assertEquals(expectedOutput, simpleObj.findSubstringInQuery("Hello how are your?", "how you"));
    }

    public void findSubstringtNameIncorrect1Test() throws Exception 
    {
        String expectedOutput = "I like fish. Little star's deep dish pizza sure is fantastic. Dogs are funny.";
        assertEquals(expectedOutput, simpleObj.findSubstringInQuery(originalQuery, "I love Ruby too"));
    }

    public void findSubstringtNameNullTest() throws Exception
    {
        String expectedOutput = "I like fish. Little star's deep dish pizza sure is fantastic. Dogs are funny.";
        assertEquals(expectedOutput, simpleObj.findSubstringInQuery(originalQuery, null));

    }
}
like image 537
zengr Avatar asked Oct 15 '10 04:10

zengr


People also ask

How do you find the sub string of a string?

You first check to see if a string contains a substring, and then you can use find() to find the position of the substring. That way, you know for sure that the substring is present. So, use find() to find the index position of a substring inside a string and not to look if the substring is present in the string.

How do I find a particular substring in a string in Java?

To locate a substring in a string, use the indexOf() method. Let's say the following is our string. String str = "testdemo"; Find a substring 'demo' in a string and get the index.


2 Answers

A few comments;

  • You only highlight the first occurance of the search string.
  • You assume that lower case matching is fine. Unless this was specified as a requirement it might be better to provide two methods, one that respects case and one that ignores case.
  • I would probably check the given parameters and throw a NPE if either of them were null. This would be the first thing my method did. I would clearly document this behaviour in the javadoc.
  • Your method mame is bad; findSubstringInQuery's main task isn't to find, it is to highlight and the inQuery part is superflous. Just call the method highlight or maybe highlightIgnoreCase if you are going to have a highlight that respects case.
  • Your method parameter names are bad. I've looked at your method signature 10 times and still have to look at the method body to remind myself which arg is the search term and which is the text to search. Call them searchTerm and text.
  • Production code doesn't use the default package.
  • Production code doesn't use System.out.println().
  • Your javadoc need improving, it needs to tell the user everything they need to know about the code.
  • I would consider using static methods for a class with no class variables.
  • I would also consider allowing the user to specify their own start and end highlighting markers (I wouldn't use static methods if I did this).
  • I wouldn't trim() unless this was specified as a requirement. If I did, then obviously this behaviour would be documented in the javadoc.

I wouldn't worry about the algorithm used for searching, Knuth-Morris-Pratt looks good but they shouldn't expect you to know about it and implement it unless the job spec specifically asked for experience/expertise in string searching.

like image 83
Qwerky Avatar answered Sep 19 '22 21:09

Qwerky


If this code was submitted to me for review, this is what I would think:

  • The code is overly verbose and complex where it does not need to be. The whole thing can be done in a small method in ten lines of code, including all sanity checks. This method should probably be static.
  • Your code is doing things that (I assume) were not asked for. You were being asked to search for a substring in a string. If it is not found, then that's fine -- no need to split the substring into words and search for each individual word.
  • Unless they asked you to remove leading and trailing whitespace, I would not have called trim()
  • I would not have included the calls to toLowerCase() unless explicitly asked for, although I would have added a comment stating that they could be added if needed. Anyway even if the search is meant to be case insensitive, there are too many redundant calls to toLowerCase() in your code.
  • You should not need to catch NullPointerException -- instead, you should ensure that this exception is never thrown.
like image 44
Grodriguez Avatar answered Sep 22 '22 21:09

Grodriguez