I have this code that determines whether a word (ignoring case) is included in a wordList text file. However, the wordList text file may have 65000++ lines, and to just search a word using my implementation below takes nearly a minute. Could you think of any better implementation?
Thanks!
import java.io.*;
import java.util.*;
public class WordSearch
{
LinkedList<String> lxx;
FileReader fxx;
BufferedReader bxx;
public WordSearch(String wordlist)
throws IOException
{
fxx = new FileReader(wordlist);
bxx = new BufferedReader(fxx);
lxx = new LinkedList<String>();
String word;
while ( (word = bxx.readLine()) != null)
{
lxx.add(word);
}
bxx.close();
}
public boolean inTheList (String theWord)
{
for(int i =0 ; i < lxx.size(); i++)
{
if (theWord.compareToIgnoreCase(lxx.get(i)) == 0)
{
return true;
}
}
return false;
}
}
In computer science, a search data structure is any data structure that allows the efficient retrieval of specific items from a set of items, such as a specific record from a database. The simplest, most general, and least efficient search structure is merely an unordered sequential list of all the items.
Use a HashSet into which you put a lowercase version of each word. Checking if a HashSet contains a specified string is, on average, a constant-time (read: extremely fast) operation.
Hash Table. Hashtable is a list of paired values, the first item in the pair is the key, and the second item is the value. With a hash table, you can access objects by the key, so this structure is high-speed for lookups. Hash tables are faster than the arrays for lookups.
What is the most efficient data structure for this? Looking at complexity analysis, Hashtables seem to be the most efficient for lookups.
Use a HashSet
into which you put a lowercase version of each word. Checking if a HashSet
contains a specified string is, on average, a constant-time (read: extremely fast) operation.
Since you're searching, you may want to consider sorting the list before searching; then you can do binary search which is much faster than linear search. That can help if you'll perform multiple searches on the same list, otherwise the penalty you pay to sort the list isn't worth it for searching only once.
Also, doing linear search on a linked list using "lxx.get(i)" is asking for trouble. LinkedList.get() is O(n). You can either use an Iterator (easy way: for (String s : lxx)) or switch to a list type that has O(1) access time, such as ArrayList.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With