Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the most efficient way to insert a string into an already-sorted array list of strings?

I have an ArrayList that has 17,000 words in it. I need to add a word to the list only if it's not already in and I need to preserve the sort order of the list. i.e., I need to put it into its alphabetically correct location.

I don't know how to find the correct place to insert it. I am using a binary search to find if the word is already in the list and that returns the index if it's in there or -1 if it's not. I was planning on using ArrayList.add(int index, E element) to put it in.

like image 850
Cooldog117 Avatar asked May 07 '12 23:05

Cooldog117


4 Answers

Convert the ArrayList into a TreeSet http://docs.oracle.com/javase/7/docs/api/java/util/TreeSet.html

The TreeSet will take care of duplicates for you, and keep the words in alphabetical order.

Example: (WordList is the ArrayList of words)

TreeSet<String> WordSet = new TreeSet<String>(WordList);
like image 178
apnorton Avatar answered Oct 21 '22 11:10

apnorton


Use the built-in binarySearch method. If the key is not found, the number returned is
-(insertionIndex) - 1

like image 34
Jeffrey Avatar answered Oct 21 '22 09:10

Jeffrey


binary search comes to mind, the list api might contain better though

In a binary search you will get to a point where you have 2 items left, one above and one below, with one of them possibly == to your item. For your case you won't have the == case, so return the index of the higher one and insert at its position. I don't know if java has a tuple class, or you can build a container. Either way, return something like:

(bool, int) binSearch(IList list)
  returns true, -1 if found
  returns false, higher of 2 bounds otherwise

obviously this isn't java but it's not a stretch to convert

like image 23
Jean-Bernard Pellerin Avatar answered Oct 21 '22 09:10

Jean-Bernard Pellerin


If you wrote the binary search you could modify it to return the last value searched. This value could either be the location of the matching string or the location where it should be inserted.

That is in a binary search, you subdivide the list until you either find the string or cannot subdivide it any further. That position where you can no longer subdivide the list is the position where your string should be inserted.

like image 29
AaronM Avatar answered Oct 21 '22 11:10

AaronM