Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"Simple" Trie Implementation

I need to implement a Trie (in Java) for a college project. The Trie should be able to add and remove Strings (for phase 1).

I have spent several hours each day (for the last few days) trying to figure out how to do this and FAILED miserably each time.

I require some help, the examples on the internet and my textbook (Data Structures and Algorithms in Java By Adam Drozdek) are not helping.

Information

  1. Node classes I am working with:

    class Node {
        public boolean isLeaf;
    }
    
    class internalNode extends Node {
        public String letters;  //letter[0] = '$' always.
        //See image -> if letter[1] = 'A' then children[1] refers to child node "AMMO"
        //See image -> if letter[2] = 'B' then children[2] refers to internal node "#EU"
        public TrieNode[] children = new TrieNode[2];
    
        public TrieInternalNode(char ch)
        {
            letters = "#" + String.valueOf(ch);//letter[0] = '$' always.
            isLeaf = false;
        }
    }
    
    class leafNode extends Node
    {
        public String word;
        public TrieLeafNode(String word)
        {
            this.word = new String(word);
            isLeaf = true;
        }
    }
    
  2. And here is the pseudo code for insert that I need to follow: (warning it is very vague)

    trieInsert(String K)
    {
        i = 0;
        p = the root;
        while (not inserted)
        {
            if the end of word k is reached
                set the end-of-word marker in p to true;
            else if (p.ptrs[K[i]] == 0)
                create a leaf containing K and put its address in p.ptrs[K[i]];
            else if reference p.ptrs[K[i]] refers to a leaf
            {
                K_L = key in leaf p.ptrs[K[i]]
                do
                {
                    create a nonleaf and put its address in p.ptrs[K[i]];
                    p = the new nonleaf;
                } while (K[i] == K_L[i++]);
            }
            create a leaf containing K and put its address in p.ptrs[K[--i]];
            if the end of word k is reached
                set the end-of-word marker in p to true;
            else
                create a leaf containing K_L and put its address in p.ptrs[K_L[i]];
            else
                p = p.ptrs[K[i++]];
        }
    }
    
  3. I need to implement the following methods.

    public boolean add(String word){...}//adds word to trie structure should return true if successful and false otherwise
    
    public boolean remove(String word){...}//removes word from trie structure should return true if successful and false otherwise
    
  4. I cant find pseudo code for remove, but if insert does not work delete wont help me.

  5. Here is a image of how the Trie that I need to implement should look like.

enter image description here

  1. I am aware that the Trie will still be inefficient if implemented like this, but at the moment I need not worry about this.

  2. The book provides an implementation that is similar to what I need to do but doesn't use the end of word char ('$') and only stores the words without their prefixes in the child nodes http://mathcs.duq.edu/drozdek/DSinJava/SpellCheck.java

Constraints

  1. I need to implement the trie in JAVA.
  2. I may not import or use any of Java's built-in data structures. (ie. no Map, HashMap, ArrayList etc)
  3. I may use Arrays, Java primitive Types and Java Strings.
  4. The Trie must use a $ (dollar) symbol to indicate a end-of-word. (see the image below )

enter image description here

  1. I may asume that now word containing the $symbol will be inserted.
  2. I need to implement the Trie it in the same style as the book does.
  3. Case of words doesn't matter ie. all words will be considered to be lowercase
  4. The Trie should only store the end-of-word character and the characters applicable to a word and not the entire alphabet(like some implementations).

I do not expect anyone to do the implementation for me(unless they have one lying around :P) I just really need help.

like image 260
user3553706 Avatar asked Nov 02 '22 01:11

user3553706


1 Answers

First of all, I don't think you should make leaf nodes and internal nodes separate classes. I recommend making a universal node class with an isLeaf() method. This method would return true if a node has no children.

Here is some higher-level pseudocode for the functions you need to implement. For simplicity, I assume the existence of a method called getIndex() which returns the index corresponding to a character.

Insert(String str)
    Node current = null
    for each character in str
        int index = getIndex(character)
        if current.children[index] has not been initialized
            initialize current.children[index] to be a new Node
        current = current.children[index]

You can easily augment this pseudocode to fit your needs. For example, if you want to return false whenever insertion isn't successful:

  • Return false if the input string is null
  • Return false if the input string contains invalid characters

Now, here is some higher-level pseudocode for remove.

Remove(String str)
    Node current = null
    for each character in str
        int index = getIndex(character)
        current = current.children[index] 

    // At this point, we found the node we want to remove. However, we want to 
    // delete as many ancestor nodes as possible. We can delete an ancestor node 
    // if it is not need it any more. That is, we can delete an ancestor node 
    // if it has exactly one child. 

    Node ancestor = current
    while ancestor is not null
        if ancestor has 2 or more children
            break out of loop 
        else if ancestor has less than 2 children
            Node grandAncestor = ancestor.parent
            if grandAncestor is not null
                reinitialize grandAncestor.children // this has the effect of removing ancestor

        ancestor = ancestor.parent   

At a very high level, we follow the input string to the node we want to remove. After this, we traverse up the tree following parent pointers and delete every node with 1 child (since it is no longer needed). Once we reach a node with 2 children, we stop.

Like Insert, we can easily augment this pseudocode to return false whenever deletion isn't successful:

  • Return false if the input string is null
  • Return false if the input string contains invalid characters
  • Return false if the input string leads to a Node which doesn't exist

It is easiest to implement delete if your Node class has a parent field. However, it is possible to implement the method without parent points, but it is more difficult. You can see an example of the trickier implementation here.

like image 125
Ben Sandler Avatar answered Nov 15 '22 03:11

Ben Sandler