Thursday, August 10, 2017

Facts and Implementation of TRIE data structure

Facts about Binary Search Tree , Balanced Binary Search Tree and Trie

      a)  BST is better in terms of searching node comparing to Linked List but when it comes to skewed BST , it takes n iteration of searching which is same as Linked List.

      b) So as to avoid making skewed BST , its always better to use Balanced BST approach i.e.
|leftSubTree height – rightSubTreeHeight|<=1
In this way , every node will be having 2 nodes before it moves to next node.

Comparison of performance of different data structures


          
      c) TRIE as a data structure is better when it comes for searching in a dictionary. TRIE can have more than 2 child nodes unlike BST.

      d) Considering a dictionary with 2,50,000 words at least. Height would be
log(25*10^4) ~ 18
This means if dictionary has been implemented using BST then for searching any word, in worst case scenario, it has to traverse 18 levels .

If same has been implemented using TRIE ,generally every word is of length not more than 5-8 letter, so when it comes to TRIE data structure , it has to traverse up to 5-8 level as compare to 18 levels and this is significant improvement.

“ALL DICTIONARIES ARE IMPLEMENTED USING TRIE DATA STRUCTURE ONLY”

Using TRIE, we can search the key in O(M) time where M is maximum String length

Implementation of TRIE data structure

Insertion of a TRIE node


    Searching of a word

    a) Prefix based search
    b) Whole string search


   Deletion of TRIE Node


package com.trie;

import java.util.HashMap;
import java.util.Map;

/* TrieNode includes 2 elements
 * Child TrieNodes and word is true or false*/

class TrieNode {
     boolean eow;
     Map<Character, TrieNode> children;

     public TrieNode() {
          eow = false;
          children = new HashMap<Character, TrieNode>();
     }
}

/* Insert/delete/search into trie data structure */

public class Trie {
     private final TrieNode root;

     public Trie() {
          root = new TrieNode();
     }

     /* Time Complexity O(L*N)
      * L-Length of the word
      * N-Number of words
     */
     public void insert(String word) {
          TrieNode current = root;
          for (int i = 0; i < word.length(); i++) {
              char ch = word.charAt(i);
              TrieNode node = current.children.get(ch);
              if (node == null) {
                   node = new TrieNode();
                   current.children.put(ch, node);
              }
              current = node;
          }
          current.eow = true;
     }

     /* Same is used in implementing auto-complete functionality */
     public boolean prefixBasedSearch(String word){
          TrieNode current =root;
          for(int i=0;i<word.length();i++){
              char ch= word.charAt(i);
              TrieNode childNode=current.children.get(ch);
              if(childNode==null)
                   return false;
              else
                   current=childNode;
          }
          return true;
     }
    
     public boolean wholeWordSearch(String word){
          TrieNode current =root;
          for(int i=0;i<word.length();i++){
              char ch= word.charAt(i);
              TrieNode childNode=current.children.get(ch);
              if(childNode==null)
                   return false;
              else
                   current=childNode;
          }
          return current.eow;
     }
    
     /**
     * Delete word from trie.
     */
    public void delete(String word) {
        delete(root, word, 0);
    }
   
    /**
     * Returns true if parent should delete the mapping
     */
    private boolean delete(TrieNode current, String word, int index) {
        if (index == word.length()) {
            //when end of word is reached only delete if currrent.endOfWord is true.
            if (!current.eow) {
                return false;
            }
            current.eow = false;
            //if current has no other mapping then return true
            if(current.children.size() == 0)
              System.out.println("Word "+word + " deleted from TRIE");
            return current.children.size() == 0;
        }
        char ch = word.charAt(index);
        TrieNode node = current.children.get(ch);
        if (node == null) {
            return false;
        }
        boolean shouldDeleteCurrentNode = delete(node, word, index + 1);

        //if true is returned then delete the mapping of character and trienode reference from map.
        if (shouldDeleteCurrentNode) {
            current.children.remove(ch);
            //return true if no mappings are left in the map.
            return current.children.size() == 0;
        }
        return false;
    }

     public static void main(String[] args) {
          Trie trie = new Trie();
          String[] insertStr = { "abc", "abgl", "cdf", "abcd", "lmn" };
          /* Inserting words */
          for (String word : insertStr) {
              trie.insert(word);
          }
         
          String[] prefSearchStr={"ab","lo"};
          /*Prefix based Search String */
          for(String word : prefSearchStr){
              System.out.println("Word with prefix as " +word + " Exist-->" +trie.prefixBasedSearch(word));
          }
         
          String[] wholeWordSearchStr={"lmn","cdf","abgl","abd"};
          /*Whole word Search */
          for(String word : wholeWordSearchStr){
              System.out.println("Whole Word " +word + " Exist-->" +trie.wholeWordSearch(word));
          }
         
          String[] deleteWordStr={"abcdf","lmn"};
          /*Whole word Search */
          for(String word : deleteWordStr){
              trie.delete(word);
          }
     }
}


Output

Word with prefix as ab Exist-->true
Word with prefix as lo Exist-->false
Whole Word lmn Exist-->true
Whole Word cdf Exist-->true
Whole Word abgl Exist-->true
Whole Word abd Exist-->false
Word lmn deleted from TRIE

No comments:

Post a Comment