Trie or prefix tree is a tree-like data structure that implements dictionary abstract data type. In other words, trie allows to store key/value pairs in hierarchical structure. For example, let say we have words {a, as, air, aim, ail, car, i, milk, be, bag}. These words can be represented as follows.
As you can see the root of the trie is an empty string that has five direct children. Every child represents first letter of a given word so that the words with the same prefix share the common nodes. As a result, a key can consist of one or more nodes, but values stored in one node.
Applications
The most common application of the trie is autocomplete and spell-checking subroutines in web browsers, mobile phones and text editors.
The tries have several advantages as compared to hash tables. For instance, weak point of hash tables is hash function that can have collisions during generation of hash codes. As a result, the worst case in lookup data is O(N) where N is number of entries. Lookup in the trie has O(L) time complexity, where L is a length of the searching word. Another advantage is that a trie can provide an alphabetical ordering of the entries by key.
Implementation and operations on tries
In my implementation I used array lists to represent the trie. Every node contains array list of child nodes, character symbol and value that can be null in case if the node is not a key. There are three main operations that can be applied on the trie: insert a new entry into the trie, find entry and delete entry.
Here goes my implementation of the trie data structure on github: https://github.com/alexvolov/algorithms-and-data-structures/blob/master/src/main/java/com/alexvolov/ads/ds/impl/ArrayListTrie.java
Improvements
In general all improvements boils down to compression of the trie. A compressed trie a is trie in which all intermediate nodes that goes to the one single node are deleted. For example, on the diagram below for key “CAR” we have three nodes that can be compressed into one. The same for the key “MILK”.
Time complexity and memory usage
In terms of memory consumption and time complexity the trie doesn’t yield to hash tables and balanced trees. The main operations such as find, delete, insert has O(|Key|) time complexity. For example, lookup from hash table is constant but generation of hash code has O(|Key|) time complexity.
By memory consumption compressed prefix trees surpass hash tables and balanced tries. Because, the keys with the same prefix uses the same memory.