+
+
+
+
Do you know how the "auto-completion feature" provided by different software like IDEs, Search Engines, command-line interpreters, text editors, etc works?

+Below is an input box, which has an autocomplete feature for "country names". Try it out!
+
+
+
+
+
+
+
The basic data structure behind all these scenes is **Trie**.

@@ -14,17 +30,30 @@ String processing is widely used across real-world applications, for example dat
Trie is a very useful and special kind of data structure for string processing.
+Below is a very simple representation of trie consisting of `"cat"`, `"bat"`, `"dog"` strings.
+
+
+
+Now, suppose we are given a string-array and we are told that check whether `"cat"` string is present in the array. Then we can check it via brute force-compare with each and every string present in the string-array, which would take $O(N*length("cat"))$ in the worst-case situation, where $N$ is the number of string in the array.
+
+Now, if you create a trie from all the strings present in the array, then you can simply check it in $O(length("cat"))$ time by traversing through trie(confused? we will see it soon), which is very efficient and therefore trie is an efficient information retrieval data structure.
+
## Introduction
Trie is a tree of nodes, where the specifications of a node can be given as below:
Each node has,
-1. An array of size of the alphabet(see the note below).
+1. An array of size of the alphabet(see the note below) to store links to other nodes.
2. A boolean variable.
-**Note:** For an easy understanding purpose, we are assuming that all strings contain lowercase alphabet letters, i.e. `alphabet_size` is $26$. **We can convert characters to a number by using `c-'a'`, `c` is a lowercase character.**
+**Notes**
-**We will see usages of these two variables soon.**
+1. For an easy understanding purpose, we are assuming that all strings contain only lowercase alphabet letters, i.e. `alphabet_size` is $26$.
+2. We will discuss the traditional implementation here, although we can use some data structures like hash table in each node.
+
+
+
+**We will see "why do we need these two variables?" soon.**
```cpp
struct trie_node
@@ -43,9 +72,7 @@ struct trie_node
};
```
-
-
-Now, we have seen how a trie node looks like. Let's see how we are going to store strings in a trie using this kind of node.
+Now, we have seen how a trie node looks like. Let's see **how we are going to store strings in a trie using this kind of node.**
## How to insert a string in a trie?
@@ -55,11 +82,8 @@ Look at the image below, which represents a string "act" stored in a trie. Obser
**Note: Empty places in the array have null values(`nullptr` in c++).**
-What did you observe?
-
-Observations:
-1. **Other than the root node, each node in trie represents a single character.**
-2. **We set isEndofString to true in the node at which the string ends.**
+1. **Other than the root node, each node in trie represents a single character.** In the above image, $2^{nd}$, $3^{rd}$, $4^{th}$ node represents `'a'`, `'c'`, and `'t'` respectively.
+2. **The node at which the string ends, we set isEndofString to true.** See last node in the image above.
Therefore, now for the shake of ease we are going to represent the nodes of trie as below.
@@ -85,7 +109,7 @@ A common prefix of `"ace"` and `"act"` is `"ac"` and therefore we are having the
Therefore, we are not creating any new node until we need one and **Trie is a very efficient data storage, when we have a large list of strings sharing common prefixes.** It is also known as **prefix tree**.
-Now, observe the trie below, which contains three strings `"act"`, `"ace"` and `"cat"`.
+Now, look the trie below, which contains three strings `"act"`, `"ace"` and `"cat"`.
.jpg)
@@ -426,7 +450,7 @@ Hashtable can be used to implement a dictionary. After precomputation of hash fo
But as the dictionary is very large there will be collisions between two or more words. Still, you can design a hash table to have efficient look-ups.
-But space usages is very high, as we simply store each word. But what if we design it using a trie?
+But hashtable has a very high space usages, as we simply store each word and attatched data. But what if we design it using a trie?
As in a dictionary we have many common-prefix words, trie will save a substantial amount of memory consumption. Trie supports look-up in $O(\text{word length})$, which is higher than a very efficient hash table.
@@ -435,7 +459,7 @@ Other advantages of the trie are as below:
2. It also supports ordered traversal of words with given prefix
3. No need for complex hash functions
-So, if you want some of the above features then using trie is good for you. Also, we don't have to deal with collisions.
+So, if you want some of the above features, then using a trie is good. Also, we don't have to deal with collisions.
Note that in the dictionary along with a word, we have explanations or meanings of that word. That can be handled by separately maintaining an array that stores all those extra stuff. Then store one integer in the `TrieNode` structure to store the index of the corresponding data in the array.
@@ -454,3 +478,151 @@ struct trie_node
The below image shows a typical trie structure for the dictionary.

+
+
+
+
+
+
+
+
+