mirror of
https://github.com/dholerobin/Lecture_Notes.git
synced 2025-03-15 13:49:59 +00:00
Update Trie.md
This commit is contained in:
parent
99b27b0733
commit
e5fe2c4725
@ -1,4 +1,5 @@
|
||||
Do you know how autocompletion provided by different softwares like IDEs, Search Engines, command-line interpreters, text editors, etc works?
|
||||
|
||||
Do you know how the "auto-completion feature" provided by different software like IDEs, Search Engines, command-line interpreters, text editors, etc works?
|
||||
|
||||

|
||||
|
||||
@ -9,41 +10,41 @@ The basic data structure behind all these scenes is **Trie**.
|
||||
Spell checkers can also be designed using **Trie**.
|
||||
|
||||
# Trie
|
||||
String processing is widely used across real world applications, for example data analytics, search engines, bioinformatics, plagiarism detection, etc.
|
||||
String processing is widely used across real-world applications, for example data analytics, search engines, bioinformatics, plagiarism detection, etc.
|
||||
|
||||
Trie is very useful and special kind of data structure for string processing.
|
||||
Trie is a very useful and special kind of data structure for string processing.
|
||||
|
||||
## Introduction
|
||||
|
||||
Trie is basically a tree of nodes, where specification of a node can be given as below:
|
||||
Trie is a tree of nodes, where the specifications of a node can be given as below:
|
||||
|
||||
Each node has,
|
||||
1. An array of datatype node and of size of alphabet.
|
||||
2. A boolean value(We will see why it is needed).
|
||||
1. An array of datatype `node` having the size of the alphabet(see the note below).
|
||||
2. A boolean variable.
|
||||
|
||||
We will see usages of these two variables soon.
|
||||
**We will see usages of these two variables soon.**
|
||||
|
||||
```cpp
|
||||
struct trie_node
|
||||
{
|
||||
// Array of pointers of type
|
||||
// trie_node
|
||||
vector<trie_node*> links;
|
||||
bool isEndofString;
|
||||
// Array of pointers of type
|
||||
// trie_node
|
||||
vector<trie_node*> links;
|
||||
bool isEndofString;
|
||||
|
||||
trie_node(bool end = false)
|
||||
{
|
||||
links.assign(alphabet_size, nullptr);
|
||||
isEndofString = end;
|
||||
}
|
||||
trie_node(bool end = false)
|
||||
{
|
||||
links.assign(alphabet_size, nullptr);
|
||||
isEndofString = end;
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
**Note:** For easy understanding purpose, we are assuming that all strings contain lowercase alphabet letters that is alphabet size is $26$. **We can convert characters to a number by using `c-'a'`, `c` is a lowercase character.**
|
||||
**Note:** For an easy understanding purpose, we are assuming that all strings contain lowercase alphabet letters, i.e. `alphabet_size` is $26$. **We can convert characters to a number by using `c-'a'`, `c` is a lowercase character.**
|
||||
|
||||

|
||||
|
||||
Now, we have seen how trie node looks like. Let's see how we are going to store strings in a trie using these kind of nodes.
|
||||
Now, we have seen how a trie node looks like. Let's see how we are going to store strings in a trie using this kind of node.
|
||||
|
||||
## How to insert a string in a trie?
|
||||
|
||||
@ -51,18 +52,18 @@ Look at the image below, which represents a string "act" stored in a trie. Obser
|
||||
|
||||

|
||||
|
||||
**Note: Empty places in the array have null values(`nullptr` in c++).**
|
||||
|
||||
What did you observe?
|
||||
|
||||
Observations:
|
||||
1. **Other root node, each node in trie represents a single character.**
|
||||
1. **Other than the root node, each node in trie represents a single character.**
|
||||
2. **We set isEndofString to true in the node at which the string ends.**
|
||||
|
||||
Therefore, now for the shake of ease we are going to represent the nodes of trie as below.
|
||||
|
||||

|
||||
|
||||
**Note: Empty places in array have null values.**
|
||||
|
||||
And therefore representation of trie containing string "act" will be as below.
|
||||
|
||||
.jpg)
|
||||
@ -73,13 +74,13 @@ Now, observe the trie below, which contains two strings "act" and "ace".
|
||||
|
||||
.jpg)
|
||||
|
||||
Note that the node representing character `c` in the above trie, in magnified sense would look as below:
|
||||
Note that the node representing character `c` in the above trie, in a magnified sense would look as below:
|
||||
|
||||

|
||||
|
||||
What did you observe?
|
||||
|
||||
Common prefix of `"ace"` and `"act"` is `"ac"` and therefore we are having same nodes until we traverse `"ac"` and then we create a new node for character `e`.
|
||||
A common prefix of `"ace"` and `"act"` is `"ac"` and therefore we are having the same nodes until we traverse `"ac"` and then we create a new node for character `e`.
|
||||
|
||||
Therefore, we are not creating any new node until we need one and **Trie is a very efficient data storage, when we have a large list of strings sharing common prefixes.** It is also known as **prefix tree**.
|
||||
|
||||
@ -87,24 +88,24 @@ Now, observe the trie below, which contains three strings `"act"`, `"ace"` and `
|
||||
|
||||
.jpg)
|
||||
|
||||
Let's see proper algorithm to insert a string in a trie.
|
||||
Let's see a proper algorithm to insert a string in a trie.
|
||||
|
||||
1. Starting from the root, if there is already a node representing corresponding character of a string, then simply traverse.
|
||||
2. Otherwise, create a new node representing corresponding character.
|
||||
3. At the end of string, set `isEndofString` to true in the last ending node.
|
||||
1. Starting from the root, if there is already a node representing the corresponding character of a string, then simply traverse.
|
||||
2. Otherwise, create a new node representing the corresponding character.
|
||||
3. At the end of the string, set `isEndofString` to true in the last ending node.
|
||||
|
||||
```cpp
|
||||
void insert(trie_node* root, string s)
|
||||
{
|
||||
trie_node* temp = root;
|
||||
int n = s.size();
|
||||
for(int i = 0; i < n; i++){
|
||||
if(temp->link[s[i]-'a'] == nullptr)
|
||||
temp->link[s[i]-'a'] = new trie_node();
|
||||
// Traverse using link
|
||||
temp = temp->link[s[i]-'a'];
|
||||
}
|
||||
temp->isEndofString = true;
|
||||
trie_node* temp = root;
|
||||
int n = s.size();
|
||||
for(int i = 0; i < n; i++){
|
||||
if(temp->link[s[i]-'a'] == nullptr)
|
||||
temp->link[s[i]-'a'] = new trie_node();
|
||||
// Traverse using link
|
||||
temp = temp->link[s[i]-'a'];
|
||||
}
|
||||
temp->isEndofString = true;
|
||||
}
|
||||
```
|
||||
|
||||
@ -124,28 +125,54 @@ Observe the trie given below and try to search whether `"on"` is present or not.
|
||||
|
||||
.jpg)
|
||||
|
||||
If you don't have `isEndofString` variable, then you will not be able to correctly check whether `on` is present or not. Because it is prefix of `once`.
|
||||
If you don't have `isEndofString` variable, then you will not be able to correctly check whether `on` is present or not. Because it is the prefix of `once`.
|
||||
|
||||
**Algorithm**:
|
||||
|
||||
1. Starting from the root, try to traverse corresponding character of the string. If a link is present, then go ahead.
|
||||
1. Starting from the root, try to traverse the corresponding character of the string. If a link is present, then go ahead.
|
||||
2. Otherwise, simply given string is not present in the trie.
|
||||
3. If you are successfully able to traverse according to the string, then check whether the query string is really present or not via `isEndofString` variable of a last node.
|
||||
3. If you are successfully able to traverse all corresponding characters of the string, then check whether the query string is present or not via `isEndofString` variable of the last node.
|
||||
|
||||
```cpp
|
||||
bool search(trie_node* root, string s)
|
||||
{
|
||||
trie_node* temp = root;
|
||||
int n = s.size();
|
||||
for(int i = 0; i < n; i++){
|
||||
// There is not further link
|
||||
if(temp->link[s[i]-'a'] == nullptr)
|
||||
return false;
|
||||
temp = temp->link[s[i]-'a'];
|
||||
}
|
||||
return temp->isEndofString;
|
||||
trie_node* temp = root;
|
||||
int n = s.size();
|
||||
for(int i = 0; i < n; i++){
|
||||
// There is not further link
|
||||
if(temp->link[s[i]-'a'] == nullptr)
|
||||
return false;
|
||||
temp = temp->link[s[i]-'a'];
|
||||
}
|
||||
return temp->isEndofString;
|
||||
}
|
||||
```
|
||||
Can you find recursive version of the above function?
|
||||
|
||||
**Recursive version:**
|
||||
```cpp
|
||||
// @param: root -> root of the trie
|
||||
// @param: s -> the string we are deleting
|
||||
// @param: i -> index of s currently reached via recursive traversal
|
||||
bool Rec_search(trie_node* root, string& s, int i = 0)
|
||||
{
|
||||
// No link present
|
||||
// so string is not present
|
||||
if(root == nullptr)
|
||||
return false;
|
||||
if(i == s.size()) {
|
||||
// present
|
||||
if(root->isEndofString)
|
||||
return true;
|
||||
else
|
||||
return false;
|
||||
}
|
||||
// Recusively traverse using links
|
||||
return Rec_search(root->link[s[i]-'a'], s, i+1);
|
||||
}
|
||||
```
|
||||
|
||||
**Time Complexity:** $O(N)$, where $N$ is the length of the string we are searching for.
|
||||
|
||||
## Delete
|
||||
|
||||
@ -157,126 +184,126 @@ Things to take care about while you are deleting a string from the trie,
|
||||
1. It should not affect any other string present in the trie.
|
||||
2. Therefore, we are only going to delete **the nodes which are present only due to the presence of the given string**. And no other string is passing through them.
|
||||
|
||||
We are going to use recursive procedure. If the string is not present, then we will return `false` and `true` otherwise.
|
||||
We are going to use a recursive procedure. If the string is not present, then we will return `false` and `true` otherwise. **Recursive procedure for delete is a modified version of the recursive search procedure** and therefore make sure you understand that.
|
||||
|
||||
1. We are traversing trie via the given string recursively.
|
||||
2. While traversing, if we find that no link is present(`nullptr`) for the current character, then string is not present in the trie and return `false`.
|
||||
3. If we are successfully able to traverse the string(`i==s.size())`, then finally check `isEndofString` of the last node. If the string is really present, then return `true`. Otherwise return `false`.
|
||||
4. Now, while backtracking stage of recursion, delete nodes if it is no longer needed after deletion of the given string.
|
||||
Can you figure it out on your own?
|
||||
|
||||
**Procedure:**
|
||||
|
||||
1. We are traversing the trie recursively, the same way as in `Rec_search()` procedure.
|
||||
2. While traversing, if we find that no link is present(`root == nullptr`) for the current character, then the string is not present in the trie and return `false`.
|
||||
3. If we are successfully able to traverse the whole string until `i==s.size()`, then finally check `isEndofString` of the last node. If the string is present(`isEndofString = true)`, then set it to `false` and return `true`. Otherwise, return `false`-not present.
|
||||
4. Now, while backtracking stage of the recursion, delete nodes if it is no longer needed after deletion of the given string.
|
||||
|
||||
Now, go through the code below with very intuitive comments.
|
||||
|
||||
Now, Go through the code below, very intuitive comments are written.
|
||||
```cpp
|
||||
// Checks whether any link is present
|
||||
bool isEmptyNode(trie_node* node)
|
||||
{
|
||||
for(auto i:node->link)
|
||||
if(i != nullptr)
|
||||
return false;
|
||||
return true;
|
||||
for(auto i:node->link)
|
||||
if(i != nullptr)
|
||||
return false;
|
||||
return true;
|
||||
}
|
||||
|
||||
// Returns true, if the string is successfully deleted
|
||||
// Returns true if the string is successfully deleted
|
||||
// And if the string is not present in the trie then returns false.
|
||||
// @param: root -> root of the trie
|
||||
// @param: s -> string we are deleting
|
||||
// @param: i -> index of @s currently reached via recursive traversal
|
||||
bool deleteString(trie_node* root, string& s, int i = 0)
|
||||
{
|
||||
// Means string is not present
|
||||
if(root == nullptr)
|
||||
return false;
|
||||
|
||||
// Successfully traversed the whole string
|
||||
if(i == s.size()) {
|
||||
|
||||
// Check whether the string is really present
|
||||
// by checking `isEndofString` variable of the last node
|
||||
if(root->isEndofString) {
|
||||
root->isEndofString = false;
|
||||
return true;
|
||||
}
|
||||
else
|
||||
return false;
|
||||
}
|
||||
|
||||
bool ans = deleteString(root->link[s[i]-'a'], s, i+1);
|
||||
|
||||
// String is present
|
||||
if(ans) {
|
||||
if(root == nullptr)
|
||||
return false;
|
||||
|
||||
if(i == s.size()) {
|
||||
// present
|
||||
if(root->isEndofString) {
|
||||
// delete it
|
||||
root->isEndofString = false;
|
||||
return true;
|
||||
}
|
||||
else
|
||||
return false;
|
||||
}
|
||||
|
||||
bool ans = deleteString(root->link[s[i]-'a'], s, i+1);
|
||||
|
||||
// String is present
|
||||
if(ans) {
|
||||
// Check whether any other string
|
||||
// passes through this link node
|
||||
// If not passing, then delete it
|
||||
if(isEmptyNode(root->link[s[i]-'a'])) {
|
||||
|
||||
// Check whether any other string
|
||||
// passes through this node
|
||||
// Not passing, then delete this node
|
||||
if(isEmptyNode(root->link[s[i]-'a'])) {
|
||||
|
||||
// Deallocate used memory
|
||||
delete root->link[s[i]-'a'];
|
||||
root->link[s[i]-'a'] = nullptr;
|
||||
}
|
||||
return true;
|
||||
}
|
||||
|
||||
// Not present the return false
|
||||
return false;
|
||||
// Deallocate used memory
|
||||
delete root->link[s[i]-'a'];
|
||||
root->link[s[i]-'a'] = nullptr;
|
||||
}
|
||||
return true;
|
||||
}
|
||||
|
||||
// Not present the return false
|
||||
return false;
|
||||
}
|
||||
```
|
||||
|
||||
**Time Complexity:** $O(N)$, where $N$ is the length of the string we are deleting.
|
||||
|
||||
## Trie as an array
|
||||
|
||||
Availability of dynamic arrays allow use to create Trie without using pointers.
|
||||
The availability of dynamic arrays allows us to create Trie without using pointers.
|
||||
|
||||
Now, we are going to store trie as a dynamic array of `TrieNodes`. In this implementation, we are going to use an array of integers instead of pointers in `TrieNode` and as a link, we are going to store index of a node rather than address of a node in the former case.
|
||||
Now, we are going to store trie as a dynamic array of `TrieNodes`. In this implementation, we are going to use an array of integers instead of pointers in `TrieNode` and as a link, we are going to store the index of a node rather than the address of a node in the former case.
|
||||
|
||||

|
||||
|
||||
See the below implementation of trie as an array, which is quite similar and intuitive as previous implementation.
|
||||
See the below implementation of trie as an array, which is quite similar and intuitive as the previous implementation.
|
||||
|
||||
```cpp
|
||||
struct TrieNode
|
||||
{
|
||||
vector<int> id_link;
|
||||
bool isEndofString;
|
||||
|
||||
TrieNode(bool end = false)
|
||||
{
|
||||
end = isEndofString;
|
||||
id_link.assign(26,-1);
|
||||
}
|
||||
vector<int> id_link;
|
||||
bool isEndofString;
|
||||
|
||||
TrieNode(bool end = false)
|
||||
{
|
||||
end = isEndofString;
|
||||
id_link.assign(26,-1);
|
||||
}
|
||||
};
|
||||
|
||||
void insert(vector<TrieNode>& trie, string s)
|
||||
{
|
||||
int temp = 0;
|
||||
int n = s.size();
|
||||
for(int i = 0; i < n; i++) {
|
||||
if(trie[temp].id_link[s[i]-'a'] == -1) {
|
||||
trie[temp].id_link[s[i]-'a'] = (int)trie.size();
|
||||
trie.push_back(TrieNode());
|
||||
}
|
||||
temp = trie[temp].id_link[s[i]-'a'];
|
||||
}
|
||||
trie[temp].isEndofString = true;
|
||||
int temp = 0;
|
||||
int n = s.size();
|
||||
for(int i = 0; i < n; i++) {
|
||||
if(trie[temp].id_link[s[i]-'a'] == -1) {
|
||||
trie[temp].id_link[s[i]-'a'] = (int)trie.size();
|
||||
trie.push_back(TrieNode());
|
||||
}
|
||||
temp = trie[temp].id_link[s[i]-'a'];
|
||||
}
|
||||
trie[temp].isEndofString = true;
|
||||
}
|
||||
|
||||
bool search(vector<TrieNode>& trie, string s)
|
||||
{
|
||||
int temp = 0;
|
||||
int n = s.size();
|
||||
for(int i = 0; i < n; i++) {
|
||||
if(trie[temp].id_link[s[i]-'a'] == -1)
|
||||
return false;
|
||||
temp = trie[temp].id_link[s[i]-'a'];
|
||||
}
|
||||
return trie[temp].isEndofString;
|
||||
int temp = 0;
|
||||
int n = s.size();
|
||||
for(int i = 0; i < n; i++) {
|
||||
if(trie[temp].id_link[s[i]-'a'] == -1)
|
||||
return false;
|
||||
temp = trie[temp].id_link[s[i]-'a'];
|
||||
}
|
||||
return trie[temp].isEndofString;
|
||||
}
|
||||
```
|
||||
But it has a downside that you can not delete strings present in the trie. Why?
|
||||
But it has a downside that you can not generally delete strings present in the trie. Why?
|
||||
|
||||
Try deleting a single node, you will realize that indexes of each subsequent node will change and moreover deleting in an array has a very bad performance.
|
||||
Try deleting a single node(other than last one), you will realize that indexes of each subsequent node will change, and also deleting in an array has a very bad performance.
|
||||
|
||||
It is easy implemention, but with single downside. Therefore, use as per the requirement.
|
||||
It is an easy implementation, but with a single downside. Therefore, use as per the requirement.
|
||||
|
||||
## Count total number of words present in a Trie
|
||||
## Count the total number of words present in a Trie
|
||||
|
||||
How will you find the number of words(strings) present in the trie below?
|
||||
|
||||
@ -284,7 +311,7 @@ How will you find the number of words(strings) present in the trie below?
|
||||
|
||||
Ultimately, It means to find the total number of nodes having `true` value of `isEndofString`. Which can be easily done using recursive traversal of all the nodes present in the trie.
|
||||
|
||||
The basic idea of recursive procedure is as follow:
|
||||
The basic idea of the recursive procedure is as follow:
|
||||
|
||||
Start from the $\text{root}$ node and go through all $26$ positions of the `link` array. For each not-null link, recursively call `countWords()` considering that linked node as a $\text{root}$. And therefore formula will be as below:
|
||||
|
||||
@ -295,135 +322,134 @@ Finally, add $1$ to $\text{TotalWords}$ if the current node has `isEndofString =
|
||||
```cpp
|
||||
int countWords(trie_node* root)
|
||||
{
|
||||
int total = 0;
|
||||
if(root == nullptr)
|
||||
return 0;
|
||||
for(auto i:root->link)
|
||||
if(i != nullptr)
|
||||
total += countWords(i);
|
||||
total += root->isEndofString;
|
||||
return total;
|
||||
int total = 0;
|
||||
if(root == nullptr)
|
||||
return 0;
|
||||
for(auto i:root->link)
|
||||
if(i != nullptr)
|
||||
total += countWords(i);
|
||||
total += root->isEndofString;
|
||||
return total;
|
||||
}
|
||||
```
|
||||
**Time complexity:** $O(\text{Number of nodes present in the trie})$, as we are visiting each and every node.
|
||||
**Time complexity:** $O(\text{Number of nodes present in the trie})$, as we are visiting each and every node. <br>
|
||||
**Space complexity:** $O(1)$
|
||||
|
||||
## Print all words stored in Trie
|
||||
|
||||
It is similar to finding total number of words but instead of adding $1$ for each `isEndofString`'s true value, we are going to store the word representing that particular end.
|
||||
It is similar to finding the total number of words but instead of adding $1$ for each `isEndofString`'s true value, we are going to store the word representing that particular end.
|
||||
|
||||
The code is similar as finding total number of words.
|
||||
The code is similar to finding the total number of words.
|
||||
|
||||
```cpp
|
||||
void printAllWords(trie_node* root, vector<string>& ans, string s="")
|
||||
{
|
||||
if(root == nullptr)
|
||||
return;
|
||||
for(int i = 0; i < alphabet_size; i++) {
|
||||
if(root->link[i] != nullptr) {
|
||||
char c = 'a' + i;
|
||||
string temp = s;
|
||||
temp += c;
|
||||
printAllWords(root->link[i], ans, temp);
|
||||
}
|
||||
}
|
||||
if(root->isEndofString)
|
||||
ans.push_back(s);
|
||||
if(root == nullptr)
|
||||
return;
|
||||
for(int i = 0; i < alphabet_size; i++) {
|
||||
if(root->link[i] != nullptr) {
|
||||
char c = 'a' + i;
|
||||
string temp = s;
|
||||
temp += c;
|
||||
printAllWords(root->link[i], ans, temp);
|
||||
}
|
||||
}
|
||||
if(root->isEndofString)
|
||||
ans.push_back(s);
|
||||
}
|
||||
```
|
||||
**Time complexity:** $O(\text{Number of nodes present in the trie})$, as we are visiting each and every node.
|
||||
**Time complexity:** $O(\text{Number of nodes present in the trie})$, as we are visiting each and every node. <br>
|
||||
**Space Complexity:** $O(\text{Total length of all words present in the trie})$
|
||||
|
||||
## Auto-suggestion features
|
||||
|
||||
How will you design autocompletion feature using Trie?
|
||||
How will you design the autocompletion feature using Trie?
|
||||
|
||||
For example, we have stored C++ keywords in a trie. Now, when you type `"n"` it should show all keywords starting from `"n"`. For simplicity only keywords starting from `"n"` are shown in the trie below,
|
||||
For example, we have stored C++ keywords in a trie. Now, when you type `"n"` it should show all keywords starting from `"n"`. For simplicity, only keywords starting from `"n"` are shown in the trie below,
|
||||
|
||||
.jpg)
|
||||
|
||||
How will you print all keywords starting from `"n"`? OR how will you print all keywords having `"n"` as prefix?
|
||||
How will you print all keywords starting from `"n"`? OR how will you print all keywords having `"n"` as a prefix?
|
||||
|
||||
Simply use `printAllWords()` on node `n`, and problem is solved!
|
||||
Simply use `printAllWords()` on node `n`, and the problem is solved!
|
||||
|
||||
Common procedure is as below:
|
||||
A common procedure is as below:
|
||||
|
||||
1. Traverse nodes in trie according to the given uncomplete string `s`. If we are successfully able to traverse `s`, then there are keywords having prefix of `s`. Otherwise, there will be nothing to suggest.
|
||||
1. Traverse nodes in trie according to the given uncomplete string `s`. If we are successfully able to traverse `s`, then there are keywords having a prefix of `s`. Otherwise, there will be nothing to suggest.
|
||||
|
||||
2. Now, use `printAllWords()` considering the last node(after traversal of trie according to `s`) as a root.
|
||||
|
||||
```cpp
|
||||
void autocomplete(trie_node* root, string s)
|
||||
{
|
||||
int n = s.size();
|
||||
trie_node* temp = root;
|
||||
for(int i = 0; i < n; i++) {
|
||||
if(temp->link[s[i]-'a'] == nullptr)
|
||||
return;
|
||||
temp = temp->link[s[i]-'a'];
|
||||
}
|
||||
vector<string> suggest;
|
||||
printWords(temp, suggest, s);
|
||||
for(auto i:suggest)
|
||||
cout << i << endl;
|
||||
/*
|
||||
OR
|
||||
printWords(temp, suggest);
|
||||
for(auto i:suggest)
|
||||
cout << s << i << endl;
|
||||
*/
|
||||
int n = s.size();
|
||||
trie_node* temp = root;
|
||||
for(int i = 0; i < n; i++) {
|
||||
if(temp->link[s[i]-'a'] == nullptr)
|
||||
return;
|
||||
temp = temp->link[s[i]-'a'];
|
||||
}
|
||||
vector<string> suggest;
|
||||
printWords(temp, suggest, s);
|
||||
for(auto i:suggest)
|
||||
cout << i << endl;
|
||||
/*
|
||||
OR
|
||||
printWords(temp, suggest);
|
||||
for(auto i:suggest)
|
||||
cout << s << i << endl;
|
||||
*/
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
**Time complexity:** $O(\text{Length of S + Total length of all suggestions excluding common prefix(S) from all})$, where `s` is the string you want suggestions for.
|
||||
**Time complexity:** $O(\text{Length of S + Total length of all suggestions excluding common prefix(S) from all})$, where `s` is the string you want suggestions for. <br>
|
||||
**Space complexity:** $O(\text{Total length of all possible suggestions})$
|
||||
|
||||
It is widely used feature, as discussed at the start of the article.
|
||||
It is a widely used feature, as discussed at the start of the article.
|
||||
|
||||
There is also something called **"Ternary Search Tree"**. When each node in the trie has most of its links used(having many similar prefixe words), trie is substantially more space efficient and time efficient than ternary search tree.
|
||||
There is also something called **"Ternary Search Tree"**. When each node in the trie has most of its links used(having many similar prefix words), a trie is substantially more space-efficient and time-efficient than the ternary search tree.
|
||||
|
||||
But, If each node stores few links, then ternary search tree is much more space efficient, because we are using $26$ pointers in each node of trie and many of them may be unused.
|
||||
But, If each node stores a few links, then the ternary search tree is much more space-efficient, because we are using $26$ pointers in each node of trie and many of them may be unused.
|
||||
|
||||
Therefore, use as per the requirements.
|
||||
|
||||
## Dictionary using Trie
|
||||
|
||||
What are common features of an english dictionary?
|
||||
What are the common features of an English dictionary?
|
||||
|
||||
1. Efficient Lookup of words
|
||||
2. As dictionary is very large, Less memory usages
|
||||
2. As the dictionary is very large, Lesser memory usages
|
||||
|
||||
Hashtable can be used to implement dictionary. After precomputation of hash for each word in $O(M)$, where $M$ is total length of all words in the dictionary, we can have efficient lookups if we design a very efficient hashtable.
|
||||
Hashtable can be used to implement a dictionary. After precomputation of hash for each word in $O(M)$, where $M$ is the total length of all words in the dictionary, we can have efficient lookups if we design a very efficient hashtable.
|
||||
|
||||
But as dictionary is very large there will be collisions between two or more words. But still you can design hash table to have efficient look-ups.
|
||||
But as the dictionary is very large there will be collisions between two or more words. Still, you can design a hash table to have efficient look-ups.
|
||||
|
||||
But space usages is very high, as we simply store each words. But what if we design it using a trie?
|
||||
But space usages is very high, as we simply store each word. But what if we design it using a trie?
|
||||
|
||||
As in a dictionary we have many common-prefix words, trie will save substantial amount of memory consumption. Trie supports look-up in $O(word length)$, which is higher than a very efficient hash table.
|
||||
As in a dictionary we have many common-prefix words, trie will save a substantial amount of memory consumption. Trie supports look-up in $O(\text{word length})$, which is higher than a very efficient hash table.
|
||||
|
||||
Other advantages of trie is as below:
|
||||
Other advantages of the trie are as below:
|
||||
1. Auto-complete feature
|
||||
2. It also supports ordered traversal of words with given prefix
|
||||
3. No need for complex hash functions
|
||||
|
||||
So, if you want some of the above features then using trie is good for you. Also, we don't have to deal with collisions.
|
||||
|
||||
Note that in dictionary along with a word, we have explanations or meanings of that word. That can be handled by seperately maintaining an array which stores all those extra stuffs. Then store one integer in the `TrieNode` structure to store the index of the corresponding data in the array.
|
||||
Note that in the dictionary along with a word, we have explanations or meanings of that word. That can be handled by separately maintaining an array that stores all those extra stuff. Then store one integer in the `TrieNode` structure to store the index of the corresponding data in the array.
|
||||
|
||||
```cpp
|
||||
struct trie_node
|
||||
{
|
||||
// Array of pointers of type
|
||||
// trie_node
|
||||
vector<trie_node*> links;
|
||||
bool isEndofString;
|
||||
// To store id of data
|
||||
int idOfData;
|
||||
// Array of pointers of type
|
||||
// trie_node
|
||||
vector<trie_node*> links;
|
||||
bool isEndofString;
|
||||
// To store id of data
|
||||
int idOfData;
|
||||
};
|
||||
```
|
||||
|
||||
Below image shows a typical trie structure for dictionary.
|
||||
The below image shows a typical trie structure for the dictionary.
|
||||
|
||||
|
||||
.jpg)
|
||||

|
||||
|
Loading…
x
Reference in New Issue
Block a user