Update DSU.md

2025-07-12 10:21:52 +00:00 · 2020-01-16 20:34:13 +05:30 · 2020-01-16 20:34:13 +05:30 · 29d9121501
commit 29d9121501
parent dcba97aca7
1 changed files with 49 additions and 23 deletions
--- a/Articles/DSU.md
+++ b/Articles/DSU.md
@ -1,14 +1,29 @@
+Suppose, you are giving a programming contest and one of the problem is: You are given a number of vertices and a list of undirected unweighted edges between these vertices. Now the queries are to find whether there is a path from some vertex $u$ to $v$. Note that the whole graph may not be connected. How can you solve it? 
+
+![enter image description here](https://lh3.googleusercontent.com/H_pixss3v5apcsEkUmk_hAzOhgif-O43ce8IijOw3AhCmATXdw0QpG6eQCJEnmwcLs0NYUa96_XU)
+
+DFS, Right? Start DFS from either u or v and check if we can reach to the other vertex. Done!
+
+But what if the graph is **dynamic**, means that apart from the path query, you are given another type of query which is, to add an edge in the graph. Now, how to solve it?
+
+![](https://lh3.googleusercontent.com/AK-Y9QBBXX0mB1twpZdPPTA2gcEhPjKwAh0cOxGaXltv6S2xcup9HPF2CDpjvhBlp3v4IiS341lz)
+
+Again DFS? Yes, you can add any number of edges and still check if there is a path from vertex u and v. But if you do that then you will get **TLE**(Time limit exceed).
+
+Now, Disjoint Set Union is a data structure which can do this operations very efficiently.
+
+But what do we mean by the name **"Disjoint Set Union"**. **Set** is a collection of distinct elements. **Disjoint set** means they are non-overlapping - in language of math if A and B are two disjoint sets then $A \cap B = \phi$. **Union** is an operation, we do to combine two disjoint sets.
+
+In the above stated problem, we can consider a connected components as disjoint sets and then we can do union when we are adding edges. 
+
+For the queries, to check if there is a path from u to v, we can check whether u and v are in the same disjoint sets, if yes then there is a path from u to v, otherwise not. Confusing?
+
+Now, let's see how it works actually.

 ## Disjoint Set Union

 Disjoint Set Union is one of the simplest and easy to implement data structure, which is used to keep track of disjoint(Non-overlapping) dynamic sets.

-![enter image description here](https://lh3.googleusercontent.com/0_ODdLNCGwS5fJNoV_7_dv-yOcgpqpsoqNZg5pdXY1Ms6cTV8xUFMCORIc3ywty57Dal29hCOULw)
-
-In the image above, $\{a,b,c,d\}$ and $\{e,f\}$ are two non-overlapping(disjoint) sets. Dynamic here means that we can combine any two sets, so they are dynamic.
-
-**Note:** Here disjoint set is represented as a tree and terminology of tree is used.
-
 There are three main operations of this data structure: Make-set, Find and Union

 1. **Make-Set**: This operation creates a disjoint set having a single element.
@ -17,22 +32,24 @@ There are three main operations of this data structure: Make-set, Find and Union

 3. **Union**: This operation unifies two disjoint sets.

-There are many ways we can implement this data structure: Linked list, Array, Trees. But here we will implement using array and represent using tree.
+There are many ways we can implement this data structure: Linked list, Array, Trees. But here we will implement it using array and represent using tree.

 **Some Terminologies**

- - **Parent** is a main attribute of an element which represents an element by which a particular element is connected with some disjoint set.
+ - **Parent** is a main attribute of an element(or set), which represents an element by which a particular element is connected with some disjoint set.
 In the image below, $c$ is parent of $d$ and $a$ is parent of $c$.

+	**Note:** Below is just for visualization purpose, if you don't understand it right now. Don't worry, you will understand it by the end of the article.
+
 ![enter image description here](https://lh3.googleusercontent.com/iaVsRRMUzGUK-PlEl7gFtUoDnatg9O2tBF-gMJ_qm5FyNWJSWXnCL6jAxX5siijx1L57Tg-3A0HZ)

- - **Root** is an element of a set whose parent is itself. It is unique per set. 
-$a$ is the root element for the left disjoint set.
+ - **Root** is an element of a set whose parent is itself. It is unique per set.
+$a$ is the root element for the disjoint set above in the image.


 ## Operation Make-Set

-Make-Set operation creates a new set having a single element (means $size=1$) which is having a unique id.
+Make-Set operation creates a new set having a single element (means size=1) which is having a unique id.

 **Pseudocode:**
 ```	
@ -44,7 +61,7 @@ MAKE-SET(x)
 ```
 Here X is the only element in the set so it is parent of itself.

-The image below represents sets generated by this operation. Where each one having arrow coming to itself, which represents that it is its own parent right now. Each one have size of 1.
+The image below represents sets generated by this operation. Where each one having arrow coming to itself, which represents that it is its own parent right now. Each one has size of 1.
 ![enter image description here](https://lh3.googleusercontent.com/UW-R9Hbi7YaCOyrVd2F0ThzzQ9pAF1zqoASJDhGKjbBHN8P-dJJr4sZubW1csc97l6iQMo3L39Bc)

 We are working with arrays, so the code to make $n$ sets is as below:
@ -63,13 +80,15 @@ void Make_sets(int n)
 }
 ```

-**Time Complexity:** Make-Set operation take $O(1)$ time. So creating $N$ sets it will take $O(N)$ time.
+**Time Complexity:** Make-Set operation takes $O(1)$ time. So to creat $N$ sets it will take $O(N)$ time.

 ## Operation Find

-$\text{Find}(X)$ basically finds the root element of the disjoint set to which $X$ belongs. 
+$\text{Find}(X)$ basically finds the root element of the disjoint set to which $X$ belongs.

-If we apply $\text{Find}(d)$ or $\text{Find}(c)$ operation for the set in the image below, then it will return '$a$' which is a root element.
+The root basically represents a unique ID for a particular disjoint set. (Look at the code for $\text{Find}(X)$)
+
+If we apply $\text{Find}(d)$ or $\text{Find}(b)$ operation for the set in the image below, then it will return '$a$' which is a root element. 

 ![enter image description here](https://lh3.googleusercontent.com/j1H9MBKoSzyQV_8ObjBOjD1W2Na57kYg8aGMrbI8dLepF2IIqbRJSKzccH7rgfWrBqgFJ3LtYzAN)

@ -77,7 +96,7 @@ Here the thing to note is that, the root element of a root element of any disjoi

 **Algorithm**

-Until you reach at the root element, traverse the tree of the disjoint set upwards.
+ - Until you reach at the root element, traverse the tree of the disjoint set upwards.

 **Pseudocode:**
 ```
@ -128,11 +147,10 @@ int Find(x)

 This is too much. Right? What else can we do?

-We have a technique named **"Path compression"**, which burns this time to $O(log^*N)$. $log^*N$ is iterated logarithm-number of time you have to apply $log$ to $N$ before the result is less than or equal to 1.
-
-The idea of the Path compression is: **It re-connects every vertex to the root vertex directly, rather than by a path**.
+We have a technique named **"Path compression"**. The idea of the Path compression is, **it re-connects every vertex to the root vertex directly, rather than by a path**.

 If we apply $\text{Find}(d)$ operation with the path compression, then the following thing will happen.
+
 ![enter image description here](https://lh3.googleusercontent.com/ltQXkpZAjEO543ibrVodpMMZp2IHXVJ7Rjxevm2ztJQAC67UnvBeMmwEoIB9qZ0_2PgpSs98nWV9)

 How can we do it? It is easy, we just need a little modification in $\text{Find}(X)$.
@ -199,12 +217,20 @@ int Find(x)
 		return parent[x] = Find(parent[x]);
 }
 ```
+**Time complexity of Find:**
+
+ 1. Without path compression: $\mathcal{O}(N)$
+ 2. With path compression: $\mathcal{O}(\log^*(N))$
+
+**Note:** 
+- $log^*(N)$ is the **iterated logarithm**, which is essentially the number of times we have to apply $log$ to $N$ before it becomes less than or equal to 1.
+- $\mathcal{O}(\log^*(N))$ is almost constant time becuase $\log^*(N) <=5$ for even such a big number like $2^{65536}$.

 ## Operation Union

-$Union(X,Y)$ operation first of all finds root element of both the disjoint sets containing X and Y respectively. Then it connects the root element of one of the disjoint set to the another.
+$\text{Union}(X,Y)$ operation first of all finds root element of both the disjoint sets containing X and Y respectively. Then it connects the root element of one of the disjoint set to the another.

-Well, how do we decide which root will connet to which? If we do it randomly then it may increase the tree height up to $O(N)$, which means that the next $Find(x)$ operation will take $O(N)$ time. Can we do better?
+Well, how do we decide which root will connet to which? If we do it randomly then it may increase the tree height up to $O(N)$, which means that the next $\text{Find}(x)$ operation will take $O(N)$ time. Can we do better?

 Yes, we have two standard techniques: **By size and By rank**.

@ -299,8 +325,8 @@ void union(int x,int y)

 ### Time Complexity of Union

- 1. Without path compression: $O(N)$
- 2. With path compression: $O(log^*N)$
+ 1. Without path compression(in find): $\mathcal{O}(N)$
+ 2. With path compression: $\mathcal{O}(\log^*(N))$

 ## Applications of DSU