Add sorting notes

This commit is contained in:
Pragy Agarwal 2019-10-14 15:04:31 +05:30
parent 3301445103
commit 25f9902805
2 changed files with 291 additions and 0 deletions

152
Sorting/1.md Normal file
View File

@ -0,0 +1,152 @@
# Sorting
- define sorting: permuting the sequence to enforce order. todo
- brute force: $O(n! \times n)$
Stability
---------
- definition: if two objects have the same value, they must retain their original order after sort
- importance:
- preserving order - values could be orders and chronological order may be important
- sorting tuples - sort on first column, then on second column
-- --
Insertion Sort
--------------
- explain:
- 1st element is sorted
- invariant: for i, the array uptil i-1 is sorted
- take the element at index i, and insert it at correct position
- pseudo code:
```c++
void insertionSort(int arr[], int length) {
int i, j, key;
for (i = 1; i < length; i++) {
key = arr[i];
j = i-1;
while (j >= 0 && arr[j] > key) {
arr[j+1] = arr[j];
j--;
}
arr[j + 1] = key;
}
}
```
- **Stablility:** Stable, because swap only when strictly >. Had it been >=, it would be unstable
- **Complexity:** $O(n^2)$
-- --
Bubble Sort
-----------
- explain:
- invariant: last i elements are the largest one and are in correct place.
- why "bubble": largest unsorted element bubbles up - just like bubbles
- pseudo code:
```c++
void bubbleSort(int arr[], int n) {
for (int i = 0; i < n-1; i++)
for (int j = 0; j < n-i-1; j++)
if (arr[j] > arr[j+1])
swap(&arr[j], &arr[j+1]);
}
```
- **Stability:** Stable
- **Complexity:** $O(n^2)$
-- --
Bubble Sort with window of size 3
---------------------------------
- explain bubble sort as window of size 2
- propose window of size 3
- does this work?
- no - even and odd elements are never compared
-- --
Counting Sort
-------------
- explain:
- given array, first find min and max in O(n) time
- create space of O(max-min)
- count the number of elements
- take prefix sum
- constraint: can only be used when the numbers are bounded.
- pseudo code:
```c++
void counting_sort(char arr[]) {
// find min, max
// create output space
// count elements
// take prefix sum
// To make it stable we are operating in reverse order.
for (int i = n-1; i >= 0; i--) {
output[count[arr[i]] - 1] = arr[i];
-- count[arr[i]];
}
}
```
- **Stability:** Stable, if imlpemented correctly
- **Complexity**: $O(n + \max(a[i]))$
- why not just put the element there? if numbers/value, can do. Else, could be objects
-- --
Radix Sort
----------
- sort elements from lowest significant to most significant values
- explain: basically counting sort on each bit / digit
- **Stability:** inherently stable - won't work if unstable
- **complexity:** $O(n \log\max a[i])$
-- --
Partition Array
---------------
> Array of size $n$
> Given $k$, $k <= n$
> Partition array into two parts $A, ||A|| = k$ and $B, ||B|| = n-k$ elements, such that $|\sum A - \sum B|$ is maximized
- Sort and choose smallest k?
- Counterexample
```
1 2 3 4 5
k = 3
bad: {1, 2, 3}, {4, 5}
good: {1, 2}, {3, 4, 5}
```
- choose based on n/2 - because we want the small sum to be smaller, so choose less elements, and the larger sum to be larger, so choose more elements
-- --
Sex-Tuples
----------
> Given A[n], all distinct
> find the count of sex-tuples such that
> $$\frac{a b + c}{d} - e = f$$
> Note: numbers can repeat in the sextuple
- Naive: ${n \choose 6} = O(n^6)$
- Optimization. Rewrite the equation as $ab + c = d(e + f)$
- Now, we only need ${n \choose 3} = O(n^3)$
- Caution: $d \neq 0$
- Once you have array of RHS, sort it in $O(\log n^3)$ time.
- Then for each value of LHS, count using binary search in the sorted array in $\log n$ time.
- Total: $O(n^3 \log n)$
-- --
Anagrams
--------

139
Sorting/2.md Normal file
View File

@ -0,0 +1,139 @@
# Sorting 2
-- --
Merge Sort
----------
- Divide and Conquer
- didive into 2
- sort individually
- combine the solution
- Merging takes $O(n+m)$ time.
- needs extra space
- code for merging:
```c++
// arr1[n1]
// arr2[n2]
int i = 0, j = 0, k = 0;
// output[n1+n2]
while (i<n1 && j <n2) {
if (arr1[i] <= arr2[j]) // if <, then unstable
output[k++] = arr1[i++];
else
output[k++] = arr2[j++];
}
// only one array can be non-empty
while (i < n1)
output[k++] = arr1[i++];
while (j < n2)
output[k++] = arr2[j++];
```
- stable? Yes
- in-place? No
- Time complexity recurrence: $T(n) = 2T(n/2) + O(n)$
- Solve by Master Theorem.
- Solve by algebra
- Solve by Tree height ($\log n$) * level complexity ($O(n)$)
-- --
Intersection of sorted arrays
-----------------------------
> 2 sorted arrays
> ```
> 1 2 2 3 4 9
> 2 3 3 9 9
>
> intersection: 2 3 9
> ```
- calculate intersection. Report an element only once
- Naive:
- Search each element in the other array. $O(n \log m)$
- Optimied:
- Use merge.
- Ignore unequal.
- Add equal.
- Move pointer ahead till next element
-- --
Merging without extra space
---------------------------
- can use extra time
- if a[i] < b[j], i++
- else: swap put b[i] in place of a[i]. Sorted insert a[i] in b array
- so, $O(n^2)$ time
-- --
Count inversions
---------------
> inversion:
> i < j, but a[i] > a[j] (strict inequalities)
- naive: $O(n^2)$
- Split array into 2.
- Number of inversions = number of inversions in A + B + cross terms
- count the cross inversions by example
- does number of cross inversions change when sub-arrays are permuted?
- no
- can we permute so that it becomes easier to count cross inversions?
- sort both subarrays and count inversions in A, B recursively
- then, merge A and B and during the merge count the number of inversions
- A_i B_j
- if A[i] > B[j], then there are inversions
- num inversions for A[i], B[j] = |A| - i
- intra array inversions? Counted in recursive case.
-- --
Find doubled-inversions
-----------------------
> same as inversion
> just i < j, a[i] > 2 * a[j]
- same as previous. Split and recursilvely count
- while merging, for some b[j], I need to find how many elements in A are greater than 2 * b[j]
- linear search for that, but keep index
- linear search is better than binary search
-- --
Sort n strings of length n each
- $T(n) = 2T(n/2) + O(n^2) = O(n^2)$ is wrong
- $T(n) = 2T(n/2) + O(n) * O(m) = O(nm\log n)$ is correct. Here m = the initial value of n
-- --
> .
>
> I G N O R E
>
> .
Bounded Subarray Sum Count
--------------------------
> given A[N]
> can have -ve
> given lower <= upper
> find numbe of subarrays such that lower <= sum <= upper
- naive: $O(n^2)$ (keep prefix sum to calculate sum in O(1), n^2 loop)
- if only +ve, $O(n\log n)$ using prefix sum
- but what if -ve?
-
-- --