From 25f9902805ce0a79059f091868ac0e03a91fa6bc Mon Sep 17 00:00:00 2001 From: Pragy Agarwal Date: Mon, 14 Oct 2019 15:04:31 +0530 Subject: [PATCH] Add sorting notes --- Sorting/1.md | 152 +++++++++++++++++++++++++++++++++++++++++++++++++++ Sorting/2.md | 139 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 291 insertions(+) create mode 100644 Sorting/1.md create mode 100644 Sorting/2.md diff --git a/Sorting/1.md b/Sorting/1.md new file mode 100644 index 0000000..24b0382 --- /dev/null +++ b/Sorting/1.md @@ -0,0 +1,152 @@ +# Sorting + +- define sorting: permuting the sequence to enforce order. todo +- brute force: $O(n! \times n)$ + + +Stability +--------- +- definition: if two objects have the same value, they must retain their original order after sort +- importance: + - preserving order - values could be orders and chronological order may be important + - sorting tuples - sort on first column, then on second column + +-- -- + +Insertion Sort +-------------- +- explain: + - 1st element is sorted + - invariant: for i, the array uptil i-1 is sorted + - take the element at index i, and insert it at correct position +- pseudo code: + ```c++ + void insertionSort(int arr[], int length) { + int i, j, key; + for (i = 1; i < length; i++) { + key = arr[i]; + j = i-1; + while (j >= 0 && arr[j] > key) { + arr[j+1] = arr[j]; + j--; + } + arr[j + 1] = key; + } + } + ``` +- **Stablility:** Stable, because swap only when strictly >. Had it been >=, it would be unstable +- **Complexity:** $O(n^2)$ + +-- -- + + +Bubble Sort +----------- +- explain: + - invariant: last i elements are the largest one and are in correct place. +- why "bubble": largest unsorted element bubbles up - just like bubbles +- pseudo code: + ```c++ + void bubbleSort(int arr[], int n) { + for (int i = 0; i < n-1; i++) + for (int j = 0; j < n-i-1; j++) + if (arr[j] > arr[j+1]) + swap(&arr[j], &arr[j+1]); + } + ``` +- **Stability:** Stable +- **Complexity:** $O(n^2)$ + +-- -- + + +Bubble Sort with window of size 3 +--------------------------------- +- explain bubble sort as window of size 2 +- propose window of size 3 +- does this work? +- no - even and odd elements are never compared + + +-- -- + + +Counting Sort +------------- +- explain: + - given array, first find min and max in O(n) time + - create space of O(max-min) + - count the number of elements + - take prefix sum +- constraint: can only be used when the numbers are bounded. +- pseudo code: + ```c++ + void counting_sort(char arr[]) { + // find min, max + // create output space + // count elements + // take prefix sum + // To make it stable we are operating in reverse order. + for (int i = n-1; i >= 0; i--) { + output[count[arr[i]] - 1] = arr[i]; + -- count[arr[i]]; + } + } + ``` +- **Stability:** Stable, if imlpemented correctly +- **Complexity**: $O(n + \max(a[i]))$ +- why not just put the element there? if numbers/value, can do. Else, could be objects + +-- -- + + +Radix Sort +---------- +- sort elements from lowest significant to most significant values +- explain: basically counting sort on each bit / digit +- **Stability:** inherently stable - won't work if unstable +- **complexity:** $O(n \log\max a[i])$ + +-- -- + + + +Partition Array +--------------- +> Array of size $n$ +> Given $k$, $k <= n$ +> Partition array into two parts $A, ||A|| = k$ and $B, ||B|| = n-k$ elements, such that $|\sum A - \sum B|$ is maximized + +- Sort and choose smallest k? +- Counterexample +``` +1 2 3 4 5 +k = 3 + +bad: {1, 2, 3}, {4, 5} +good: {1, 2}, {3, 4, 5} +``` +- choose based on n/2 - because we want the small sum to be smaller, so choose less elements, and the larger sum to be larger, so choose more elements + +-- -- + + +Sex-Tuples +---------- +> Given A[n], all distinct +> find the count of sex-tuples such that +> $$\frac{a b + c}{d} - e = f$$ +> Note: numbers can repeat in the sextuple + +- Naive: ${n \choose 6} = O(n^6)$ +- Optimization. Rewrite the equation as $ab + c = d(e + f)$ + - Now, we only need ${n \choose 3} = O(n^3)$ + - Caution: $d \neq 0$ + - Once you have array of RHS, sort it in $O(\log n^3)$ time. + - Then for each value of LHS, count using binary search in the sorted array in $\log n$ time. + - Total: $O(n^3 \log n)$ + +-- -- + +Anagrams +-------- diff --git a/Sorting/2.md b/Sorting/2.md new file mode 100644 index 0000000..3479b5b --- /dev/null +++ b/Sorting/2.md @@ -0,0 +1,139 @@ + +# Sorting 2 +-- -- + +Merge Sort +---------- +- Divide and Conquer + - didive into 2 + - sort individually + - combine the solution +- Merging takes $O(n+m)$ time. + - needs extra space +- code for merging: + ```c++ + // arr1[n1] + // arr2[n2] + int i = 0, j = 0, k = 0; + + // output[n1+n2] + + while (i 2 sorted arrays +> ``` +> 1 2 2 3 4 9 +> 2 3 3 9 9 +> +> intersection: 2 3 9 +> ``` +- calculate intersection. Report an element only once +- Naive: + - Search each element in the other array. $O(n \log m)$ +- Optimied: + - Use merge. + - Ignore unequal. + - Add equal. + - Move pointer ahead till next element + + +-- -- + + +Merging without extra space +--------------------------- +- can use extra time +- if a[i] < b[j], i++ +- else: swap put b[i] in place of a[i]. Sorted insert a[i] in b array +- so, $O(n^2)$ time + + +-- -- + + +Count inversions +--------------- +> inversion: +> i < j, but a[i] > a[j] (strict inequalities) +- naive: $O(n^2)$ +- Split array into 2. +- Number of inversions = number of inversions in A + B + cross terms +- count the cross inversions by example +- does number of cross inversions change when sub-arrays are permuted? +- no +- can we permute so that it becomes easier to count cross inversions? +- sort both subarrays and count inversions in A, B recursively +- then, merge A and B and during the merge count the number of inversions +- A_i B_j +- if A[i] > B[j], then there are inversions +- num inversions for A[i], B[j] = |A| - i +- intra array inversions? Counted in recursive case. + + +-- -- + + +Find doubled-inversions +----------------------- +> same as inversion +> just i < j, a[i] > 2 * a[j] + +- same as previous. Split and recursilvely count +- while merging, for some b[j], I need to find how many elements in A are greater than 2 * b[j] +- linear search for that, but keep index +- linear search is better than binary search + + +-- -- + +Sort n strings of length n each + - $T(n) = 2T(n/2) + O(n^2) = O(n^2)$ is wrong + - $T(n) = 2T(n/2) + O(n) * O(m) = O(nm\log n)$ is correct. Here m = the initial value of n + + +-- -- +> . +> +> I G N O R E +> +> . + + +Bounded Subarray Sum Count +-------------------------- +> given A[N] +> can have -ve +> given lower <= upper +> find numbe of subarrays such that lower <= sum <= upper + +- naive: $O(n^2)$ (keep prefix sum to calculate sum in O(1), n^2 loop) +- if only +ve, $O(n\log n)$ using prefix sum +- but what if -ve? +- + + +-- --