# Amazon and Microsoft Interviews ## Interview Tips 1. Always give Brute Force for every question. - Benefits: - It favours against worse odds. It is better to give a brute force then to be totally blank. - Predicts Optimal Solution 2. Data Structures/ Algorithms Time Complexity| Datastructure and Algorithms :--|:-- $O(n\log{n})$ | Sorting $O(n)$ | Two Pointers/ Smart Sorting/ dfs/ bfs $O(n^2)$ | Dynamic Programming $O(\log{n})$ | Binary Search $O(1)$ | HashMap 3. Writing a working code - Google/Facebook/Uber/Microsoft - They need working code - Take care of Edge cases - Candidates may get rejected for not taking care of Edge cases - Spend $10\%$ of your time on Edge cases - Give proper names to characters - Benefits - Less Confusion, Error free code - Good presentation and easier to explain ## Question 1 Binary Tree Maximum Path Sum Given a non-empty binary tree, find the maximum path sum. For this problem, a path is defined as any sequence of nodes from some starting node to any node in the tree along the parent-child connections. The path must contain at least one node and does not need to go through the root. Example 1: ```C++ Input: [1,2,3] 1 / \ 2 3 Output: 6 ``` Example 2: ```C++ Input: [-10,9,20,null,null,15,7] -10 / \ 9 20 / \ 15 7 Output: 42 ``` ----- __Brute Force:__ - For Every Pair of node find the path sum - Find maximum among all the paths - $O(n^3)$ __Hints:__ - Are all the pairs required? - For a tree, what are the possibilities of maximum path sum 1. Only root 2. Path from left subtree that ends at root 3. Path from right subtree that ends at root 4. Path from left to right passing through the root - Are these enough? 6. Path only in left 7. Path only in right - If we take care of all these paths, we can recursively find the maximum path sum. __Code__: ```C++ /** * Definition for a binary tree node. * struct TreeNode { * int val; * TreeNode *left; * TreeNode *right; * TreeNode(int x) : val(x), left(NULL), right(NULL) {} * }; */ class Solution { public: int global_max; int pathEndingAtRoot(TreeNode* root){ if(root == NULL) return 0; int left = max(pathEndingAtRoot(root->left),0); int right = max(pathEndingAtRoot(root->right),0); int path_with_root = left+right+root->val; global_max = max(path_with_root, global_max); return max(left+root->val , right+root->val); } int maxPathSum(TreeNode* root) { global_max = root->val; pathEndingAtRoot(root); return global_max; } }; ``` __Time Complexity__ - $O(n)$ as all the nodes are visited exactly once. __Dry Run__ ```C++ Input: [-10,9,20,null,null,15,7] -10 / \ 9 20 / \ 15 7 Output: 42 ``` |Root| Left|Right|Path_with_root| global_max| |:--|:--|:--|:--|:--| |-10|9|35|34|-10| |9|0|0|9|9| |20|15|7|42|42| |15|0|0|15|15| |7|0|0|7|15| ## Question 2 Find Median from Data Stream Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value. For example, \[2,3,4\], the median is 3 \[2,3\], the median is (2 + 3) / 2 = 2.5 Design a data structure that supports the following two operations: void addNum(int num) - Add a integer number from the data stream to the data structure. double findMedian() - Return the median of all elements so far. Example: ```C++ addNum(1) addNum(2) findMedian() -> 1.5 addNum(3) findMedian() -> 2 ``` ----- __Brute Force__ - Create a sorted array - Keep on adding nodes at there appropriate places just like insertion sort - addNum: $O(n)$ - findMedian: $O(1)$ __Hints:__ - In the above solution we do not care about the ordering in the lower half and upper half of the array. - All we need is that the maximum element of lower half is smaller than minimum element of the upper half. - Also, the sizes of the two parts are equal. - Can we use two heaps to solve this question? __Code:__ ```C++ class MedianFinder { public: /** initialize your data structure here. */ priority_queue max_heap_lesser_part; priority_queue, greater> min_heap_greater_part; MedianFinder() { } void addNum(int num) { if(max_heap_lesser_part.size() == 0){ max_heap_lesser_part.push(num); return; } if(max_heap_lesser_part.top() > num){ max_heap_lesser_part.push(num); }else{ min_heap_greater_part.push(num); } while(max_heap_lesser_part.size() > min_heap_greater_part.size() + 1){ int temp = max_heap_lesser_part.top(); max_heap_lesser_part.pop(); min_heap_greater_part.push(temp); } while(min_heap_greater_part.size() > max_heap_lesser_part.size()){ int temp = min_heap_greater_part.top(); min_heap_greater_part.pop(); max_heap_lesser_part.push(temp); } } double findMedian() { int n = max_heap_lesser_part.size() + min_heap_greater_part.size(); if(n%2 == 0){ return (max_heap_lesser_part.top() + min_heap_greater_part.top())/2.0; }else{ return max_heap_lesser_part.top()*1.0; } } }; /** * Your MedianFinder object will be instantiated and called as such: * MedianFinder* obj = new MedianFinder(); * obj->addNum(num); * double param_2 = obj->findMedian(); */ ``` __Time Complexity__ - AddNum : $O(1)$ as there can be only constant operations in the heaps. - FindMedian: $O(1)$ trivially. - Space: $O(n)$ we need to store every number. __Dry Run:__ ``` Array : [10,3,5,8,4] ``` |Max Heap(Lower Half)|Min Heap(Upper Half)|Median| |:--|:--|:--| |10| |10| |10,3| | | |3|10|6.5| |3,5|10|5| |3,5|8,10|6.5| |3,4,5|8,10|5| ## Question 3 Count Complete Tree Nodes Given a complete binary tree, count the number of nodes. __Note:__ Definition of a complete binary tree from __Wikipedia__: In a complete binary tree every level, except possibly the last, is completely filled, and all nodes in the last level are as far left as possible. It can have between 1 and 2h nodes inclusive at the last level h. __Example:__ ```C++ Input: 1 / \ 2 3 / \ / 4 5 6 Output: 6 ``` ----- __Brute Force__ - By doing any tree traversal we can find the number of nodes in the tree. __Hints:__ - This is a complete tree. Can we use this property? - Can we find the height of a complete binary tree in $O(\log{n})$ time? - All the level except the last level are full in a Complete Binary Tree. - What will be the total number of nodes in all the levels except the last level? - If the height is h, it will be $2^{h-1} - 1$ - How many nodes will be there in the last level? - The nodes will belong to $[1,2^{h-1}]$. - All the nodes in the last level are towards left. - Let's number the nodes from $0$ to $2^{h-1}-1$. And look at their binary representation. ```C++ X / \ X X / \ / \ X X X X / \ / \ / \ / \ 0 1 2 3 4 5 6 7 ---------------- 0 0 0 0 1 1 1 1 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1 ``` - In this binary representation we can see if $0$ means left and $1$ means right, we know the path to the node. - Now, we know how to check if a given node is present or not. We also know that all the nodes are towards the left. - Can we do a binary search to find the last node that is persent? __Code__: ```C++ /** * Definition for a binary tree node. * struct TreeNode { * int val; * TreeNode *left; * TreeNode *right; * TreeNode(int x) : val(x), left(NULL), right(NULL) {} * }; */ class Solution { public: int getHeight(TreeNode * root){ if(root == NULL) return 0; return getHeight(root->left) + 1; } bool present(TreeNode * root, int node, int h){ if(root == NULL) return false; if(h == 1){ return node == 0; } if((node>>(h-2))&1 == 1){ int new_node = node%((int)pow(2,h-2)); return present(root->right, new_node, h-1); }else{ int new_node = node%((int)pow(2,h-2)); return present(root->left, new_node, h-1); } } int countNodes(TreeNode* root) { if(root == NULL) return 0; int h = getHeight(root); int lo = 0, hi = pow(2,h-1) - 1; int ans = -1; while(lo<=hi){ int mid = lo+(hi-lo)/2; if(present(root,mid,h)){ ans = mid + 1; lo = mid+1; }else{ hi = mid-1; } } return ans + pow(2,h-1) - 1; } }; ``` __Dry Run__ ```C++ Input: 1 / \ 2 3 / \ / 4 5 6 -------- 0 1 2 3 ``` |Lo|Hi|Mid|Present|Answer| |:--|:--|:--|:--|:--| |0|3|1|Yes|1| |2|3|2|Yes|2| |3|3|3|No|2| |3|2|Break| | | __Time Complexity__ - For every check $O(\log{n})$ - Total number of checks $O(\log{n})$ - Time Complexity : $O(\log^2{n})$ ## Question 4 In a given integer array A, we must move every element of A to either list B or list C. (B and C initially start empty.) Return true if and only if after such a move, it is possible that the average value of B is equal to the average value of C, and B and C are both non-empty. ```C++ Example : Input: [1,2,3,4,5,6,7,8] Output: true Explanation: We can split the array into [1,4,5,8] and [2,3,6,7], and both of them have the average of 4.5. ``` Note: - The length of A will be in the range \[1, 30\]. - A\[i\] will be in the range of \[0, 10000\]. ----- __Another version__ - Find two subarrays such that their average is same? - Average of the whole array will be equal to the average of the two sub-arrays. - Run a pointer from 0 to n-1. Check if the average of nodes from zero to the pointer is equal to the average of the array. __Brute Force__ - Create all the possible subsets. Check their averages and report if found. - $O(2^n*n)$ __Hints__ - We will be doing something similar here too. We will only be optimising a little bit. - Average of the whole array will be equal to the average of the two sub-arrays. $$\frac{S}{n} = \text{Average of the whole array}$$ $$\frac{S_k}{k} = \text{Average of the first Sub Array}$$ $$\frac{S-S_k}{n-k} = \text{Average of the second Sub Array}$$ $$\text{Both the averagers are equal}$$ $$\frac{S_k}{k} = \frac{S-S_k}{n-k}$$ $$S_k*(n-k) = (S-S_k) * k$$ $$S_k* n - S_k * k = S * k - S_k * k$$ $$\frac{S_k}{k}= \frac{S}{n}$$ __Solution__ First, this problem is NP, and the worst case runtime is exponential. But the expected runtime for random input could be very fast. If the array of size n can be splitted into group A and B with same mean, assuming A is the smaller group, then ``` totalSum/n = Asum/k = Bsum/(n-k), where k = A.size() and 1 <= k <= n/2; Asum = totalSum*k/n, which is an integer. So we have totalSum*k%n == 0; In general, not many k are valid. ``` __Solution 2__: early pruning + knapsack DP, O(n^3 * M) 33 ms If there are still some k valid after early pruning by checking totalSum*k%n == 0, we can generate all possible combination sum of k numbers from the array using DP, like knapsack problem. (Note: 1 <= k <= n/2) Next, for each valid k, simply check whether the group sum, i.e. totalSum * k / n, exists in the kth combination sum hashset. ``` vector>> sums(n, vector>(n/2+1)); sums[i][j] is all possible combination sum of j numbers from the subarray A[0, i]; ``` __Goal__: sums$[n-1][k]$, for all k in range $[1, n/2]$ __Initial condition__: $sums[i][0] = {0}, 0 <= i <= n-1; sums[0][1] =${all numbers in the array}; __Deduction__: $sums[i+1][j] = sums[i][j]$ "join" $(sums[i][j-1] + A[i+1])$ The following code uses less space but the same DP formula. __Runtime analysis__: All numbers in the array are in range [0, 10000]. Let M = 10000. So the size of kth combination sum hashset, i.e. sums[...][k], is <= k * M; For each number in the array, the code need loop through all combination sum hashsets, so the total runtime is n * (1 * M + 2 * M + ... + (n/2) * M) = O(n^3 * M) __Code__ ```C++ class Solution { public: bool splitArraySameAverage(vector& A) { int n = A.size(), m = n/2, totalSum = accumulate(A.begin(), A.end(), 0); // early pruning bool isPossible = false; for (int i = 1; i <= m && !isPossible; ++i) if (totalSum*i%n == 0) isPossible = true; if (!isPossible) return false; // DP like knapsack vector> sums(m+1); sums[0].insert(0); for (int num: A) { for (int i = m; i >= 1; --i) for (const int t: sums[i-1]) sums[i].insert(t + num); } for (int i = 1; i <= m; ++i) if (totalSum*i%n == 0 && sums[i].find(totalSum*i/n) != sums[i].end()) return true; return false; } }; ```