Question
You have k
lists of sorted integers in non-decreasing order. Find the smallest range that includes at least one number from each of the k
lists.
We define the range [a, b]
is smaller than range [c, d]
if b - a < d - c
or a < c
if b - a == d - c
.
Example 1:
Input: nums = [[4,10,15,24,26],[0,9,12,20],[5,18,22,30]]
Output: [20,24]
Explanation:
List 1: [4, 10, 15, 24,26], 24 is in range [20,24].
List 2: [0, 9, 12, 20], 20 is in range [20,24].
List 3: [5, 18, 22, 30], 22 is in range [20,24].
Example 2:
Input: nums = [[1,2,3],[1,2,3],[1,2,3]]
Output: [1,1]
Brute-force
- The brute force method would be to generate all possible subarrays that include at least one element from each of the k arrays, and then find the minimum range among them. This would take O(n^k) time, where n is the maximum length of any array.
- This seems complicated as the arrays we will be given can vary so it can be hard to generate all the ranges. Since all the arrays are sorted maybe we can use that to our advantage.
Let’s Reframe
- To build understanding, we will start by only attempting to reduce the range between two arrays.
- Looking at the examples below, we have two elements in the range; 0 and 1. These are the starting elements in the sorted arrays. We use pointers to represent the range.
- The lower bound of the range is 0, the min between the two
- the upper bound is 1, the max between the two.
- The only correct way to reduce the difference between the upper bound and lower bound or the range is to increase the lower bound. Why?
- Since all arrays are sorted, the next number after the lower bound pointer will be larger.
- Increasing the lower bound reduces the difference between it and the upper bound, hence reducing the range.
- Lower Bound: from 0 to 1. The range across all the arrays in this case two becomes 0.
- What about decreasing the upper bound?
- We are starting from the left, so we can only move to the right. It will not be possible to decrease the upper bound while moving to the right. Since all the values in the arrays are increasing in ascending order.
- Why are we starting from the left?
- Since we are trying to reduce the lower bound. If we don’t start from the far left. We might miss an initial lower bound. In the case of 0. Starting from a higher lower bound than which we have available. 4 instead of zero will make us miss some possible smallest ranges.
- Can we start from the far right and decrease the upper bound. Seems like a possibility. But will make it more complicated than it is.
- As the value of k increases we are going to have many values in range. K to be exact. Is it possible to reduce the range of numbers between an upper and lower bound? Here is a proof by contradiction.
- If we try to reduce the range using any number other than the lower bound, the range will not be reduced. This will keep the range the same or even make it bigger. Even the arrays are sorted in ascending order. Choosing a number between the lower and upper bound will not move the lower bound as the difference between the upper bound and lower bound—range. Will never decrease.
- Once we start from the far left we can assume that we have the smallest range amongst all the first items( in all the arrays possible). Once we are sure we are not missing any possible ranges. We can go about decreasing the range by only selectively reducing the lower bound. This way we can be sure we only reduce the range and not miss any possible ranges that might not be smallest(across lower and upper bound). We keep considering more and more elements in the array. And with that only select possible ranges instead of all the possible ranges as we did in the brute force method. This is a greedy solution.
- Starting
- Elements considered vs current range
- First move
- Right before the last move, we will remove 20 and the range will not encounter all the arrays. We have found our smallest range some where in the middle.
- As the number of arrays increases there is going to be a need for more than two pointers. Since we need to keep track of only the minimum—lower bound from all the elements (the current range) in different arrays we can use a heap as an abstract pointer.
- The question is similar in ways and different in ways than Merge K sorted Arrays but having an understanding of Merge K sorted Arrays is fundamental.
- To start, we build a heap by adding the first element of every array to it. This heap represents all the minimum values from the arrays and will result in the shortest range across the arrays when considering a range among all the first elements in all the arrays. The minimum value amongst all the values will be the lower bound of the range.
- How can we obtain the upper bound when the heap only provides us with the lower bound, assuming we are using a min heap?
- We store the maximum value as a variable and initially set it to the first element from each array. While merging, we update this variable whenever we encounter a value greater than the previous maximum value.
- Keeping the invariance (The property of remaining unchanged regardless of changes)
- We are assuming an invariance:
- The difference between the upper bound and the lower bound will always have the shortest range for all the elements that that we have considered.
- Once we are able to prove that we can be sure about the correctness of our algorithm.
- We are starting from the far left. When we have considered only the first elements from all the arrays, we have the shortest range for all the elements we have considered.
- Since we are staring from the far left, we are not missing any possible ranges.
- We only reduce the lower bound. By adding a new elements to the elements we are sure to reduce reduce the range for the elements we have considered.
- The invariance holds!
- Side note: Greedy proofs are really challenging.I tried my best. If you have a better way of explaining it let me know I will add it to the document. LINK
- At the start, we have the shortest range among the first elements of all the arrays.
- But wait, does adding the next element increase the range? Is the variance still intact?
- The range didn't decrease but actually increased from 5 to 9. As the smallest minimum number in the range decreases, should the difference between the lower and upper bounds also decrease? Since the numbers are in increasing order, if we remove the smallest minimum to hopefully find a better lower bound, there might be a case where the next number is even larger than the current upper bound. When we removed 5 and added 18, the range increased. But is the invariance retained?
- What was the invariance: The difference between the upper bound and the lower bound will always have the shortest range for all the elements that we have considered.
- As we consider more elements and the arrays are in ascending order, we may encounter a larger range.
- Since we only increase the range when adding a number that exceeds the lower bound (18 in this case), we can be sure that we were decreasing the range for all elements before the range was increased. After that we start a new?
- The process involves, reducing the range until possible, and starting only when we can’t reduce it further. This way we can be sure we reduced it greedily untill a point after which it was not possible.
- After the range across all the elements increases we try to reduce it again.
- We can be certain that the range was always reduced until we started a new (given the proof of contradiction at the start). Once we establish new lower and upper bounds, we attempt to reduce them again. After processing all the ranges, we can be sure that the invariance is maintained across all combinations of possible ranges amongst all arrays. We then return the range when it was the shortest across all combinations.
List 1: [1,2]
^
List 2: [0,1,2]
^
Lets say we have these two arrays, We get the min value from all the pointers
min(1,0) = lower bound
List 1: [1,2]
^
List 2: [0,1,2]
^
Increasing the lower Bound
min(1,1) = lower bound
Trying to reduce range using 4 instead of 0(lower bound)
List 1: [4, 10, 15, 24,26]
List 2: [0, 9, 12, 20]
List 3: [5, 18, 22, 30]
min(0, 4, 5)
LB = 0
UB = 5
R = 5
#trying to reduce other than the lower bound
List 1: [4, 10, 15, 24,26]
List 2: [0, 9, 12, 20]
List 3: [5, 18, 22, 30]
min(0, 10, 5)
LB = 0
UB = 10
R = 10
List 1: [4, 10, 15, 24,26]
List 2: [0, 9, 12, 20]
List 3: [5, 18, 22, 30]
min = min(4,0,5)
maximum = max(array[0] for array in nums)
# bunch of other code
if newElementToAdd > maximum:
maximum = newElementToAdd
List 1: [4, 10, 15, 24,26]
List 2: [0, 9, 12, 20]
List 3: [5, 18, 22, 30]
lowerBound = min(4,0,5)
upperBound = max (4,0,5)
List 1: [4, 10, 15, 24,26]
List 2: [0, 9, 12, 20]
List 3: [5, 18, 22, 30]
lowerBound = min(4,0,5)
upperBound = max (4,0,5)
Let study the 4 elements merged after the first elements.
min(0, 4, 5)
LB = 0
UB = 5
R = 5
min(4, 5, 9)
LB = 4
UB = 9
R = 5
min(5, 9, 10)
LB = 5
UB = 10
R = 5
# Now interesting things happen
min(9, 10, 18)
LB = 9
UB = 18
R = 9
min(9, 10, 18)
LB = 9
UB = 18
R = 9
min(10, 18, 12)
LB = 10
UB = 18
R = 8
min(12, 18, 15)
LB = 12
UB = 18
R = 5
min(15, 18, 20)
LB = 15
UB = 20
R = 5
min(18, 20, 24)
LB = 18
UB = 24
R = 6
min(20, 24, 22)
LB = 20
UB = 24
R = 4
Algorithm
- Initializing the variables and heap as a collection of pointers.
- Pop the minimum, calculate the range. Reduce the minimum value to shorten the range, if possible. If a bigger number comes increasing the lower bound, update the maximum. If any of the arrays are exhausted, return the range.
lowerBound = float("inf")
upperBound = float("-inf")
heap = []
minRange = float("inf")
result = [lowerBound, upperBound]
#initilizing the heap
for array in nums:
firstIndex = 0
firstElement = array[0]
payload =(firstElement,array,firstIndex)
heappush(heap,payload)
while heap:
element,array,index = heappop(heap)
lowerBound = element
currentRange = upperBound - lowerBound
# update the range if needed
if minRange > currentRange:
minRange = currentRange
result = [lowerBound,upperBound]
nextIndex = index+1
# array exhausted?
if nextIndex == len(array):
return result
# adding to heap
newElementToAdd = array[nextIndex]
payload = (newElementToAdd,array,nextIndex)
heappush(heap,payload)
if newElementToAdd > upperBound:
upperBound = newElementToAdd
Implementation
from heapq import heappush,heappop
class Solution:
def smallestRange(self, nums: List[List[int]]) -> List[int]:
lowerBound = float("inf")
upperBound = float("-inf")
heap = []
minRange = float("inf")
result = [lowerBound, upperBound]
for array in nums:
firstIndex = 0
firstElement = array[0]
payload =(firstElement,array,firstIndex)
heappush(heap,payload)
upperBound = max(array[0] for array in nums)
while heap:
element,array,index = heappop(heap)
lowerBound = element
currentRange = upperBound - lowerBound
# update the range if needed
if minRange > currentRange:
minRange = currentRange
result = [lowerBound,upperBound]
nextIndex = index+1
# array exhausted?
if nextIndex == len(array):
return result
# adding to heap
newElementToAdd = array[nextIndex]
payload = (newElementToAdd,array,nextIndex)
heappush(heap,payload)
if newElementToAdd > upperBound:
upperBound = newElementToAdd
Complexities
Time complexity: O(n*log(k))
Where n
is the number of elements in the smallest arrays and k
is the number of arrays, insertion and popping n
elements from the min heap takes O(log(k))
time. This is because the maximum number of elements that can be in a heap is k
.
Space Complexity: O(k)