Prerequsites : kth smallest element in an array
Question
Given a list of points on a 2D plane. Find k closest points to origin (0, 0)
.
Input: points = [[1,3],[-2,2],[5,5]], k = 2
Output: [[-2,2],[1,3]]
Explanation:
The distance between (1, 3) and the origin is sqrt(10). The distance between (-2, 2) and the origin is sqrt(8). Since sqrt(8) is less than sqrt(10), (-2, 2) is closer to the origin. We only want the closest k = 2 points from the origin, so the answer is just [[-2,2],[1,3]].
Leetcode
Reframing the Question
- Before we dive into the solution, the .Distance between two points
(x1, y1)
and(x2, y2)
is. - For the point
(-2,-2)
. The coordinates of the first point,(x1, y1)
, are(0, 0)
and the coordinates of the second point,(x2, y2)
, are(-2, -2)
, so the distance from the origin to this point can be calculated as follows
- For the points (-2,-2), (5,5), and (1,3), the distances will be, respectively.
Intuition 1.0
- As we discussed in Similar Questions: Kth Element, when we encounter a problem involving "closest k", our first reaction should be to use heaps.
- The first part element in the payload to the heap is used to sort the data inside the heap. We can add different types of data in the second or even third element and they will not play any part in the sorting.
- Elements
- First element (sorting): Distance
- Second Element (information) : coordinates.
- Something like in the image below
- We can start with a heap of size k. These will be assumed to be the closest k points to the origin among the first k values from array points. (Don't forget to multiply -1 while adding to the heap in Python to simulate a max heap.)
- function for calculating distance
- To find the closest points from all points, we will iterate over the remaining points. As we iterate across the list of points, since we need to find the closest points we will use a max heap to discard larger values we already have in the heap in favor of smaller values coming in.
- The heap now has the closest distances and their corrdinates, the data looks like this. We need only the corrdinates.
- Iterate over the heap and get the coordinates and simply return the result.
for i in range(k):
x2 = points[i][0]
y2 = points[i][1]
distance = self.calculateDistance(x2,y2)
coordinates = (x2,y2)
heappush(heap,(-1 *distance,coordinates))
def calculateDistance(self,x2,y2):
#origin(coordinates)
x1 = 0
y1 = 0
#calculate
radicand = (x1-x2)**2 + (y1-y2)**2
distance = sqrt(radicand)
return distance
for i in range(k,len(points)):
x2 = points[i][0]
y2 = points[i][1]
distance = self.calculateDistance(x2,y2)
coordinates = (x2,y2)
kickOutALargerValue = distance < -1 * heap[0][0]
if kickOutALargerValue:
heappop(heap)
heappush(heap,(-1 * distance,coordinates))
heap = [(-4.47213595499958, (-2, 4)), (-4.242640687119285, (3, 3))]
We could have also simply added the radicand (the value underneath the square root), as taking the square root does not change the order of values.
for i in range(len(heap)):
result.append(heap[i][1])
return result
Implementation
from heapq import heappush, heappop
from math import sqrt
class Solution:
def kClosest(self, points: List[List[int]], k: int) -> List[List[int]]:
heap = [] # max heap
result = []
for i in range(k):
x2 = points[i][0]
y2 = points[i][1]
distance = self.calculateDistance(x2,y2)
coordinates = (x2,y2)
heappush(heap,(-1 *distance,coordinates))
for i in range(k,len(points)):
x2 = points[i][0]
y2 = points[i][1]
distance = self.calculateDistance(x2,y2)
coordinates = (x2,y2)
kickOutALargerValue = distance < -1 * heap[0][0]
if kickOutALargerValue:
heappop(heap)
heappush(heap,(-1 * distance,coordinates))
for i in range(len(heap)):
result.append(heap[i][1])
return result
def calculateDistance(self,x2,y2):
#origin(coordinates)
x1 = 0
y1 = 0
#calculate
radicand = (x1-x2)**2 + (y1-y2)**2
distance = sqrt(radicand)
return distance
Complexities
The time complexity is O(n*log(k))
Space Complexity: O(k)
We will pass at most n items (points) through the heap. Deletion is O(1) and insertion has a cost of at most log(k), as the size of the heap will never exceed that.
🥡 Takeaways
The first item in the heap is used to determine sortedness of the heap. We will use this property of heap to solve further questions. In Similar Questions: Heaps with a payload