97. K Closest Points to Origin
Problem Statement
Given an array of points where points[i] = [xi, yi] represents a point on the X-Y plane and an integer k, return the k closest points to the origin (0, 0).
The distance between two points on the X-Y plane is the Euclidean distance (i.e., sqrt(x^2 + y^2)).
You may return the answer in any order. The answer is guaranteed to be unique (except for the order that it is in).
Additional information
1 <= k <= points.length <= 10^4
-10^4 <= xi, yi <= 10^4
Example 1:
Input: points = [[1, 3], [-2, 2]], k = 1
Output: [[-2, 2]]
Explanation: * The distance between (1, 3) and the origin is sqrt(1^2 + 3^2) = sqrt(10).
- The distance between (-2, 2) and the origin is
sqrt((-2)^2 + 2^2) = sqrt(8).
- Since
sqrt(8) < sqrt(10), (-2, 2) is closer to the origin.
- We only want the closest
k = 1 points, so the answer is [[-2, 2]].
Example 2:
Input: points = [[3, 3], [5, -1], [-2, 4]], k = 2
Output: [[3, 3], [-2, 4]]
Explanation: The answer [[-2, 4], [3, 3]] would also be accepted.
Brute Force (Sorting)
Intuition
The simplest way to find the k closest points is to calculate the distance from the origin for every single point in the array. Once we have all the distances, we can just sort the array of points based on those distances in ascending order. The first k points in the sorted array will be our answer.
Since we only need to compare distances, we don't actually need to compute the square root (which is an expensive operation). We can just compare the squared distances (x^2 + y^2).
Algorithm
- Define a custom sorting key or lambda function that calculates
x^2 + y^2 for any given point [x, y].
- Sort the
points array using this custom key.
- Slice the first
k elements from the sorted array and return them.
def k_closest(points: list[list[int]], k: int) -> list[list[int]]:
# Sort points based on their squared distance from the origin: x^2 + y^2
points.sort(key=lambda P: P[0]**2 + P[1]**2)
# Return the first k points
return points[:k]
Time & Space Complexity
Time complexity: O(N log N)
Reason: Sorting an array of length N takes O(N log N) time. Calculating the distance takes O(1) for each point.
Space complexity: O(N)
Reason: Timsort (Python's built-in sorting algorithm) requires up to O(N) auxiliary space.
Max-Heap (Priority Queue) - Optimal
Intuition
Sorting the entire array does more work than necessary, especially if N is huge (e.g., 10,000) and k is very small (e.g., 5). We don't care about the relative order of the points that are far away.
Instead, we can maintain a Max-Heap of strictly size k. As we iterate through the points, we add them to the heap. If the heap size exceeds k, we pop the maximum element (the furthest point currently in our heap). By the end of the iteration, the heap will only contain the k smallest distances.
Since Python's heapq is naturally a Min-Heap, we simulate a Max-Heap by pushing negative distances. For example, a distance of 10 becomes -10. Since -10 is smaller than -8, it will sit at the top of the Min-Heap and be popped off first!
Algorithm
- Initialize an empty list
max_heap.
- Iterate through each point
[x, y] in points.
- Calculate the squared distance:
dist = x**2 + y**2.
- Push a tuple containing the negative distance and the point
(-dist, [x, y]) onto the max_heap.
- If the length of
max_heap becomes strictly greater than k, pop from the heap (this removes the point with the largest distance).
- Once the loop finishes, extract just the points from the remaining tuples in the heap and return them.
def k_closest(points: list[list[int]], k: int) -> list[list[int]]:
max_heap = []
for x, y in points:
dist = x**2 + y**2
# Push negative distance to simulate a Max-Heap
heapq.heappush(max_heap, (-dist, [x, y]))
# If heap exceeds size k, evict the furthest point
if len(max_heap) > k:
heapq.heappop(max_heap)
# Extract the points from the heap
res = []
for dist, point in max_heap:
res.append(point)
return res
Time & Space Complexity
Time complexity: O(N log k)
Reason: We iterate through all N points. Pushing and popping from a heap of size k takes O(log k) time. Thus, the total time is bounded by O(N log k). This is vastly superior to O(N log N) when k is much smaller than N.
Space complexity: O(k)
Reason: The heap only ever stores exactly k elements at any given time.