Problem Statement
Given an integer array nums and an integer k, return the k most frequent elements in the array.
The answer is guaranteed to be unique, and you may return the output in any order.
Additional information
1 <= nums.length <= 10^4
-1000 <= nums[i] <= 1000
1 <= k <= number of distinct elements in nums
Example 1:
Input: nums = [1, 1, 1, 2, 2, 3], k = 2
Output: [1, 2]
Explanation: The number 1 appears 3 times and 2 appears 2 times. These are the two most frequent elements.
Example 2:
Input: nums = [1], k = 1
Output: [1]
Example 3:
Input: nums = [1, 2, 3], k = 3
Output: [1, 2, 3]
Explanation: Every element appears exactly once, so all three are returned.
Sorting
Intuition
The most natural way to find the "top" items of any category is to count them and then rank them. We can count how many times each number appears using a dictionary. Once we have these counts, we can sort the distinct numbers based on their frequency count in descending order. The first k elements of this sorted list will be our answer.
Algorithm
- Traverse the input array
nums and count the frequency of every number using a Hash Map.
- Convert the Hash Map items into a list of pairs:
(count, number).
- Sort this list in descending order based on the
count.
- Extract the
number from the first k pairs in the sorted list.
- Return these
k numbers.
def top_k_frequent(nums: list[int], k: int) -> list[int]:
count = {}
for n in nums:
count[n] = 1 + count.get(n, 0)
# Sort by frequency (cnt) in descending order
arr = []
for num, cnt in count.items():
arr.append([cnt, num])
arr.sort(reverse=True)
res = []
for i in range(k):
res.append(arr[i][1])
return res
Time & Space Complexity
- Time complexity:
O(n log n)
- Reason: The sorting step dominates the runtime. Even though we only sort unique elements, in the worst case (all elements unique), this is
O(n log n).
- Space complexity:
O(n)
- Reason: We need to store the frequency map and the list of pairs.
Min-Heap
Intuition
Instead of sorting the entire list of frequencies, we can maintain a "Hall of Fame" list that only holds the top k elements seen so far. A Min-Heap is perfect for this. We add elements to the heap, and if the heap size exceeds k, we remove the element with the lowest frequency (the root of the Min-Heap). This ensures that after processing all numbers, only the k most frequent ones remain.
Algorithm
- Count the frequency of each number using a Hash Map.
- Initialize a Min-Heap.
- Iterate through the frequency map:
- Push the pair
(count, number) onto the heap.
- If the size of the heap is greater than
k, pop the smallest element (the one with the lowest count).
- After the loop, the heap will contain exactly
k elements. Return the numbers from these pairs.
def top_k_frequent(nums: list[int], k: int) -> list[int]:
count = {}
for n in nums:
count[n] = 1 + count.get(n, 0)
heap = []
for num in count.keys():
heapq.heappush(heap, (count[num], num))
if len(heap) > k:
heapq.heappop(heap)
res = []
for i in range(k):
res.append(heapq.heappop(heap)[1])
return res
Time & Space Complexity
- Time complexity:
O(n log k)
- Reason: We perform
n insertions into the heap. Since the heap size is capped at k, each insertion takes O(log k).
- Space complexity:
O(n + k)
- Reason:
O(n) to store the counts and O(k) for the heap.
Bucket Sort (Optimal)
Intuition
Since the frequency of any number cannot exceed the total size of the array n, we can use this constraint to our advantage. Instead of sorting, we can create "buckets" where the index represents the frequency. Bucket[5] will contain all numbers that appeared exactly 5 times. We then walk backwards from the largest bucket to the smallest, collecting numbers until we have found k of them.
Algorithm
- Count the frequency of each number using a Hash Map.
- Create a list of empty lists (buckets) called
freq, where the index i will store numbers that appear i times.
- Populate the buckets: for each number, calculate its count and add the number to the bucket at that index.
- Iterate through the buckets in reverse order (from highest frequency to lowest).
- Add numbers from the current bucket to the result list.
- Stop and return the result once the list contains
k elements.
def top_k_frequent(nums: list[int], k: int) -> list[int]:
count = {}
freq = [[] for i in range(len(nums) + 1)]
for n in nums:
count[n] = 1 + count.get(n, 0)
for n, c in count.items():
freq[c].append(n)
res = []
for i in range(len(freq) - 1, 0, -1):
for n in freq[i]:
res.append(n)
if len(res) == k:
return res
return res
Time & Space Complexity
- Time complexity:
O(n)
- Reason: We iterate through the array to count frequencies
O(n), then iterate through the buckets O(n). There is no sorting involved.
- Space complexity:
O(n)
- Reason: The
freq array creates a bucket for every possible frequency up to n.