156. Sliding Window Median
Problem Statement
The median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle values.
- For arrays of odd length (e.g.,
[2,3,4]), the median is 3.
- For arrays of even length (e.g.,
[2,3]), the median is (2 + 3) / 2 = 2.5.
You are given an integer array nums and an integer k. There is a sliding window of size k which is moving from the very left of the array to the very right. You can only see the k numbers in the window. Each time the sliding window moves right by one position, return the median array for each window in the original array. Answers within 10^-5 of the actual value will be accepted.
Additional information
1 <= k <= nums.length <= 10^5
-2^31 <= nums[i] <= 2^31 - 1
Example 1:
Input: nums = [1,3,-1,-3,5,3,6,7], k = 3
Output: [1.00000,-1.00000,-1.00000,3.00000,5.00000,6.00000]
Explanation:
| Step |
Isolated Window |
Array Context |
Median |
| 1 |
[1, 3, -1] |
[1, 3, -1], -3, 5, 3, 6, 7 |
1 |
| 2 |
[3, -1, -3] |
1, [3, -1, -3], 5, 3, 6, 7 |
-1 |
| 3 |
[-1, -3, 5] |
1, 3, [-1, -3, 5], 3, 6, 7 |
-1 |
| 4 |
[-3, 5, 3] |
1, 3, -1, [-3, 5, 3], 6, 7 |
3 |
| 5 |
[5, 3, 6] |
1, 3, -1, -3, [5, 3, 6], 7 |
5 |
| 6 |
[3, 6, 7] |
1, 3, -1, -3, 5, [3, 6, 7] |
6 |
Example 2:
Input: nums = [1,2,3,4,2,3,1,4,2], k = 4
Output: [2.50000,2.50000,3.00000,2.50000,2.50000,2.50000]
Brute Force (Sort Every Window)
Intuition
The most literal interpretation of the problem is to simply simulate the window moving across the array.
For every position of the window, we can extract the k elements into a new sub-array. To find the median, the elements must be ordered, so we sort this sub-array. Once sorted, finding the median is trivial: if k is odd, we take the exact middle element. If k is even, we take the average of the two middle elements. We append this result to our answer array and slide the window forward by one.
Algorithm
- Initialize an empty
res array to store the medians.
- Loop
i from 0 up to len(nums) - k + 1. This i represents the starting index of our sliding window.
- Slice the array to get the current window:
window = nums[i : i + k].
- Sort the
window in ascending order.
- Find the median:
- Calculate the middle index:
mid = k // 2.
- If
k % 2 != 0 (odd length), the median is window[mid].
- If
k % 2 == 0 (even length), the median is (window[mid - 1] + window[mid]) / 2.0.
- Append the calculated median to
res.
- Return
res.
def median_sliding_window(nums: list[int], k: int) -> list[float]:
res = []
for i in range(len(nums) - k + 1):
# Extract and sort the current window
window = sorted(nums[i : i + k])
mid = k // 2
# Calculate median based on even or odd window size
if k % 2 == 0:
median = (window[mid - 1] + window[mid]) / 2.0
else:
median = float(window[mid])
res.append(median)
return res
Time & Space Complexity
- Time complexity: O(N * K log K)
- Reason: We iterate through the array roughly N times. Inside the loop, we slice an array of size K (which takes O(K) time) and then sort it (which takes O(K log K) time). For large values of N and K, this will trigger a Time Limit Exceeded (TLE) error.
- Space complexity: O(K)
- Reason: We create a new
window array of size K during every iteration.
Two Heaps with Lazy Deletion - Optimal
Intuition
Sorting the entire window from scratch every single time is massively inefficient because the window only changes by exactly two elements (one leaves, one enters). We can optimize this by maintaining two Heaps:
- A Max-Heap to store the smaller half of the numbers.
- A Min-Heap to store the larger half of the numbers.
If we keep these heaps perfectly balanced, the median is always instantly accessible at the top of the heaps!
The challenge is removing the "outgoing" element as the window slides. Python's heapq does not support O(1) removals of arbitrary elements. Instead, we use Lazy Deletion. We maintain a Hash Map to count the elements that have fallen out of the window but are still physically sitting inside our heaps. We only pop them off when they bubble up to the very top of the heaps, right when they would otherwise interfere with our median calculation.
Algorithm
- Initialize
min_heap (larger half) and max_heap (smaller half, store values as negative).
- Initialize a dictionary
invalidated to track elements that have exited the window.
- First, populate the heaps with the first
k elements and balance them.
- Define a helper function
get_median() that looks at the tops of the valid heaps.
- Iterate
i from k to the end of nums:
- Calculate and store the median of the current window.
- Identify the
outgoing_num (the one leaving the window) and the incoming_num.
- Record
outgoing_num in the invalidated hash map.
- Keep a
balance integer. If outgoing_num was in the max_heap (<= its top), decrement balance. Otherwise, increment balance.
- Push the
incoming_num into the appropriate heap and update the balance.
- If
balance < 0, move an element from min_heap to max_heap. If balance > 0, move an element from max_heap to min_heap.
- The Lazy Deletion Step: While the top of
max_heap is in invalidated (and count > 0), heappop it and decrement its count. Do the exact same for min_heap.
- Append the very last median and return the result.
def median_sliding_window(nums: list[int], k: int) -> list[float]:
small = []
large = []
lazy = {}
for i in range(k):
heapq.heappush(small, -nums[i])
for i in range(k // 2):
heapq.heappush(large, -heapq.heappop(small))
def get_median():
if k % 2 == 1:
return float(-small[0])
return (-small[0] + large[0]) / 2.0
res = [get_median()]
for i in range(k, len(nums)):
out_num = nums[i - k]
in_num = nums[i]
lazy[out_num] = lazy.get(out_num, 0) + 1
balance = 0
if out_num <= -small[0]:
balance -= 1
else:
balance += 1
if small and in_num <= -small[0]:
balance += 1
heapq.heappush(small, -in_num)
else:
balance -= 1
heapq.heappush(large, in_num)
if balance < 0:
heapq.heappush(small, -heapq.heappop(large))
elif balance > 0:
heapq.heappush(large, -heapq.heappop(small))
while small and lazy.get(-small[0], 0) > 0:
lazy[-small[0]] -= 1
heapq.heappop(small)
while large and lazy.get(large[0], 0) > 0:
lazy[large[0]] -= 1
heapq.heappop(large)
res.append(get_median())
return res
Time & Space Complexity
- Time complexity: O(N log K)
- Reason: Inserting an element into a heap and rebalancing takes O(log K) time. The lazy deletion ensures we only pop elements when necessary, keeping heap operations logarithmic. We do this for N elements.
- Space complexity: O(K)
- Reason: The two heaps will store a combined total of at most K valid elements (plus a few lazily deleted ones before pruning), and the
invalidated dictionary will hold at most K entries.