Problem Statement
You are given a string s. We want to partition the string into as many parts as possible so that each letter appears in at most one part.
Note that the partition is done so that after concatenating all the parts in order, the resultant string should be s.
Return a list of integers representing the size of these parts.
Additional information
1 <= s.length <= 500
s consists of lowercase English letters.
Example 1:
Input: s = "ababcbacadefegdehijhklij"
Output: [9,7,8]
Explanation: The partition is "ababcbaca", "defegde", "hijhklij".
This is a partition so that each letter appears in at most one part.
A partition like "ababcbacadefegde", "hijhklij" is incorrect, because it splits s into less parts.
Example 2:
Input: s = "eccbbbbdec"
Output: [10]
Interval Merging Approach
Intuition
We can think of each unique character in the string as an "interval". The interval starts at the character's very first appearance and ends at its very last appearance. Because all occurrences of a character must belong to the exact same partition, the partition must perfectly cover this interval.
If we find the [start, end] interval for every single unique character in the string, the problem transforms into the classic "Merge Intervals" problem!
We sort the intervals by their start times, and then merge any intervals that overlap. Once all overlapping intervals are merged, the lengths of these finalized, disjoint intervals will give us the exact sizes of our maximum possible partitions.
Algorithm
- Iterate through the string
s to record the first and last appearance of every unique character. Store this in a dictionary mapping char to [first_index, last_index].
- Extract all the intervals from the dictionary and sort them based on their
first_index.
- Initialize a
merged list to hold the combined intervals.
- Iterate through the sorted intervals:
- If
merged is empty, or if the current interval's start time is strictly greater than the previous interval's end time (no overlap), append the current interval to merged.
- Otherwise, there is an overlap! Update the end time of the last interval in
merged to be the maximum of its current end time and the new interval's end time.
- Initialize a
res list.
- Loop through the
merged intervals, calculate the length of each (end - start + 1), and append it to res.
- Return
res.
def partition_labels(s: str) -> list[int]:
intervals_map = {}
# Find the first and last occurrence of each character
for i, char in enumerate(s):
if char not in intervals_map:
intervals_map[char] = [i, i]
else:
intervals_map[char][1] = i
# Extract and sort the intervals by their start index
intervals = list(intervals_map.values())
intervals.sort(key=lambda x: x[0])
merged = []
# Merge overlapping intervals
for current_start, current_end in intervals:
if not merged or merged[-1][1] < current_start:
merged.append([current_start, current_end])
else:
merged[-1][1] = max(merged[-1][1], current_end)
# Calculate the size of each merged partition
res = []
for start, end in merged:
res.append(end - start + 1)
return res
Time & Space Complexity
Time complexity: O(N + C log C)
Reason: N is the length of the string and C is the number of unique characters. We iterate through the string in O(N) time. Sorting the intervals takes O(C log C) time. Because C is bounded by 26 (the English alphabet), the sorting step is effectively O(1), making the overall time O(N).
Space complexity: O(C)
Reason: The intervals_map and the merged list will store at most 26 intervals, using O(1) auxiliary space.
Greedy Two-Pass - Optimal
Intuition
We can solve this much faster without creating or sorting intervals by using a Greedy algorithm.
In our first pass, we only care about the absolute last time we see each character. We store this in a simple dictionary.
In our second pass, we iterate through the string while keeping a running tally of the current partition's size. We also maintain an end pointer that represents the furthest index our current partition must stretch to.
As we look at each character, we check its last known occurrence and push our end pointer out to that index. We keep walking forward. If our current index i ever exactly matches the end pointer, it means we have safely passed the last occurrences of every single character we've seen so far! We can instantly chop the partition here, save the size, and reset for the next one.
Algorithm
- Create a dictionary
last_occurrence to store the last index of each character in s.
- Do a first pass: iterate through
s using enumerate and update last_occurrence[char] = i.
- Initialize
size = 0, end = 0, and an empty res list.
- Do a second pass: iterate through
s again with i and char.
- Increment
size by 1 for the current character.
- Greedily update
end to be the maximum of end and last_occurrence[char].
- If our current index
i catches up to end (i == end), the partition is complete!
- Append
size to res.
- Reset
size = 0 for the next partition.
- Return
res.
def partition_labels(s: str) -> list[int]:
last_occurrence = {}
# Pass 1: Record the last index of every character
for i, char in enumerate(s):
last_occurrence[char] = i
res = []
size = 0
end = 0
# Pass 2: Greedily stretch the partition boundary
for i, char in enumerate(s):
size += 1
end = max(end, last_occurrence[char])
# If we reach the furthest required boundary, chop the partition
if i == end:
res.append(size)
size = 0
return res
Time & Space Complexity
Time complexity: O(N)
Reason: We iterate through the string of length N exactly twice. The dictionary lookups take O(1) time. This results in a highly optimal linear runtime.
Space complexity: O(1)
Reason: The last_occurrence dictionary will only ever store up to 26 key-value pairs (the lowercase English alphabet), resulting in constant O(1) extra space.