91. Group Anagrams
Problem Statement
Given an array of strings strs, group the anagrams together. You can return the answer in any order.
An Anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.
Additional information
1 <= strs.length <= 10^4
0 <= strs[i].length <= 100
strs[i] consists of lowercase English letters.
- The output groups can be in any order.
Example 1:
Input: strs = ["eat","tea","tan","ate","nat","bat"]
Output: [["bat"],["nat","tan"],["ate","eat","tea"]]
Example 2:
Input: strs = [""]
Output: [[""]]
Example 3:
Input: strs = ["a"]
Output: [["a"]]
Sorting
Intuition
Two strings are anagrams if and only if their sorted characters are identical. For example, "eat" and "tea" both sort to "aet". We can use this sorted version as a key in a Hash Map (dictionary) to group all corresponding strings together.
Algorithm
- Initialize an empty dictionary
anagram_groups where the key is the sorted string and the value is a list of original strings.
- Iterate through each string
s in the input array.
- Sort the string
s to create a key (e.g., "eat" -> "aet").
- Append the original string
s to the list corresponding to this key in the dictionary.
- Return the values of the dictionary as a list of lists.
def group_anagrams(strs: list[str]) -> list[list[str]]:
anagram_groups = {}
for s in strs:
# Sort string to use as key
key = "".join(sorted(s))
if key not in anagram_groups:
anagram_groups[key] = []
anagram_groups[key].append(s)
return list(anagram_groups.values())
Time & Space Complexity
- Time complexity:
O(m * n log n)
- Reason: Let
m be the number of strings and n be the average length of a string. We iterate m times, and for each string, we sort it which takes O(n log n).
- Space complexity:
O(m * n)
- Reason: We store all strings in the dictionary.
Hash Map (Character Count)
Intuition
Sorting each string can be slow if the strings are very long. Instead of sorting, we can use the character count as a key. Since there are only 26 lowercase English letters, we can represent each string by a tuple of 26 integers (the count of 'a', 'b', ..., 'z'). This tuple serves as a unique signature for all anagrams of that word.
Algorithm
- Initialize a
defaultdict where the value is a list.
- Iterate through each string
s in strs.
- Create a list
count of size 26 initialized to 0.
- For each character
c in s, increment the corresponding index in count (using ord(c) - ord('a')).
- Convert the
count list to a tuple (so it can be hashed) and use it as a key in the dictionary.
- Append
s to the list at that key.
- Return the values of the dictionary.
def group_anagrams(strs: list[str]) -> list[list[str]]:
res = {}
for s in strs:
count = [0] * 26
for c in s:
count[ord(c) - ord("a")] += 1
# Use tuple as key since lists are not hashable
key = tuple(count)
if key not in res:
res[key] = []
res[key].append(s)
return list(res.values())
Time & Space Complexity
- Time complexity:
O(m * n)
- Reason: We iterate through
m strings, and for each string, we iterate through its n characters to count them. This avoids the log n factor from sorting.
- Space complexity:
O(m * n)
- Reason: We store all strings in the output dictionary.