LeetCode 49. 字母异位词分组（中）

这道题的关键是，不同字母异位词怎么用同一个key表示。可以先对输入进行排序，然后将排序结果作为key，不管输入的字符的位置怎么排列，一旦排序之后，肯定是一样。

题目

给定一个字符串数组，将字母异位词组合在一起。字母异位词指字母相同，但排列不同的字符串。 说明： - 所有输入均为小写字母。 - 不考虑答案输出的顺序。

示例

示例:

输入: ["eat", "tea", "tan", "ate", "nat", "bat"],
输出:
[
  ["ate","eat","tea"],
  ["nat","tan"],
  ["bat"]
]

考察知识点

哈希表、字符串
哈希表相关题目的的精髓就是，在不同的输入之间找共同点，并以此为键，以输入为值。
题目模板

class Solution:
    def func(self, inputs):
        res = {}
        for input in inputs:
            if key in res:
                res[key].append(input)
            else:
                res[key] = (input)
        return res

    def hashMapFunc(self, value):
        # 构建value-key关系
        return key

核心思想

方法一、暴力对比法

思路很直接，一个一个遍历过去，如果字符出现种类和次数一样，就放到一起，否则就放到另一边。
时间复杂度为 \(O(n^2)\)。
不同的字母异位词怎么用同一个key表示？

1 2	`strs = ["eat", "tea", "tan", "ate", "nat", "bat"] map_dict = {'bat': ['bat'], 'eat': ['eat', 'tea', 'ate'], 'tan': ['tan', 'nat']}`

算法

声明保存已出现类型的 res 和保存映射关系的 map_dict
开始遍历输入字符串，for _str in strs:
- 获取 tmp_dict={"a": 1, "b": 1, "n": 1} 保存 _str="abn" 中出现的字母及其出现的次数。
- 判断 res 是否为空
  - 如果为空，说明这是第一次添加，把 tmp_dict 添加到 res 中，更新 map_dict。
  - 如果非空，说明这不是第一次添加，就遍历 res 寻找是否有字母异位词。设置存在判定标记变量 is_exist = False 。
  1
  2
  for _dic in res: if _dic == tmp_dict:
  - 如果已经存在，就设置 is_exist = True，然后更新 map_dict 中对应的位置。
  - 如果不存在，就更新 res 并把这个新的类型添加到 map_dict。
将 map_dict 转化为 list 形式

方法二、将输入排序之后再处理

算法

声明 tmp_dict 保存结果
遍历 strs
- 先调用 HashMapFunc 函数基于对 tmp_str 进行排序的方法获得其对应的 key 。
  - 先将 tmp_str 转成 list类型的数据
  - 再将排序结果转回成 str类型的数据再返回 sort_str = ''.join(sorted(list(temp_str)))
- 判断排序结果是否已经在 tmp_dict 中
  - 如果在，就添加。
  - 如果不在，就新建。
返回 tmp_dict

方法三、按计数分类

由于输入的全是小写字母，也可以将每个字符串 s 转换为字符数 count，由26个非负整数组成，表示 a，b，c 的数量等。我们使用这些计数作为哈希映射的基础。这样一来不管输入的字符的位置怎么排列，处理成26个非负数字的组合之后，肯定是独一无二的。

算法

与方法2的唯一的区别就是 hashMapFunc 函数

def HashMapFunc(self, temp_str):
    key = [0] * 26
    for char in temp_str:
    key[ord(char) - ord('a')] += 1
    return ''.join([str(i) for i in key])

LeetCode题解

Python版本

方法一的实现，自己完成的。

注意回顾，这个版本，十分的不简洁。

class Solution:
    def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
        # 声明hash map
        map_dict = {}
        res = []
        for _str in strs:
            tmp_dict = {}
            for _char in _str:
                if _char not in tmp_dict:
                    tmp_dict[_char] = 1
                else:
                    tmp_dict[_char] += 1
            if len(res)  == 0:
                res.append(tmp_dict)
                target_str = ''
                for _char in tmp_dict:
                    target_str += _char * tmp_dict[_char]
                map_dict[target_str] = [_str]
            else:
                is_exist = False
                for _dic in res:
                    if _dic == tmp_dict:  # 如果存在 就不新增了 直接添加
                        is_exist = True
                        target_str = ''
                        for _char in _dic:
                            target_str += _char * _dic[_char]
                        map_dict[target_str].append(_str)
                        break
                if not is_exist:  # 如果不存在 就继续新增 
                    res.append(tmp_dict)
                    target_str = ''
                    for _char in tmp_dict:
                        target_str += _char * tmp_dict[_char]
                    map_dict[target_str] = [_str]

        res = []
        for _dict in map_dict:
            res.append(map_dict[_dict])
        return res

print("leet code accept!!!")
Input = [
    ["paw","dad","bog","day","day","mig","len","rat"],
    ["cab","tin","pew","duh","may","ill","buy","bar","max","doc"], 
    ["eat", "tea", "tan", "ate", "nat", "bat"]]
Input1 = []
Answer = [
    [["rat"],["mig"],["paw"],["dad"],["len"],["bog"],["day","day"]],
    [["doc"],["bar"],["buy"],["ill"],["may"],["tin"],["cab"],["pew"],["max"],["duh"]],
    [["ate","eat","tea"],["nat","tan"],["bat"]],
]

if __name__ == "__main__":
    solution = Solution()
    for i in range(len(Input)):
        print("-"*50)
        reslut = solution.groupAnagrams(Input[i])
        print(reslut)
        print(Answer[i])

时间复杂度：\(O(n \times m)\), \(n\)是字符串个数，\(m\)是类别总数。
执行用时：2076 ms, 在所有 Python3 提交中击败了5.03%的用户
内存消耗：19.4 MB, 在所有 Python3 提交中击败了5.05%的用户

方法二的实现推荐

class Solution:
    def groupAnagrams(self,strs):
        temp_dict={}
        for temp_str in strs:
            sort_str = self.HashMapFunc(temp_str)
            if sort_str in temp_dict:
                temp_dict[sort_str].append(temp_str)
            else:
                temp_dict[sort_str]=[temp_str]
        return list(temp_dict.values())


    def HashMapFunc(self, temp_str):
        return ''.join(sorted(list(temp_str)))

时间复杂度：\(O(NK \log K)\)，其中 \(N\) 是 \(strs\) 的长度，而 \(K\) 是 \(strs\) 中字符串的最大长度。当我们遍历每个字符串时，外部循环具有的复杂度为 \(O(N)\)。然后，我们在 \(O(K \log K)\) 的时间内对每个字符串排序。
空间复杂度：\(O(NK)\)，排序存储在 \(ans\) 中的全部信息内容。
执行用时 :56 ms, 在所有 Python3 提交中击败了96.16%的用户
内存消耗 :16.2 MB, 在所有 Python3 提交中击败了61.68%的用户

方法三的实现

class Solution:
    def groupAnagrams(self,strs):
        temp_dict={}
        for temp_str in strs:
            sort_str = self.HashMapFunc(temp_str)
            if sort_str in temp_dict:
                temp_dict[sort_str].append(temp_str)
            else:
                temp_dict[sort_str]=[temp_str]
        return list(temp_dict.values())

    def HashMapFunc(self, temp_str):
        key = [0] * 26
        for char in temp_str:
            key[ord(char) - ord('a')] += 1
        return ''.join([str(i) for i in key])

时间复杂度：\(O(NK)\)，其中 \(N\) 是 \(strs\) 的长度，而 \(K\) 是 \(strs\) 中字符串的最大长度。计算每个字符串的字符串大小是线性的，我们统计每个字符串。空间复杂度：\(O(NK)\)，排序存储在 \(ans\) 中的全部信息内容。执行用时 :140 ms, 在所有 Python3 提交中击败了30.09%的用户内存消耗 :16.2 MB, 在所有 Python3 提交中击败了60.42%的用户

有效语法糖

1、将 Python dict 中的 Value 转成 list

1 2	`_dict = {'abt': ['bat'], 'aet': ['eat', 'tea', 'ate'], 'ant': ['tan', 'nat']} _list = list(_dict.values()) # [['ate', 'eat', 'tea'], ['nat', 'tan'], ['bat']]`

LeetCode

本博客所有文章除特别声明外，均采用 CC BY-SA 4.0 协议，转载请注明出处！

LeetCode 50. Pow(x, n)（中）& 剑指offer 面试题16. 数值的整数次方（中）上一篇

LeetCode 48. 旋转图像（中）下一篇