LeetCode-Python-820. 单词的压缩编码（字典树 Trie Tree）

Post author:xfxia
Post published:2023年9月19日
Post category:python

给定一个单词列表，我们将这个列表编码成一个索引字符串 S 与一个索引列表 A。

例如，如果这个列表是 [“time”, “me”, “bell”]，我们就可以将其表示为 S = “time#bell#” 和 indexes = [0, 2, 5]。

对于每一个索引，我们可以通过从字符串 S 中索引的位置开始读取字符串，直到 “#” 结束，来恢复我们之前的单词列表。

那么成功对给定单词列表进行编码的最小字符串长度是多少呢？

示例：

输入: words = [“time”, “me”, “bell”]

输出: 10

说明: S = “time#bell#” ， indexes = [0, 2, 5] 。

提示：

1 <= words.length <= 2000

1 <= words[i].length <= 7

每个单词都是小写字母。

来源：力扣（LeetCode）

链接：https://leetcode-cn.com/problems/short-encoding-of-words

著作权归领扣网络所有。商业转载请联系官方授权，非商业转载请注明出处。

思路：

这道题关键就是判断每个字符串是不是其他某个字符串的后缀，

比如对于输入words = [“time”, “me”, “bell”]， “me” 是 “time” 的后缀，所以不用处理它。

处理字符串后缀的问题和处理字符串前缀的问题基本类似，都可以用 Trie Tree 字典树实现。

把输入数组按照字符串长度由大到小进行排序，然后逐个字符串反向（因为是后缀处理问题）插入到字典树里，

如果当前字符串不是任意字符串的后缀，就把它的长度和终止符 “#” 的长度累加到答案里。

时间复杂度：O（NlogN）

空间复杂度：O（NK）, K是最长的字符串长度

class Trie(object):
    def __init__(self):
        """
        Initialize your data structure here.
        """
        self.root = {}
        self.char_cnt = 0 # 统计 a - z 字符个数
        self.word_cnt = 0 # 统计结尾符 # 个数
    def insert(self, word):
        """
        Inserts a word into the trie.
        :type word: str
        :rtype: None
        """
        node = self.root
        for char in word: # word 入树
            node = node.setdefault(char, {})

        if not node: # not node 就代表当前 word 不是之前某一 word 的后缀        
            self.word_cnt += 1 
            self.char_cnt += len(word)
        node["end"] = True 

class Solution(object):
    def minimumLengthEncoding(self, words):
        """
        :type words: List[str]
        :rtype: int
        """
        ttree = Trie()

        for word in sorted(words, key = lambda x:len(x), reverse = True):
            # 按长度由大到小排序，再将每个 word 反向插入树
            ttree.insert(word[::-1])
        # print ttree.char_cnt, ttree.word_cnt
        return ttree.char_cnt + ttree.word_cnt

原文链接：https://blog.csdn.net/qq_32424059/article/details/105154947

你可能也喜欢