Repeated DNA Sequences
https://leetcode.com/problems/repeated-dna-sequences/description/
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
For example,
Thoughts
首先这个有歧义,开始一直以为是找No overlapping的重复子数组,看了别人的答案才知道不一定要no overlapping, 比如
If you enter your "AAAAAAAAAAA" as custom test and click "Run Code", you'll see that expected answer is
["AAAAAAAAAA"].
知道了这点很容易想出naive 解法。即每次移动一位,把遇到的都记录下来。但这么做浪费空间。
还可以用整数存储方法。
这个思路非常简单,这里一共有四个字母:A,C,G,T。我们转换整数的思路如下:
A = 00,C = 01,G = 10,T = 11。
int key = 0, key = key << 2 | code(A|C|G|T)。
Last updated
Was this helpful?