Is Subsequence

https://leetcode.com/problems/is-subsequence/description/

Given a string s and a string t, check if s is subsequence of t.

You may assume that there is only lower case English letters in both s and t. t is potentially a very long (length ~= 500,000) string, and s is a short string (<=100).

A subsequence of a string is a new string which is formed from the original string by deleting some (can be none) of the characters without disturbing the relative positions of the remaining characters. (ie, "ace" is a subsequence of "abcde" while "aec" is not).

Example 1: s = "abc", t = "ahbgdc"

Return true.

Example 2: s = "axc", t = "ahbgdc"

Return false.

Thoughts

这道题有很多解法,最快的是贪心法,一般能想到动规解法,还有就是二分法。这里详细聊下二分法。

最先自然想到依次检查每个在s中的字符是否在t中存在, 并检查最先存在的位置是否在上个检查位置的后面:

for (Char c : s) {
  int startPos = search(start, c, t, startPos)
}

时间复杂度为O(m+n),不是O(mn), 因为t实际上只遍历了一次。想用二分法则在search做文章,但却无法直接套用。 因为虽然t自然被分界点c分成了两段,前段没有c,后段首字母为c(如存在),但是却无法找到满足第三个条件(即无论中点落在那点,都能利用某规则判断中点是否符合性质。这个规则须与序列长度无关)的规则,于是卡住。

后来看到他人的解法恍然大悟。难点在于不能O(1)知道中点是落在了c的前还是后,但如果我们提前把(所有可能的)c位置存起来,不就可以O(1)时间判断了。为什么之前没想到呢?因为大部分二分法的题都是直接O(lgn)解法,不会先花O(n)时间存下某些信息的思维定式。所以看到暴力解法时间复杂度超过了O(n)的题,就要想想是不是可以先做某preprocessing为二分法做铺垫。

Code

class Solution {
    private int firstValidPos(int target, List<Integer> location) {
            int start = 0;
            int end = location.size() - 1;
            while (start + 1 < end) {
                int mid = start + (end - start) / 2;
                if (location.get(mid) == target) {
                    end = mid;
                } else if (location.get(mid) < target) {
                    start = mid;
                } else {
                    end = mid;
                }

            }
        if (location.get(start) >= target) {
            return location.get(start);
        } else if (location.get(end) >= target) {
            return location.get(end);
        } else {
            return -2;
        }
    }


    public boolean isSubsequence(String s, String t) {
        Map<Character, List<Integer>> locations = new HashMap<>();
        for (int i = 0; i < t.length(); i++) {
            char c = t.charAt(i);
            if (!locations.keySet().contains(c)) {
                locations.put(c, new ArrayList<>());
            }
            locations.get(c).add(i);
        }

        int firstPos = 0;
        for (int i = 0; i < s.length(); i++) {
            char c = s.charAt(i);
            if (!locations.keySet().contains(c)) {
                return false;
            }
            firstPos = firstValidPos(firstPos, locations.get(c)) + 1;
            if (firstPos < 0) {
                break;
            }
        }

        if (firstPos < 0) {
            return false;
        } else {
            return true;
        }
    }
}

Analysis

时间复杂度为O(mlgn), 空间复杂度为存储了t中所有字符所在位置的hashmap, O(n).

Last updated