{"id":61,"date":"2015-03-23T19:29:31","date_gmt":"2015-03-23T19:29:31","guid":{"rendered":"http:\/\/emacslisp.com\/?p=61"},"modified":"2015-06-07T11:36:56","modified_gmt":"2015-06-07T11:36:56","slug":"repeated-dna-sequences","status":"publish","type":"post","link":"http:\/\/emacslisp.com\/?p=61","title":{"rendered":"Repeated DNA Sequences"},"content":{"rendered":"<p>Repeated DNA Sequences Total Accepted: 8816 Total Submissions: 47451 My Submissions Question Solution<br \/>\nAll DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: &#8220;ACGAATTCCG&#8221;. When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.<\/p>\n<p>Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.<\/p>\n<p>For example,<\/p>\n<p>Given s = &#8220;AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT&#8221;,<\/p>\n<p>Return:<br \/>\n[&#8220;AAAAACCCCC&#8221;, &#8220;CCCCCAAAAA&#8221;].<\/p>\n<pre lang=\"java\" line=\"1\">\r\n\r\nimport java.util.*;\r\npublic class RepeatedDNASequences {\r\n    public static List<String> findRepeatedDnaSequences(String s) {\r\n\r\n        ArrayList<String> result = new ArrayList<String>();\r\n\r\n        if(s== null || s.length() == 0) return result;\r\n\r\n        HashMap<Character,Integer> map = new HashMap<Character,Integer>();\r\n        map.put('A',0);\r\n        map.put('C',1);\r\n        map.put('G',2);\r\n        map.put('T',3);\r\n\r\n        HashSet<Integer> hashSet = new HashSet<Integer>();\r\n        Set<Integer> unique = new HashSet<Integer>();\r\n\r\n        int temp = 0;\r\n        for(int i=0;i<s.length();i++){\r\n            \/*temp = 0;\r\n              for(int j=0;j<10;j++){\r\n              temp = (temp<<2) + map.get(s.charAt(i+j));\r\n              }*\/\r\n            if(i<9){\r\n                temp = (temp<<2) + map.get(s.charAt(i));\r\n                \/\/ System.out.println(temp);\r\n            }\r\n            else{\r\n                temp = (temp<<2) + map.get(s.charAt(i));\r\n                temp &amp;= (1<<20) - 1;\r\n                System.out.println(temp);\r\n                if(hashSet.contains(temp) &amp;&amp; !unique.contains(temp)){\r\n                    result.add(s.substring(i-9,i+1));\r\n                    unique.add(temp);\r\n                }\r\n                else {\r\n                    hashSet.add(temp);\r\n                }\r\n            }\r\n        }\r\n\r\n        return result;\r\n    }\r\n\r\n    public static void main(String[] args) {\r\n        String s = \"AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT\";\r\n        System.out.println(findRepeatedDnaSequences(s));\r\n    }\r\n}\r\n\r\n<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Repeated DNA Sequences Total Accepted: 8816 Total Submissions: 47451 My Submissions Question Solution All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: &#8220;ACGAATTCCG&#8221;. When studying DNA, it is sometimes useful to identify repeated sequences within the DNA. Write a function to find all the 10-letter-long sequences [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-61","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"http:\/\/emacslisp.com\/index.php?rest_route=\/wp\/v2\/posts\/61","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/emacslisp.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/emacslisp.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/emacslisp.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/emacslisp.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=61"}],"version-history":[{"count":5,"href":"http:\/\/emacslisp.com\/index.php?rest_route=\/wp\/v2\/posts\/61\/revisions"}],"predecessor-version":[{"id":106,"href":"http:\/\/emacslisp.com\/index.php?rest_route=\/wp\/v2\/posts\/61\/revisions\/106"}],"wp:attachment":[{"href":"http:\/\/emacslisp.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=61"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/emacslisp.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=61"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/emacslisp.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=61"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}