The task is:
A non-empty zero-indexed string S is given. String S consists of N characters from the set of upper-case English letters A, C, G, T.
This string actually represents a DNA sequence, and the upper-case letters represent single nucleotides.
You are also given non-empty zero-indexed arrays P and Q consisting of M integers. These arrays represent queries about minimal nucleotides. We represent the letters of string S as integers 1, 2, 3, 4 in arrays P and Q, where A = 1, C = 2, G = 3, T = 4, and we assume that A < C < G < T.
Query K requires you to find the minimal nucleotide from the range (P[K], Q[K]), 0 ≤ P[i] ≤ Q[i] < N.
For example, consider string S = GACACCATA and arrays P, Q such that:
P[0] = 0 Q[0] = 8 P[1] = 0 Q[1] = 2 P[2] = 4 Q[2] = 5 P[3] = 7 Q[3] = 7
The minimal nucleotides from these ranges are as follows:
(0, 8) is A identified by 1, (0, 2) is A identified by 1, (4, 5) is C identified by 2, (7, 7) is T identified by 4.
Write a function:
class Solution { public int[] solution(String S, int[] P, int[] Q); }
that, given a non-empty zero-indexed string S consisting of N characters and two non-empty zero-indexed arrays P and Q consisting of M integers, returns an array consisting of M characters specifying the consecutive answers to all queries.
The sequence should be returned as:
a Results structure (in C), or a vector of integers (in C++), or a Results record (in Pascal), or an array of integers (in any other programming language).
For example, given the string S = GACACCATA and arrays P, Q such that:
P[0] = 0 Q[0] = 8 P[1] = 0 Q[1] = 2 P[2] = 4 Q[2] = 5 P[3] = 7 Q[3] = 7
the function should return the values [1, 1, 2, 4], as explained above.
Assume that:
N is an integer within the range [1..100,000]; M is an integer within the range [1..50,000]; each element of array P, Q is an integer within the range [0..N − 1]; P[i] ≤ Q[i]; string S consists only of upper-case English letters A, C, G, T.
Complexity:
expected worst-case time complexity is O(N+M); expected worst-case space complexity is O(N), beyond input storage (not counting the storage required for input arguments).
Elements of input arrays can be modified.
My solution is:
class Solution { public int[] solution(String S, int[] P, int[] Q) { final char c[] = S.toCharArray(); final int answer[] = new int[P.length]; int tempAnswer; char tempC; for (int iii = 0; iii < P.length; iii++) { tempAnswer = 4; for (int zzz = P[iii]; zzz <= Q[iii]; zzz++) { tempC = c[zzz]; if (tempC == 'A') { tempAnswer = 1; break; } else if (tempC == 'C') { if (tempAnswer > 2) { tempAnswer = 2; } } else if (tempC == 'G') { if (tempAnswer > 3) { tempAnswer = 3; } } } answer[iii] = tempAnswer; } return answer; } }
It is not optimal, I believe it's supposed to be done within one loop, any hint how can I achieve it?
You can check quality of your solution here https://codility.com/train/ test name is Genomic-range-query.
Here is the solution that got 100 out of 100 in codility.com. Please read about prefix sums to understand the solution:
public static int[] solveGenomicRange(String S, int[] P, int[] Q) { //used jagged array to hold the prefix sums of each A, C and G genoms //we don't need to get prefix sums of T, you will see why. int[][] genoms = new int[3][S.length()+1]; //if the char is found in the index i, then we set it to be 1 else they are 0 //3 short values are needed for this reason short a, c, g; for (int i=0; i<S.length(); i++) { a = 0; c = 0; g = 0; if ('A' == (S.charAt(i))) { a=1; } if ('C' == (S.charAt(i))) { c=1; } if ('G' == (S.charAt(i))) { g=1; } //here we calculate prefix sums. To learn what's prefix sums look at here https://codility.com/media/train/3-PrefixSums.pdf genoms[0][i+1] = genoms[0][i] + a; genoms[1][i+1] = genoms[1][i] + c; genoms[2][i+1] = genoms[2][i] + g; } int[] result = new int[P.length]; //here we go through the provided P[] and Q[] arrays as intervals for (int i=0; i<P.length; i++) { int fromIndex = P[i]; //we need to add 1 to Q[i], //because our genoms[0][0], genoms[1][0] and genoms[2][0] //have 0 values by default, look above genoms[0][i+1] = genoms[0][i] + a; int toIndex = Q[i]+1; if (genoms[0][toIndex] - genoms[0][fromIndex] > 0) { result[i] = 1; } else if (genoms[1][toIndex] - genoms[1][fromIndex] > 0) { result[i] = 2; } else if (genoms[2][toIndex] - genoms[2][fromIndex] > 0) { result[i] = 3; } else { result[i] = 4; } } return result; }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With