TY - JOUR
T1 - Querying highly similar sequences
AU - Barton, Carl
AU - Giraud, Mathieu
AU - Iliopoulos, Costas S.
AU - Lecroq, Thierry
AU - Mouchard, Laurent
AU - Pissis, Solon P.
PY - 2013
Y1 - 2013
N2 - In this paper, we present a solution to the extreme similarity sequencing problem. The extreme similarity sequencing problem consists of finding occurrences of a pattern p in a set S(0), S(1), …, S(k), of sequences of equal length, where S(i), for all 1≤i≤k, differs from S(0) by a constant number of errors - around 10 in practice. We present an asymptotically fast O(n + occ logocc) time algorithm, as well as a practical O(nk/w) time algorithm for solving this problem, where n is the length of a sequence, occ is the number of candidate occurrences reported by our technique, w is the size of the machine word, and the total number of errors is bounded by k - the number of sequences.
AB - In this paper, we present a solution to the extreme similarity sequencing problem. The extreme similarity sequencing problem consists of finding occurrences of a pattern p in a set S(0), S(1), …, S(k), of sequences of equal length, where S(i), for all 1≤i≤k, differs from S(0) by a constant number of errors - around 10 in practice. We present an asymptotically fast O(n + occ logocc) time algorithm, as well as a practical O(nk/w) time algorithm for solving this problem, where n is the length of a sequence, occ is the number of candidate occurrences reported by our technique, w is the size of the machine word, and the total number of errors is bounded by k - the number of sequences.
U2 - 10.1504/IJCBDD.2013.052206
DO - 10.1504/IJCBDD.2013.052206
M3 - Article
VL - 6
SP - 119
EP - 130
JO - International Journal of Computational Biology and Drug Design
JF - International Journal of Computational Biology and Drug Design
IS - 1
ER -