Abstract
Strings with don't care symbols, also called partial words, and more general indeterminate strings are a natural representation of strings containing uncertain symbols. A considerable effort has been made to obtain efficient algorithms for pattern matching and periodicity detection in such strings. Among those, a number of algorithms have been proposed that behave well on random data, but still their worst-case running time is Θ(n2). We present the first truly subquadratic-time solutions for a number of such problems on partial words. We show that n longest common compatible prefix queries (which correspond to longest common extension queries in regular strings) can be answered on-line in O(n√n log n) time after O(n√n log n)-time preprocessing. We also present O(n√n log n)-time algorithms for computing the prefix array and two types of border array of a partial word. We show how our solutions can be adapted to indeterminate strings over a constant-sized alphabet and prove that, unless the Strong Exponential Time Hypothesis is false, the considered problems cannot be solved efficiently over a general alphabet.
Original language | English |
---|---|
Title of host publication | Leibniz International Proceedings in Informatics, LIPIcs |
Place of Publication | Germany |
Publisher | Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing |
Pages | 8.1-8.12 |
Number of pages | 12 |
Volume | 54 |
ISBN (Print) | 9783959770125 |
DOIs | |
Publication status | Published - 1 Jun 2016 |
Event | 27th Annual Symposium on Combinatorial Pattern Matching, CPM 2016 - Tel Aviv, Israel Duration: 27 Jun 2016 → 29 Jun 2016 |
Conference
Conference | 27th Annual Symposium on Combinatorial Pattern Matching, CPM 2016 |
---|---|
Country/Territory | Israel |
City | Tel Aviv |
Period | 27/06/2016 → 29/06/2016 |
Keywords
- Indeterminate string
- Longest common conservative prefix queries
- Partial word
- Prefix array
- String with don't cares