Charalampopoulos, Panagiotis and Kociumaka, T. and Wellnitz, P. (2022) Faster pattern matching under edit distance : a reduction to dynamic puzzle matching and the Seaweed Monoid of permutation matrices. 63rd IEEE Annual Symposium on Foundations of Computer Science, FOCS 2022 , pp. 698-707. ISSN 2575-8454.
|
Text
focs_biron.pdf - Author's Accepted Manuscript Download (604kB) | Preview |
Abstract
We consider the approximate pattern matching problem under the edit distance. Given a text T of length n, a pattern P of length m, and a threshold k, the task is to find the starting positions of all substrings of T that can be transformed to P with at most k edits. More than 20 years ago, Cole and Hariharan [SODA’98, J. Comput.’02] gave an O(n + k^4·n/m)-time algorithm for this classic problem, and this runtime has not been improved since. Here, we present an algorithm that runs in time O(n + k^{3.5}√( log m log k) · n/m), thus breaking through this longstanding barrier. In the case where n^{1/4+ε} ≤ k ≤ n^{2/5−ε} for some arbitrarily small positive constant ε, our algorithm improves over the state-of-the-art by polynomial factors: it is polynomially faster than both the algorithm of Cole and Hariharan and the classic O(kn)-time algorithm of Landau and Vishkin [STOC’86, J. Algorithms’89]. We observe that the bottleneck case of the alternative O(n + k^4· n/m)-time algorithm of Charalampopoulos, Kociumaka, and Wellnitz [FOCS’20] is when the text and the pattern are (almost) periodic. Our new algorithm reduces this case to a new Dynamic Puzzle Matching problem, which we solve by building on tools developed by Tiskin [SODA’10, Algorithmica’15] for the so called seaweed monoid of permutation matrices. Our algorithm relies only on a small set of primitive operations on strings and thus also applies to the fully-compressed setting (where text and pattern are given as straight-line programs) and to the dynamic setting (where we maintain a collection of strings under creation, splitting, and concatenation), improving over the state of the art.
Metadata
Item Type: | Article |
---|---|
Additional Information: | Date of Conference: 31 October 2022 - 03 November 2022. ISBN: 9781665455190 |
School: | Birkbeck Faculties and Schools > Faculty of Science > School of Computing and Mathematical Sciences |
Depositing User: | Panagiotis Charalampopoulos |
Date Deposited: | 06 Jan 2023 05:48 |
Last Modified: | 09 Aug 2023 12:54 |
URI: | https://eprints.bbk.ac.uk/id/eprint/50362 |
Statistics
Additional statistics are available via IRStats2.