Approximate string searching with fast fourier transforms and simplexes

Torrey Pines High School, San Diego, California, United States
DOI
10.7287/peerj.preprints.27615v1
Subject Areas
Algorithms and Analysis of Algorithms
Keywords
fast fourier transform, Hamming distance, string searching, don't cares, simplex
Copyright
© 2019 Liu
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Liu D. 2019. Approximate string searching with fast fourier transforms and simplexes. PeerJ Preprints 7:e27615v1

Abstract

Previous algorithms for solving the approximate string matching with Hamming distance problem with wildcard ("don't care") characters have been shown to take \(O(|\Sigma| N \log M)\) time, where \(N\) is the length of the text, \(M\) is the length of the pattern, and \(|\Sigma|\) is the size of the alphabet. They make use of the Fast Fourier Transform for efficiently calculating convolutions. We describe a novel approach of the problem, which makes use of special encoding schemes that depend on \((|\Sigma| - 1)\)-simplexes in \((|\Sigma| - 1)\)-dimensional space.

Author Comment

This is a preprint submission to PeerJ Preprints.