ACM International Conference Proceeding Series, Pages 107-111 , 27/12/2017

Approximate tandem repeats computation

Tida Butrak, Supaporn Chairungsee

Abstract

In this paper, we present a linear-time algorithm to compute approximate tandem repeat in genomic sequences with the Longest Previous non-overlapping Factor (LPnF) table. The LPnF table is related to a certain type of Ziv-Lempel factorization in text compression that is an essential element for the design of efficient algorithms on strings. The LPnF table is a technique to obtain a number of substrings including the longest factor occurring previously at a position i. We show the process for computing the LPnF table by using on-line construction of position heap that is a well-known data structure to efficiently and speed up the search. The proposed algorithm to compute approximate tandem repeat with the LPnF table using on-line construction of position heap runs in linear time, O (n), for a string of length n.

Document Type

Conference Paper

Source Type

Conference Proceeding

ISBN

[9781450363518]

ISSN

Keywords

Approximate tandem repeatLongest previous non-overlapping factorOn-line construction of position heapTandem repeat

ASJC Subject Area

Computer Science : SoftwareComputer Science : Computer Networks and CommunicationsComputer Science : Human-Computer InteractionComputer Science : Computer Vision and Pattern Recognition


Bibliography


Butrak, T., & Chairungsee, S. (2017). Approximate tandem repeats computation. ACM International Conference Proceeding Series107-111. doi:10.1145/3176653.3176660

Copy | Save