Maximal Motif discovery in a sliding window

Costas S. Iliopoulos, Manal Mohamed, Solon P. Pissis, Fatima Vayani*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

3 Citations (Scopus)

Abstract

Motifs are relatively short sequences that are biologically significant, and their discovery in molecular sequences is a well-researched subject. A don’t care is a special letter that matches every letter in the alphabet. Formally, a motif is a sequence of letters of the alphabet and don’t care letters. A motif (Formula presented) that occurs at least k times in a sequence is maximal if it cannot be extended (to the left or right) nor can it be specialised (that is, its d’≤d don’t cares cannot be replaced with letters from the alphabet) without reducing its number of occurrences. Here we present a new dynamic data structure, and the first on-line algorithm, to discover all maximal motifs in a sliding window of length l on a sequence x of length n in (Formula presented) time, where w is the size of the machine word and DIFFi i-1 is the symmetric difference of the sets of occurrences of maximal motifs at x[i-l..i-1] and at x[i-l+1..i].

Original languageEnglish
Title of host publicationString Processing and Information Retrieval - 25th International Symposium, SPIRE 2018, Proceedings
EditorsTravis Gagie, Alistair Moffat, Gonzalo Navarro, Ernesto Cuadros-Vargas
PublisherSpringer Verlag
Pages191-205
Number of pages15
ISBN (Print)9783030004781
DOIs
Publication statusPublished - 1 Jan 2018
Event25th International Symposium on String Processing and Information Retrieval, SPIRE 2018 - Lima, Peru
Duration: 9 Oct 201811 Oct 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11147 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference25th International Symposium on String Processing and Information Retrieval, SPIRE 2018
Country/TerritoryPeru
CityLima
Period9/10/201811/10/2018

Keywords

  • Genome analysis
  • Motif discovery
  • Sequence motifs

Fingerprint

Dive into the research topics of 'Maximal Motif discovery in a sliding window'. Together they form a unique fingerprint.

Cite this