HOME
SEARCH BY SIGNAL
UPF IMIM GRIB HOME COURSES SEQUENCE ANALYSIS Search by Signal
In This Section

What is a Motif ?

Let D={A,C,G,T} be the alphabet of the nucleotide sequences. A motif (pattern, signal...) is an object dennoting a set of sequences on this alphabet, either in a deterministic or probabilistic way. Given a sequence S and a motif m, we will say that the motif m occurs in S if any of the sequences denoted by m occurs in S.

A Hierarchy of Motif Descriptors

Sequence motifs can be described in a wide variety of ways.

  • Exact Word. The description is an specific sequence in the alphabet.

    CTTAAAATAA

  • Consensus Sequences. The description allows for the specification of alternative nucleotides occurring at a given position.

    YTWWAAATAR   (Consensus MEF2 sequence, Yu et al., 1992)

    CTAAAAATAA
    TTAAAAATAA
    TTTAAAATAA
    CTATAAATAA
    TTATAAATAA
    CTTAAAATAG
    TTTAAAATAG
    ..........


  • Regular Expressions. The description is built on an extension of the original alphabet. Among the new symbols of this extended alphabet, there symbols dennoting the alternative occurence of a number of nucleotides at a given position, and symbols denoting that a given position may not be present.

    C..?[STA]..C[STA][^P]C

    (ferredoxin, iron-sulfur binding region signature, PROSITE database, Bairoch, 1991)

  • Position Weigth Matrices. The description includes a weight (score, probability, likelihood) for each symbol occuring at each position along the motif.

    Follow the link for An Introduction to Position Weigth Matrices

PRACTICAL

Disclaimer
webmaster