******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 3.5.4 (Release date: ) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= ../data/oreganno_data/processed_data/regulons_for_one_factor/vvl_factor_binding_sites_sequences.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ vvl_chr3L_6758913_675893 1.0000 21 vvl_chr3L_14043830_14043 1.0000 30 vvl_chr3L_14043358_14043 1.0000 15 vvl_chr3L_14044373_14044 1.0000 13 vvl_chr2R_8412593_841261 1.0000 22 vvl_chr3L_14043929_14043 1.0000 17 vvl_chr3L_14043971_14043 1.0000 12 vvl_chr3L_6759141_675915 1.0000 17 vvl_chr3L_14044313_14044 1.0000 17 vvl_chr3L_14043381_14043 1.0000 23 vvl_chr2R_11403677_11403 1.0000 8 vvl_chr3L_14043906_14043 1.0000 12 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme ../data/oreganno_data/processed_data/regulons_for_one_factor/vvl_factor_binding_sites_sequences.fa -dna -mod zoops -nmotifs 1 -revcomp -minw 6 -maxw 25 -dir /Users/jturatsi model: mod= zoops nmotifs= 1 evt= inf object function= E-value of product of p-values width: minw= 6 maxw= 25 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 12 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 207 N= 12 strands: + - sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.353 C 0.147 G 0.147 T 0.353 Background letter frequencies (from dataset with add-one prior applied): A 0.351 C 0.149 G 0.149 T 0.351 ******************************************************************************** ******************************************************************************** MOTIF 1 width = 8 sites = 12 llr = 61 E-value = 4.9e+000 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 6:12:48: pos.-specific C :2:::6:: probability G 323:6::1 matrix T 17684:29 bits 2.7 2.5 2.2 1.9 Information 1.6 content 1.4 ** (7.4 bits) 1.1 ** * 0.8 ***** 0.5 ******** 0.3 ******** 0.0 -------- Multilevel ATTTGCAT consensus G G TA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Strand Start P-value Site ------------- ------ ----- --------- -------- vvl_chr3L_14043381_14043 + 1 7.51e-06 . GTGTGCAT ACATACATAA vvl_chr2R_11403677_11403 + 1 9.07e-05 . ATTTGCAT vvl_chr3L_6759141_675915 - 5 9.07e-05 GGAAC ATTTGCAT AATG vvl_chr3L_6758913_675893 - 12 9.07e-05 AG GGGTGCAT ATCATTAGCG vvl_chr3L_14044373_14044 + 3 1.34e-03 AT TTTTGCAT ATC vvl_chr3L_14043971_14043 + 3 1.62e-03 GA ATGTTAAT TA vvl_chr3L_14043358_14043 - 7 1.62e-03 G GTTTTAAT TTATTC vvl_chr2R_8412593_841261 + 11 1.67e-03 ATCGTCATCA GCATGCAT GGCA vvl_chr3L_14043929_14043 - 8 2.00e-03 GC ATTTTAAT CAATTAA vvl_chr3L_14043906_14043 + 3 1.00e-02 TA ACTTTATT AT vvl_chr3L_14044313_14044 - 2 1.00e-02 CCTTCGTA ATGAGCTG G vvl_chr3L_14043830_14043 - 1 1.00e-02 ATTAATTTTT AGTATAAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- vvl_chr3L_14043381_14043 7.5e-06 [+1]_15 vvl_chr2R_11403677_11403 9.1e-05 [+1] vvl_chr3L_6759141_675915 9.1e-05 4_[-1]_5 vvl_chr3L_6758913_675893 9.1e-05 11_[-1]_2 vvl_chr3L_14044373_14044 0.0013 2_[+1]_3 vvl_chr3L_14043971_14043 0.0016 2_[+1]_2 vvl_chr3L_14043358_14043 0.0016 6_[-1]_1 vvl_chr2R_8412593_841261 0.0017 10_[+1]_4 vvl_chr3L_14043929_14043 0.002 7_[-1]_2 vvl_chr3L_14043906_14043 0.01 2_[+1]_2 vvl_chr3L_14044313_14044 0.01 1_[-1]_8 vvl_chr3L_14043830_14043 0.01 [-1]_22 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=8 seqs=12 vvl_chr3L_14043381_14043 ( 1) GTGTGCAT 1 vvl_chr2R_11403677_11403 ( 1) ATTTGCAT 1 vvl_chr3L_6759141_675915 ( 5) ATTTGCAT 1 vvl_chr3L_6758913_675893 ( 12) GGGTGCAT 1 vvl_chr3L_14044373_14044 ( 3) TTTTGCAT 1 vvl_chr3L_14043971_14043 ( 3) ATGTTAAT 1 vvl_chr3L_14043358_14043 ( 7) GTTTTAAT 1 vvl_chr2R_8412593_841261 ( 11) GCATGCAT 1 vvl_chr3L_14043929_14043 ( 8) ATTTTAAT 1 vvl_chr3L_14043906_14043 ( 3) ACTTTATT 1 vvl_chr3L_14044313_14044 ( 2) ATGAGCTG 1 vvl_chr3L_14043830_14043 ( 1) AGTATAAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 123 bayes= 3.20945 E= 4.9e+000 73 -1023 116 -207 -1023 16 16 93 -207 -1023 116 73 -107 -1023 -1023 125 -1023 -1023 197 25 25 197 -1023 -1023 125 -1023 -1023 -107 -1023 -1023 -84 139 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 12 E= 4.9e+000 0.583333 0.000000 0.333333 0.083333 0.000000 0.166667 0.166667 0.666667 0.083333 0.000000 0.333333 0.583333 0.166667 0.000000 0.000000 0.833333 0.000000 0.000000 0.583333 0.416667 0.416667 0.583333 0.000000 0.000000 0.833333 0.000000 0.000000 0.166667 0.000000 0.000000 0.083333 0.916667 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AG]T[TG]T[GT][CA]AT -------------------------------------------------------------------------------- Time 0.07 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- vvl_chr3L_6758913_675893 2.54e-03 11_[-1(9.07e-05)]_2 vvl_chr3L_14043830_14043 3.70e-01 30 vvl_chr3L_14043358_14043 2.56e-02 15 vvl_chr3L_14044373_14044 1.60e-02 13 vvl_chr2R_8412593_841261 4.88e-02 22 vvl_chr3L_14043929_14043 3.93e-02 17 vvl_chr3L_14043971_14043 1.61e-02 12 vvl_chr3L_6759141_675915 1.81e-03 4_[-1(9.07e-05)]_5 vvl_chr3L_14044313_14044 1.82e-01 17 vvl_chr3L_14043381_14043 2.40e-04 [+1(7.51e-06)]_15 vvl_chr2R_11403677_11403 1.81e-04 [+1(9.07e-05)] vvl_chr3L_14043906_14043 9.56e-02 12 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 1 reached. ******************************************************************************** CPU: jturatsi.scmbb.ulb.ac.be ********************************************************************************