******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 3.5.4 (Release date: ) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= ../data/oreganno_data/processed_data/regulons_for_one_factor/en_factor_binding_sites_sequences.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ en_chrX_1996383_1996439_ 1.0000 57 en_chr2R_7044683_7044718 1.0000 36 en_chr3R_12599167_125991 1.0000 14 en_chr2R_7044520_7044545 1.0000 26 en_chrX_1981511_1981523_ 1.0000 13 en_chrX_1996747_1996757_ 1.0000 11 en_chrX_1996767_1996793_ 1.0000 27 en_chr4_80105_80117_D 1.0000 13 en_chr2R_7041421_7041430 1.0000 10 en_chr4_80673_80681_D 1.0000 9 en_chr2R_7041482_7041501 1.0000 20 en_chrX_1980287_1980378_ 1.0000 92 en_chrX_1981537_1981622_ 1.0000 86 en_chr3R_12599235_125992 1.0000 19 en_chrX_1981649_1981669_ 1.0000 21 en_chr3R_12526804_125268 1.0000 12 en_chr2L_2471225_2471232 1.0000 8 en_chr3R_12526988_125269 1.0000 10 en_chrX_1996644_1996665_ 1.0000 22 en_chr2L_2471447_2471454 1.0000 8 en_chrX_1996473_1996534_ 1.0000 62 en_chrX_1980403_1980432_ 1.0000 30 en_chr3R_12599326_125993 1.0000 20 en_chrX_1981450_1981495_ 1.0000 46 en_chr2R_7044643_7044664 1.0000 22 en_chr3R_2692105_2692143 1.0000 39 en_chr2L_2471427_2471434 1.0000 8 en_chr2L_2471457_2471466 1.0000 10 en_chrX_1981689_1981725_ 1.0000 37 en_chr2R_7041389_7041401 1.0000 13 en_chrX_1996582_1996641_ 1.0000 60 en_chr2L_2471212_2471220 1.0000 9 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme ../data/oreganno_data/processed_data/regulons_for_one_factor/en_factor_binding_sites_sequences.fa -dna -mod zoops -nmotifs 1 -revcomp -minw 6 -maxw 25 -dir /Users/jturatsi model: mod= zoops nmotifs= 1 evt= inf object function= E-value of product of p-values width: minw= 6 maxw= 25 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 32 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 870 N= 32 strands: + - sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.370 C 0.130 G 0.130 T 0.370 Background letter frequencies (from dataset with add-one prior applied): A 0.369 C 0.131 G 0.131 T 0.369 ******************************************************************************** ******************************************************************************** MOTIF 1 width = 6 sites = 31 llr = 144 E-value = 6.9e-004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 179::3 pos.-specific C ::1::: probability G :2::27 matrix T 8::a8: bits 2.9 2.6 2.3 2.1 Information 1.8 content 1.5 * * (6.7 bits) 1.2 ** * 0.9 ***** 0.6 ****** 0.3 ****** 0.0 ------ Multilevel TAATTG consensus G A sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Strand Start P-value Site ------------- ------ ----- --------- ------ en_chr2R_7044643_7044664 - 3 8.96e-04 TTTAACTGGT TAATTG AA en_chr2L_2471447_2471454 + 2 8.96e-04 T TAATTG A en_chrX_1996644_1996665_ - 15 8.96e-04 AA TAATTG CTTCTTGGGT en_chr2L_2471225_2471232 + 3 8.96e-04 TA TAATTG en_chr2R_7041482_7041501 + 12 8.96e-04 AATTATAATT TAATTG ACC en_chr2R_7041421_7041430 + 4 8.96e-04 GGC TAATTG G en_chr4_80105_80117_D - 5 8.96e-04 ATG TAATTG AGCA en_chrX_1996767_1996793_ + 8 8.96e-04 TCGCTGC TAATTG GCAGTACAAA en_chr2R_7044520_7044545 + 5 8.96e-04 AAGC TAATTG TTTTAATTTA en_chr2R_7044683_7044718 + 17 8.96e-04 GATTGATATT TAATTG ACATTTAATT en_chrX_1996582_1996641_ + 11 1.21e-03 TTGAACTCCA TGATTG TAAGCTATAA en_chr3R_12599326_125993 + 11 1.21e-03 CAAAAAATTA TGATTG TTTC en_chrX_1996747_1996757_ + 4 1.21e-03 AGT TGATTG AA en_chrX_1980287_1980378_ + 16 1.53e-03 AACAAAAAAG TAATGG TGTAAAATAA en_chr3R_12599167_125991 - 2 1.53e-03 CCTTTTT TAATGG C en_chrX_1981649_1981669_ + 5 2.08e-03 GTAA TGCTTG TAAATACTAC en_chrX_1996383_1996439_ + 52 2.19e-03 TGAACTAAAT TACTGG en_chr2R_7041389_7041401 + 7 4.75e-03 ACTCTC TAATTA G en_chr2L_2471457_2471466 + 4 4.75e-03 GAT TAATTA T en_chr2L_2471427_2471434 + 2 4.75e-03 T TAATTA C en_chrX_1981450_1981495_ + 8 4.75e-03 AGTGTGG TAATTA TTTTCTTAAT en_chrX_1980403_1980432_ + 22 4.75e-03 ATATATTTAT TAATTA ATT en_chr4_80673_80681_D + 3 4.75e-03 CT TAATTA G en_chrX_1981537_1981622_ + 29 5.65e-03 ACTCTACTAA TGATTA TACTATTATT en_chr3R_2692105_2692143 - 17 7.18e-03 TGAGTAGTTA TCATTG AAAGGCATTT en_chrX_1981511_1981523_ + 7 7.18e-03 TGCAAT AGATTG T en_chr3R_12599235_125992 - 10 8.08e-03 CTGC TAATGA GCGATCTTT en_chrX_1981689_1981725_ - 19 9.15e-03 AGGCAAATCA AAATGG AGAACGTTAA en_chr3R_12526988_125269 - 3 9.53e-03 GC CGATGG TG en_chr3R_12526804_125268 + 3 1.46e-02 CC AAATTA GCAG en_chrX_1996473_1996534_ + 37 2.11e-02 ATAGACATTT AACTTA ATACCTTTCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- en_chr2R_7044643_7044664 0.0009 2_[-1]_14 en_chr2L_2471447_2471454 0.0009 1_[+1]_1 en_chrX_1996644_1996665_ 0.0009 14_[-1]_2 en_chr2L_2471225_2471232 0.0009 2_[+1] en_chr2R_7041482_7041501 0.0009 11_[+1]_3 en_chr2R_7041421_7041430 0.0009 3_[+1]_1 en_chr4_80105_80117_D 0.0009 4_[-1]_3 en_chrX_1996767_1996793_ 0.0009 7_[+1]_14 en_chr2R_7044520_7044545 0.0009 4_[+1]_16 en_chr2R_7044683_7044718 0.0009 16_[+1]_14 en_chrX_1996582_1996641_ 0.0012 10_[+1]_44 en_chr3R_12599326_125993 0.0012 10_[+1]_4 en_chrX_1996747_1996757_ 0.0012 3_[+1]_2 en_chrX_1980287_1980378_ 0.0015 15_[+1]_71 en_chr3R_12599167_125991 0.0015 1_[-1]_7 en_chrX_1981649_1981669_ 0.0021 4_[+1]_11 en_chrX_1996383_1996439_ 0.0022 51_[+1] en_chr2R_7041389_7041401 0.0048 6_[+1]_1 en_chr2L_2471457_2471466 0.0048 3_[+1]_1 en_chr2L_2471427_2471434 0.0048 1_[+1]_1 en_chrX_1981450_1981495_ 0.0048 7_[+1]_33 en_chrX_1980403_1980432_ 0.0048 21_[+1]_3 en_chr4_80673_80681_D 0.0048 2_[+1]_1 en_chrX_1981537_1981622_ 0.0056 28_[+1]_52 en_chr3R_2692105_2692143 0.0072 16_[-1]_17 en_chrX_1981511_1981523_ 0.0072 6_[+1]_1 en_chr3R_12599235_125992 0.0081 9_[-1]_4 en_chrX_1981689_1981725_ 0.0091 18_[-1]_13 en_chr3R_12526988_125269 0.0095 2_[-1]_2 en_chr3R_12526804_125268 0.015 2_[+1]_4 en_chrX_1996473_1996534_ 0.021 36_[+1]_20 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=6 seqs=31 en_chr2R_7044643_7044664 ( 3) TAATTG 1 en_chr2L_2471447_2471454 ( 2) TAATTG 1 en_chrX_1996644_1996665_ ( 15) TAATTG 1 en_chr2L_2471225_2471232 ( 3) TAATTG 1 en_chr2R_7041482_7041501 ( 12) TAATTG 1 en_chr2R_7041421_7041430 ( 4) TAATTG 1 en_chr4_80105_80117_D ( 5) TAATTG 1 en_chrX_1996767_1996793_ ( 8) TAATTG 1 en_chr2R_7044520_7044545 ( 5) TAATTG 1 en_chr2R_7044683_7044718 ( 17) TAATTG 1 en_chrX_1996582_1996641_ ( 11) TGATTG 1 en_chr3R_12599326_125993 ( 11) TGATTG 1 en_chrX_1996747_1996757_ ( 4) TGATTG 1 en_chrX_1980287_1980378_ ( 16) TAATGG 1 en_chr3R_12599167_125991 ( 2) TAATGG 1 en_chrX_1981649_1981669_ ( 5) TGCTTG 1 en_chrX_1996383_1996439_ ( 52) TACTGG 1 en_chr2R_7041389_7041401 ( 7) TAATTA 1 en_chr2L_2471457_2471466 ( 4) TAATTA 1 en_chr2L_2471427_2471434 ( 2) TAATTA 1 en_chrX_1981450_1981495_ ( 8) TAATTA 1 en_chrX_1980403_1980432_ ( 22) TAATTA 1 en_chr4_80673_80681_D ( 3) TAATTA 1 en_chrX_1981537_1981622_ ( 29) TGATTA 1 en_chr3R_2692105_2692143 ( 17) TCATTG 1 en_chrX_1981511_1981523_ ( 7) AGATTG 1 en_chr3R_12599235_125992 ( 10) TAATGA 1 en_chrX_1981689_1981725_ ( 19) AAATGG 1 en_chr3R_12526988_125269 ( 3) CGATGG 1 en_chr3R_12526804_125268 ( 3) AAATTA 1 en_chrX_1996473_1996534_ ( 37) AACTTA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 6 n= 710 bayes= 5.18461 E= 6.9e-004 -151 -202 -1160 118 101 -202 79 -1160 129 -44 -1160 -1160 -1160 -1160 -1160 144 -1160 -1160 56 113 -19 -1160 237 -1160 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 6 nsites= 31 E= 6.9e-004 0.129032 0.032258 0.000000 0.838710 0.741935 0.032258 0.225806 0.000000 0.903226 0.096774 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.193548 0.806452 0.322581 0.000000 0.677419 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- T[AG]ATT[GA] -------------------------------------------------------------------------------- Time 0.56 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- en_chrX_1996383_1996439_ 2.04e-01 57 en_chr2R_7044683_7044718 5.41e-02 36 en_chr3R_12599167_125991 2.72e-02 14 en_chr2R_7044520_7044545 3.70e-02 26 en_chrX_1981511_1981523_ 1.09e-01 13 en_chrX_1996747_1996757_ 1.45e-02 11 en_chrX_1996767_1996793_ 3.87e-02 27 en_chr4_80105_80117_D 1.42e-02 13 en_chr2R_7041421_7041430 8.93e-03 10 en_chr4_80673_80681_D 3.74e-02 9 en_chr2R_7041482_7041501 2.65e-02 20 en_chrX_1980287_1980378_ 2.34e-01 92 en_chrX_1981537_1981622_ 6.01e-01 86 en_chr3R_12599235_125992 2.03e-01 19 en_chrX_1981649_1981669_ 6.44e-02 21 en_chr3R_12526804_125268 1.86e-01 12 en_chr2L_2471225_2471232 5.36e-03 8 en_chr3R_12526988_125269 9.13e-02 10 en_chrX_1996644_1996665_ 3.00e-02 22 en_chr2L_2471447_2471454 5.36e-03 8 en_chrX_1996473_1996534_ 9.12e-01 62 en_chrX_1980403_1980432_ 2.12e-01 30 en_chr3R_12599326_125993 3.58e-02 20 en_chrX_1981450_1981495_ 3.23e-01 46 en_chr2R_7044643_7044664 3.00e-02 22 en_chr3R_2692105_2692143 3.87e-01 39 en_chr2L_2471427_2471434 2.82e-02 8 en_chr2L_2471457_2471466 4.65e-02 10 en_chrX_1981689_1981725_ 4.45e-01 37 en_chr2R_7041389_7041401 7.34e-02 13 en_chrX_1996582_1996641_ 1.25e-01 60 en_chr2L_2471212_2471220 7.31e-01 9 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 1 reached. ******************************************************************************** CPU: jturatsi.scmbb.ulb.ac.be ********************************************************************************