Gene Information

Name : ECO111_p3-23 (ECO111_p3-23)
Accession : YP_003237769.1
Strain :
Genome accession: NC_013366
Putative virulence/resistance : Virulence
Product : serine protease EspP
Function : -
COG functional category : S : Function unknown
COG ID : COG4625
EC number : -
Position : 18330 - 22178 bp
Length : 3849 bp
Strand : -
Note : contains autotransporter domain

DNA sequence :
ATGCTTATATTATGTTGTTTAGGAATATTAAGCCCAACATACACCTTTGCTTCTCGCATGGATGCATCTAATTTTTACAT
CCGCGATTACCTGGATTTTGCGCAAAACAAAGGCATATTTCAGGCTGGCGCAACAAATATTGAAATAATAAAAAAAGATG
GAAGTACACTAAAACTTCCGGAAGTACCGTTTCCTGACTTCTCTCCTGTAGCAAATAAAGGCTCAACCACATCTATTGGT
GGAGCATACAGTATCACTGCTACACATAATACAAAAACCCACCACTCCGTTGCAGCACAAAACTGGGGAAACTCTACATA
CAAACAGGTAGACTGGTACACTTCACAACCAGATTTTGCTGTATCCCGACTTGATAAGTACGTTGTTGAAACCAGAGGGG
CTACAGAAGGCGCAGATACGTCGTTATCAACACAGCAGGCTCTCGAACGTTATGGCGTTAATTATAAAGGCGAGAAGAAA
CTTATTGCATTCAGAGCCGGCTCAGGAAAAATTGGGATTAGAAAAGATGGAAATTTAACACCTCTGGATGATATCTCGTA
TAAGCCTGAAATGTTAAATGGCTCTTTCGTTCATATTGATAACTGGAGTGGATGGTTAGTGTTAACCAACAACCAGTTTG
ATGAGTTTAATAACCTTGCAACTCAGGGTGACAGTGGTTCAGCACTTTTTGTATATGACAACCAAAAGAAAAAGTGGGTT
GTCGTTGGAACCGCATGGGGCGTTTATAACTATTCAAATGGGAAAAACCATACAGCATATAGTAGATGGGATCAGAAAGC
CATTGATAATCTAAAAAAGAACTTTTCTTACAATGTGGATATGTCAGGAGTACAGCAAGTTACCATTGAAAATGGAAAAC
TGACAGGCACTGGCTCAGACACCACCGATATAAAAAATAAGGACTTAATATTTACAGGCGGTGGAAACATCCTCCTGAAA
TCCTCTTTTGATAATGGTGCTGGCGGTCTTGTCTTTAATGATAAAAAGACCTATCAAGTAAACGGGGAGGATTTCACCTT
TAAAGGTGCCGGTGTTGACACAAGAAACGGCAGCACCGTTGAGTGGAATATCCGGTATGATAATAAAGACAACCTTCACA
AGATTGGTAATGGTACATTAGATGTCCGAAAATCCCAGAACACCAACCTCAAAATAGGTGAAGGCCTGGTCATTCTTGGT
GCTGAAAAAACCTTCAACAATATCTATATGGCCAGTGGCGATGGTACCGTTAAGCTGAATGCCAATAACGCACTGAACGG
TGATGATTATGCCGGAGTTTTCTTTACTGAAAATGGAGGGACCCTTGATCTAAATGGTTATGACCAGGTATTTAAGTATA
TAGCCGCAACAGATTCAGGAACAACGATTACAAACACGAATAAAGACAAGCCAGCCACTTTATCTATAAATAATACAGAT
AGCTATATTTACCATGGCAATATCACGGGCAATACCAAAATTACACATTCATTCGACCAGAAACAACAGAATGATCGTCT
GATTCTGGATGGTAATATCAATACGACAAATGATATAAGCATAAAAAATGCTCACTTAGTGATGCAAGGGCATGCTACAA
ATCATGCTGTATTCAGGGATGGCGGATTTTCCTGTTCATTTCCCGCCCAGTTAAAATGGTTATGTGGCACAGATTATGTC
GCAGTAATCCAGAATCAGGAAAAATCGGTTAATCAAAAACAAAATACAGACTACAAAACTAATAACCTGGTCTCTGATTT
GTCTCAGCCTGACTGGGAAAGGAGAACATTTAACTTCGGCACCCTTCATCTTGAAAATGCGGATTTTGCTGTTGCACGCA
ATGCTGACGTGAAAGGTAATATTTATGCAAAAAACTCCAGCGTAATGCTTGGTAGTGATATCGCTTATATTGACCTTCAT
TCAGGAAAAAACATCATTAATGATGGATTTTCCTTCCGTAAGGATATTCGATCCGGAATATCAGAAAGCACAGCAGGGGA
TTTAAGTTCATTTACCGGACGTGTTGTTGCTGATAATTCAACGCTCGCTATTAACAATAAGTTCCTGGGAGAGTTTACGG
CAGAGAATAAGAGTAAGGTTTCGGTTAAAAGCCGTGATGTGGTCCTTAATACTGGTGCCACAATTTCAAATGACAGTACT
CTGACTCTGGAAAAAGACTCACGACTGACGGCTAATATGTGGCTTATGAACTCAGGCACAATTAACGTTGGTGAAAATGC
GGAGTTAAATATTCAGGGATATCCGATTGCTGATAAATTCATTCCGTCAATACATGACCTGGGGAATGTGAAAATGACGG
CGTCAAATGCCACCCTGACGGCAGGTAATTATGCAATGTTCAGCGGTGAACTCACTGCCGATAACGCCACTGCTGTCAGA
GTTAACCTGGGCTCAGAGACATCAACACTCTCTGAATTTAATCCTAATCCGGAACTGACCGACCTGATGTTTGATAAATA
TAATACATCATGGACCGGCAAAATCTCTGCCCTGAAAGGTGATGCATCAATGGTTAATACCGTATGGCGTATGACGGGAG
ATTCCGGGCTTAATACCCTCAAAACCAGCAAATCCCTGACGGTATTCAGCAGTGATAATAAATTCTCCACGCTGACAGTT
AATGACCTGACAACCAGTGACAGCACCTTTGTGCTGCGTTCTGACTCTACAGGCTCTGATAAGGTGGTAGTAAAAAACAA
ACTTGAGGGTAAAAACAATAATCTTCTTGTTGATTATGTTGCAAATGATGGAAAACATAATTCGCTAAATTTTGAACTGG
TCAGTGCACCGAAAGGAACTGCTGCCGATGTCTTTAATTCACAGACACAAAATGTGGGCTTCAGTGATGTAACACCGGTG
ATTGAACAGAAAGATAGCGGAGAGAAAACCACATGGACCCTGAAGGGATTCAATGCTGTAGCCAATCAGCAGTCAGCTGA
AAAAGCGGAAAATTTCATGTCAGCAGGCTATAAAAATTTTCTTGCTGAAGTCAACAACCTGAACAAACGTATGGGTGACC
TGCGTGACATCAACGGCGAAGCCGGTGCATGGGCACGCATCATGAGCGGTACCGGCTCTGCCAGTGGTGGTTTCAGTGAC
AACTACACGCACGTTCAGGTCGGGGTCGACAAAAAACATGAGCTGGACGGACTGGATTTGTTTACCGGTTTCACTGTCAC
ACACACTGACAGCAGTGCCTCCGCCGATGTTTTCAGCGGTAAAACGAAGTCTGTGGGGGCTGGCCTGTATGCTTCCGCCA
TGTTTGATTCCGGTGCCTATATCGACCTGATTGGCAAGTATGTTCACCATGATAATGAGTACACAGCAACCTTTGCCGGA
CTCGGAACCCGTGATTACAGCACGCATTCATGGTATGCCGGTGCAGAAGCGGGCTACCGCTATCATGTCACTGAGGATGC
CTGGATTGAGCCACAGGCTGAGCTGGTTTACGGTTCTGTATCCGGTAAACAGTTTGCATGGAAGGACCAGGGAATGCATC
TGTCCATGAAGGACAAGGACTACAATCCGCTGATTGGCCGAACCGGTGTAGATGTGGGTAAATCCTTCTCTGGTAAGGAC
TGGAAAGTGACAGCCCGGGCCGGCCTGGGCTACCAGTTCGACCTGCTGGCTAACGGCGAAACCGTATTGCGGGATGCATC
TGGTGAAAAACGCATCAAAGGTGAAAAGGACAGCCGTATGCTGATGTCCGTTGGCCTGAATGCAGAAATCAGGGACAACG
TCCGCTTTGGACTGGAGTTTGAGAAATCCGCCTTTGGTAAGTACAACGTTGATAATGCAGTCAACGCTAACTTCCGTTAC
TCGTTCTGA

Protein sequence :
MLILCCLGILSPTYTFASRMDASNFYIRDYLDFAQNKGIFQAGATNIEIIKKDGSTLKLPEVPFPDFSPVANKGSTTSIG
GAYSITATHNTKTHHSVAAQNWGNSTYKQVDWYTSQPDFAVSRLDKYVVETRGATEGADTSLSTQQALERYGVNYKGEKK
LIAFRAGSGKIGIRKDGNLTPLDDISYKPEMLNGSFVHIDNWSGWLVLTNNQFDEFNNLATQGDSGSALFVYDNQKKKWV
VVGTAWGVYNYSNGKNHTAYSRWDQKAIDNLKKNFSYNVDMSGVQQVTIENGKLTGTGSDTTDIKNKDLIFTGGGNILLK
SSFDNGAGGLVFNDKKTYQVNGEDFTFKGAGVDTRNGSTVEWNIRYDNKDNLHKIGNGTLDVRKSQNTNLKIGEGLVILG
AEKTFNNIYMASGDGTVKLNANNALNGDDYAGVFFTENGGTLDLNGYDQVFKYIAATDSGTTITNTNKDKPATLSINNTD
SYIYHGNITGNTKITHSFDQKQQNDRLILDGNINTTNDISIKNAHLVMQGHATNHAVFRDGGFSCSFPAQLKWLCGTDYV
AVIQNQEKSVNQKQNTDYKTNNLVSDLSQPDWERRTFNFGTLHLENADFAVARNADVKGNIYAKNSSVMLGSDIAYIDLH
SGKNIINDGFSFRKDIRSGISESTAGDLSSFTGRVVADNSTLAINNKFLGEFTAENKSKVSVKSRDVVLNTGATISNDST
LTLEKDSRLTANMWLMNSGTINVGENAELNIQGYPIADKFIPSIHDLGNVKMTASNATLTAGNYAMFSGELTADNATAVR
VNLGSETSTLSEFNPNPELTDLMFDKYNTSWTGKISALKGDASMVNTVWRMTGDSGLNTLKTSKSLTVFSSDNKFSTLTV
NDLTTSDSTFVLRSDSTGSDKVVVKNKLEGKNNNLLVDYVANDGKHNSLNFELVSAPKGTAADVFNSQTQNVGFSDVTPV
IEQKDSGEKTTWTLKGFNAVANQQSAEKAENFMSAGYKNFLAEVNNLNKRMGDLRDINGEAGAWARIMSGTGSASGGFSD
NYTHVQVGVDKKHELDGLDLFTGFTVTHTDSSASADVFSGKTKSVGAGLYASAMFDSGAYIDLIGKYVHHDNEYTATFAG
LGTRDYSTHSWYAGAEAGYRYHVTEDAWIEPQAELVYGSVSGKQFAWKDQGMHLSMKDKDYNPLIGRTGVDVGKSFSGKD
WKVTARAGLGYQFDLLANGETVLRDASGEKRIKGEKDSRMLMSVGLNAEIRDNVRFGLEFEKSAFGKYNVDNAVNANFRY
SF

• Homologs from PAI DB

GeneGenBank Accn Product Virulance or Resistance PAI or REI Alignment Type E-val Identity
sigA NP_838462.1 serine protease Virulence SHI-1 Protein 0.0 53
sigA NP_708742.1 serine protease Virulence SHI-1 Protein 0.0 53
sigA AAF67320.1 exported serine protease SigA Virulence SHI-1 Protein 0.0 53
sat YP_002414040.1 Serine protease Not tested Not named Protein 0.0 53
espC AAG37043.1 enterotoxin EspC Virulence espC PAI Protein 0.0 52

• Homologs from VFDB (virulence genes)

GeneGenBank Accn Product ID of source DB Alignment Type E-val Identity
ECO111_p3-23 YP_003237769.1 serine protease EspP VFG0844 Protein 0.0 70
ECO111_p3-23 YP_003237769.1 serine protease EspP VFG0862 Protein 0.0 55
ECO111_p3-23 YP_003237769.1 serine protease EspP VFG0630 Protein 0.0 53
ECO111_p3-23 YP_003237769.1 serine protease EspP VFG0902 Protein 0.0 53
ECO111_p3-23 YP_003237769.1 serine protease EspP VFG0772 Protein 0.0 52