Gene Information

Name : ECO103_3805 (ECO103_3805)
Accession : YP_003223650.1
Strain : Escherichia coli 12009
Genome accession: NC_013353
Putative virulence/resistance : Virulence
Product : secreted autotransporter serine protease
Function : -
COG functional category : M : Cell wall/membrane/envelope biogenesis
COG ID : COG3468
EC number : -
Position : 3873591 - 3877682 bp
Length : 4092 bp
Strand : +
Note : Integrative element ECO103_IE04

DNA sequence :
ATGAATAAAATATACGCGCTTAAATATAGCTCCCTTACTGGTGGGCTTATAGCTGTGTCAGAATTAAGTAAGAAGGTCAC
AGGAAAAACCGGCAGAAGATTAATGACGGTTTCCCTGGTATTATCAGTGACTCTTTCTACTTTACCGGGTAAAGCATCAA
CGGTCAGCGCAGAAATACCATATCAGACTTTTCGTGACTTTGCTGAAAATAAAGGTGTGTTTACCCCCGGAGTGACAGGA
ATTGAAATAAACGACAACAATGGAAATAAAGTTGGGGTTCTTGATGTTCCCATGCTTGATTTTTCCAGTCTTTCTCGTGA
TGGTCATACCACATTGATTCATCCTGGTTATGTTGTATCGGCTAAACATGGTGGTTTACAAAGTGTTTCATCAGCAACTT
TTGGTTATGACCAGATATATAAAATAGTTGATAATAACCTTGCTGGTATAGATTTTTCTGCCCCACGATTAAATAAGCTT
GTTACAGAAGTAATTCCCGCAGATATACAGGGAAAGGATAAATTTAATAATAACCGGTATACGGCTTTTTACCGTGCGGG
CGTTGGCTCTCAATATATCCGTTATGCAAATGGCACAGATAAACTACTGCAGGCTTACACTCCAGAAAAGGCTTATCTGA
CCGGCGGAACAGTGGGGAAACCTTATTATACTCACTATAACGGTATGAAGATGATTTCAGCAAACCCGGGAAATACCTTT
GATAAAAACCAGGGACCTCTTGCCAGTTATGGACAGAGTGGAGACAGTGGTTCGCCGTTATATGCCTGGGATAACATTGA
CAAAAAATGGGTATTAGCTGGAGTTACTCTGCATAATTATGGAGTAAAAGGTGCACGAAATGACTGGCTCCTGATACCTC
ATGACTTTATCAGTCAAAAATTACAGGATGACCTTAAACCAATTATTGTTGCTTCTCCGGAGGAGAATATCTTACGCTGG
GAATTTGATCGTTCCAGAGGTACAGGTACTCTCAGTCAGGGAGAGAAGATTTTTTCCATGACTGGTAGTGTAAACGGAAA
TGCAAATACCGGGAACAATCTTGTGTTCTCAGGTAATGAAGGGAAAATCGAACTGGTATCCAGTGTGGAACAGGGAGCCG
GATATCTTCAGTTTGATAAAGACTACACTGTACTGACAAATAATAACAGTACATGGACTGGTGCCGGAATTATTGTCGGT
GACGAGGCAAATGTCAAATGGGGAGTTAATGGCATTGCCGGTGATAATCTGCATAAAGTTGGTTCCGGGACATTAACTGT
TAATGGTCATGGTGAGAATAAAGGTGGCCTTAAAGTTGGGGACGGCGTTGTTGTTCTTGAGCAACAGCCAGATGCAAACC
AGAAACAACAGGCGTTTAGCCATATCAATATAGCCAGTGGTCGGGCAACAGTTAAACTTAACGGGGCAAACCAGGTAGAT
GCAGATAATATCAGCTGGGGATATCGGGGTGGGAAACTGGATTTAAATGGGTATGATTTTACCTTTTCCCGTCTTCAGGC
TGCAGATTATGGTGCTGAAATCAGCAACGATAATCAGACAGATAAATCCATAGTCACACTTTCGTTATCTCCTCTGAAAG
CAGAAGAAATAAATGTGGTTGTTAATAATATAAATATAATGGGGGGGACAGGCAAACCAGGTGATCTGTATTATACGACC
TTTGACGGAAATTATTATCTGTTGAAAAGTAACCGATATGGCAGCGCTTTGTTTGGCGCGCTGAATAATCAGAGCGAATG
GCAAAGGCTGGGTAAGGATAAAGAAAAAGCAATTGGGTTATATACTCAGATGAAAATGCAGGAAAGCGCTCCTTTATCAT
ATATATATCATGGAAAAATAACCGGTAATACCAGTGTGGAAATCCCCAAACTGGCAGGCAATGATATTTTAACGCTTGAT
GGCTCTGTCAGTATATCAGGAGATATGTCAAAACAGGACGGTGCTCTTATCTTCCAGGGACACCCTGTTATTCATGCAGG
GCAAACTGTTTCTGCATCGCAGAGTGACTGGGAGAACAGGGAGTTCTCACTCAACAATCTGAATCTTAATAATGTGGACT
TCAGTCTGTCCCGTAATGCATTTATGAACGGGAATATCAGGGCCGTTAACCAGAGCACTGTTATTATCGGCGGAGATACA
GTCTTTACTGATAAAAATGACGGAACAGGTAATGATGTCATCAGTGTTGAAGGGAAATCTGCTGCCTCAGGAACATCCTC
CTATACAGGGCATATCACTCTGGAGCAAAAATCAGCACTGGATATCCGCGATAATTTTCGTGGCGGGGTTACGTCTGAAG
ACAGTCATATCAATGTTTCTTCATCTTCAGTCCTGTTCTCAGATGCATCGTCATTTATAAACAGCTCCCTGAATATTCAT
AAAGGAGGTGCGCTGACCGCTCAGGGAGGGCTGTTTACAAGTGGAAGCATTGATATTGGTGACGCTTCCCTTCTGCTTAC
CGGTACACCAGTGAATTCAGATGATGCTGCTTTTTTACCGACCATCAATATGGCTGATGGCGGATTTAAACTGATGTCTG
ATTCATCAGTACTGAAAGCCAGAGACCAGGCATCTGTTGTTGGTGATATTATTTCTGATAAACAGGCCACAATCAGCTTC
GGAACTGAATCAGGTAAAGAGGGCATATTATCTGAGAAGGCATCCCGGGGACTCGCGGTAGGATTACTGAGTGGTTTTAA
TACGGCATACCGCGGTGCAATTCATGCCCCGTCAGCATCTGCCACTATGAACAACACCTGGTGGCAACTGACAGGAGACT
CCTCACTTCGCTCGTTAAAAAATACCGGAAGCATGACATATTTTACAGGAAGTGCAGCGAATAAAGCATTCCATACACTG
ACGGTTGATGAGCTGACGACGAATGGCACTGCGTATGCCATGCGTACGGACCTGAAAAATGCGGATAAGCTGGTAGTAAA
CCAAAAGCTGTCAGGTAAGGATAATATTCTGCTGGTTGATTTTCTGAACAAACCCACCGGAGAAAAACTGGATATTGAAC
TGGTGAGTGCACCGGGGAACAGCAGTAAGGATGTTTTTAAAGGAAGTGAACAGGAAATAGGTTTCAGCAATGTCACACCT
GTTATTACAGCTATAGACGCCGGAGATAAAACAACATGGAACCTGACCGGGTACAGGATGGCAGAAAATCCTGCCGCAAC
CCAAAGTGCCTCAGGCCTTGCATCTGTGGGGTACAAATCATTTTTGAGTGAGGTCAACAACCTGAATAAACGTATGGGTG
ACCTGCGTGACATCAATGGTGAAGCTGGCGCATGGGCACGTATCATGAGCGGAACCGGCTCTGCCGGTGGTGGTTTCAGT
GACAACCACACACATGTTCAGGTCGGTGTCGACAAAAAACATGAGCTGGACGGACTGGATTTGTTTACCGGCTTCACTGT
CACACACACTGACAGCAGTGCCTCCGCCGATGCTTTCAAAGGTAAAACAAAATCTGTGGGGGCCGGACTCTATGCTTCCG
CCATGTTTGATTCCGGTGCCTATATCGACCTGATTGGTAAGTATGTTCATCATGATAATGAGTACACCGCAACCTTTGCC
GGACTCGGAACCCGTGATTACAGTACGCATTCATGGTATGCCGGTGCTGAAGCAGGCTACCGCTGTCATGTCACTGAGGA
TACCTGGATTGAGCCACAGGCAGAACTGGTTTACGGTGCTGTATCCGGTAAACAGTTTGCATGGAAGGACCAGGGGATGC
ATCTGTCCATGAAGGACAGGGACTACAATCCGCTGATTGGTCGTACCGGTGTGGATGTGGGTAAATCCTTCTCAGGTAAG
GACTGGAAAGTGACAGCCCGTGCCGGTCTGGGCTACCAGTTCGACCTGCTGGCTAACGGCGAAACCGTATTGCGGGACGC
ATCAGGTGAAAAACGTATCAAAGGTGAAAAAGACAGCCGTATGCTGATGTCCGTTGGCCTGAATGCAGAAATCAGGGACA
ACGTCCGCTTTGGACTGGAGTTTGAGAAATCCGCCTTTGGTAAGTACAACGTTGATAATGCTGTCAACGCTAACTTCCGT
TACTCGTTCTGA

Protein sequence :
MNKIYALKYSSLTGGLIAVSELSKKVTGKTGRRLMTVSLVLSVTLSTLPGKASTVSAEIPYQTFRDFAENKGVFTPGVTG
IEINDNNGNKVGVLDVPMLDFSSLSRDGHTTLIHPGYVVSAKHGGLQSVSSATFGYDQIYKIVDNNLAGIDFSAPRLNKL
VTEVIPADIQGKDKFNNNRYTAFYRAGVGSQYIRYANGTDKLLQAYTPEKAYLTGGTVGKPYYTHYNGMKMISANPGNTF
DKNQGPLASYGQSGDSGSPLYAWDNIDKKWVLAGVTLHNYGVKGARNDWLLIPHDFISQKLQDDLKPIIVASPEENILRW
EFDRSRGTGTLSQGEKIFSMTGSVNGNANTGNNLVFSGNEGKIELVSSVEQGAGYLQFDKDYTVLTNNNSTWTGAGIIVG
DEANVKWGVNGIAGDNLHKVGSGTLTVNGHGENKGGLKVGDGVVVLEQQPDANQKQQAFSHINIASGRATVKLNGANQVD
ADNISWGYRGGKLDLNGYDFTFSRLQAADYGAEISNDNQTDKSIVTLSLSPLKAEEINVVVNNINIMGGTGKPGDLYYTT
FDGNYYLLKSNRYGSALFGALNNQSEWQRLGKDKEKAIGLYTQMKMQESAPLSYIYHGKITGNTSVEIPKLAGNDILTLD
GSVSISGDMSKQDGALIFQGHPVIHAGQTVSASQSDWENREFSLNNLNLNNVDFSLSRNAFMNGNIRAVNQSTVIIGGDT
VFTDKNDGTGNDVISVEGKSAASGTSSYTGHITLEQKSALDIRDNFRGGVTSEDSHINVSSSSVLFSDASSFINSSLNIH
KGGALTAQGGLFTSGSIDIGDASLLLTGTPVNSDDAAFLPTINMADGGFKLMSDSSVLKARDQASVVGDIISDKQATISF
GTESGKEGILSEKASRGLAVGLLSGFNTAYRGAIHAPSASATMNNTWWQLTGDSSLRSLKNTGSMTYFTGSAANKAFHTL
TVDELTTNGTAYAMRTDLKNADKLVVNQKLSGKDNILLVDFLNKPTGEKLDIELVSAPGNSSKDVFKGSEQEIGFSNVTP
VITAIDAGDKTTWNLTGYRMAENPAATQSASGLASVGYKSFLSEVNNLNKRMGDLRDINGEAGAWARIMSGTGSAGGGFS
DNHTHVQVGVDKKHELDGLDLFTGFTVTHTDSSASADAFKGKTKSVGAGLYASAMFDSGAYIDLIGKYVHHDNEYTATFA
GLGTRDYSTHSWYAGAEAGYRCHVTEDTWIEPQAELVYGAVSGKQFAWKDQGMHLSMKDRDYNPLIGRTGVDVGKSFSGK
DWKVTARAGLGYQFDLLANGETVLRDASGEKRIKGEKDSRMLMSVGLNAEIRDNVRFGLEFEKSAFGKYNVDNAVNANFR
YSF

• Homologs from PAI DB

GeneGenBank Accn Product Virulance or Resistance PAI or REI Alignment Type E-val Identity
unnamed CAC39286.1 hypothetical protein Not tested LPA Protein 0.0 99
pic NP_838464.1 serine protease precurser Virulence SHI-1 Protein 0.0 54
she AAB58244.1 mucinase Virulence SHI-1 Protein 0.0 52
pic NP_708747.3 serine protease Not tested SHI-1 Protein 0.0 52
pic AAK00464.1 Pic Virulence SHI-1 Protein 0.0 52
unnamed CAD66214.1 putative hemoglobin protease Not tested PAI III 536 Protein 0.0 44
vat YP_851472.1 vacuolating autotransporter Not tested PAI III APEC-O1 Protein 0.0 44
vat AAO21903.1 vacuolating autotransporter toxin Virulence Not named Protein 0.0 43

• Homologs from VFDB (virulence genes)

GeneGenBank Accn Product ID of source DB Alignment Type E-val Identity
ECO103_3805 YP_003223650.1 secreted autotransporter serine protease VFG0903 Protein 0.0 53
ECO103_3805 YP_003223650.1 secreted autotransporter serine protease VFG0861 Protein 0.0 52
ECO103_3805 YP_003223650.1 secreted autotransporter serine protease VFG0635 Protein 0.0 52
ECO103_3805 YP_003223650.1 secreted autotransporter serine protease VFG0844 Protein 0.0 47
ECO103_3805 YP_003223650.1 secreted autotransporter serine protease VFG1689 Protein 0.0 44
ECO103_3805 YP_003223650.1 secreted autotransporter serine protease VFG0904 Protein 0.0 44