Gene Information

Name : SbBS512_E4650 (SbBS512_E4650)
Accession : YP_001882815.1
Strain : Shigella boydii CDC 3083-94
Genome accession: NC_010658
Putative virulence/resistance : Virulence
Product : serine protease EatA
Function : -
COG functional category : S : Function unknown
COG ID : COG4625
EC number : 3.4.21.-
Position : 4343979 - 4347836 bp
Length : 3858 bp
Strand : +
Note : identified by match to protein family HMM PF02395; match to protein family HMM PF03797; match to protein family HMM TIGR01414

DNA sequence :
ATGAATAAAATTTATTCACTGAAATATAGTCATATTACAGGTGGATTAGTTGCTGTTTCTGAACTGACCCGGAAAGTTAG
TGTCGGTACATCAAGAAAGAAAGTTATCCTCGGTATTATTTTATCCTCAATATATGGAAGTTATGGCGAAACAGCATTTG
CAGCAATGCTGGATATAAATAATATATGGACCCGCGATTATCTTGACCTTGCTCAAAACAGAGGAGAGTTCAGACCGGGT
GCAACAAATGTTCAATTAATGATGAAAGATGGAAAGATATTTCATTTTCCAGAACTACCTGTACCTGATTTTTCTGCTGT
TTCCAACAAAGGTGCAACAACATCAATTGGAGGTGCGTACAGTGTTACTGCGACTCATAACGGTACACAGCATCATGCAA
TAAAAACACAGTCATGGGATCAGACAGCATATAAAGCAAGTAACAGAGTATCATCTGGCGACTTTTCGGTTCATCGTCTG
AATAAATTCGTCGTGGAAACAACAGGGGTTACGGAGAGTGCCGACTTCTCACTTTCTCCCGAAGATGCGATGAAAAGATA
TGGCGTAAACTACAACGGTAAGGAACAAATAATTGGCTTCAGAGCAGGTGCCGGAACAACCTCAACGATATTAAACGGCA
AACAATATCTGTTTGGACAAAACTATAATCCCGACTTGTTAAGCGCAAGTCTTTTTAATCTGGACTGGAAAAACAAGAGT
TACATTTATACCAACAGAACCCCTTTTAAAAACTCACCAATTTTTGGCGATAGTGGTTCTGGTTCTTATCTATATGATAA
AGAACAACAAAAATGGGTTTTCCATGGTGTTACCAGTACAGTTGGTTTTATCAGTAGTACCAATATAGCCTGGACAAACT
ACTCGTTATTTAATAATATTCTGGTAAACAATTTAAAAAAGAATTTCACAAACACTATGCAGCTGGATGGTAAAAAACAA
GAGTTATCATCGATTATAAAAGATAAGGACCTGTCTGTCTCAGGAGGAGGGGAATTAACGCTCAAGCAGGATACCGATCT
TGGCATTGGCGGGCTTATATTCGATAAGAACCAGACATATAAAGTGTACGGAAAAGATAAGTCTTATAAAGGTGCCGGGA
TAGATATTGATAATAATACCACCGTTGAATGGAATGTTAAGGGCGTTGCCGGAGATAATCTGCATAAAATAGGTAGTGGT
ACTCTGGATGTAAAAATAGCACAGGGAAATAACCTTAAAATAGGTAATGGGACTGTCATCCTTAGTGCTGAAAAAGCCTT
CAATAAAATTTACATGGCCGGAGGTAAAGGTACGGTAAAAATAAATGCCAAAGACGCTTTAAGCGAAAGCGGTAATGGCG
AAATCTATTTTACCAGAAATGGCGGAACACTGGATCTAAACGGCTATGACCAGTCATTTCAGAAAATCGCAGCAACAGAT
GCGGGAACAACCGTAACGAACTCAAACGTGAAGCAATCAACATTATCACTTACTAATACTGATGCATATATGTACCATGG
GAATGTATCAGGTAATATAAGCATAAATCATATTATCAATACTACCCAGCAACATAACAATAATGCCAATCTGATCTTTG
ATGGCTCAGTCGATATCAAAAACGATATCTCTGTCCGGAATGCACAGTTAACATTACAAGGACATGCGACAGAACATGCC
ATATTTAAAGAAGGCAATAACAACTGTCCAATTCCTTTTTTATGTCAAAAAGATTATTCTGCTGCCATAAAGGACCAGGA
AAGCACTGTAAATAAACGTTACAATACGGAATATAAGTCCAACAATCAGATAGCCTCTTTTTCCCAGCCCGACTGGGAAA
GTCGTAAATTTAATTTCCGGAAATTAAATTTAGAAAACGCAACCCTGAGTATAGGCCGGGATGCTAATGTAAAAGGACAC
ATAGAGGCTAAAAACTCTCAAATTGTTCTGGGAAATAAAACTGCATACATTGACATGTTCTCAGGAAGAAACATTACTGG
CGAAGGTTTTGGATTCAGACAACAGCTTCGCTCCGGGGATTCAGCAGGCGAAAGTAGTTTCAACGGCAGTCTGAGTGCTC
AAAACAGCAAAATAACTGTTGGTGATAAATCAACTGTTACTATGACTGGTGCATTATCCTTAATTAATACAGACCTGATT
ATCAACAAAGGAGCTACTGTTACCGCCCAGGGAAAAATGTATGTAGATAAAGCTATTGAACTGGCCGGAACCCTGACATT
AACAGGCACCCCTACAGAAAATAATAAATACAGCCCGGCAATCTATATGTCAGATGGATATAATATGACAGAAGATGGTG
CCACGTTAAAGGCTCAAAATTATGCCTGGGTCAATGGTAATATAAAATCAGACAAAAAAGCATCTATTCTGTTTGGTGTT
GACCAGTATAAAGAAGATAACCTGGACAAAACCACACACACACCGCTGGCTACAGGTTTGCTGGGTGGCTTTGATACTTC
TTATACCGGAGGTATTGATGCTCCTGCTGCCTCAGCCAGCATGTATAACACCTTATGGAGAGTAAACGGACAGTCAGCCC
TGCAATCATTAAAAACCCGCGACAGTCTTTTGTTGTTTAGTAACATAGAGAATTCGGGTTTCCATACTGTGACTGTAAAC
ACACTGGATGCCACTAATACTGCTGTGATTATGCGGGCTGATCTGAGCCAGTCTGTAAATCAATCGGATAAACTCATTGT
TAAAAATCAGTTAACCGGACGCAATAACAGTCTGTCGGTCGATATACAGAAAGTGGGAAATAATAACTCAGGATTAAACG
TTGACCTGATAACAGCCCCAAAAGGAAGCAATAAAGAGATATTTAAAGCCAGTACTCAGGCCATAGGTTTCAGCAACATA
TCTCCTGTGATCAGCACGAAAGAGGATCAGGAACATACCACGTGGACCCTGACCGGATATAAGGTGGCTGAAAATACAGC
ATCTTCCAGTGCAGCAAAATCGTATATGTCCGGTAATTACAAAGCCTTCCTGACAGAAGTCAACAACCTGAATAAACGAA
TGGGGGATCTGCGTGACACCAATGGCGAGGCCGGTGCATGGGCCCGCATCATGAGCGGAGCAGGTTCAGCTTCTGGTGGA
TACAGTGACAACTACACCCATGTGCAGATTGGTGTGGATAAAAAACATGAGCTGGATGGACTTGACCTTTTCACTGGTCT
GACTATGACGTATACCGACAGTCATGCCAGCAGTAATGCATTCAGTGGCAAGACGAAGTCCGTCGGGGCAGGTCTGTATG
CTTCCGCTATATTTGACTCTGGTGCCTATATCGACCTGATTAGTAAGTATGTTCACCATGATAATGAGTACTCGGCGACC
TTTGCTGGGCTCGGAACAAAAGACTACAGTTCTCATTCCTTGTATGTGGGTGCTGAAGCAGGCTACCGCTATCATGTAAC
AGAAGACTCCTGGATTGAGCCGCAGGCAGAACTGGTTTATGGGGCCGTATCAGGTAAACGGTTCGACTGGCAGGATCGCG
GAATGAGCGTGACCATGAAGGATAAGGACTTTAATCCGCTGATTGGGCGTACCGGTGTTGATGTGGGTAAATCCTTCTCC
GGTAAGGACTGGAAAGTCACAGCCCGCGCCGGCCTTGGCTACCAGTTTGACCTGTTTGCCAACGGTGAAACCGTACTGCG
TGATGCGTCCGGTGAGAAACGTATCAAAGGTGAAAAAGACGGTCGTATTCTCATGAATGTTGGTCTCAACGCCGAAATTC
GCGATAATCTTCGCTTCGGTCTTGAGTTTGAGAAATCGGCATTTGGTAAATACAACGTGGATAACGCGATCAACGCCAAC
TTCCGTTACTCTTTCTGA

Protein sequence :
MNKIYSLKYSHITGGLVAVSELTRKVSVGTSRKKVILGIILSSIYGSYGETAFAAMLDINNIWTRDYLDLAQNRGEFRPG
ATNVQLMMKDGKIFHFPELPVPDFSAVSNKGATTSIGGAYSVTATHNGTQHHAIKTQSWDQTAYKASNRVSSGDFSVHRL
NKFVVETTGVTESADFSLSPEDAMKRYGVNYNGKEQIIGFRAGAGTTSTILNGKQYLFGQNYNPDLLSASLFNLDWKNKS
YIYTNRTPFKNSPIFGDSGSGSYLYDKEQQKWVFHGVTSTVGFISSTNIAWTNYSLFNNILVNNLKKNFTNTMQLDGKKQ
ELSSIIKDKDLSVSGGGELTLKQDTDLGIGGLIFDKNQTYKVYGKDKSYKGAGIDIDNNTTVEWNVKGVAGDNLHKIGSG
TLDVKIAQGNNLKIGNGTVILSAEKAFNKIYMAGGKGTVKINAKDALSESGNGEIYFTRNGGTLDLNGYDQSFQKIAATD
AGTTVTNSNVKQSTLSLTNTDAYMYHGNVSGNISINHIINTTQQHNNNANLIFDGSVDIKNDISVRNAQLTLQGHATEHA
IFKEGNNNCPIPFLCQKDYSAAIKDQESTVNKRYNTEYKSNNQIASFSQPDWESRKFNFRKLNLENATLSIGRDANVKGH
IEAKNSQIVLGNKTAYIDMFSGRNITGEGFGFRQQLRSGDSAGESSFNGSLSAQNSKITVGDKSTVTMTGALSLINTDLI
INKGATVTAQGKMYVDKAIELAGTLTLTGTPTENNKYSPAIYMSDGYNMTEDGATLKAQNYAWVNGNIKSDKKASILFGV
DQYKEDNLDKTTHTPLATGLLGGFDTSYTGGIDAPAASASMYNTLWRVNGQSALQSLKTRDSLLLFSNIENSGFHTVTVN
TLDATNTAVIMRADLSQSVNQSDKLIVKNQLTGRNNSLSVDIQKVGNNNSGLNVDLITAPKGSNKEIFKASTQAIGFSNI
SPVISTKEDQEHTTWTLTGYKVAENTASSSAAKSYMSGNYKAFLTEVNNLNKRMGDLRDTNGEAGAWARIMSGAGSASGG
YSDNYTHVQIGVDKKHELDGLDLFTGLTMTYTDSHASSNAFSGKTKSVGAGLYASAIFDSGAYIDLISKYVHHDNEYSAT
FAGLGTKDYSSHSLYVGAEAGYRYHVTEDSWIEPQAELVYGAVSGKRFDWQDRGMSVTMKDKDFNPLIGRTGVDVGKSFS
GKDWKVTARAGLGYQFDLFANGETVLRDASGEKRIKGEKDGRILMNVGLNAEIRDNLRFGLEFEKSAFGKYNVDNAINAN
FRYSF

• Homologs from PAI DB

GeneGenBank Accn Product Virulance or Resistance PAI or REI Alignment Type E-val Identity
sigA NP_838462.1 serine protease Virulence SHI-1 Protein 0.0 99
sigA NP_708742.1 serine protease Virulence SHI-1 Protein 0.0 99
sigA AAF67320.1 exported serine protease SigA Virulence SHI-1 Protein 0.0 99
sat YP_002414040.1 Serine protease Not tested Not named Protein 0.0 55
espC AAG37043.1 enterotoxin EspC Virulence espC PAI Protein 0.0 54
pic NP_838464.1 serine protease precurser Virulence SHI-1 Protein 2e-172 43

• Homologs from VFDB (virulence genes)

GeneGenBank Accn Product ID of source DB Alignment Type E-val Identity
SbBS512_E4650 YP_001882815.1 serine protease EatA VFG0630 Protein 0.0 99
SbBS512_E4650 YP_001882815.1 serine protease EatA VFG0862 Protein 0.0 57
SbBS512_E4650 YP_001882815.1 serine protease EatA VFG0844 Protein 0.0 56
SbBS512_E4650 YP_001882815.1 serine protease EatA VFG0902 Protein 0.0 55
SbBS512_E4650 YP_001882815.1 serine protease EatA VFG0772 Protein 0.0 54