Gene Information

Name : ECSF_4014 (ECSF_4014)
Accession : YP_003352004.1
Strain : Escherichia coli SE15
Genome accession: NC_013654
Putative virulence/resistance : Virulence
Product : serine protease
Function : -
COG functional category : -
COG ID : -
EC number : -
Position : 4379740 - 4383669 bp
Length : 3930 bp
Strand : +
Note : -

DNA sequence :
GTGAACAGAATATATTCACTTAAGTATTGTCACCTTACTCAAGGGCTTATAGCTGTATCAGAATTAGCATCCCGAGTATC
CTCTAAAACCGGTCGAAAACTTCTCGCTATATTTATAGTCTCTGCCTTGTCTTATGGTGCGGCTGAATATGCCTGTGCAG
CACAGATGGACACCCGAAATTTCTGGATCCGTGATTATCTTGATCTGGCACAGAATAAAGGTGCATTTCAGCCCGGAGCA
TACGGGGTAAAAACGCCATTAAAAAATGGGGGAGAATTCAGTTTTCCTGAAGTAACAATCCCTGATTTTTCTCCTGTATC
CGCTAAAGGTGCAACAACTGCTATTGGTAATGCCTACAGTGTTACAGCAAGTCATAATGGCACTATTCACCATGCCATTA
AAACTCAGACATGGGGACAGTCAGATTATCATTATGTTGATCGTGTGACCAAAGGTGACTTTGCGGTCCAGCGTCTGGAT
AAGTTTGTTGTTGAAACAGCGGGTGCAACAGAGCATGCTGATTTCAACTTATCAGCAGCAGAAGCACTAGAGCGTTATGG
TATTGAATTCAATGGAAAGAAACAAATAATCGGTTTCCGCGTTGGAGCCGGAGCAACTGGTGTTACATCTTATGGGGTGG
GACAGGCATATAATCCATTATTACGCAGTGCCTCTATGTTTCAGTTAAACTGGAACAATATGTTAGCCACGAATAATACA
GGTGGATTTTATAATGAAGTGACAGGAGGAGACAGCGGTTCCGGATTTTATCTTTATGACAATCAGAGAAAAAAATGGGT
CATTCTTGGAACAACATATGGTAAAGCATTCTCCAGTAAGGATACCTGGGCCTTTTTTTCTCGATATGATCAGAACACTG
TCGATACCTTGAAAAATACTTTTACCCAAGAGGTGAACCTCAATGGTCAGAAAATGACAGTTAATAATAAAAATATTGCC
ATTAACGATAAAATAACTGCTATTGAACTGACCAAGAGTAATAAAAATAAAGATTTGAAATTTCATGGTGGCGGGAGCAT
TGAGCTCACCGATAACCTGAACTCAGGAACCGGAGGATTGATTTTTGATGAGGGACAACATTATTCGGTTATTGGGAAAG
ATAAAGCCTATAAAGGGGCGGGTGTTGAGATCGGAAAAGATACGGTTGTCGACTGGTCGGTAAAAGGGGAGGCAAACGAT
AACCTGCACAAAACAGGGGCCGGGACACTGAATGTCAATGTGGCCCAGGGGAATAACCTGAAAACAGGTGACGGTACCGT
TTTTCTTAATGCAGAAAAGGCTTTCAATGCTATCTATGTTGCCAGTGGCCGTGGAACGGTCAAACTGGGGCAGGCCGATG
CGCTGGATAAAAATAGTGATTACAGAGGTATTTATTTTACTAGTCGTGGGGGGACTCTGGATTTAAACGGGTTCAGCCAG
TCGTTCAAGAAGATCGCGGCAACTGATGTTGGTACCATTATCACCAATACTTCTGATAAAACAGCGACCCTTTCCCTACA
AAACCCCTCCCGCTATGTCTATCACGGTAGTATCACGGGAAATACGAATATCGAACACACTGGGACACAGAAAAGTGCTG
ACAGCAGTCTGATTATTGATGGAAACATTAATACACGCAATGACATTACTGTGCGGAATTCCCAGCTCAGACTTCAGGGG
CATGCCACATCACATGCAATATTCCGCGAGGGGCCTCGGCACTGCTATGTCCCCGGAGTTCTTTGTGACAAAGATTATGT
TACTGATTTTGCCAGACTGGAAAGTGAGGCAAATAAGAAAAATAACAGTGCCTATAAAACAAATAATCAGGTGGCTTCTT
TTGACCAACCTGACTGGGAAACCCGGCATTTCCGATTTAAGACTCTGAATCTGGAAAACTCAGAATTCACAACTGCACGT
AACTCAGTTGTTGAGGGTGATATTGTCGCATCGAATTCAACGCTGAAACTGGGGGGCGACGTTCCGGTGTTCATTGATAT
GTATGATGGCATCAATATCACCGGTAATGGTTTTGGCTTCCGCCAGGACGTTCGTGAAGGACGCTCAGCAGATGATGGCA
GTAGCAGCTATACAGGCAAAATTACACTGCAAAAAGGCTCCACGCTGGACATCAACAACCGGTTCATTGGCGGTATTGAA
GCCCATGACAGTAAGGTAAACGTCACCTCACCGGATGCCCTGCTGCAGAACAGTGGTGTTTTCGTGAATTCCACCCTTTC
TGTCCGTGACGGCGGTCATCTGACGGCACAAAAAGGGCTCTACAGTGACGGCCGGGTTCAGATTGGAAAGAACGGTACGC
TTTCCCTGAGCGGCACGCCGGAAAATGGCGCTGATAATACCTGGATGCCCGTACTGACATACATGACAGAAGGCTATGAT
TTAACCGGTGATAACGCCACGCTGAACATCAGCCAGCAGGCGCATGTTTCCGGGGATGTTCATGCAACCAGTTCATCCAG
CATTCGTATTGGCTCCGAAAACCCTGGCTCAGTTTCCTCTTCTGTCTCCCCTGTTCTGGCTGCCGGGTTGTTCAGCGGAT
ATAACGCGGCGTACTACGGTGCCATCACCGGCGGTAAGGGGAACGTCAGTATGAATAATGGCCTGTGGCAGCTGACCGGA
GATTCCGACATCAACAGTCTGACGACCCGTAACAGCCGGGTACAGTCTGAAGAAAACGGTGCCTTCCGTACCCTGACGGT
TAAGACACTTGATGCCACGGGCAGTGATTTTGTCCTGCGCACTGATCTGAAGGACGCTGATAAAATCAGTATTACGGAGA
AAGCCAGCGGTTCAGACAACACCCTGAATGTCAGCTTTATGAAGAACCCGTCTCCGGGACAGTCCCTGAATATCCCGCTG
GTCAGTGCACCGGCCGGAACATCAGGGGATATCTTTAAGGCCGGCACCCGGGTGACAGGCTTCAGTCGTGTGACGCCGAC
GCTGCGTGTTGACACCACTGGCGGCAGTACGAAGTGGATTCTGGATGGTTTCAGGACGGAAGCTGATAAAGCGGCTGCAG
CGAAGGCGAACAGTTTCATGAATGCCGGCTACAAAAGCTTTATGACGGAAGTAAACAATCTGAACAAACGTATGGGGGAG
CTGCGTGACACGAACGGTGATGCCGGTGCCTGGGCCCGTATTATGAACGGCGCCGGTTCAGCCGATGGCGGATACAGTGA
TAACTACACCCACGTTCAGGTCGGTTTTGACAAAAAACATGCGCTGGACGGTGTTGACCTGTTCACCGGTGTCACGATGA
CCTATACCGACAGCAGTGCAGACAGTGATGCGTTCAGCGGGAAGACAAAATCCGTGGGGGGAGGTCTGTATGCTTCAGCA
CTGTTTAACTCCGGTGCCTACATTGATTTGATTGGTAAATATATTCACCATGACAATGACTACACAGGTAACTTTGCCGG
TCTGGGTACGAAGCACTACGGAACCCACTCCTGGTATGCCGGAGCGGAAACGGGTTACCGTTATCACCTGACGGAAGACA
CATTTATTGAGCCTCAGGCCGAACTGGTTTACGGCGCAGTGTCCGGGAAAACATTCCGCTGGAAAGACGGTGAGATGGAC
CTGAGTATGAAGAACAAGGATTTCAGCCCGTTGATTGGCAGAACAGGGATTGAGCTGGGCAAAACCTTCAGGGGTAAGGA
CTGGAGTGTGACAGCCCGTGCCGGAACCAGCTGGCAGTTTGACCTGCTGAATAATGGCGAGACAGTTCTTCGTGATGCAT
CCGGAGAGAAACGGATCAAAGGGGAGAAAGACAGCAGGATGTTGTTCAATGTTGGCATGAACGCACAGATAAAGGACAAC
ATGCGCTTTGGTCTGGAGTTTGAGAAATCCGCCTTTGGTAAATACAACGTGGATAACGCAATAAACGCGAATTTCCGGTA
TATGTTCTGA

Protein sequence :
MNRIYSLKYCHLTQGLIAVSELASRVSSKTGRKLLAIFIVSALSYGAAEYACAAQMDTRNFWIRDYLDLAQNKGAFQPGA
YGVKTPLKNGGEFSFPEVTIPDFSPVSAKGATTAIGNAYSVTASHNGTIHHAIKTQTWGQSDYHYVDRVTKGDFAVQRLD
KFVVETAGATEHADFNLSAAEALERYGIEFNGKKQIIGFRVGAGATGVTSYGVGQAYNPLLRSASMFQLNWNNMLATNNT
GGFYNEVTGGDSGSGFYLYDNQRKKWVILGTTYGKAFSSKDTWAFFSRYDQNTVDTLKNTFTQEVNLNGQKMTVNNKNIA
INDKITAIELTKSNKNKDLKFHGGGSIELTDNLNSGTGGLIFDEGQHYSVIGKDKAYKGAGVEIGKDTVVDWSVKGEAND
NLHKTGAGTLNVNVAQGNNLKTGDGTVFLNAEKAFNAIYVASGRGTVKLGQADALDKNSDYRGIYFTSRGGTLDLNGFSQ
SFKKIAATDVGTIITNTSDKTATLSLQNPSRYVYHGSITGNTNIEHTGTQKSADSSLIIDGNINTRNDITVRNSQLRLQG
HATSHAIFREGPRHCYVPGVLCDKDYVTDFARLESEANKKNNSAYKTNNQVASFDQPDWETRHFRFKTLNLENSEFTTAR
NSVVEGDIVASNSTLKLGGDVPVFIDMYDGINITGNGFGFRQDVREGRSADDGSSSYTGKITLQKGSTLDINNRFIGGIE
AHDSKVNVTSPDALLQNSGVFVNSTLSVRDGGHLTAQKGLYSDGRVQIGKNGTLSLSGTPENGADNTWMPVLTYMTEGYD
LTGDNATLNISQQAHVSGDVHATSSSSIRIGSENPGSVSSSVSPVLAAGLFSGYNAAYYGAITGGKGNVSMNNGLWQLTG
DSDINSLTTRNSRVQSEENGAFRTLTVKTLDATGSDFVLRTDLKDADKISITEKASGSDNTLNVSFMKNPSPGQSLNIPL
VSAPAGTSGDIFKAGTRVTGFSRVTPTLRVDTTGGSTKWILDGFRTEADKAAAAKANSFMNAGYKSFMTEVNNLNKRMGE
LRDTNGDAGAWARIMNGAGSADGGYSDNYTHVQVGFDKKHALDGVDLFTGVTMTYTDSSADSDAFSGKTKSVGGGLYASA
LFNSGAYIDLIGKYIHHDNDYTGNFAGLGTKHYGTHSWYAGAETGYRYHLTEDTFIEPQAELVYGAVSGKTFRWKDGEMD
LSMKNKDFSPLIGRTGIELGKTFRGKDWSVTARAGTSWQFDLLNNGETVLRDASGEKRIKGEKDSRMLFNVGMNAQIKDN
MRFGLEFEKSAFGKYNVDNAINANFRYMF

• Homologs from PAI DB

GeneGenBank Accn Product Virulance or Resistance PAI or REI Alignment Type E-val Identity
espC AAG37043.1 enterotoxin EspC Virulence espC PAI Protein 0.0 60
pic NP_838464.1 serine protease precurser Virulence SHI-1 Protein 0.0 59
sigA AAF67320.1 exported serine protease SigA Virulence SHI-1 Protein 0.0 53
sigA NP_838462.1 serine protease Virulence SHI-1 Protein 0.0 53
sigA NP_708742.1 serine protease Virulence SHI-1 Protein 0.0 53
sat YP_002414040.1 Serine protease Not tested Not named Protein 0.0 52

• Homologs from VFDB (virulence genes)

GeneGenBank Accn Product ID of source DB Alignment Type E-val Identity
ECSF_4014 YP_003352004.1 serine protease VFG0772 Protein 0.0 60
ECSF_4014 YP_003352004.1 serine protease VFG0630 Protein 0.0 53
ECSF_4014 YP_003352004.1 serine protease VFG0902 Protein 0.0 52
ECSF_4014 YP_003352004.1 serine protease VFG0844 Protein 0.0 51
ECSF_4014 YP_003352004.1 serine protease VFG0862 Protein 0.0 51