Gene Information

Name : sat (ECUMN_3367)
Accession : YP_002414040.1
Strain : Escherichia coli UMN026
Genome accession: NC_011751
Putative virulence/resistance : Virulence
Product : Serine protease
Function : -
COG functional category : S : Function unknown
COG ID : COG4625
EC number : 3.4.21.72
Position : 3479461 - 3483360 bp
Length : 3900 bp
Strand : -
Note : Evidence 2b : Function of strongly homologous gene; Product type e : enzyme

DNA sequence :
TTGAGAGAATATATGAATAAAATATACTCCCTTAAATATAGTGCTGCCACTGGCGGACTCATTGCTGTTTCTGAATTAGC
GAAAAGAGTTTCTGGTAAAACAAACCGAAAACTTGTAGCAACAATGTTGTCTCTGGCTGTTGCCGGTACAGTAAATGCAG
CAAATATTGATATATCAAATGTATGGGCGAGAGACTATCTTGATCTTGCACAAAATAAAGGTATTTTCCAGCCCGGAGCA
ACAGACGTAACAATCACTTTAAAAAACGGAGATAAATTCTCTTTCCATAATCTCTCAATTCCGGATTTTTCTGGTGCAGC
AGCGAGTGGCGCAGCTACCGCAATAGGAGGTTCTTATAGTGTTACTGTTGCACATAACAAAAAGAACCCTCAGGCCGCAG
AAACCCAGGTTTACGCTCAGTCTTCTTACAGGGTTGTTGACAGAAGAAATTCCAATGATTTTGAGATTCAGAGGTTAAAT
AAATTTGTTGTGGAAACAGTAGGTGCCACCCCGGCAGAGACCAACCCTACAACATATTCTGATGCATTAGAACGCTACGG
TATAGTCACTTCTGACGGTTCAAAAAAAATCATAGGTTTTCGTGCTGGCTCTGGAGGAACATCATTTATTAATGGTGAAT
CCAAAATCTCAACAAATTCAGCATATAGCCATGATCTGTTAAGTGCTAGTCTATTTGAGGTCACCCAATGGGACTCATAC
GGCATGATGATTTATAAAAATGATAAAACATTTCGTAATCTTGAAATATTCGGAGACAGCGGCTCTGGAGCATACTTATA
TGATAACAAACTAGAAAAATGGGTATTAGTCGGAACAACCCATGGTATTGCCAGCGTTAATGGTGACCAACTGACATGGA
TAACAAAATACAATGATAAACTGGTTAGTGAGTTAAAAGATACCTATAGTCATAAAATAAATCTGAATGGCAATAATGTA
ACCATTAAAAACACAGATATAACATTACACCAAAACAATGCAGATACCACTGGTACTCAAGAAAAAATAACTAAAGACAA
AGATATTGTGTTCACAAATGGGGGAGATGTCCTGTTTAAGGATAATTTGGATTTTGGTAGCGGTGGTATTATCTTTGACG
AAGGCCATGAATATAACATAAACGGTCAGGGATTTACATTTAAAGGAGCAGGAATTGATATCGGAAAAGAAAGCATTGTA
AACTGGAATGCATTGTATTCCAGTGATGATGTTTTACACAAAATAGGCCCTGGTACTCTGAATGTTCAAAAAAAACAGGG
GGCAAATATAAAGATAGGTGAAGGAAATGTTATTCTTAATGAAGAAGGAACATTTAACAATATATACCTTGCAAGCGGAA
ATGGTAAGGTAATACTAAATAAAGATAATTCCCTTGGCAATGATCAATATGCGGGGATATTTTTTACTAAACGTGGTGGT
ACGCTAGATTTAAATGGACACAATCAGACTTTTACTAGAATTGCCGCCACTGACGATGGAACAACAATAACTAACTCAGA
TACAACGAAAGAAGCCGTTCTGGCAATCAATAACGAAGACTCCTACATATATCATGGGAACATAAATGGCAATATAAAAC
TAACACACAATATTAATTCTCAGGATAAGAAAACTAATGCAAAATTAATTCTGGATGGTAGTGTCAACACAAAAAATGAT
GTTGAAGTCAGTAATGCCAGTCTTACCATGCAAGGCCATGCAACAGAGCATGCAATATTCAGAAGCTCAGCGAATCATTG
CTCCCTGGTATTTCTTTGTGGAACGGACTGGGTCACCGTTTTGAAAGAAACAGAGAGTTCATATAATAAAAAGTTCAATT
CTGATTACAAAAGTAATAATCAGCAGACCTCATTTGATCAGCCTGACTGGAAAACCGGGGTGTTTAAATTTGATACATTA
CACCTGAACAATGCTGACTTTTCAATATCACGCAATGCCAATGTTGAAGGAAATATATCAGCAAATAAATCAGCTATCAC
AATCGGCGATAAAAATGTTTACATTGATAATCTTGCAGGGAAAAATATTACTAATAATGGTTTTGACTTCAAACAAACTA
TCAGTACTAATCTATCCATAGGAGAAACTAAATTTACAGGTGGCATCACTGCACATAACAGCCAAATAGCCATAGGTGAT
CAAGCTGTAGTTACACTTAATGGTGCAACCTTTCTGGATAATACTCCTATAAGTATAGATAAAGGAGCAAAAGTTATAGC
ACAAAATTCCATGTTCACAACAAAAGGTATTGATATCTCCGGTGAACTGACTATGATGGGAATCCCTGAACAGAATAGTA
AAACTGTAACGCCGGGTCTCCACTACGCTGCTGATGGATTCAGGCTGAGTGGTGGAAATGCAAATTTCATTGCCAGAAAT
ATGGCATCTGTCACCGGAAATATTTATGCTGATGATGCAGCAACCATTACTCTGGGACAGCCTGAAACTGAAACACCGAC
TATATCGTCTGCTTATCAGGCATGGGCAGAGACTCTTTTGTATGGCTTTGATACCGCTTATCGAGGCGCAATAACAGCCC
CCAAAGCTACAGTTAGCATGAATAATGCGATCTGGCATCTAAATAGCCAGTCATCAATTAATCGTCTAGAAACAAAAGAC
AGTATGGTGCGTTTTACTGGTGATAATGGGAAGTTTACAACCCTTACAGTGAACAACCTTACTATAGATGACAGTGCATT
TGTGCTGCGTGCAAATCTGGCCCAAGCAGATCAGCTTGTTGTCAATAAATCGTTGTCTGGTAAAAACAACCTTCTGTTAG
TCGACTTCATTGAGAAAAATGGAAACAGCAACGGACTGAATATCGATCTGGTCAGCGCACCAAAAGGAACTGCAGTAGAT
GTCTTTAAAGCTACGACTCGGAGTATTGGCTTCAGTGATGTAACACCGGTTATCGAGCAAAAGAACGATACAGACAAAGC
AACATGGACTCTGATCGGCTATAAATCTGTGGCCAACGCCGATGCGGCTAAAAAGGCAACATTACTGATGTCAGGCGGCT
ATAAAGCCTTCCTTGCTGAGGTCAACAACCTTAACAAACGTATGGGTGATCTGCGTGACATTAACGGTGAGTCCGGTGCA
TGGGCCCGAATCATTAGCGGAACCGGGTCTGCCGGCGGTGGATTCAGTGACAACTACACCCACGTTCAGGTCGGTGCGGA
TAACAAACATGAACTCGATGGCCTTGACCTCTTCACCGGGGTGACCATGACCTATACCGACAGCCATGCAGGCAGTGATG
CCTTCAGTGGTGAAACGAAGTCTGTGGGTGCCGGTCTCTATGCCTCTGCCATGTTTGAGTCCGGAGCATATATCGACCTC
ATCGGTAAGTACGTTCACCATGACAACGAGTATACCGCAACTTTCGCCGGCCTTGGCACCAGAGACTACAGCTCCCACTC
CTGGTATGCCGGTGCGGAAGTCGGTTACCGTTACCATGTAACTGACTCTGCATGGATTGAGCCGCAGGCGGAACTTGTTT
ACGGTGCTGTATCCGGGAAACAGTTCTCCTGGAAGGACCAGGGAATGAACCTCACCATGAAGGATAAGGACTTTAATCCG
CTGATTGGGCGTACCGGTGTTGATGTGGGTAAATCCTTCTCCGGTAAGGACTGGAAAGTCACAGCCCGCGCCGGCCTTGG
CTACCAGTTTGACCTGTTTGCCAACGGTGAAACCGTACTGCGTGATGCGTCCGGTGAGAAACGTATCAAAGGTGAAAAAG
ACGGTCGTATGCTCATGAATGTTGGTCTCAACGCCGAAATTCGCGATAATCTTCGCTTCGGTCTTGAGTTTGAGAAATCG
GCATTTGGTAAATACAACGTGGATAACGCGATCAACGCCAACTTCCGTTACTCTTTCTGA

Protein sequence :
MREYMNKIYSLKYSAATGGLIAVSELAKRVSGKTNRKLVATMLSLAVAGTVNAANIDISNVWARDYLDLAQNKGIFQPGA
TDVTITLKNGDKFSFHNLSIPDFSGAAASGAATAIGGSYSVTVAHNKKNPQAAETQVYAQSSYRVVDRRNSNDFEIQRLN
KFVVETVGATPAETNPTTYSDALERYGIVTSDGSKKIIGFRAGSGGTSFINGESKISTNSAYSHDLLSASLFEVTQWDSY
GMMIYKNDKTFRNLEIFGDSGSGAYLYDNKLEKWVLVGTTHGIASVNGDQLTWITKYNDKLVSELKDTYSHKINLNGNNV
TIKNTDITLHQNNADTTGTQEKITKDKDIVFTNGGDVLFKDNLDFGSGGIIFDEGHEYNINGQGFTFKGAGIDIGKESIV
NWNALYSSDDVLHKIGPGTLNVQKKQGANIKIGEGNVILNEEGTFNNIYLASGNGKVILNKDNSLGNDQYAGIFFTKRGG
TLDLNGHNQTFTRIAATDDGTTITNSDTTKEAVLAINNEDSYIYHGNINGNIKLTHNINSQDKKTNAKLILDGSVNTKND
VEVSNASLTMQGHATEHAIFRSSANHCSLVFLCGTDWVTVLKETESSYNKKFNSDYKSNNQQTSFDQPDWKTGVFKFDTL
HLNNADFSISRNANVEGNISANKSAITIGDKNVYIDNLAGKNITNNGFDFKQTISTNLSIGETKFTGGITAHNSQIAIGD
QAVVTLNGATFLDNTPISIDKGAKVIAQNSMFTTKGIDISGELTMMGIPEQNSKTVTPGLHYAADGFRLSGGNANFIARN
MASVTGNIYADDAATITLGQPETETPTISSAYQAWAETLLYGFDTAYRGAITAPKATVSMNNAIWHLNSQSSINRLETKD
SMVRFTGDNGKFTTLTVNNLTIDDSAFVLRANLAQADQLVVNKSLSGKNNLLLVDFIEKNGNSNGLNIDLVSAPKGTAVD
VFKATTRSIGFSDVTPVIEQKNDTDKATWTLIGYKSVANADAAKKATLLMSGGYKAFLAEVNNLNKRMGDLRDINGESGA
WARIISGTGSAGGGFSDNYTHVQVGADNKHELDGLDLFTGVTMTYTDSHAGSDAFSGETKSVGAGLYASAMFESGAYIDL
IGKYVHHDNEYTATFAGLGTRDYSSHSWYAGAEVGYRYHVTDSAWIEPQAELVYGAVSGKQFSWKDQGMNLTMKDKDFNP
LIGRTGVDVGKSFSGKDWKVTARAGLGYQFDLFANGETVLRDASGEKRIKGEKDGRMLMNVGLNAEIRDNLRFGLEFEKS
AFGKYNVDNAINANFRYSF

• Homologs from PAI DB

GeneGenBank Accn Product Virulance or Resistance PAI or REI Alignment Type E-val Identity
sat YP_002414040.1 Serine protease Not tested Not named Protein 0.0 100
sigA NP_838462.1 serine protease Virulence SHI-1 Protein 0.0 55
sigA NP_708742.1 serine protease Virulence SHI-1 Protein 0.0 55
sigA AAF67320.1 exported serine protease SigA Virulence SHI-1 Protein 0.0 55
espC AAG37043.1 enterotoxin EspC Virulence espC PAI Protein 0.0 52

• Homologs from VFDB (virulence genes)

GeneGenBank Accn Product ID of source DB Alignment Type E-val Identity
sat YP_002414040.1 Serine protease VFG0902 Protein 0.0 100
sat YP_002414040.1 Serine protease VFG0862 Protein 0.0 64
sat YP_002414040.1 Serine protease VFG0844 Protein 0.0 56
sat YP_002414040.1 Serine protease VFG0630 Protein 0.0 55
sat YP_002414040.1 Serine protease VFG0772 Protein 0.0 52