Gene Information

Name : UTI89_C4880 (UTI89_C4880)
Accession : YP_543814.1
Strain : Escherichia coli UTI89
Genome accession: NC_007946
Putative virulence/resistance : Virulence
Product : DNA helicase superfamily protein I
Function : -
COG functional category : L : Replication, recombination and repair
COG ID : COG1112
EC number : -
Position : 4780097 - 4783612 bp
Length : 3516 bp
Strand : +
Note : -

DNA sequence :
ATGGATGAAAATGCTTTAGGGTTTACCTCATACTGGCGCAACTCGCTTGCGGATGCTGAGTCAGGAAAGGGCAGTTTTGA
ACGGAAAGACGCCAAAAATTTCACTCACTGGCATGGGATAGCGGCGGGACGTCTTGACGAAGCGATTGTCAGTAAATTTT
TTAAGGGAGAAAAAGACGATGTCGAAACGGTCGATGTCATCTTGCGCCCAAAAGTTTATTTCCGGTTATTGCAGCATGGT
AAGGACCGTTCTGCAGGTGCGCCTGATATTGTTACCCCGATAGTGACGCCAGCCTTGCTAAGCCGTGAGGGTTTTTTATA
TCCGACGCCAGCGACCTCCATTCCCAGAGACCTGCTTGAACCTTTGCCAAAAGGAGCATTTTCGATTGGTGAGATTGGGC
AGTATGACAAATACAAGACGACCCATACCACGTTCTCTATCAACTTTGATGACAGCGTTGATAAGACTGCCGAAACGGAT
GAAGAACGGGAAGCACGATATGCCGCCTTGCAGCAGGAGTGGCGTCAATATCTGTATGACTCAGAGAGGCTACTGAAGAG
CGTTGCCGGCGACTGGATTGAAAAACCTGAGCAATATGAACTCGCTGAGCACGGTTATATTGTTAAAACGGCTCAATCTG
GCGGTGCCAGTTTCCATATCCTTTCTCTTTATGATCACCTGCTTGTTTGCAATAAGGATGTGCCGCTCTTCAATCGCTTC
GCCTCGCGAGAGGTTCATGCTGCAGAGTCTTTGCTGGCCCCTGGAGCAAAATTCAGCGACAGGCTTGGACACTCCGGAGA
TAAGTTTCCGCTGGCAAAGGCTCAGCGCGATGCCTTAAGCCATTTTCTGGATGCTAGACATGGCGATATCCTTGCTGTTA
ATGGCCCTCCGGGAACCGGAAAAACCACGCTGGTGCTTTCTATCATCGCCACGCAGTGGGCCAGAGCGGCTCTCGAAAAA
TCTGAGCCTCCGGTTATTATCGCGACTTCAACGAATAACCAGGCTGTAACGAACATTATTGAGGCATTCGGGAAAGACTT
TTCGCAAGGTTCAGGTGCGATGGCCGGGCGATGGTTGCCAGAGCTGAAAAGCTTCGGCGCTTATTTTCCCTCAAGCAGCC
GTAAAGCTGAAGCAGCCAAAAAATATCAAACTGAAGATTTCTTCAACCAGGTTGAGTCAAAAGAGTATGTAGAAGATGCA
CTGCTGTTTTATCTCGAGAAAGCTAAGGCAGCTTTTCCTGAAAAAGAGTGTTCATCCCCTGAAAAGGTCATTGAACTCCT
GCATGGTCAGTTGGCAGCAAAATCCGAGCAACTGATAAGACTGAACGCAACATGGCAAACGTTAAGCCAGGTATGGGCTG
CGCGTGAGCTTATTGCTAACGACATTGAGCAATATCTCGATAATTTAAATAAATTACTTTCCGGACAAGAACAAAAAATC
ACTCTACTGAAAAGTGCTAAAACGGAATGGAAAAAATATCGCGCCGGTGAATCACTGATCTATTCATTATTTTCCTGGCT
CCCGGCGGTTCGCAGTAAGCGACAGTACCAAATACAACTGTTTCTCGAAGATAAATTAGGGGCGCTGATTGCAGGAAATC
AGTGGTCTGATCCTGAAACTATCGAACGTAATATTGATGGGCTGCTCAATTCCGCTGAGCGCGAGCAAACAACATACCGG
CAGCAGATTGACTCCGCCCATGAAATCATTCTTAAAGAACAGCAGGCGGTTCAGGAGTGGCAGAGACTGGCTCTTGATTT
AGGGTATGAGGGCGACGAGGAACTGAGCTTCTCACAGGCCGATGAACTGGCTGATACGCAGATTCGCTTCCCTGCATTTT
TACTGACGACTCACTACTGGGAAGGTCGTTGGCTGATGGATATGGCCAGCATTGATGATCTGCAGGAAGAGAAGAAGAAA
AAAGGCGCTAAAGGGGTAACCGCCCGTTGGCAACGTCGAATGAAACTCACTCCATGTGTGGTGATGACATGCTATATGCT
GCCCGGCAATATGCAGATAAGTGAGCACAAAGGACAGCGTAAATTCGAGAAAAGTTATTTGTATGATTTTGCCGATTTAC
TCATTGTCGATGAAGCCGGGCAGGTGCTTCCTGAAGTGGCTGCTGCCTCGTTTGCATTAGCTAAGAAGGCATTAGTGATT
GGCGATACGGAGCAGCTCCCGCCAATATGGAGTATTGCTCCTGCGATTGATGTCGGTAACATGCTGGCGGAAAAAATTCT
GTCTGGCAGTACGCAAGAAGAGATTACCGAGAAATATACGGCAATCGCAGACCTTGGTAAAAGTGCCGCATCTGGCAGCG
TTATGAAAATAGCGCAGTTTGCTTCACGCTATCAATATGATCCCGAACTGGCTCGTGGTATGTACCTATATGAACACCGC
CGGTGCTACGACAATATTATTGGATACTGTAATACGCTCTGCTATCACGGTAAGTTGTTGCCTAAAAGAGGGCGTGAAGA
GAGCAATTTAATGCCCGAAATGGGGTATCTCCATATTGATGGTAAAGGTGAGCTGGCAAGTAGTGGAAGTCGATATAATT
TGCTTGAGGCTGAAACGATAGCGGTCTGGTTGGCAGAGAACCAGCAAAATATTGAAGCGCATTACGGTAAATCGCTTCAT
GAAGTTGTCGGTATCGTGACGCCTTTTAGCGCTCAGGTATCCACTATCAAACAGGTGCTGGGCAAACAAGGTATCAGTAC
AGGCGCGAATGAAAAATCGCTCACAGTGGGCACCGTGCACTCTCTTCAGGGAGCGGAAAGAGCGATTGTGATATTCTCGC
CAGTCTATTCAAAACATGAAGACGGCGGGTTTATTGATAGCGATAACAGCATGCTGAATGTTGCAGTCTCCCGTGCGAAG
GACAGTTTTCTGGTCTTCGGCGATATGGACCTGTTTGAGGTCCAGCCAGCCTCATCTCCACGGGGATTACTGGCAAAATA
CCTCTTTGAGTCAGAGAAGAATGCGCTCTCTTTTGATTATAAAGAGCGTAAGGATTTAAAAACCGCCGGGACCAAAATCT
ACACACTTCATGGTGTGGAGCAACATGATAATTTCCTGAATCAGACATTTGAAAATACCAGTAAACACATCATGATAGTT
TCTCCATGGCTGACCTGGCAAAGGCTGGAGCAAACCGGTTTTCTTGATTCCATGATTGCGGCGTGTTCACGTGGAGTTAA
CGTCACGATAGTCACTGACAGAAGCTACAACACTGAACATAATGATTTTGAGAAGCGAAAAGAGAAGCAGCAAAACTTTA
AAGCGGCGCTGGAGAAACTGAATGCGCTGGGTATTGCTACAAAGCTGGTAAACCGTGTTCATAGCAAAATTGTTATTGGT
GATGATGGTTTGCTGTGCGTGGGATCGTTCAACTGGTTTAGTGCGACACGGGAAGCGCGATATGAACGATACGATACCTC
AATGGTTTATTGCGGTGATAACCTGAAGGGTGAGGTTGAGGCTATTTATAATAGTCTTGAGAGGCGTCAGGTTTAG

Protein sequence :
MDENALGFTSYWRNSLADAESGKGSFERKDAKNFTHWHGIAAGRLDEAIVSKFFKGEKDDVETVDVILRPKVYFRLLQHG
KDRSAGAPDIVTPIVTPALLSREGFLYPTPATSIPRDLLEPLPKGAFSIGEIGQYDKYKTTHTTFSINFDDSVDKTAETD
EEREARYAALQQEWRQYLYDSERLLKSVAGDWIEKPEQYELAEHGYIVKTAQSGGASFHILSLYDHLLVCNKDVPLFNRF
ASREVHAAESLLAPGAKFSDRLGHSGDKFPLAKAQRDALSHFLDARHGDILAVNGPPGTGKTTLVLSIIATQWARAALEK
SEPPVIIATSTNNQAVTNIIEAFGKDFSQGSGAMAGRWLPELKSFGAYFPSSSRKAEAAKKYQTEDFFNQVESKEYVEDA
LLFYLEKAKAAFPEKECSSPEKVIELLHGQLAAKSEQLIRLNATWQTLSQVWAARELIANDIEQYLDNLNKLLSGQEQKI
TLLKSAKTEWKKYRAGESLIYSLFSWLPAVRSKRQYQIQLFLEDKLGALIAGNQWSDPETIERNIDGLLNSAEREQTTYR
QQIDSAHEIILKEQQAVQEWQRLALDLGYEGDEELSFSQADELADTQIRFPAFLLTTHYWEGRWLMDMASIDDLQEEKKK
KGAKGVTARWQRRMKLTPCVVMTCYMLPGNMQISEHKGQRKFEKSYLYDFADLLIVDEAGQVLPEVAAASFALAKKALVI
GDTEQLPPIWSIAPAIDVGNMLAEKILSGSTQEEITEKYTAIADLGKSAASGSVMKIAQFASRYQYDPELARGMYLYEHR
RCYDNIIGYCNTLCYHGKLLPKRGREESNLMPEMGYLHIDGKGELASSGSRYNLLEAETIAVWLAENQQNIEAHYGKSLH
EVVGIVTPFSAQVSTIKQVLGKQGISTGANEKSLTVGTVHSLQGAERAIVIFSPVYSKHEDGGFIDSDNSMLNVAVSRAK
DSFLVFGDMDLFEVQPASSPRGLLAKYLFESEKNALSFDYKERKDLKTAGTKIYTLHGVEQHDNFLNQTFENTSKHIMIV
SPWLTWQRLEQTGFLDSMIAACSRGVNVTIVTDRSYNTEHNDFEKRKEKQQNFKAALEKLNALGIATKLVNRVHSKIVIG
DDGLLCVGSFNWFSATREARYERYDTSMVYCGDNLKGEVEAIYNSLERRQV

• Homologs from PAI DB

GeneGenBank Accn Product Virulance or Resistance PAI or REI Alignment Type E-val Identity
S3169 NP_838460.1 superfamily I DNA helicase Not tested SHI-1 Protein 0.0 99
SF2965 NP_708739.1 superfamily I DNA helicase Not tested SHI-1 Protein 0.0 99
unnamed CAD42018.1 hypothetical protein Not tested PAI II 536 Protein 0.0 99
APECO1_3532 YP_854230.1 superfamily I DNA helicase Not tested PAI I APEC-O1 Protein 0.0 97
ORF_2 AAZ04413.1 superfamily I DNA helicase Not tested PAI I APEC-O1 Protein 0.0 97

• Homologs from VFDB (virulence genes)

GeneGenBank Accn Product ID of source DB Alignment Type E-val Identity
UTI89_C4880 YP_543814.1 DNA helicase superfamily protein I VFG1537 Protein 0.0 99
UTI89_C4880 YP_543814.1 DNA helicase superfamily protein I VFG0627 Protein 0.0 99