Gene Information

Name : c0393 (c0393)
Accession : NP_752330.1
Strain : Escherichia coli CFT073
Genome accession: NC_004431
Putative virulence/resistance : Virulence
Product : hemoglobin protease
Function : -
COG functional category : M : Cell wall/membrane/envelope biogenesis
COG ID : COG3468
EC number : -
Position : 371877 - 376007 bp
Length : 4131 bp
Strand : -
Note : Residues 1 to 1376 of 1376 are 79.04 pct identical to residues 1 to 1377 of 1377 from GenPept.129 : >emb|CAA11507.1| (AJ223631) haemoglobin protease [Escherichia coli]

DNA sequence :
ATGAATAAAATATACGCTCTAAAATATTGTTATATTACTAACACAGTAAAGGTTGTCTCTGAACTAGCCCGAAGGGTATG
TAAAGGGAGTACCCGCAGAGGAAAAAGACTTTCAGTACTTACCTCTCTGGCACTATCTGCATTACTCCCAACCGTTGCTG
GTGCATCAACGGTTGGTGGCAACAATCCTTACCAGACATACCGCGACTTTGCAGAAAACAAAGGGCAGTTTCAGGCTGGC
GCAACAAACATTCCTATTTTTAATAATAAAGGGGAATTAGTAGGACATCTTGATAAAGCGCCCATGGTTGATTTTAGCAG
TGTGAATGTAAGCTCAAATCCCGGCGTTGCAACATTAATTAACCCGCAATATATAGCCAGTGTAAAACATAATAAAGGAT
ATCAGAGCGTCAGCTTCGGTGATGGTCAGAACAGTTACCATATTGTGGATCGTAATGAACACAGTTCATCTGATCTCCAC
ACACCAAGACTTGATAAGCTCGTAACTGAGGTTGCTCCGGCTACCGTAACCAGCTCATCAACAGCTGATATATTGAACCC
TTCAAAATACTCGGCATTCTACAGGGCTGGTTCGGGAAGTCAGTATATTCAGGATAGTCAGGGTAAGCGACATTGGGTAA
CAGGTGGGTATGGTTATCTGACAGGAGGAATACTCCCGACATCATTCTTTTATCACGGCTCAGACGGCATTCAGCTGTAT
ATGGGGGGCAACATACATGATCATAGCATCCTGCCCTCTTTTGGAGAGGCCGGCGACAGTGGTTCTCCATTATTTGGCTG
GAATACGGCCAAAGGGCAGTGGGAACTGGTCGGTGTTTACTCGGGAGTAGGAGGGGGGACCAATTTGATATATTCTCTTA
TTCCTCAGAGTTTTCTCTCACAGATCTATTCAGAGGATAATGACGCTCCCGTCTTTTTTAATGCCTCATCCGGCGCCCCC
CTGCAATGGAAATTTGACAGCAGCACCGGCACTGGCTCTCTGAAACAGGGTTCCGATGAATATGCCATGCACGGGCAAAA
AGGTTCTGACCTGAACGCAGGTAAAAATCTGACATTCCTGGGACATAATGGTCAGATTGACCTGGAAAACTCTGTCACGC
AGGGTGCCGGTTCACTGACATTTACTGATGACTACACTGTCACCACTTCAAACGGAAGTACCTGGACCGGGGCCGGTATT
ATTGTGGACAAGGATGCCTCCGTAAACTGGCAGGTTAATGGTGTGAAAGGTGACAACCTGCATAAAATCGGCGAAGGAAC
CCTGGTTGTACAGGGAACCGGTGTTAATGAGGGCGGCCTGAAAGTCGGGGATGGGACCGTTGTCCTCAATCAGCAGGCTG
ACAGTTCAGGACACGTTCAGGCATTCAGTAGCGTGAATATTGCCAGCGGCCGCCCGACAGTCGTGCTGGCAGACAACCAG
CAGGTTAATCCGGACAATATATCCTGGGGCTACCGGGGGGGGGTTCTGGATGTTAACGGGAATGACCTGACATTTCATAA
GCTGAATGCCGCCGATTATGGCGCAACTCTCGGTAACAGCAGTGATAAAACGGCTAATATCACTCTGGATTATCAGACGC
GTCCGGCAGACGTAAAAGTTAATGAATGGTCATCATCAAACAGGGGAACAGTAGGTTCATTATATATTTATAATAATCCC
TATACTCATACCGTCGATTATTTTATCCTGAAAACAAGTAGTTATGGCTGGTTCCCTACCGGTCAGGTCAGTAACGAGCA
CTGGGAATATGTCGGACATGACCAGAACAGTGCACAGGCACTGCTTGCAAACAGAATTAATAATAAAGGGTATCTGTATC
ATGGCAAGTTGCTGGGAAATATTAATTTCTCAAATAAAGCAACCCCGGGTACAACCGGCGCATTGGTTATGGACGGCTCA
GCGAATATGTCCGGTACATTTACTCAGGAAAACGGTCGTCTGACCATTCAGGGCCACCCGGTTATCCATGCTTCAACGTC
TCAGAGTATTGCAAATACAGTCTCGTCTCTGGGCGACAATTCCGTTCTGACACAGCCCACCTCATTTACACAGGATGACT
GGGAGAACAGGACGTTCAGCTTTGGTTCGCTCGTGTTAAAAGATACAGACTTTGGTCTGGGCCGCAATGCCACACTGAAC
ACAACCATCCAGGCAGATAACTCCAGCGTCACGCTGGGCGACAGTCGGGTATTTATCGACAAAAAAGATGGCCAGGGAAC
AGCATTTACCCTTGAAGAAGGCACATCTGTTGCAACTAAAGATGCAGATAAAAGCGTCTTCAACGGCACCGTCAACCTGG
ATAATCAGTCAGTGCTGAATATCAATGAGATATTCAATGGCGGAATACAGGCGAACAACAGTACCGTGAATATCTCCTCA
GACAGTGCCGTTCTGGAGAACTCAACGCTGACCAGTACCGCCCTGAATCTGAACAAGGGAGCAAATGTTCTGGCCAGTCA
GAGTTTTGTTTCTGACGGTCCGGTGAATATTTCTGATGCCACCCTGAGTCTGAACAGCCGTCCTGATGAGGTATCTCACA
CACTTTTACCTGTATACGATTATGCCGGTTCATGGAACCTGAAGGGAGACGATGCCCGCCTGAACGTGGGGCCGTACAGT
ATGTTGTCAGGTAATATCAATGTTCAGGATAAAGGGACTGTCACCCTCGGAGGGGAAGGGGAACTGAGTCCTGACCTGAC
TCTTCAGAATCAGATGTTGTACAGCCTGTTTAACGGGTACCGCAATACCTGGAGCGGGAGCCTGAATGCACCGGATGCCA
CCGTCAGCATGACAGACACCCAGTGGTCGATGAACGGAAACTCCACGGCAGGAAATATGAAACTTAACCGGACAATAGTC
GGTTTTAACGGGGGAACATCATCGTTCACGACACTGACAACAGATAATCTGGACGCGGTTCAGTCAGCATTTGTCATGCG
TACAGACCTTAACAAGGCAGACAAACTGGTGATAAACAAGTCGGCAACAGGTCATGACAACAGCATCTGGGTTAACTTCC
TGAAAAAACCCTCTGACAAGGACACGCTTGATATTCCACTGGTCAGCGCACCTGAAGCGACAGCTGATAATCTGTTCAGG
GCATCAACACGGGTTGTGGGATTCAGTGATGTCACCCCCACCCTTAGTGTCAGAAAAGAGGACGGGAAAAAAGAGTGGGT
CCTCGATGGTTACCAGGTTGCACGTAACGACGGCCAGGGTAAGGCTGCCGCCACATTCATGCACATCAGCTATAACAACT
TCATCACTGAAGTTAACAACCTGAACAAACGCATGGGCGATTTGAGGGATATTAACGGCGAAGCCGGTACGTGGGTGCGT
CTGCTGAACGGTTCCGGCTCTGCTGATGGCGGTTTCACTGACCACTATACCCTGCTGCAGATGGGGGCTGACCGTAAGCA
CGAACTGGGAAGTATGGACCTGTTTACCGGCGTGATGGCCACCTACACTGACACAGATGCGTCAGCAGGCCTGTACAGCG
GTAAAACAAAATCATGGGGTGGTGGTTTCTATGCCAGTGGTCTGTTCCGGTCCGGCGCTTACTTTGATTTGATTGCCAAA
TATATTCACAATGAAAACAAATATGACCTGAACTTTGCCGGAGCTGGTAAACAGAACTTCCGCAGCCATTCACTGTATGC
AGGTGCAGAAGTCGGATACCGTTATCATCTGACAGATACGACGTTTGTTGAACCTCAGGCGGAACTGGTCTGGGGAAGAC
TGCAGGGCCAAACATTTAACTGGAACGACAGTGGAATGGATGTCTCAATGCGTCGTAACAGCGTTAATCCTCTGGTAGGC
AGAACCGGCGTTGTTTCCGGTAAAACCTTCAGTGGTAAGGACTGGAGTCTGACAGCCCGTGCCGGCCTGCATTATGAGTT
CGATCTGACGGACAGTGCTGACGTTCACCTGAAGGATGCAGCGGGAGAACATCAGATTAATGGCAGAAAAGACGGTCGTA
TGCTTTACGGTGTGGGGTTAAATGCCCGGTTTGGCGACAATACGCGTCTGGGGCTGGAAGTTGAACGCTCTGCATTCGGT
AAATACAACACAGATGATGCGATAAACGCTAATATTCGTTATTCATTCTGA

Protein sequence :
MNKIYALKYCYITNTVKVVSELARRVCKGSTRRGKRLSVLTSLALSALLPTVAGASTVGGNNPYQTYRDFAENKGQFQAG
ATNIPIFNNKGELVGHLDKAPMVDFSSVNVSSNPGVATLINPQYIASVKHNKGYQSVSFGDGQNSYHIVDRNEHSSSDLH
TPRLDKLVTEVAPATVTSSSTADILNPSKYSAFYRAGSGSQYIQDSQGKRHWVTGGYGYLTGGILPTSFFYHGSDGIQLY
MGGNIHDHSILPSFGEAGDSGSPLFGWNTAKGQWELVGVYSGVGGGTNLIYSLIPQSFLSQIYSEDNDAPVFFNASSGAP
LQWKFDSSTGTGSLKQGSDEYAMHGQKGSDLNAGKNLTFLGHNGQIDLENSVTQGAGSLTFTDDYTVTTSNGSTWTGAGI
IVDKDASVNWQVNGVKGDNLHKIGEGTLVVQGTGVNEGGLKVGDGTVVLNQQADSSGHVQAFSSVNIASGRPTVVLADNQ
QVNPDNISWGYRGGVLDVNGNDLTFHKLNAADYGATLGNSSDKTANITLDYQTRPADVKVNEWSSSNRGTVGSLYIYNNP
YTHTVDYFILKTSSYGWFPTGQVSNEHWEYVGHDQNSAQALLANRINNKGYLYHGKLLGNINFSNKATPGTTGALVMDGS
ANMSGTFTQENGRLTIQGHPVIHASTSQSIANTVSSLGDNSVLTQPTSFTQDDWENRTFSFGSLVLKDTDFGLGRNATLN
TTIQADNSSVTLGDSRVFIDKKDGQGTAFTLEEGTSVATKDADKSVFNGTVNLDNQSVLNINEIFNGGIQANNSTVNISS
DSAVLENSTLTSTALNLNKGANVLASQSFVSDGPVNISDATLSLNSRPDEVSHTLLPVYDYAGSWNLKGDDARLNVGPYS
MLSGNINVQDKGTVTLGGEGELSPDLTLQNQMLYSLFNGYRNTWSGSLNAPDATVSMTDTQWSMNGNSTAGNMKLNRTIV
GFNGGTSSFTTLTTDNLDAVQSAFVMRTDLNKADKLVINKSATGHDNSIWVNFLKKPSDKDTLDIPLVSAPEATADNLFR
ASTRVVGFSDVTPTLSVRKEDGKKEWVLDGYQVARNDGQGKAAATFMHISYNNFITEVNNLNKRMGDLRDINGEAGTWVR
LLNGSGSADGGFTDHYTLLQMGADRKHELGSMDLFTGVMATYTDTDASAGLYSGKTKSWGGGFYASGLFRSGAYFDLIAK
YIHNENKYDLNFAGAGKQNFRSHSLYAGAEVGYRYHLTDTTFVEPQAELVWGRLQGQTFNWNDSGMDVSMRRNSVNPLVG
RTGVVSGKTFSGKDWSLTARAGLHYEFDLTDSADVHLKDAAGEHQINGRKDGRMLYGVGLNARFGDNTRLGLEVERSAFG
KYNTDDAINANIRYSF

• Homologs from PAI DB

GeneGenBank Accn Product Virulance or Resistance PAI or REI Alignment Type E-val Identity
vat YP_851472.1 vacuolating autotransporter Not tested PAI III APEC-O1 Protein 0.0 99
unnamed CAD66214.1 putative hemoglobin protease Not tested PAI III 536 Protein 0.0 99
vat AAO21903.1 vacuolating autotransporter toxin Virulence Not named Protein 0.0 98
pic NP_838464.1 serine protease precurser Virulence SHI-1 Protein 0.0 49
pic NP_708747.3 serine protease Not tested SHI-1 Protein 0.0 48
she AAB58244.1 mucinase Virulence SHI-1 Protein 0.0 48
pic AAK00464.1 Pic Virulence SHI-1 Protein 0.0 48
unnamed CAC39286.1 hypothetical protein Not tested LPA Protein 0.0 44

• Homologs from VFDB (virulence genes)

GeneGenBank Accn Product ID of source DB Alignment Type E-val Identity
c0393 NP_752330.1 hemoglobin protease VFG0904 Protein 0.0 100
c0393 NP_752330.1 hemoglobin protease VFG1689 Protein 0.0 99
c0393 NP_752330.1 hemoglobin protease VFG0635 Protein 0.0 48
c0393 NP_752330.1 hemoglobin protease VFG0861 Protein 0.0 48
c0393 NP_752330.1 hemoglobin protease VFG0903 Protein 0.0 48