Gene Information

Name : DIP2227 (DIP2227)
Accession : NP_940534.1
Strain : Corynebacterium diphtheriae NCTC13129
Genome accession: NC_002935
Putative virulence/resistance : Virulence
Product : surface-anchored fimbrial subunit
Function : -
COG functional category : M : Cell wall/membrane/envelope biogenesis
COG ID : COG4932
EC number : -
Position : 2315777 - 2319904 bp
Length : 4128 bp
Strand : -
Note : Similar to Actinomyces naeslundii hypothetical 103.4 kDa protein TR:Q9EV17 (EMBL:AJ401093) (976 aa) fasta scores: E(): 0.0003, 29.146% id in 597 aa. Note: Contains a potential sortase anchor site (LPLTG) upstream of the C-terminal region transmembrane dom

DNA sequence :
ATGACTGCATGGTCCTTAAAGGGGCGGAATGACTTCTTGCAACGTCTTGCCGCCCTAGCTGCCGTACCCATCGTGGCCGT
ATCTTTGATGGCAGGCAGCGTTAGTGCTGCGGCCCAAGATGAAGTCGGACTTGATGCAGCGCTTGCTGACTCGGATTCTC
CACCACTAGAACCCAAGCCACAGGACTTAGAGTCTGTGGAGGGGGAAGAGGAGGCGCCCGCGCCGGAGGAGCTGGAATCT
ATTGCTGGTACCTTCGCCACCGCCGCTGCGATGGGCACAACCACGGGATCGGGCAACCTGAAATGCGATGAAGGTTACTA
TTACGCTATCCATAAGTGGGGCGAGGTACTGCAATTGCAGGAGCAGAACGGCGCATGGGGGAGGAACCCAATTAACCGGC
AGTCGTACAGCTTCCGCAGTGGTTCTTCACCGAGCGGGGATTGGAACGGTCTTGGTGTCTCTAAGGATGGCTCGACCGTA
TACGCCTATAACCGCTATAGATATGAGAACTCCTCGACGTACGGCTGGGTTGTAACTCAAGTCCAGATCGGGGACATTGT
TAATGGGCAGCGAAACGTATCGAATGTCTATAGGTTGGAAACTCCGTCTGGTCTTATTGCTGGTGCCGTAGCGCCGGATG
GCACATACTATTTCGGTGGCTACCGCTTTTCTGGGGATGTGAAAGCTGCCAAGAAAGCTGTTTATACGAACAATCCTCCG
GTGACGAACTGGACCTATTCTTACGACTCTAGTCTCGGCTATTACTACTATAACGGCACCGACGGACGCCGCTATTATTA
CAGTAGCGCTGGGTTTTGGAGCAATTCAGGCGAATTTCTGGGACGCTACTGGAGCCCCGCTGTGGCTGAAAGGTCGCGAG
AGTTTGTGCTCTATAAATATTCCGGTGGACAGATAACTACCGTCGGTATTGTTCCAATATCGACTCCCGTCCAGAATTCC
ACTTCCGCAGTCAATGGTGACTTTGCCTTTGACTCATCGGGAAATATGTACTTGCTGTTCCACGAAGGTGGCACGACCAA
CGTACAGCTCATTCCAGTGCTGGCTGACCAGCTTGAGGGAACGGGTACGACGCGGCAGATTATTCCGCAGCCGACGACAA
CCATTCAGGTGAAGCCCAATGATCTTGGTCCCTCCTATTTTAACGGTGTGTCGTTTACTGTGGATGGCTACCTCATGATC
GAAGACGGAACCGTAAATGCTCGCCAGAGTACAGCGTATTTCTACAAGGTCGATCCGACCACGGGTAATGTGGTCGCCCA
AGAAACTTTTAACAATGATGGCACCCATGAAGTCTATATGGATAGAGATGGAGGCTATGCGGCGTGGCTCTACGGCGGCC
AAAGCGACCTTGCATCCTGTGCAACTTTCTCGACGCTGGAGCTCAAGAAGACGTTCCCAACCGGCCGTGCTGCCAAGGCT
GATCAGGTGGAGCTGCAGATTTCGCGCGGCACAGGCACCAACACGAACGAAAAGCTGGCGTGGACTTCTACTACGGGCGA
AGTGATTGGCGAACAGCCAGTGGTGGCCGGCCCCGTGATCGCTAGCCCGGACGAGGTCTTCCGCCTCCGCGAGGTCCCCA
GCGTCTCAGAAGGGACCGGCAAGACTCGACTCAGCTACTACAAATCCAACTTGACCTGTAAAGACACACTCACGGGGGCA
TTCTTACCTTCGACGTATATTAAGTCGGTGGACAATACGTCCTCCACGAAGCGTGAATGGGATGTCACGATCCCGTCATA
TTCTGCATGGCAGCTCCAATGCATGTACGAAAACACCCCAGTTAAGGGTGATTTGGCGTGGACCAAGGTTGGTAAGGAGT
CCCCAGAGGACACCTCCGTCATTGAGCTTGGTGGTTCCGTGTGGTCTCTGCTTGACAGCAACAAGCAGGTGATTACCAGT
CCGATTAGCTACGAGAAGATTAAAGACTGCGACGCCAATACCGATGCAGGCTGCCTGAAAGTTGATAAGAACAAATCTTC
TGCAAAGTTCCTGGTTACCGACTTGCCATATGGCACCTACTATCTGCGTGAGGATCAAGCTCCCGATGGATATAAGCCGC
TGTCGGATCCGATTAAGTTCTCCTTCGACGTCAAGGGCATGACTATCCTGGAGAGCCCTGCCGAGCTGGACCAACGGTGG
AAGGAGTTGGCCAAGAAGAACACCGACACGAATGTCGTTGACCTGGGACAGATTCTCAACCTCAAGCCCCAGGGAACAGT
CACCTGGAAGAAGACAGACGCGGCGACCGGTGCGATTCTGCACGACTCGACGTGGAAACTGACGCGTTTGACGGATGCCT
CCGGACAGAAGATTGCTGCCGAGGCATTGGTAGACGGCCGTGACCAGTTTGAGATTAAGGACAAGACTTCGAAGACCGAA
GATGCGACTCGCGATACGAATGAGGCTACCGGTGAATTTAAGGTAGAGGACCTTCCCTATGGAAAGTGGCGTCTGGAAGA
GGTGGAGGCGCCTGATGGATACCGCATCGTGGGCGCGAACGCATATGAGTTTACGGTTAATAAAGCCAAGAAGGACGTCG
AAGCAGTTCCTGGCGGAGCTATCAAGAACTACAAGGCTCGTATTGAATGGTCCAAGGTCGATGCTGCAGATGAGAAGAAC
GCCCTTGCAGGATCGGAATGGCAGCTCACTGGACCAGACGGCAAGGAGCTTTCTGTAAAGGATTGCGTTGCATCCGAATG
TGTGCCAGAAGTGAATGACATTGACCCTGGCAAAGGTAAATTCAGCGTTGCGGAGCTCGGCAACGGCGAGTACACCCTCA
AGGAGACGAAGGCTCCAGAGGGGTACCTCGTAACCGAAAAGACGTACACGTTCAAGATCACCGATACCGGCTACTCCATC
GATGGAAACACTGAGAAGAAGTTCCCGGTTCCTTCTTCCGGCACCAACTACATCCCGTTGAGCATCGGCGGGATCACCAA
CGATAAGGACGCCGCTAAGGTCCACTGGTCCAAGGTCGACGCGGCCGCGACCTCGACGTTGCTTGGTGGCTCGGAGTGGT
CCATCGTCCCTAAAACGTCCACTGGTGCTCTTGATGAGGTAAACGCAATCAAGGTCACTGACCTAGTTGCCGGTGCGAGA
GCCGAGAAGGATACCAATACTGCGGCGGGCGAGTTCACCGTGGAGCTGCCACTCGGCACCTATGTTCTGAAGGAGACGAA
AGCACCTGCCGGATACCTCGTTAGTGATGAGGCTAAGAACGGCAAGGAGTTCACTCTGACTCGCGACAACCTCACTAAGG
TCTTGGAGTTTGGTTCTGTTACCAACAAGAAGATCGAAGCGGGGGTCACTTGGAAGAAGGTTGATGCAACCGACGCTACG
AAACTCCTAGGCGGATCCGAGTGGACCATCACACCTTATAAGGCAGACGGCACCTTAGATACCGCCAACGCCAAGAAGGT
GGTCGATAAGACTCCGAACGCGGCTGGTGTCCTCGACGAGGATCCTGTGGCTGGTCAGTTCAAGATCAATGCTCTGCCGG
CGGGCAAGTACCGCTTGGAGGAGACCAAGGCACCCGAGGGCTACGTGCTTGCCGATGCCGCCTCCGCCGGCGTGGACTTT
ACCGTCGACGATTCCAAGGCGGGCGCCGTTGTTGACTTGGACGCCATCAAGAACGAGCGCATGGACGGCGTCGTGACGTG
GACCAAGTTGGACCAGTCCGGCAAGGCTTTGCATAGCTCCGAATGGACGATCGTCAAAGTTACGGCCGACCGCAAGCCCA
TCGACGGCGCCGTGGCCATCCAGGTGACTGACTGCCAAGCAGACGCGGCATCGAAATGCACTGAGCCTGACATTGACCCG
GCGGGTGGTGCGTTCAAGGTCGACGGCTTGGAGTTTGGCGAGTACAAGCTCGTCGAGTCCAAGGCACCTGCAGGCTTCGT
CGTCGATACCACAGAACACTATTTCACAATTTCTAAGAACGGTGAGGAGATCGTGGCCGGAGCGTTTAAGAACGAACTTG
GTAAAGGCGTAAAGCTGCCACTGACCGGCGGCCTGGGAGCATTTAAGTTCCTGATCGCCGGTGGTCTCCTCGGAGTTCTA
TCGGCAGCGATGGGAGGCGCGCATGTAATGCGACGTCGCAACAGCTAA

Protein sequence :
MTAWSLKGRNDFLQRLAALAAVPIVAVSLMAGSVSAAAQDEVGLDAALADSDSPPLEPKPQDLESVEGEEEAPAPEELES
IAGTFATAAAMGTTTGSGNLKCDEGYYYAIHKWGEVLQLQEQNGAWGRNPINRQSYSFRSGSSPSGDWNGLGVSKDGSTV
YAYNRYRYENSSTYGWVVTQVQIGDIVNGQRNVSNVYRLETPSGLIAGAVAPDGTYYFGGYRFSGDVKAAKKAVYTNNPP
VTNWTYSYDSSLGYYYYNGTDGRRYYYSSAGFWSNSGEFLGRYWSPAVAERSREFVLYKYSGGQITTVGIVPISTPVQNS
TSAVNGDFAFDSSGNMYLLFHEGGTTNVQLIPVLADQLEGTGTTRQIIPQPTTTIQVKPNDLGPSYFNGVSFTVDGYLMI
EDGTVNARQSTAYFYKVDPTTGNVVAQETFNNDGTHEVYMDRDGGYAAWLYGGQSDLASCATFSTLELKKTFPTGRAAKA
DQVELQISRGTGTNTNEKLAWTSTTGEVIGEQPVVAGPVIASPDEVFRLREVPSVSEGTGKTRLSYYKSNLTCKDTLTGA
FLPSTYIKSVDNTSSTKREWDVTIPSYSAWQLQCMYENTPVKGDLAWTKVGKESPEDTSVIELGGSVWSLLDSNKQVITS
PISYEKIKDCDANTDAGCLKVDKNKSSAKFLVTDLPYGTYYLREDQAPDGYKPLSDPIKFSFDVKGMTILESPAELDQRW
KELAKKNTDTNVVDLGQILNLKPQGTVTWKKTDAATGAILHDSTWKLTRLTDASGQKIAAEALVDGRDQFEIKDKTSKTE
DATRDTNEATGEFKVEDLPYGKWRLEEVEAPDGYRIVGANAYEFTVNKAKKDVEAVPGGAIKNYKARIEWSKVDAADEKN
ALAGSEWQLTGPDGKELSVKDCVASECVPEVNDIDPGKGKFSVAELGNGEYTLKETKAPEGYLVTEKTYTFKITDTGYSI
DGNTEKKFPVPSSGTNYIPLSIGGITNDKDAAKVHWSKVDAAATSTLLGGSEWSIVPKTSTGALDEVNAIKVTDLVAGAR
AEKDTNTAAGEFTVELPLGTYVLKETKAPAGYLVSDEAKNGKEFTLTRDNLTKVLEFGSVTNKKIEAGVTWKKVDATDAT
KLLGGSEWTITPYKADGTLDTANAKKVVDKTPNAAGVLDEDPVAGQFKINALPAGKYRLEETKAPEGYVLADAASAGVDF
TVDDSKAGAVVDLDAIKNERMDGVVTWTKLDQSGKALHSSEWTIVKVTADRKPIDGAVAIQVTDCQADAASKCTEPDIDP
AGGAFKVDGLEFGEYKLVESKAPAGFVVDTTEHYFTISKNGEEIVAGAFKNELGKGVKLPLTGGLGAFKFLIAGGLLGVL
SAAMGGAHVMRRRNS

• Homologs from PAI DB

GeneGenBank Accn Product Virulance or Resistance PAI or REI Alignment Type E-val Identity
spaG YP_005139153.1 putative surface-anchored fimbrial subunit Virulence Not named Protein 0.0 99

• Homologs from VFDB (virulence genes)

GeneGenBank Accn Product ID of source DB Alignment Type E-val Identity
DIP2227 NP_940534.1 surface-anchored fimbrial subunit VFG2207 Protein 0.0 100