Gene Information

Name : C695_02720 (C695_02720)
Accession : YP_006893015.1
Strain : Helicobacter pylori Rif1
Genome accession: NC_018937
Putative virulence/resistance : Virulence
Product : cag pathogenicity island protein (cag7)
Function : -
COG functional category : -
COG ID : -
EC number : -
Position : 554213 - 559996 bp
Length : 5784 bp
Strand : -
Note : COG2948 Type IV secretory pathway, VirB10 components

DNA sequence :
ATGAATGAAGAAAACGATAAACTTGAAACTTCTAAAAAAGCCCAACAAGATTCACCCCAAGATTTATCCAATGAAGAAGC
AACAGAAGCCAATCATTTTGAAAATCTTTTAAAAGAATCCAAAGAAAGCTCAGATCATCATCTTGACAACCCCACAGAAA
CTCAAACCCATTTTGATGGAGACAAGTCAGAAGAAACCCAAACTCAAATGGATTCTGAAGGTAATGAAACTTCAGAATCT
AGCAATGGCAGTCTAGCAGACAAGTTATTCAAAAAAGCCAGAAAATTAGTTGATAATAAAAAACCTTTCACTCAGCAAAA
GAATTTAGATGAAGAAACCCAAGAACTGAACGAAGAAGACGATCAAGAAAATAATGAGTATCAAGAAGAAACTCAAACGG
ACTTAATTGATGATGAAACTTCTAAAAAAACCCAACAACATTCACCCCAAGATTTATCCAATGAAGAAGCAACAGAAGCC
AATCATTTTGAAAATCTTTTAAAAGAATCCAAAGAAAGCTCAGATCATCATCTTGACAACCCCACAGAAACTCAAACCAA
TTTTGATGGAGACAAGTCAGAAGAAACCCAAACTCAAATGGATTCTGAAGGTAATGAAACTTCAGAATCTAGCAATGGCA
GTCTAGCAGACAAGTTATTCAAAAAAGCCAGAAAATTAGTTGATAATAAAAAACCTTTCACTCAGCAAAAGAATTTAGAT
GAAGAAACCCAAGAACTGAACGAAGAAGACGATCAAGAAAATAATGAGTATCAAGAAGAAACTCAAACGGACTTAATTGA
TGATGAAACTTCTAAAAAAACCCAACAACATTCACCCCAAGATTTATCCAATGAAGAAGCAACAGAAGCCAATCATTTTG
AAAATCTTTTAAAAGAATCCAAAGAAAGCTCAGATCATCATCTTGACAACCCCACAGAAACTCAAACCAATTTTGATGGA
GACAAGTCAGAAGAAATAACTGACGACTCTAACGATCAAGAGATTATCAAAGGAAGCAAAAAGAAATATATTATTGGTGG
CATTGTAGTCGCTGTTCTTATCGTGATTATTTTATTTTCTAGAAGCATTTTTCACTACTTCATGCCTTTGGAAGATAAAA
GCTCTCGTTTTAGCAAAGACAGGAATCTTTATGTCAATGATGAAATCCAAATAAGGCAAGAGTATAACCGATTGCTGAAA
GAACGGAATGAAAAAGGCAATATGATCGATAAGAATCTTTTCTTCAATGACGATCCCAATAGAACCTTATACAACTATTT
GAATATTGCAGAAATTGAGGACAAAAACCCGTTGAGAGCCTTTTATGAATGTATTAGTAATGGTGGCAACTATGAAGAAT
GTTTGAAGCTTATCAAAGACAAAAAACTTCAAGATCAGATGAAAAAGACTCTAGAGGCTTATAACGACTGCATCAAAAAT
GCCAAAACTGAAGAAGAAAGGATCAAGTGTTTAGATTTAATCAAAGATGAAAACCTAAAAAAAAGCTTACTGAACCAACA
AAAAGTTCAAGTGGCGCTAGATTGTTTGAAAAACGCTAAAACCGATGAAGAACGAAACGAGTGCCTAAAACTCATAAATG
ACCCTGAGATTAGAGAGAAATTCCGTAAGGAATTAGAGCTTCAAAAAGAGCTTCAAGAGTATAAGGATTGTATCAAAAAC
GCCAAAACAGAAGCTGAGAAAAACAAATGCTTGAAAGGCTTGTCTAAAGAAGCTATAGAGAGATTGAAACAGCAAGCGCT
AGATTGTTTGAAAAACGCTAAAACCGATGAAGAACGAAACGAGTGCTTGAAAAATATTCCCCAAGACTTGCAAAAAGAAC
TATTAGCTGATATGAGCGTCAAGGCTTACAAGGATTGCGTATCAAAAGCTAGAAATGAAAAAGAGAAACAAGAATGCGAG
AAATTGCTCACGCCTGAAGCGAGGAAAAAGTTAGAACAACAGGTTCTAGATTGTTTGAAAAACGCTAAAACCGATGAAGA
ACGAAAAAAGTGTTTGAAAGATCTCCCTAAAGACTTACAAAGCGATATTCTAGCCAAAGAGAGCCTGAAAGCTTATAAAG
ACTGCGTATCTCAAGCCAAAACCGAAGCTGAGAAAAAAGAATGCGAGAAATTACTCACCCCTGAAGCGAAAAAACTTTTA
GAAGAAGAAGCCAAAGAGAGCGTTAAGGCTTATTTGGATTGCGTATCTCAAGCCAAAACCGAAGCTGAGAAAAAAGAATG
CGAGAAATTGCTCACCCCTGAAGCGAAAAAAAAGTTAGAAGAAGCTAAAAAAAGCGTTAAAGCTTACTTGGATTGCGTAT
CAAGAGCTAGGAATGAAAAAGAGAAAAAAGAATGCGAGAAATTGCTCACCCCTGAAGCGAAAAAACTTTTAGAGCAACAA
GCACTAGATTGTTTGAAAAACGCTAAAACCGATAAAGAACGAAAAAAGTGTTTGAAAGATCTCCCTAAAGACTTGCAGAA
AAAGGTTTTAGCTAAAGAAAGCGTTAAAGCTTACTTGGATTGCGTATCTCAAGCCAAAACTGAAGCTGAGAAAAAAGAAT
GCGAGAAATTACTCACCCCTGAAGCGAGAAAACTTTTAGAAGAAGCTAAAAAAAGCGTTAAGGCTTATTTGGATTGCGTA
TCTCAAGCCAAAACTGAAGCTGAGAAAAAAGAATGCGAGAAATTACTCACCCCTGAAGCGAGAAAACTTTTAGAAGAAGA
KGCCAAAGAGAGCGTTAAAGCTTACTTGGATTGCGTATCTCAAGCCAAAAACGAAGCTGAGAAAAAAGAATGCGAGAAAT
TGCTCACCCTTGAATCGAAAAAAAAGTTAGAAGAAGCTAAAAAAAGCGTTAAGGCTTATTTGGATTGCGTATCTCAAGCC
AAAACCGAAGCTGAGAAAAAAGAATGCGAAAAATTGCTCACGCCTGAAGCGAAAAAACTTTTAGAGCAACAAGCGCTAGA
TTGTTTGAAAAACGCTAAAACCGAAGCTGATAAAAAAAGGTGTGTCAAAGATCTCCCTAAAGACTTGCAGAAAAAGGTTT
TAGCCAAAGAGAGCCTGAAAGCTTATAAAGACTGCGTATCAAAAGCTAGGAATGAAAAAGAGAAAAAAGAATGCGAGAAA
TTACTCACCCCTGAAGCGAAAAAACTTTTAGAAGAAGCTAAAAAAAGCGTTAAGGCTTACTTGGATTGCGTATCTCAAGC
CAAAACTGAAGCTGAGAAAAAAGAATGCGAGAAATTACTCACCCCTGAAGCGAGAAAACTCTTAGAAGAAGCTAAAGAGA
GCGTTAAAGCTTATAAAGACTGCGTATCAAAAGCTAGGAATGAAAAAGAGAAAAAAGAATGCGAGAAATTACTCACGCCT
GAAGCGAAAAAACTTTTAGAGCAACAAGTGCTAGATTGTTTGAAAAACGCTAAAACCGAAGCTGATAAAAAAAGGTGTGT
CAAAGATCTCCCTAAAGACTTGCAGAAAAAGGTTTTAGCTAAAGAGAGCGTTAAGGCTTATTTGGACTGCGTATCAAGAG
CTAGGAATGAAAAAGAGAAAAAAGAATGCGAGAAATTGCTCACCCCTGAAGCGAAAAAACTTTTAGAAGAAGCCAAAGAG
AGTCTTAAAGCTTATAAAGACTGCCTCTCTCAAGCTAGAAATGAAGAAGAAAGGAGAGCTTGCGAGAAACTACTCACGCC
TGAAGCGAGAAAACTCTTAGAGCAAGAAGTTAAGAAAAGCATTAAGGCTTATTTGGACTGCGTATCAAGAGCTAGGAATG
AAAAAGAGAAAAAAGAATGCGAGAAATTACTCACGCCTGAAGCGAGAAAATTTTTAGCGAAGCAAGTGCTAAATTGTTTG
GAAAAAGCTGGAAATGAAGAAGAAAGAAAAGCATGTCTTAAAAATCTCCCTAAAGACTTACAGGAAAATATTTTAGCTAA
AGAGAGTCTTAAAGCTTATAAAGACTGCCTCTCTCAAGCTAGAAATGAAGAAGAAAGGAGAGCTTGCGAGAAACTACTCA
CGCCTGAAGCGAGAAAACTCTTAGAGCAAGAAGTTAAGAAAAGCGTTAAGGCTTATTTGGACTGCGTATCAAGAGCTAGG
AATGAAAAAGAGAAAAAAGAATGCGAGAAATTACTCACGCCTGAAGCGAGAAAATTTTTAGCGAAAGAACTCCAACAAAA
AGATAAAGCGATCAAAGATTGCTTGAAAAACGCCGATCCTAACGACAGAGCGGCTATCATGAAGTGTTTGGATGGTTTGA
GCGATGAAGAGAAGCTCAAATACCTGCAAGAAGCTAGAGAAAAGGCTGTTGCGGATTGTTTGGCTATGGCTAAAACCGAT
GAAGAAAAAAGGAAATGCCAAAACCTTTATAGCGATTTGATCCAAGAAATCCAAAATAAAAGGACACAAAACAAACAAAA
TCAATTGAGTAAAACAGAAAGGTTGCATCAAGCAAGCGAGTGCTTGGATAACTTAGATGACCCTACTGATCAAGAGGCCA
TAGAGCAATGTTTAGAGGGCTTGAGCGATAGTGAAAGGGCGCTAATTCTAGGAATTAAACGACAAGCTGATGAAGTGGAT
CTGATTTATAGCGATCTAAGAAACCGTAAAACCTTTGATAACATGGCGGCTAAAGGTTATCCATTGTTACCAATGGATTT
CAAAAATGGCGGCGATATTGCCACTATTAACGCCACTAATGTTGATGCGGACAAAATAGCTAGCGATAATCCTATTTATG
CTTCCATAGAGCCTGATATTGCCAAGCAATACGAAACAGAAAAAACCATTAAGGATAAGAATTTAGAAGCTAAATTAGCT
AAGGCTTTAGGTGGCAATAAAAAAGATGACGATAAAGAAAAAAGTAAAAAATCCACAGCAGAAGCTAAAGCAGAAAACAA
TAAGATAGACAAAGATGTCGCAGAAACTGCCAAGAATATCAGTGAAATCGCTCTTAAGAACAAAAAAGAAAAGAGTGGGG
AATTTGTAGATGAAAATGGTAATCCCATTGATGACAAAAAGAAAGCAGAAAAACAAGATGAAACAAGCCCTGTCAAACAG
GCCTTTATAGGCAAGAGTGATCCCACATTTGTTTTAGCGCAATACACCCCCATTGAAATCACTCTGACTTCTAAAGTAGA
TGCCACTCTCACAGGTATAGTGAGTGGGGTTGTAGCCAAAGATGTATGGAACATGAACGGCACTATGATCTTATTAGACA
AAGGCACTAAGGTGTATGGGAATTATCAAAGCGTGAAAGGTGGCACACCCATTATGACACGCTTAATGATAGTCTTTACT
AAAGCCATTACGCCTGATGGTGTGATAATACCTCTAGCAAACGCTCAAGCAGCAGGCATGTTGGGTGAAGCAGGGGTAGA
TGGCTATGTGAATAATCACTTTATGAAGCGCATAGGCTTTGCTGTGATAGCAAGCGTGGTTAATAGCTTCTTGCAAACTG
CGCCTATCATAGCTCTAGATAAACTCATAGGCCTTGGCAAAGGTAGAAGTGAAAGGACACCTGAATTTAATTACGCTTTG
GGTCAAGCTATCAATGGTAGCATGCAAAGTTCAGCTCAGATGTCTAATCAAATTCTAGGGCAACTGATGAATATCCCCCC
AAGTTTTTACAAAAACGAGGGCGATAGTATTAAGATTCTCACAATGGACGATATTGATTTTAGCGGTGTGTATGATGTTA
AAATTACTAACAAATCTGTGGTAGATGAAATTATCAAACAAAGCACCAAAACTTTGTCTAGAGAACATGAAGAAATCACC
ACAAGCCCCAAAGGTGGCAATTAA

Protein sequence :
MNEENDKLETSKKAQQDSPQDLSNEEATEANHFENLLKESKESSDHHLDNPTETQTHFDGDKSEETQTQMDSEGNETSES
SNGSLADKLFKKARKLVDNKKPFTQQKNLDEETQELNEEDDQENNEYQEETQTDLIDDETSKKTQQHSPQDLSNEEATEA
NHFENLLKESKESSDHHLDNPTETQTNFDGDKSEETQTQMDSEGNETSESSNGSLADKLFKKARKLVDNKKPFTQQKNLD
EETQELNEEDDQENNEYQEETQTDLIDDETSKKTQQHSPQDLSNEEATEANHFENLLKESKESSDHHLDNPTETQTNFDG
DKSEEITDDSNDQEIIKGSKKKYIIGGIVVAVLIVIILFSRSIFHYFMPLEDKSSRFSKDRNLYVNDEIQIRQEYNRLLK
ERNEKGNMIDKNLFFNDDPNRTLYNYLNIAEIEDKNPLRAFYECISNGGNYEECLKLIKDKKLQDQMKKTLEAYNDCIKN
AKTEEERIKCLDLIKDENLKKSLLNQQKVQVALDCLKNAKTDEERNECLKLINDPEIREKFRKELELQKELQEYKDCIKN
AKTEAEKNKCLKGLSKEAIERLKQQALDCLKNAKTDEERNECLKNIPQDLQKELLADMSVKAYKDCVSKARNEKEKQECE
KLLTPEARKKLEQQVLDCLKNAKTDEERKKCLKDLPKDLQSDILAKESLKAYKDCVSQAKTEAEKKECEKLLTPEAKKLL
EEEAKESVKAYLDCVSQAKTEAEKKECEKLLTPEAKKKLEEAKKSVKAYLDCVSRARNEKEKKECEKLLTPEAKKLLEQQ
ALDCLKNAKTDKERKKCLKDLPKDLQKKVLAKESVKAYLDCVSQAKTEAEKKECEKLLTPEARKLLEEAKKSVKAYLDCV
SQAKTEAEKKECEKLLTPEARKLLEEXAKESVKAYLDCVSQAKNEAEKKECEKLLTLESKKKLEEAKKSVKAYLDCVSQA
KTEAEKKECEKLLTPEAKKLLEQQALDCLKNAKTEADKKRCVKDLPKDLQKKVLAKESLKAYKDCVSKARNEKEKKECEK
LLTPEAKKLLEEAKKSVKAYLDCVSQAKTEAEKKECEKLLTPEARKLLEEAKESVKAYKDCVSKARNEKEKKECEKLLTP
EAKKLLEQQVLDCLKNAKTEADKKRCVKDLPKDLQKKVLAKESVKAYLDCVSRARNEKEKKECEKLLTPEAKKLLEEAKE
SLKAYKDCLSQARNEEERRACEKLLTPEARKLLEQEVKKSIKAYLDCVSRARNEKEKKECEKLLTPEARKFLAKQVLNCL
EKAGNEEERKACLKNLPKDLQENILAKESLKAYKDCLSQARNEEERRACEKLLTPEARKLLEQEVKKSVKAYLDCVSRAR
NEKEKKECEKLLTPEARKFLAKELQQKDKAIKDCLKNADPNDRAAIMKCLDGLSDEEKLKYLQEAREKAVADCLAMAKTD
EEKRKCQNLYSDLIQEIQNKRTQNKQNQLSKTERLHQASECLDNLDDPTDQEAIEQCLEGLSDSERALILGIKRQADEVD
LIYSDLRNRKTFDNMAAKGYPLLPMDFKNGGDIATINATNVDADKIASDNPIYASIEPDIAKQYETEKTIKDKNLEAKLA
KALGGNKKDDDKEKSKKSTAEAKAENNKIDKDVAETAKNISEIALKNKKEKSGEFVDENGNPIDDKKKAEKQDETSPVKQ
AFIGKSDPTFVLAQYTPIEITLTSKVDATLTGIVSGVVAKDVWNMNGTMILLDKGTKVYGNYQSVKGGTPIMTRLMIVFT
KAITPDGVIIPLANAQAAGMLGEAGVDGYVNNHFMKRIGFAVIASVVNSFLQTAPIIALDKLIGLGKGRSERTPEFNYAL
GQAINGSMQSSAQMSNQILGQLMNIPPSFYKNEGDSIKILTMDDIDFSGVYDVKITNKSVVDEIIKQSTKTLSREHEEIT
TSPKGGN

• Homologs from PAI DB

GeneGenBank Accn Product Virulance or Resistance PAI or REI Alignment Type E-val Identity
HP0527 NP_207323.1 cag pathogenicity island protein (cag7) Virulence cag PAI Protein 0.0 100
cagY AGC69792.1 cag pathogenicity island protein Y Virulence cag PAI Protein 0.0 96
cagY YP_005774542.1 cag pathogenicity island protein Virulence cag PAI Protein 0.0 95
cagY YP_005777271.1 cag pathogenicity island protein Y VirB10-like protein Virulence cag PAI Protein 0.0 95
HP0527 BAD14026.1 cag pathogenicity island protein Virulence cag PAI Protein 0.0 95
HP0527 BAD13970.1 cag pathogenicity island protein Virulence cag PAI Protein 0.0 95
HP0527 BAD14052.1 cag pathogenicity island protein Virulence cag PAI Protein 0.0 94
HP0527 BAD13833.1 cag pathogenicity island protein Virulence cag PAI Protein 0.0 94
cagY AGC69786.1 cag pathogenicity island protein Y Virulence cag PAI Protein 0.0 94
cagY AGC69789.1 cag pathogenicity island protein Y Virulence cag PAI Protein 0.0 94
HP0527 BAD13888.1 cag pathogenicity island protein Virulence cag PAI Protein 0.0 93
cagY AGC69787.1 cag pathogenicity island protein Y Virulence cag PAI Protein 0.0 92
HP0527 BAD13998.1 cag pathogenicity island protein Virulence cag PAI Protein 0.0 91
HP0527 BAD13860.1 cag pathogenicity island protein Virulence cag PAI Protein 0.0 86
cagY YP_005775730.1 cag pathogenicity island protein Y VirB10-like protein Virulence cag PAI Protein 0.0 86
orf13/14 NP_223194.1 cag island protein Virulence cag PAI Protein 0.0 85
HP0527 BAD13806.1 cag pathogenicity island protein Virulence cag PAI Protein 0.0 85
HP0527 BAD13915.1 cag pathogenicity island protein Virulence cag PAI Protein 0.0 82

• Homologs from VFDB (virulence genes)

GeneGenBank Accn Product ID of source DB Alignment Type E-val Identity
C695_02720 YP_006893015.1 cag pathogenicity island protein (cag7) VFG0287 Protein 0.0 100