Gene Information

Name : clpA (sce4989)
Accession : YP_001615632.1
Strain : Sorangium cellulosum So ce 56
Genome accession: NC_010162
Putative virulence/resistance : Virulence
Product : endopeptidase Clp
Function : -
COG functional category : O : Posttranslational modification, protein turnover, chaperones
COG ID : COG0542
EC number : 3.4.21.92
Position : 7009076 - 7011781 bp
Length : 2706 bp
Strand : -
Note : Family membership

DNA sequence :
ATGCCGCGCATGCAGCTCGTCGACACCAAAGCCATCGTCAAGCGACTCACCAAGAACTGCACCTCTGCCCTCGAGGCGGC
GATTGGCCAGTGTGTCAACGCGCGCCACTACGAGGTCACGCTCGAGCACCTGCTCCTCGCGCTGCTCGACGACGCGAACT
CCGACATCGCGTTTCTCGTATCCCACTATGATCTCGACCCGTCGCACCTCCGCGCCGCGCTCCAGCGCAGCCTGGAAGAG
CTGAGGAAGGGCAACGCCGGGAGGCCCGTGCTCTCGCCGACGATGCTCGAGTGGATGCAGGACGCCTATCTCGTCGGATC
CATGGAGTACGGCTACCAGCGGGTCCGGAGCGGCGTGCTCTTCCAGCGCCTCGTCCAGCAGCCGACCCGGTACTCGGTGA
GCAGCATCGGCGCGTATCTCGAGGGCATCTCGAAAGACGATCTCAAGAACAACCTCCCCAAGATCGTCTCCGGCTCCAAG
GAGGAGGCGGAGGCCGCAGCCGCTGCGGCGGGCGCAGGAGCGCCCGGGGCCGGGAAGCTGCCGGCCGGGATGGCGAGCGC
CGGGCCCGATTCGGCGCTGGCGAAGTTCTGCGTCGACTACACGGGGAAGGCGCGCGCGGGGCAGATCGATCCGATCTTCG
GCCGCGAGCGGGAGATCCGGCAGGTCATCGATATCCTCGCGCGGCGGCGCAAGAACAACCCGATCATCGTCGGCGACGCG
GGCGTCGGGAAGACGGCGCTCGTCGAGGGGCTCGCGCTGCTCATCGTCGAGAGCACGCCGGAGAACCCGAAGGTGCCGCC
GCTCCTGCAGGGCGTCGACATCCTCGGGCTCGACATGGGGCTGCTCCAGGCAGGGGCCGGCGTGAAGGGCGAGTTCGAGA
ACCGGATGAAGCAGGTGATCGCCGAGGTGAAGGGGTCGTCGAAGCCGATCATCCTCTTCATCGACGAGGCGCACACGCTC
ATCGGCGCGGGCGGCCAGCAAGGCGGCGGCGATGCGGCCAACCTGCTCAAGCCGGCGCTCGCCCGCGGCGAGCTCCGGAC
GATCGCAGCCACGACGTGGAGCGAGTACAAGAAATACTTCGAGAAGGACGCGGCCCTGGAGCGGCGCTTCCAGCCGGTGA
AGGTCGACGAGCCGAGCGAGCCGGCGGCCGTCGTCATGCTCCGGGGGCTCCGGCCGAAGTTCGAGGTGGCCCACAACGTC
ATCATCCAGGATGAGGCGGTGACCGCGGCCGTGCGGCTGTCGGCGCGCTACCTGACCGGCAGGCAGCTGCCGGACAAGGC
GGTCGACCTGCTCGACACCTGCGCGGCCCGCGTGAAGGTCGCGCTGCAGCAGCGCCCCTCCGCGGTCGAGGACGCCGAGA
TCCTCATCGTGAACACCGAGACCGAGCTCAAGGCACTCGAGCGGGATCGCGACAGGGGCGTGCACATCGATGAAGAGCGG
GTCAGCGCGCTGAAGGACAGGCTCGCGAAGGCGAAGACCGAGCTCGAGACGGTGCGCGCCGCGTACGCGAAGGAGACGGC
GGGCACGCAGCGGGTCATCGACGCCCGCAAGAAGATGGACGAGGCCAAGACGAGCGAGGAGCGCGACGCCGCGCGCCGTG
AGGTGGTCAAGGCCCTCGACGAGCTCAAGCAGAGCCAGGGCGAGGTGCCGCTGGTCCGGCCGGACGTCGACGAGGCCATG
GTCGCGAGCGTGGTCGCGGCGTGGACGGGCATCCCCGTCGGCAAGATGGTCCAGGACGACGTGAAGGCGCTCCTCGAGAT
GGAGGACCGGCTGACCCGGCGGATCAAGGGGCAGACGCACGGTATCGTGACGATCTCGAAGGAGCTCCGGAGCGCGCGCG
CGGGCCTGAAGCCGCTGAGCACGCCGCAGGGCGTGTTCCTGCTCGTGGGGCCCAGCGGCGTCGGCAAGACGGAGACCGCG
CTCGGGATCGCCGACCTCATGTTCGGCGGAGAGCGGATGATGACGGTCATCAACATGTCCGAGTTCCAGGAGAAGCACAC
CGTCTCCCGGCTCATCGGCTCGCCCCCCGGCTACGTCGGCTACGGCGAGGGCGGCATGCTCACCGAGGCCGTGCGCCAGC
GGCCCTACACGGTGGTGCTCCTCGACGAGGTCGAGAAGGCCGACCCCGACGTGATGAACCTGTTCTACCAGGTGTTCGAC
AAGGGCATGCTGAGCGACGGCGAGGGGCGGCTCGTCGACTTCAAGAACACGGTCATCATCCTGACGAGCAACCTCGCGAC
CGACAAGATCACGAACATGACGGTCGCGGCGCGGGAGGAAGACCCGTCGCGCCGCCTCGACGCGGACGGCGAGTTCATCA
AGGAGATCGTCGAGTCGATCAAGCCGACGCTCTCCGCGCACTTCAAGCCGGCGCTGCTCGCGCGCATGACGACGGTGCCG
TACCTGCCGATCTCGCCCGACGCGCTCGGCGAGATCACGCGGCTCAAGCTCGACGCGCTCGTCGACCGCCTGCGCAAGAG
CCAGCGGATCGAGGCGTCGTACTCGGACGCGCTCGTCGAGACGATCGCGGCGCGCTGCACCGAGGTCGACACGGGGGCGC
GCAACATCGACCATATCCTCCGGGCGTCGCTCCTGCCGCTGCTCTCGGTCGCCGTCCTCGAGAAGATGGCGGAGGGGCCG
CTCCCGAAGCGGCTCCAGATCGGGGTCGACGCCGAGAAGAACTTCACGATCTCGTTCTCGGACTGA

Protein sequence :
MPRMQLVDTKAIVKRLTKNCTSALEAAIGQCVNARHYEVTLEHLLLALLDDANSDIAFLVSHYDLDPSHLRAALQRSLEE
LRKGNAGRPVLSPTMLEWMQDAYLVGSMEYGYQRVRSGVLFQRLVQQPTRYSVSSIGAYLEGISKDDLKNNLPKIVSGSK
EEAEAAAAAAGAGAPGAGKLPAGMASAGPDSALAKFCVDYTGKARAGQIDPIFGREREIRQVIDILARRRKNNPIIVGDA
GVGKTALVEGLALLIVESTPENPKVPPLLQGVDILGLDMGLLQAGAGVKGEFENRMKQVIAEVKGSSKPIILFIDEAHTL
IGAGGQQGGGDAANLLKPALARGELRTIAATTWSEYKKYFEKDAALERRFQPVKVDEPSEPAAVVMLRGLRPKFEVAHNV
IIQDEAVTAAVRLSARYLTGRQLPDKAVDLLDTCAARVKVALQQRPSAVEDAEILIVNTETELKALERDRDRGVHIDEER
VSALKDRLAKAKTELETVRAAYAKETAGTQRVIDARKKMDEAKTSEERDAARREVVKALDELKQSQGEVPLVRPDVDEAM
VASVVAAWTGIPVGKMVQDDVKALLEMEDRLTRRIKGQTHGIVTISKELRSARAGLKPLSTPQGVFLLVGPSGVGKTETA
LGIADLMFGGERMMTVINMSEFQEKHTVSRLIGSPPGYVGYGEGGMLTEAVRQRPYTVVLLDEVEKADPDVMNLFYQVFD
KGMLSDGEGRLVDFKNTVIILTSNLATDKITNMTVAAREEDPSRRLDADGEFIKEIVESIKPTLSAHFKPALLARMTTVP
YLPISPDALGEITRLKLDALVDRLRKSQRIEASYSDALVETIAARCTEVDTGARNIDHILRASLLPLLSVAVLEKMAEGP
LPKRLQIGVDAEKNFTISFSD

• Homologs from PAI DB

GeneGenBank Accn Product Virulance or Resistance PAI or REI Alignment Type E-val Identity
STY0294 NP_454876.1 ClpB-like protein Not tested SPI-6 Protein 4e-142 47
aec27 YP_851418.1 ATPase Not tested PAI II APEC-O1 Protein 3e-146 43
aec27 AAQ96721.1 Aec27 Not tested AGI-1 Protein 2e-146 43
clpC YP_005163377.1 ATP-dependent Clp protease ATP-binding subunit Not tested Not named Protein 9e-110 41

• Homologs from VFDB (virulence genes)

GeneGenBank Accn Product ID of source DB Alignment Type E-val Identity
clpA YP_001615632.1 endopeptidase Clp VFG2076 Protein 2e-157 46
clpA YP_001615632.1 endopeptidase Clp VFG2084 Protein 4e-151 44
clpA YP_001615632.1 endopeptidase Clp VFG0079 Protein 4e-112 42