Gene Information

Name : YpsIP31758_3692 (YpsIP31758_3692)
Accession : YP_001402646.1
Strain : Yersinia pseudotuberculosis IP 31758
Genome accession: NC_009708
Putative virulence/resistance : Unknown
Product : RHS/YD repeat-containing protein
Function : -
COG functional category : M : Cell wall/membrane/envelope biogenesis
COG ID : COG3209
EC number : -
Position : 4163688 - 4167947 bp
Length : 4260 bp
Strand : +
Note : identified by match to protein family HMM PF03527; match to protein family HMM PF05488; match to protein family HMM PF05593; match to protein family HMM TIGR01643

DNA sequence :
ATGCTGGAAGCCGCTGCGCGTGTTGGCGATGCCATTGGTCACTCTCAGGCATTAGCCGGTCTTATCGGTGGAACGATTTT
AGGCGGGTTAATTAATGTCGCGGGGGGCATTCTTGGCGGAATGCTATTTGCCGCAGGGTGTGCCTCGGCGTGTCTGGGGG
TGGGTATCCTGCTGATTGGCGCATCGATTGCCGTGGGTATGGCAGCGAATGCTCTGGGGGAAAAGGCACGGGATGCCTGT
GTTGATGCAGGAAAAAACTCGCTCAGCCCCAGCGGGGCCATAGTCACCGGCTCCGCTAATGTTCGCACCAACAGTAAAGC
CGCCGCCGTAGCGACCCTCAGTGCCGTCACCTGCGATAAAGATAAAGCACAGCAGGTGGCACAAGGCTCCTCCTCAGTAT
TTATCAATGGCTTGCCAGCCGCCCGTCGTAACGATAAAACCACCTGTGATGCCAGTATCATGGTCGGTTCCACGAATGTA
TTCATTGGGGGAGGAACGGAAACTACGCTCCCCATTACATCTGAAATTCCTGATTGGGCCTATACCGTTTCAGACCTGAC
GATGTTTGCTGCCGGGTTGATCAGTTTTGGCGGCGCAGTCAGTCGGGGGCCAGGGGCGGTACAGAAACTGTTTGCCAAAC
TGCCTGGTGCCGACAAAATTGCCAAAATAGCCTGTCGGTTGGCACCACTGGCGATTATTTTACCGGTGGTCGGTATTCTG
ACTAACCCGGTGGATGTGACCAGTGGACAGAAGTTCCTTAACGACGAGGATGAGCGAGATTTTACGCTTGATGGTGAATT
GCCCCTATTCTGGCAGCGCCGATATCTGAGCAGCTATGTCTATGAGGGGGTATTGGGACGTGGCTGGAATTTGTTTTGGG
AAAGTGCGTTGAGTCGTGTAGACGACGGTATATTGTGGCGCAATACCTATGGAGATTATATTCCCTTCCCAGACATACCG
GCAGGTCACCAAACCTTCTGTCCCGAACAGCAGTGCTGGTTGATACATCTGGAGGACGGGCGCTGGTGTATCCGCGATGC
GGGTGAATGGGTTTACCATTACGGAAAATTTGATGCACAAGGTCTCGCCCCGCTGGCGAATATTACCGATAACGTCGGTA
ACCGCCAGTCGTTTCATTACAATGACCAACAGCAGATGGTATCCATTACGGGCACTGGCGGGCTGTCACTGCGCTGTGAC
TACCACCCAGAACGGCATCGTCTGACGGCGGTCTGGCAACAGACGCCTGACGGGGACATTATTCGCGCCCGCTATCAGTA
TAATGAGTCAGGGCAACTGGCAGCAGTACAACACCGGGATGATACGGTAGTACGCCGTTTTGGCTGGGGTGAAGATCACG
GTTTATTGCTCTGGCATGAAAATGTGGCTGGATTGCGTTGTGACTACCAGTGGCAGGAGATTGACGACATCTGGTGTGTG
GCAGAACAGCATACCTGCGAGGGTGACGGCTACCGACTGGCCTACGATGAAGAGCGGCACCAACGCACTGCGACCTATCA
GGACGGCAGTCAGGCGGTATGGATACTGGATGAACAGCACCGGGTCAGCCGTTACACCGACCGCACTGCCCATGAATGTC
AGCTCCAGTGGGACACGGCCGGGCAACCGACGGGTTACCGTTCACCGCGGGGACATCAACGCCAATGTCAGTGGGATGAA
CTGGGCCGTCTGGTCAGTGTGACGGATGCCAATGGCGCAGAAACACGTTGGCAGTATGAGCGCAATACTGACCGGCAGAC
CTTTGTGTTCTGGCCGGACGGCACGCAAGAGCGCCAGCAGTGGGATGCCCAGGGGCGACTCCTGCAGGAAACTGACCGTC
TGGGGCAATCTACGCACTATACCTACCCGCATCCACGTACCTTACTGCCGGACAGCATCACTGACGCGCTGGGAGGGCAA
AGTCAGTTGCTCTGGAGTCAGCAAGGTCAACTCACCGGCTACACCGACTGTTCCGGTCAGCCCACGCAGTGGCGCTATGA
TGCGCTGGGGCAGTTGCTCTTGCGCCGCGATGCCTTACAACAGGAAATTCGCTACCACTGGGATCCCGTGGGCAGGTTAA
CCAAAGTGACCCTGCCAGACGGTTCAACGGAACAGTTTGACTGGTCTCCGGCGGGTCAGTTAGTGCGACACCAGCAAGGC
CATAACCAGCCCCGCCATTGGCATTACAGTGTGCGCGGACAAATCCTGAGTACCACAGACCGCCTGAGCCGGGTTATCCG
CTATCGCTATGATGCCGAGGGACGCCTGGTCCATCTGGACAATGACAACGGCGGCCAGTATCACTTTAACCGGGATGCGG
AAGGTCGTTTGCTGGAAGAACAGCGTCCGGATGATACTCGTTATTCCTACACCTATAATGCCGATGGGCAGGCAACGGAT
ATCACACAACGTGGCCTGTCAGAGAACCACGCCTCACCGCCAGAGAAACCTACCCGACTGACGTATGACGCCGTTGGCCG
ACTGATAGCGCGCCATACTCTCACGGAACAGACGTGCTATCAGTGGGACAAGATGGGCAATCTACTCAGTGCCATCCGCA
CCCCGACCGAACAGGGTGAAAAGCTGGGTATTCTGACCAATACCGTGACCTTTGAACGGGATGCGTTGGGCCGCATTACT
CAGGAGCATAATGGGGCGGAGGCGCTGGCTTACCACTATGATGCGCTGGGCAACCTGACCCGACTGGAATTACCGAATGA
TGACCATTTCCAGTGGCTGCATTATGGCTCTGGGCATGTGAGTGCCATCCGCTTTAATGACCAGTTGGTCAGCGAATTCG
AGCGTGATGCATTACACCGCGAGACCCGCCGCACACAAGGTATCCTGACGCAACAACGTCAGTATGATGTACTGGGTCGC
CGCCGCTGGCAAAGTAGTATCAGCAGTCGTCTCACCGAGGCGCTCACCACGCCAGAGCAGGGAATACTGTGGCGGGCCTA
TCATTACGATGAACTGCATGAACTGGCTGCGGTAGAGGACAGCAACCGGGGCATGTTGAGCTACGGCTATGATGAAGAGG
GACGACTACGCAGTACCGTCTCGCCACACAGCGGTCAGACGACGGTGCATTATGACCGGGCCGATAACGCGTTGATGTTA
CCGTTACAGACGCCGGAGAGTTCGCCATATGCGCGGAGCAGTCAACCTTATTGTGATAATCGGCTGACACGTTGGGAACA
GTGGCAGTATCACTATGATGCCTTTGGTAACCTGTCGGAGCGACTGGAAGGCTACCGTACTCAACGCTACCGCTATGATG
GGGATAACCGGTTGGTCGGGGCCAAAGGGGATGGGCAGAAAGGTCTCTTTGAGGCGCAATATCATTATGATGCGCTGGGT
CGGCGGCTCAGCAAAGTGGTCAGGACGCCACAGGGTAACCAAGAGACTCACTTTTTATGGCAGGGTCTGCGTCTGCTGCA
AAGCCGCACGGACGAGAGCCAGCAAACCTACTGCTATGACCCGAATGAGGCATACACACCACTCGCCTGCATTGAGCGGC
GCTACGGTGAAGACACCTTGTACTGGTATCACACTGATCTGAATGGCTCGCCACAGGAAGTGACCAACGCGCAGGGTGAA
ATGGTATGGTCGGGGCAATATGGGGTGTTTGGGCAAGTTACACGCCAGACCGATGCGATGTGGCGTAACGTCAGTAAACC
GCTGGGCCAATTCAGGCAGCCATTGCGTTATGCCGGGCAATATCTGGATGACGAAACGGGGTTGCACTACACTACCTACC
GGTATTACGCGCCGGAGGTGGGAAGGTTTATCACACCCGATCCGATTGGCTTGGCGGGGGGTCTAAATCTTTATCAGTAT
GCGCCAAATCCGTTGGGGTGGATTGATCCATGGGGGTTGGCAGGTAGTCCAACGACAGCAACACACATCACTTATCAGGG
TATTGATGCTATTACAGGTAAGCCTTATGTTGGGTATGCAAGTATGCAAGGCAATCAAATAGCACAAGATGTGTTGAAAT
ATCGCTATGCTAATGACTTTAGTCGTTTTGGTGGAACTCCTCCTGAAATTTTATACGATGGGTATGGTCAGGCAGGTAAA
TATGTCACTCGTGGATTAGAGCAGCGGACATTTGAAAATCTTGGTGGACTTGACGGTACTGCGAATAAACAAAATCCAGT
AGGGCAGGGAAATGCTAGAAGAACAGAATACCTTAATGCGGCAGATGAACATCTGAGTAATAAAAATGGTAGTAGAAAAG
GAGGAGGCGGTCGATGTTAA

Protein sequence :
MLEAAARVGDAIGHSQALAGLIGGTILGGLINVAGGILGGMLFAAGCASACLGVGILLIGASIAVGMAANALGEKARDAC
VDAGKNSLSPSGAIVTGSANVRTNSKAAAVATLSAVTCDKDKAQQVAQGSSSVFINGLPAARRNDKTTCDASIMVGSTNV
FIGGGTETTLPITSEIPDWAYTVSDLTMFAAGLISFGGAVSRGPGAVQKLFAKLPGADKIAKIACRLAPLAIILPVVGIL
TNPVDVTSGQKFLNDEDERDFTLDGELPLFWQRRYLSSYVYEGVLGRGWNLFWESALSRVDDGILWRNTYGDYIPFPDIP
AGHQTFCPEQQCWLIHLEDGRWCIRDAGEWVYHYGKFDAQGLAPLANITDNVGNRQSFHYNDQQQMVSITGTGGLSLRCD
YHPERHRLTAVWQQTPDGDIIRARYQYNESGQLAAVQHRDDTVVRRFGWGEDHGLLLWHENVAGLRCDYQWQEIDDIWCV
AEQHTCEGDGYRLAYDEERHQRTATYQDGSQAVWILDEQHRVSRYTDRTAHECQLQWDTAGQPTGYRSPRGHQRQCQWDE
LGRLVSVTDANGAETRWQYERNTDRQTFVFWPDGTQERQQWDAQGRLLQETDRLGQSTHYTYPHPRTLLPDSITDALGGQ
SQLLWSQQGQLTGYTDCSGQPTQWRYDALGQLLLRRDALQQEIRYHWDPVGRLTKVTLPDGSTEQFDWSPAGQLVRHQQG
HNQPRHWHYSVRGQILSTTDRLSRVIRYRYDAEGRLVHLDNDNGGQYHFNRDAEGRLLEEQRPDDTRYSYTYNADGQATD
ITQRGLSENHASPPEKPTRLTYDAVGRLIARHTLTEQTCYQWDKMGNLLSAIRTPTEQGEKLGILTNTVTFERDALGRIT
QEHNGAEALAYHYDALGNLTRLELPNDDHFQWLHYGSGHVSAIRFNDQLVSEFERDALHRETRRTQGILTQQRQYDVLGR
RRWQSSISSRLTEALTTPEQGILWRAYHYDELHELAAVEDSNRGMLSYGYDEEGRLRSTVSPHSGQTTVHYDRADNALML
PLQTPESSPYARSSQPYCDNRLTRWEQWQYHYDAFGNLSERLEGYRTQRYRYDGDNRLVGAKGDGQKGLFEAQYHYDALG
RRLSKVVRTPQGNQETHFLWQGLRLLQSRTDESQQTYCYDPNEAYTPLACIERRYGEDTLYWYHTDLNGSPQEVTNAQGE
MVWSGQYGVFGQVTRQTDAMWRNVSKPLGQFRQPLRYAGQYLDDETGLHYTTYRYYAPEVGRFITPDPIGLAGGLNLYQY
APNPLGWIDPWGLAGSPTTATHITYQGIDAITGKPYVGYASMQGNQIAQDVLKYRYANDFSRFGGTPPEILYDGYGQAGK
YVTRGLEQRTFENLGGLDGTANKQNPVGQGNARRTEYLNAADEHLSNKNGSRKGGGGRC

• Homologs from PAI DB

GeneGenBank Accn Product Virulance or Resistance PAI or REI Alignment Type E-val Identity
YpsIP31758_3692 YP_001402646.1 RHS/YD repeat-containing protein Not tested YAPI Protein 0.0 100
api89 CAF28563.1 putative membrane-bound sugar-binding protein Not tested YAPI Protein 0.0 98