PAI Gene Information


Name : EC042_4089 (EC042_4089)
Accession : YP_006098372.1
PAI name : Tn2411
PAI accession : NC_017626_R1
Strain : Escherichia coli 042
Virulence or Resistance: Not determined
Product : transposase
Function : -
Note : -
Homologs in the searched genomes :   165 hits    ( 159 protein-level,   6 DNA-level )  
Publication :
    -Aslett,M.A., "Direct Submission", Submitted (25-SEP-2009) Aslett M.A., Wellcome Trust Sanger Institute, Pathogen Sequencing Unit, Wellcome Trust Genome Campus, Hinxton, Cambridge, Cambridgeshire. CB10 1SA, UNITED KINGDOM.

    -Chaudhuri,R.R., Sebaihia,M., Hobman,J.L., Webber,M.A., Leyton,D.L., Goldberg,M.D., Cunningham,A.F., Scott-Tucker,A., Ferguson,P.R., Thomas,C.M., Frankel,G., Tang,C.M., Dudley,E.G., Roberts,I.S., Rasko,D.A., Pallen,M.J., Parkhill,J., Nataro,J.P., Thomson,N, "Complete genome sequence and comparative metabolic profiling of the prototypical enteroaggregative Escherichia coli strain 042", PLoS ONE 5 (1), E8801 (2010) PUBMED 20098708 REMARK Publication Status: Online-Only.

    -Chaudhuri,R.R., Sebaihia,M., Hobman,J.L., Webber,M.A., Leyton,D.L., Goldberg,M.D., Cunningham,A.F., Scott-Tucker,A., Ferguson,P.R., Thomas,C.M., Frankel,G., Tang,C.M., Dudley,E.G., Roberts,I.S., Rasko,D.A., Pallen,M.J., Parkhill,J., Nataro,J.P., Thomson,N, "Direct Submission", Submitted (11-APR-2012) National Center for Biotechnology Information, NIH, Bethesda, MD 20894, USA.


DNA sequence :
ATGCCACGTCGTTCCATCCTGTCCGCCGCCGAGCGGGAAAGCCTGCTGGCGTTGCCGGACTCCAAGGACGACCTGATCCG
ACATTACACATTCAACGATACCGACCTCTCGATCATCCGACAGCGGCGCGGGCCAGCCAATCGGCTGGGCTTCGCGGTGC
AGCTCTGTTACCTGCGCTTTCCCGGCGTCATCCTGGGCGTCGATGAACTACCGTTCCCGCCCTTGTTGAAGCTGGTCGCC
GACCAGCTCAAGGTCGGCGTCGAAAGCTGGAACGAGTACGGCCAGCGGGAGCAGACCCGGCGCGAGCACCTGAGCGAGCT
GCAAACCGTGTTCGGTTTCCGGCCCTTCACCATGAGCCATTACCGGCAGGCCGTCCAGATGCTGACCGAGCTGGCGATGC
AAACCGACAAAGGCATCGTGCTGGCCAGCGCCTTGATCGGGCACCTGCGGCGGCAGTCGGTCATTCTGCCCGCCCTCAAC
GCCGTCGAGCGGGCGAGTGCCGAGGCGATCACCCGTGCTAACCGGCGCATCTACGACGCCTTGGCCGAACCACTGGCGGA
CGCGCATCGCCGCCGCCTCGACGATCTGCTCAAGCGCCGGGACAACGGCAAGACGACCTGGTTGGCTTGGTTGCGCCAGT
CTCCGGCCAAGCCAAATTCGCGGCATATGCTGGAACACATCGAACGCCTCAAGGCATGGCAGGCACTCGATCTGCCTACC
GGCATCGAGCGGCTGGTTCACCAGAACCGCCTGCTCAAGATTGCCCGCGAGGGCGGCCAGATGACACCCGCCGACCTGGC
CAAATTCGAGCCGCAACGGCGCTACGCCACTCTCGTGGCGCTGGCCACCGAGGGCATGGCCACCGTCACCGACGAAATCA
TCGACCTGCACGACCGCATCCTGGGTAAGCTGTTTAACGCTGCCAAGAATAAGCATCAGCAGCAGTTCCAGGCGTCAGGC
AAGGCCATCAACGCCAAGGTACGTCTGTACGGGCGCATCGGTCAGGCGCTGATCGACGCCAAGCAATCAGGCCGCGATGC
GTTTGCCGCCATCGAGGCCGTCATGTCCTGGGATTCCTTTGCCGAGAGCGTCACCGAGGCGCAGAAGCTCGCGCAACCCG
ATGACTTCGATTTCCTGCATCGCATCGGCGAGAGCTACGCCACCCTGCGCCGCTATGCACCGGAATTCCTTGCCGTGCTC
AAGCTGCGGGCCGCGCCCGCCGCCAAAAACGTGCTTGATGCCATTGAGGTGCTGCGCGGCATGAACACCGACAACGCCCG
CAAGCTGCCAGCCGATGCACCGACCGGCTTCATCAAGCCGCGCTGGCAGAAACTGGTGATGACCGACGCCGGCATCGACC
GGCGCTACTACGAACTGTGCGCGCTGTCCGAGTTGAAGAACTCCCTGCGCTCGGGCGACATCTGGGTGCAGGGTTCACGC
CAGTTCAAGGACTTCGAGGACTACCTGGTACCGCCCGAGAAGTTCACCAGCCTCAAGCAGTCCAGCGAATTGCCGCTGGC
CGTGGCCACCGACTGCGAACAATATCTGCATGAGCGGCTGACGCTGCTGGAAGCACAACTTGCCACCGTCAACCGCATGG
CGGCAGCCAACGACCTGCCGGATGCCATCATCACCGAGTCGGGCTTGAAGATCACGCCGCTGGATGCGGCGGTGCCCGAC
ACCGCGCAGGCGCTGATAGACCAGACAGCCATGGTCCTGCCGCACGTCAAGATCACCGAACTGCTGCTCGAAGTCGATGA
GTGGACGGGCTTCACCCGGCACTTCACGCACTTGAAATCGGGCGATCTGGCCAAGGACAAGAACCTGTTGTTGACCACGA
TCCTGGCCGACGCGATCAACCTGGGCCTGACCAAGATGGCCGAGTCCTGCCCCGGCACGACCTACGCGAAGCTCGCTTGG
CTGCAAGCCTGGCATACCCGCGACGAAACGTACTCGACAGCGTTGGCTGAACTGGTCAACGCTCAGTTTCGGCATCCCTT
TGCCGGGCACTGGGGCGATGGCACCACATCATCATCGGACGGACAGAATTTCCGAACCGCTAGCAAGGCAAAGAGCACGG
GGCACATCAACCCAAAATATGGCAGCAGCCCAGGACGGACTTTCTACACCCACATCTCCGACCAATACGCGCCATTCCAC
ACCAAGGTGGTCAATGTCGGCCTGCGCGACTCAACCTACGTGCTCGACGGCCTGCTGTACCACGAATCCGACCTGCGGAT
CGAGGAGCACTACACCGACACGGCGGGCTTCACCGATCACGTCTTCGCCCTGATGCACCTCTTGGGCTTCCGCTTCGCGC
CGCGCATCCGCGACCTGGGCGACACCAAGCTCTACATCCCGAAGGGCGATGCCGCCTATGACGCGCTCAAGCCGATGATC
GGCGGCACGCTCAACATCAAGCACGTCCGCGCCCATTGGGACGAAATCCTGCGGCTGGCCACCTCGATCAAGCAGGGCAC
GGTGACGGCCTCGCTGATGCTCAGGAAACTCGGCAGCTACCCGCGCCAGAACGGCTTGGCCGTCGCGCTGCGCGAGTTGG
GCCGCATCGAGCGCACGCTGTTCATCCTCGACTGGCTGCAAAGCGTCGAGCTACGCCGCCGCGTGCATGCCGGGCTGAAC
AAGGGCGAGGCGCGCAATGCGCTGGCCCGTGCCGTGTTCTTCAACCGCCTTGGTGAAATCCGTGACCGCAGTTTCGAGCA
GCAGCGCTACCGGGCCAGCGGCCTCAACCTGGTGACGGCGGCCATCGTGCTGTGGAACACGGTCTACCTGGAGCGTGCGG
CGCATGCGTTGCGCGGCAATGGTCATGCCGTCGATGACTCGCTATTGCAGTACCTGTCGCCACTCGGCTGGGAGCACATC
AACCTGACCGGTGATTACCTATGGCGCAGCAGCGCCAAGATCGGCGCGGGGAAGTTCAGGCCGCTACGGCCTCTGCAACC
GGCTTAG

Protein sequence :
MPRRSILSAAERESLLALPDSKDDLIRHYTFNDTDLSIIRQRRGPANRLGFAVQLCYLRFPGVILGVDELPFPPLLKLVA
DQLKVGVESWNEYGQREQTRREHLSELQTVFGFRPFTMSHYRQAVQMLTELAMQTDKGIVLASALIGHLRRQSVILPALN
AVERASAEAITRANRRIYDALAEPLADAHRRRLDDLLKRRDNGKTTWLAWLRQSPAKPNSRHMLEHIERLKAWQALDLPT
GIERLVHQNRLLKIAREGGQMTPADLAKFEPQRRYATLVALATEGMATVTDEIIDLHDRILGKLFNAAKNKHQQQFQASG
KAINAKVRLYGRIGQALIDAKQSGRDAFAAIEAVMSWDSFAESVTEAQKLAQPDDFDFLHRIGESYATLRRYAPEFLAVL
KLRAAPAAKNVLDAIEVLRGMNTDNARKLPADAPTGFIKPRWQKLVMTDAGIDRRYYELCALSELKNSLRSGDIWVQGSR
QFKDFEDYLVPPEKFTSLKQSSELPLAVATDCEQYLHERLTLLEAQLATVNRMAAANDLPDAIITESGLKITPLDAAVPD
TAQALIDQTAMVLPHVKITELLLEVDEWTGFTRHFTHLKSGDLAKDKNLLLTTILADAINLGLTKMAESCPGTTYAKLAW
LQAWHTRDETYSTALAELVNAQFRHPFAGHWGDGTTSSSDGQNFRTASKAKSTGHINPKYGSSPGRTFYTHISDQYAPFH
TKVVNVGLRDSTYVLDGLLYHESDLRIEEHYTDTAGFTDHVFALMHLLGFRFAPRIRDLGDTKLYIPKGDAAYDALKPMI
GGTLNIKHVRAHWDEILRLATSIKQGTVTASLMLRKLGSYPRQNGLAVALRELGRIERTLFILDWLQSVELRRRVHAGLN
KGEARNALARAVFFNRLGEIRDRSFEQQRYRASGLNLVTAAIVLWNTVYLERAAHALRGNGHAVDDSLLQYLSPLGWEHI
NLTGDYLWRSSAKIGAGKFRPLRPLQPA