Predicting Transcription Units

These are predictions of operons based on intergentic distance. This comes from the work:

Prediction of Transcription Units by Intergenic Distances

These files are distance-based predictions. Be aware that for most cases we do expect to have a high confidence on them, but exceptions are possible. The predictions will be put in a better, more informative, format in a future multigenomic version of RegulonDB.

The format:

  • 1st column: Replicon
  • 2nd column: Strand (forward or reverse)
  • 3rd column: The transcription unit genes identified by their protein ID, as of the RefSeq genome database.

Distance Graphs

These files are graphs of the inter-genic distances of all adjacent genes in the same strand in all prokaryotic genomes available. These might indicate the abundance or proportions of genes in operons within each genome. The genomes were downloaded from the RefSeq genome database.

Table of Genome Redundancy

This table corresponds to the published work. Currently, a table can be built at different similarity thresholds here.

Redundant organisms are indicated by an asterisk (*).

Organism Abbreviation GenBank Accessions
Crenarchaeota
Aeropyrum pernix A. pernix NC_000854
Pyrobaculum aerophilum P. aerophilum NC_003364
Sulfolobus solfataricus S. solfataricus NC_002754
Sulfolobus tokodaii S. tokodaii NC_003106
Euryarchaeota
Archaeoglobus fulgidus A. fulgidus NC_000917
Halobacterium sp. NRC-1 Halobacterium sp NC_002607, NC_001869, NC_002608
Methanococcus jannaschii M. jannaschii NC_000909, NC_001732, NC_001733
Methanothermobacter thermautotrophicus M. thermoautotrophicum NC_000916
Pyrococcus abyssi P. abyssi NC_000868
Pyrococcus furiosus DSM 3638* P. furiosus NC_003413
Pyrococcus horikoshii* P. horikoshii NC_000961
Thermoplasma acidophilum T. acidophilum NC_002578
Thermoplasma volcanium T. volcanium NC_002689
Aquificae
Aquifex aeolicus A. aeolicus NC_000918, NC_001880
Chlamydiales
Chlamydia muridarum C. muridarum NC_002620, NC_002182
Chlamydophila pneumoniae AR39 C. pneumoniae AR39 NC_002179
Chlamydophila pneumoniae CWL029* C. pneumoniae CWL029 NC_000922
Chlamydophila pneumoniae J138* C. pneumoniae J138 NC_002491
Chlamydia trachomatis* C. trachomatis NC_000117
Cyanobacteria
Nostoc sp. PCC 7120 Nostoc sp NC_003272, NC_003276, NC_003240, NC_003273, NC_003270, NC_003267, NC_003241
Synechocystis sp. PCC 6803 Synechocystis PCC6803 NC_000911
Firmicutes
Bacillus halodurans B. halodurans NC_002570
Bacillus subtilis B. subtilis NC_000964
Clostridium acetobutylicum C. acetobutylicum NC_003030, NC_001988
Clostridium perfringens C. perfringens NC_003366, NC_003042
Listeria innocua L. innocua NC_003212, NC_003383
Lactococcus lactis subsp. lactis L. lactis NC_002662
Listeria monocytogenes EGD-e* L. monocytogenes NC_003210
Mycoplasma genitalium M. genitalium NC_000908
Mycobacterium leprae M. leprae NC_002677
Mycoplasma pneumoniae M. pneumoniae NC_000912
Mycoplasma pulmonis M. pulmonis NC_002771
Mycobacterium tuberculosis CDC1551* M. tuberculosis CDC1551 NC_002755
Mycobacterium tuberculosis H37Rv* M. tuberculosis H37Rv NC_000962
Staphylococcus aureus subsp. aureus Mu50 S. aureus Mu50 NC_002758, NC_002774
Staphylococcus aureus subsp. aureus N315* S. aureus N315 NC_002745, NC_003140
Streptococcus pneumoniae R6 S. pneumoniae R6 NC_003098
Streptococcus pneumoniae TIGR4* S. pneumoniae TIGR4 NC_003028
Streptococcus pyogenes M1 GAS S. pyogenes NC_002737
Ureaplasma urealyticum U. urealyticum NC_002162
Proteobacteria
Agrobacterium tumefaciens str. C58 (Cereon) A. tumefaciens C58 NC_003062, NC_003063, NC_003064, NC_003065
Agrobacterium tumefaciens str. C58 (U. Washington)* A. tumefaciens C58 UWash NC_003304, NC_003305, NC_003306, NC_003308
Brucella melitensis B. melitensis NC_003317, NC_003318
Buchnera sp. APS Buchnera sp NC_002528, NC_002253, NC_002252
Caulobacter vibrioides C. crescentus NC_002696
Campylobacter jejuni C. jejuni NC_002163
Escherichia coli K12 E. coli K12 NC_000913
Escherichia coli O157:H7* E. coli O157H7 NC_002695
Escherichia coli O157:H7 EDL933* E. coli O157H7 EDL933 NC_002655
Haemophilus influenzae Rd H. influenzae NC_000907
Helicobacter pylori 26695 H. pylori 26695 NC_000915
Helicobacter pylori J99* H. pylori J99 NC_000921
Mesorhizobium loti M. loti NC_002678, NC_002679, NC_002682
Neisseria meningitidis MC58 N. meningitidis MC58 NC_003112
Neisseria meningitidis Z2491* N. meningitidis Z2491 NC_003116
Pseudomonas aeruginosa P. aeruginosa NC_002516
Pasteurella multocida* P. multocida NC_002663
Rickettsia conorii R. conorii NC_003103
Rickettsia prowazekii* R. prowazekii NC_000963
Ralstonia solanacearum R. solanacearum NC_003295, NC_003296
Sinorhizobium meliloti S. meliloti NC_003047, NC_003037, NC_003078
Salmonella enterica subsp. enterica serovar Typhi* S. typhi NC_003198, NC_003384, NC_003385
Salmonella typhimurium LT2* S. typhimurium LT2 NC_003197, NC_003277
Vibrio cholerae V. cholerae NC_002505, NC_002506
Xylella fastidiosa 9a5c X. fastidiosa NC_002488, NC_002489, NC_002490
Yersinia pestis* Y. pestis NC_003143, NC_003131, NC_003134, NC_003132
Spirochaetales
Borrelia burgdorferi B. burgdorferi NC_001318, NC_001903, NC_000948, NC_000949, NC_000950, NC_000951, NC_000952, NC_000953, NC_000954, NC_001904, NC_001849, NC_000955, NC_001850, NC_001851, NC_001852, NC_001853, NC_001854, NC_001855, NC_001856, NC_000957, NC_001857, NC_000956
Treponema pallidum T. pallidum NC_000919
Thermotogales
Thermotoga maritima T. maritima NC_000853
Thermus/Deinococcus
Deinococcus radiodurans D. radiodurans NC_001263, NC_001264, NC_000959, NC_000958

(*) Redundant organisms that should be left out to get the non-redundant prokaryotic genomes data set


  1. No trackbacks yet.

You must be logged in to post a comment.
%d bloggers like this: