Genome Sizes

The genome of an organism is the complete set of genes specifying how its phenotype will develop (under a certain set of environmental conditions). In this sense, then, diploid organisms (like ourselves) contain two genomes, one inherited from our mother, the other from our father.

Table of Genome Sizes (haploid)
Base pairsGenesNotes
Phi-X 174 5,38610virus of E. coli
Human mitochondrion16,56937
Epstein-Barr virus (EBV)172,28280causes mononucleosis
Nanoarchaeum equitans490,885552This parasitic archaean has the smallest genome of a true organism yet found.
nucleomorph of Guillardia theta551,264511all that remains of the nuclear genome of a red alga (eukaryote) engulfed long ago by another eukaryote
Mycoplasma genitalium580,073483three of the smallest true organisms
Ureaplasma urealyticum751,719652
Mycoplasma pneumoniae816,394680
Chlamydia trachomatis1,042,519936most common sexually-transmitted disease (STD) bacterium in the U.S.
Rickettsia prowazekii1,111,523834bacterium that causes epidemic typhus
Treponema pallidum1,138,0111,039bacterium that causes syphilis
Mimivirus1,181,4041,262A virus (of an amoeba) with a genome larger than the six cellular organisms above
Rickettsia conorii1,268,7551,374causes Mediterranean spotted fever
Pelagibacter ubique1,308,7591,354smallest genome yet found in a free-living organism (marine α-proteobacterium)
Borrelia burgdorferi1.44 x 1061,738bacterium that causes Lyme disease [Note]
Aquifex aeolicus1,551,3351,749bacterium isolated from a hot spring in Yellowstone National Park
Thermoplasma acidophilum1,564,9051,509an archaean that lacks a cell wall
Campylobacter jejuni1,641,4811,708frequent cause of food poisoning
Helicobacter pylori1,667,8671,589chief cause of stomach ulcers (not stress and diet)
Methanococcus jannaschii1,664,9701,783These unicellular prokaryotes look like typical bacteria but their genes
are so different from those of either bacteria or eukaryotes that they are
classified in a third kingdom: Archaea.
Aeropyrum pernix1,669,6951,885
Pyrococcus horikoshii1,738,5051,994
Methanobacterium
thermoautotrophicum
1,751,3772,008
Haemophilus influenzae1,830,1381,738bacterium that causes middle ear infections
Thermotoga maritima1,860,7251,879marine bacterium
Streptococcus pneumoniae2,160,8372,236the pneumococcus
Archaeoglobus fulgidus2,178,4002,437another member of the Archaea
Neisseria meningitidis2,184,4062,185Group A; causes occasional epidemics of meningitis in less developed countries.
Neisseria meningitidis2,272,3512,221Group B; the most frequent cause of meningitis in the U.S.
Encephalitozoon cuniculi2,507,5191,997(plus 69 RNA genes); a parasitic eukaryote.
Propionibacterium acnes2,560,2652,333causes acne
Listeria monocytogenes2,944,5282,9262,853 of these encode proteins; the rest RNAs
Deinococcus radiodurans3,284,1563,187on 2 chromosomes and 2 plasmids; bacterium noted for its resistance to radiation damage
Synechocystis3,573,4704,003a marine prokaryote, one of the cyanobacteria ("blue-green algae")
Vibrio cholerae4,033,4603,890in 2 chromosomes; causes cholera
Mycobacterium tuberculosis4,411,5323,959causes tuberculosis
Mycobacterium leprae3,268,2031,604causes leprosy
Bacillus subtilis4,214,8144,779another bacterium
E. coli4,639,2214,3774,290 of these genes encode proteins; the rest RNAs
Agrobacterium tumefaciens4,674,0625,419Useful vector for making transgenic plants; shares many genes with Sinorhizobium meliloti
Salmonella enterica var Typhi4,809,0374,395+ 2 plasmids with 372 active genes; causes typhoid fever
Salmonella enterica var Typhimurium4,857,4324,450+ 1 plasmid with 102 active genes
Yersinia pestis4,826,1004,052on 1 chromosome + 3 plasmids; causes plague
Schizosaccharomyces pombe12,462,6374,929Fission yeast. A eukaryote with fewer genes than the five prokaryotes below.
E. coli O157:H75.44 x 1065,416strain that is pathogenic for humans
Ralstonia solanacearum5,810,9225,129soil bacterium pathogenic for many plants; 1681 of its genes on a huge plasmid
Pseudomonas aeruginosa6.3 x 1065,570Increasingly common cause of opportunistic infections in humans.
Streptomyces coelicolor6,667,5077,842An actinomycete whose relatives provide us with many antibiotics
Sinorhizobium meliloti6,691,6946,204The rhizobial symbiont of alfalfa. Genome consists of one chromosome and 2 large plasmids.
Saccharomyces cerevisiae12,495,6825,770Budding yeast. A eukaryote.
Cyanidioschyzon merolae16,520,3055,331A unicellular red alga.
Plasmodium falciparum22,853,7645,268Plus 53 RNA genes. Causes the most dangerous form of malaria.
Thalassiosira pseudonana34.5 x 10611,242A diatom. Plus 144 chloroplast and 40 mitochondrial genes encoding proteins
Neurospora crassa38,639,76910,082Plus 498 RNA genes.
Caenorhabditis elegans 100,258,17119,427The first multicellular eukaryote to be sequenced.
Arabidopsis thaliana115,409,949~28,000a flowering plant (angiosperm) See note.
Drosophila melanogaster122,653,97713,379the "fruit fly"
Anopheles gambiae278,244,06313,683Mosquito vector of malaria.
Dogs2.4 x 10919,300
Humans3.3 x 10920,000–25,000 [Link to more details.]
Tetraodon nigroviridis (a pufferfish)3.42 x 10827,918Although Tetraodon seems to have about the same number of genes as we do, it has much less "junk" DNA so its total genome is about a tenth the size of ours.
Rice3.9 x 10837,544
Amphibians109 - 1011?
Psilotum nudum2.5 x 1011?Note

Note: The gene total for Borrelia burgdorferi is based on 853 genes on its single chromosome (of 910,724 base pairs) plus 430 genes on 11 of the 17 plasmids it contains.

Arabidopsis thaliana is a plant (in the mustard family) that has the smallest genome known in the plant kingdom and for this reason has become a favorite of plant molecular biologists. The sequences of two of its five chromosomes (#2 and #4) were published in December 1999. The others were reported in December 2000.

Even though Psilotum nudum (sometimes called the "whisk fern") is a far simpler plant than Arabidopsis (it has no true leaves, flowers, or fruit), it has 3000 times as much DNA. No one knows why, but 80% or more of it is repetitive DNA containing no genetic information. This is also the case for some amphibians, which contain 30 times as much DNA as we do but certainly are not 30 times as complex.

The total amount of DNA in the haploid genome is called its C value. The lack of a consistent relationship between the C value and the complexity of an organism (e.g., amphibians vs. mammals) is called the C value paradox.

Welcome&Next Search

13 December 2005