The Transcriptome
Only a very small percentage of the DNA in vertebrate genomes encodes proteins (the "proteome") because
- the exons of most genes are separated by much-longer introns
- the genome contains vast amounts of noncoding "junk" DNA
So even when the complete sequence of a genome is known, it is often difficult to spot particular genes (open reading frames or ORFs).
One approach to solving the problem is to examine a transcriptome of the organism; that is
all the DNA transcripts — mostly messenger RNAs (mRNAs) — being produced.
It is "a" transcriptome, not "the" transcriptome, because what genes are transcribed in a cell depends on
- the kind of cell (e.g., liver cell vs. lymphocyte)
- what the cell is doing at that time, e.g.,
- getting ready to divide by mitosis;
- responding to the arrival of a hormone or cytokine;
- getting ready to secrete a protein product.
ESTs are short (200–500 nucleotides) DNA sequences that can be used to identify a gene that is being expresed in a cell at a particular time.
The Procedure:
- Isolate the mRNA from a particular tissue (e.g., liver)
- Treat it with reverse transcriptase. Reverse transcriptase is a DNA polymerase that uses RNA as its template. Thus it is able to make genetic information flow in the reverse (RNA ->DNA) of its normal direction (DNA -> RNA).
- This produces complementary DNA (cDNA). Note that cDNA differs from the normal gene in lacking the intron sequences.
- Sequence 200–500 nucleotides at both the 5′ and 3′ ends of each cDNA.
- Examine the database of the organism's genome to find a matching sequence.
- That is the gene that was expressed.
8 August 2003