0%

Gene Structure and Expression

Gene Structure and Expression

How is information stored in DNA sequences converted into proteins?

  • Activation
  • Transcription
  • Processing
  • Translation

Gene transcription

  • Transcription: the process of copying a gene sequence into a sequence of RNA
  • RNA polymerase: an enzyme that synthesizes RNA from a DNA template
    • RNA polymerase I: transcribes rRNA genes
    • RNA polymerase II: transcribes protein-coding genes, snRNA genes, and others
    • RNA polymerase III: transcribes tRNA genes, 5S rRNA genes, and other small RNAs

rRNAs are named based on their S values (e.g. 18S, 28S, 5.8S, 5S, etc.)

larger S value = larger molecular weight = larger size = slower migration

Transcription Initiation

RNA polymerase binds to the promoter region of the gene, which is located upstream of the transcription start site (TSS). The promoter region contains the TATA box, which is a DNA sequence that indicates where transcription should begin. Proteins called transcription factors bind to the TATA box and help position RNA polymerase.

  • Upstream (上游) means towards the 5' end, downstream(下游) means towards the 3' end
  • Promoter (起动子): a region of DNA that initiates transcription of a particular gene
  • TATA box: Repeated sequence of TATAA or TATAAA, located 25-35 base pairs upstream of the transcription start site

Transcription Elongation

RNA polymerase moves along the DNA template strand in the 3' to 5' direction, synthesizing RNA in the 5' to 3' direction. The RNA transcript is complementary to the DNA template strand and identical to the DNA coding strand, except that the RNA transcript contains uracil (U) instead of thymine (T).

  • Template strand: the strand of DNA that is copied during the synthesis of mRNA
  • Coding strand: the strand of DNA that is not used as a template during the synthesis of mRNA
  • CTD (C-terminal domain): the C-terminal domain of RNA polymerase II. Facilitates the RNA adds the poly(A) tail to the 3' end of the mRNA transcript.

Transcription Termination

RNA polymerase reaches the end of the gene and detaches from the DNA template strand. The RNA transcript is released and the DNA double helix reforms.

mRNAs always associated with proteins as ribonucleoprotein (RNP) complexes.
heterogenous nuclear ribonucleoprotein particles (hnRNPs) can be used to prevent the pre-mRNAs from forming secondary structures. regulate the splicing of pre-mRNA. facilitates mRNA transport.

RNA Processing

DNA -> add m7G cap to 5' end -> add poly(A) tail to 3' end -> primary transcript (pre-mRNA) -> remove introns, splice exons -> mature mRNA

  • 5' cap: a modified guanine nucleotide added to the 5' end of the mRNA transcript to protect it from exonucleases attack (degradation) and to help the ribosome bind to the mRNA
  • Poly(A) tail: a string of adenine nucleotides added to the 3' end of the mRNA transcript, facilitates binding to ribosomes and stabilizes the mRNA.

Pre-mRNA splicing

  • Splice donor site: the 5' end of an intron
  • Splice acceptor site: the 3' end of an intron
  • Branch site: located close to the splice acceptor site, initiates the splicing reaction

Simple vs. Complex Eukaryotes

  • Simple transcription: A DNA sequence that contains protein-encoding exons separated by introns and upstream control regions. The primary transcript produced from a simple transcription unit is processed to yield a single type of mRNA, encoding a single protein. Simple transcription units are rare in humans
  • Complex transcription: Produce primary RNA transcripts that can be processed in more than one way, leading to formation of mRNAs containing different combinations of exons. Each alternative mRNA is translated into a single polypeptide, with translation usually initiating at the first AUG in the mRNA.

Alternative Splicing

  • Alternative splicing: the process of generating different mature mRNAs from the same primary transcript by splicing together different combinations of exons

Gene Translation

Decoding the mRNA sequence to produce a polypeptide chain

RNA side

The mRNA is read in sets of three nucleotides called codons.

  • Each codon specifies a particular amino acid
  • Most amino acids are specified by more than one codon
  • Only two amino acids are specified by a single codon (Met and Trp)

e.g. A mRNA sequence of AUGCCCGAUGGAGUAGCAGCUGA would be read as AUG-CCC-GAU-GGA-GUA-GCA-GCU-GA

  • start codon: AUG (In most cases, the first AUG is used as the start codon)

  • Stop codons: UAA, UAG, UGA

Three reading frames - Reading frame: the way a cell's mRNA-translating machinery groups the mRNA nucleotides into codons

e.g. Theoretically, the mRNA sequence AUGCCCGAUGGAGUAGCAGCUGA could be read in three different reading frames:

  • AUG-CCC-GAU-GGA-GUA-GCA-GCU-GA
  • A-UGC-CCG-AUG-GAG-UAG-CAG-CUG-A
  • AU-GCC-CGA-UGG-AGU-AGC-AGC-UGA

Ribosome

  • Ribosome: a complex of RNA and protein that catalyzes the synthesis of polypeptides
    • Large and small subunits

tRNA (transfer RNA)

tRNAs are small RNA molecules that carry amino acids to the ribosome for polymerization into a polypeptide. Each tRNA has a three-nucleotide sequence called an anticodon that is complementary to a codon in the mRNA. The tRNA also carries the amino acid that is specified by the codon.

  • the anticodon on the tRNA base-pairs with the codon on the mRNA

e.g. The tRNA with the anticodon 5'-GCC-3' would bind to the mRNA codon 3'-CGG-5'

Translation

Ribosomes bind to the mRNA and move along it in the 5' to 3' direction. As the ribosome moves along the mRNA, each tRNA binds to the mRNA and delivers its amino acid to the ribosome. The ribosome catalyzes the formation of a peptide bond between the amino acids, forming a polypeptide chain. The ribosome continues to move along the mRNA, adding amino acids to the polypeptide chain until it reaches a stop codon.

  • Peptide bond: a covalent bond between the carboxyl group of one amino acid and the amino group of another amino acid
  • Polypeptide: a chain of amino acids linked by peptide bonds

Protein

Protein structure

  • Amino acid: the building blocks of proteins
    • Can polymerize (by eliminating water)
    • CO-NH bond (peptide bond)
  • Numbering
    • Dipetide: 2 amino acids
    • Tripeptide: 3 amino acids
    • Oligopeptide: 4-10 amino acids
    • Polypeptide: >10 amino acids

Proteins consist of one or more polypeptide chains

Types of amino acids

  • Nonpolar amino acids: hydrophobic
  • Polar amino acids: hydrophilic
  • Acidic amino acids: negatively charged
  • Basic amino acids: positively charged