0%

Genetic Regulation of Gene Expression

Genetic Regulation of Gene Expression

Classic view of gene regulation

Organism' complexity generally correlates with the number of non-coding genome

  • 2% of human genome encodes proteins
  • 98% of human genome is non-coding

Genome Structure

Chromosomal territories > Chromatin > TADs > Chromosome loops > DNA sequence

TADs: Topologically Associating Domains, a self-interacting genomic region

Recall:

  • Euchromatin: loosely packed, transcriptionally active

  • Heterochromatin: tightly packed, transcriptionally inactive

Genome give rise to different cell types

  • All cells in an organism have the same genome (Normally)
  • Specialized tissues need different proteins which need different genes

House-keeping genes vs. Tissue-specific genes

House-keeping genes Tissue-specific genes
Distribution All cells Specific cells
Percentage of genes 40% 50%
Specialized gene regulation No Yes
Examples ACTB, GAPDH Hemoglobin

Comptemporary view of gene regulation

  • Gene locus: a regulator unit, contains regulatory elements

How to map regulatory elements to target genes?

  • Conservation
    • Conservation studies: compare across species
  • Biochemical
    • Chromatin accessibility (DNase-seq)
    • Histone modification (ChIP-seq)
    • Transcription factor binding (ChIP-seq)
    • Chromosome interaction (ChIP-seq)
  • Genetic dissection and functional probing

Cis-regulatory elements and Trans-acting regulators

  • Trans-acting regulators:
    • DNA-binding proteins (Direct binding)
    • Co-factors (Indirect, bind via other proteins)
  • Cis-regulatory elements:
    • Promoters - Regions upstream of genes where transcription is initiated. Binding site for RNA polymerase and general transcription factors.
    • Enhancers - Distal elements that activate gene transcription by interacting with promoters. Binding sites for tissue-specific transcription factors.
    • Insulators - Boundary elements that constrain enhancer-promoter interactions. Prevent spread of heterochromatin.

Promoters:

  • Located upstream of genes, initiation site of transcription
  • Binding site for RNA polymerase II and general transcription factors
  • Have open chromatin structure revealed by ATAC-seq
  • Histone modification H3K4me3 enriched at promoters
  • Spiky peaks in chromatin accessibility profiles

Enhancers:

  • Relay information to the promoter
  • Distal regulatory elements that activate gene transcription
  • Have open chromatin structure but not unique to enhancers
  • Histone modifications H3K27ac and H3K4me1 enriched at active enhancers
  • Binding sites for multiple tissue-specific transcription factors
  • Can be located far from promoters but interact through 3D proximity
  • Often form clusters and work cooperatively

Insulators:

  • Boundary elements that constrain enhancer-promoter interactions
  • Binding sites for CTCF and cohesin proteins
  • Help form chromatin loops to delimit domains
  • Prevent spread of heterochromatin
  • Strength of insulation varies by element and context
  • Can block enhancer-promoter contacts or prevent spread of enhancer activity

Central Dogma of Molecular Biology

DNA -> Transcription -> RNA -> Translation -> Protein

  • Transcription control:
    • Trans-acting regulators (TFs)
    • Cis-regulatory elements (promoters, enhancers, insulators)
  • Post-transcriptional control:
    • RNA processing
    • RNA transport
    • mRNA stability

Alternative splicing

Alternative RNA splicing is a process that removes the introns from pre-mRNA and joins the exons to enable translation of mRNA into a protein. Over 90% of the human protein-coding genes undergo some kind of alternative splicing, which can produce different forms of protein from the same gene. Splicing is mediated by large ribonucleoprotein complexes known as spliceosomes .

mRNA splicing:
  • Splice donor site: the 5' end of an intron (GU)
  • Splice acceptor site: the 3' end of an intron (AG)
  • Branch site: located close to the splice acceptor site, initiates the splicing reaction (A)
  • Splice enhancer sequeces: located in exons, promote splicing
  • Splice suppressor sequences: located in exons, inhibit splicing

miRNA (non-coding RNA)

  • translational repression, mRNA degradation
  • regulate about 30% of human genes

Cis-regulatory elements (on alpha-globin gene regulation)

Enhancer Experiments:

  • Deletion of enhancers R1/R2 nearly abolishes alpha globin expression, showing they are critical
  • Deletion of R3/R4/R5 reduces expression by 40%, suggesting facilitator role R3, R4, R5 work as a unit with redundancy between elements
  • Inversion of the enhancer cluster substantially reduces expression, indicating orientation is important

Facilitator concept:

  • R3/R4/R5 proposed as facilitator elements that support enhancer activity
  • They lack intrinsic enhancer activity but are important for R1/R2 function
  • Highlights enhancer clusters can work cooperatively, not just additively

Insulator Experiments:

  • Deletion of 5' insulator does not affect alpha globin expression But allows enhancer activity to spread to neighboring genes
  • Deletion of 3' insulator also does not affect alpha globin expression Active transcription of genes may block/insulate enhancer activity
  • Varying strength of different insulators based on position and context Insulators constrain but do not confer specificity to enhancer-promoter interactions

DNA variants and genetic diseases

DNA variants

  • Single nucleotide variant (SNV): a single nucleotide change
  • Structural variant (SV): a large DNA sequence change
    • Deletion: a segment of DNA is missing
    • Insertion: a segment of DNA is inserted
    • Tandem duplication: a segment of DNA is duplicated and inserted next to the original segment
    • Interspersed duplication: a segment of DNA is duplicated and inserted at a different location
    • Inversion: a segment of DNA is reversed
    • Translocation: a segment of DNA is moved to a different chromosome
    • Copy number variant (CNV): a segment of DNA is duplicated or deleted

Mutations in the genome

  • Exome is easier to identify mutations
  • But highest % of mutations are in non-coding regions
  • Significance will depend on their location and function

Disruption on enhancer, promoter, insulator, splicing site, etc. can cause diseases

Degradation of miRNA can cause diseases