nucleosome = unit of chromatin, equal to DNA of 200bp wrapping around a histone complex of: 2 molecules of H2A, H2B, H3 and H4. one molecule of H1 sits above the nucleosome and provides tighter packing. -> H1 present in half the amount of the other four. Role is to compact DNA and produce another level of regulation between euchromatin and heterochromatin. More specifically, two H2A-H2B dimers and one H3-H4 tetramer. Histone 2AZ variant replaced H2A in some nucleosomes —> suggested that proximity between H2AZ containing nucleosome and the transcription start site is an indication of gene expression i.e. if H2AZ close to TSS the gene has a higher expression level! Nucleosome displaced from TSS to being transcription, and blocks the TSS in inactive genes. So H2AZ may destabilise the nucleosome and also may recruit RNA polymerase II.
R. Bargaje 2012 “Proximity of H2A.Z containing nucleosome to the transcription start site influences gene expression levels in the mammalian liver and brain”
In a partial digest use enzymes that only cut the lengths of DNA between the nucleosomes (i.e. the exposed DNA), showing them to be about 200bp long -> “beads on a string” model [halt DNA digest before all the dna is digested to nucleotides, hence partial]
Beads on a strong are not the tightest form possible. H1, on the nucleosome, allows tighter packing than this into a 30nm fibre -> could either be a single helix (one start model, solenoid structure) or a helical ribbon (two start model, made via zig zagging the DNA.)
(histone octamer: x2 H2A, H2B, H3 and H4 and 146bp of DNA wrapped around the histones to shorted in, plus 54bp that serve as a “linker” to link nucleosomes together -> histone H1 associated with nucleosome, but once linker is digested histone H1 is released so histone H1 does further packing of the beads on a string into a 30nm fibre)
After 30nm fibre -> looped to get a 300nm structure, heterochromatin [euchromatin goes into a the heterochromatin structure in mitosis and meiosis since chromosomes must go to the right place, easier to segregate if they’re tightly packed]
Typical eukaryotic gene: box is transcribed into RNA after the initiation site, first base transcription is +1,
RNA polymerase III = tRNA transcription and other RNA level molecules, don’t need to be converted into protein
RNA polymerase II = mRNA from protein coding genes
RNA polymerase I = ribosomal RNAs
Function was found by using alpha amanitin to inhibit RNA polymerases -> II inhibited first, so mRNA production lost, then III knocked out to lose transcription of tRNA and others.
Promoter = TATA box at -30 upstream, found in most eukaryotic genes, defines the start site (if you move TATA you move start site, TATA box defines start site 30 base pairs downstream from it’s location) & transcription factor binding sites e.g. SP1 -> delete TTA box to stop transcription to show that it’s an initiation factor.
TATA box = non-coding DNA sequence (cis-regulatory element)
TATA box recognised by TFIID and TFIIA “recognition event” —> (transcription factor 2 stands for RNA polymerase 2) then TF2B also binds —> RNA polymerase 2 recognises and binds to TF2D,A&B —> TF2F associated with RNA polymerase and also binds —> TF2H binds and phosphorylates the polymerase (a kinase), so transcription may begin —> TF2D and A are left on the DNA for the next transcription event
Note: RNA polymerases never recognise DNA sequences, but recognise transcription factors bound to the DNA!
^”basal transcriptional complex” slow rate hence “basal” so needs transcription factors
summary: TF2D & A —> TF2B —> RNA polymerase & TF2F —> TF2H —> all dissociate but TF2D&A
= interacts with the promoter to increase transcription rate, distance not an issue, can be anywhere on the gene, in any orientation: action is independent of distance, orientation and position. Binding site of a transcription factor. e.g. SP40 eukaryotic virus has:
Enhancer action can open the chromatin to allow the TF2D to bind more easily. How does an enhancer effect the promotor? Idea could be “tracking” - a protein binds the enhancer and then searches downstream until it finds the promoter, or “looping”, where the enhancer and promoter are close together in three-dimensional space.
Note: in humans many promoters contain CpG islands (i.e. cystine residue on the CpG dimer is methylated) -> act as sites of transcription initiation (70% of human genes, all housekeeping ones), usually wont coincide with a TATA box, will allow transcription start over a range of sites (not as focused.) BUT e.g. -globin gene has a TATA box and a CpG island.
A. Deaton CpG islands and the regulation of transcription 2011
Transcribes one gene for ribosomes, which is repeated in tandem in the DNA with spacers in between which are not transcribed. Makes 28S, 18S and 5.8S ribosome units = 45S precursor made then cleaved. No splicing in humans, need constant ribosome expression so don’t need control.
self splicing occurs in protozoa (tetrahymena) not humans -> intron in the primary rRNA transcript splices itself out, no need for protein assembly so can be done in vitro with just ion solutions -> Nobel prize for organic solution self splicing. All splicing info is held within the RNA. [origin of splicing!]
5S of ribosome and tRNA
5S ribosomal RNA, genes are tandemly repeated in the DNA, has internal promoters —> TF3A and TF3C bind to the site at +50, TF3B binds at the +1 site, RNA polymerase III is recruited. Found by removing the upstream sequence and transcription still occured. Deletion of downstream also continued to allow transcription.
as 5S concentration goes up TF3A starts to bind to the mRNA sequence as well as the DNA sequence -> removes TF3A from DNA so acts as a negative feedback to lower 5S concentration.
Addition of 5’ cap: a methylated guanosine is added to the first transcribed base via a 5’-5’ interaction [i.e. the phosphate on the C5 of the nucleotide is bound to the c5 of the guanosine.] 5 prime to 5 prime bond so different to other phosphodiester bonds. The N7 position is methylated. Post transcriptional modification. Not encoded by DNA, added de novo. Ribosome binds to the 5’cap (guide feature) and initiates translation at the AUG codon. Polio virus doesn’t cap their mRNA -> protein binds to end of the RNA, allows it to be translated, so a cap mimicking protein + polio destroys the capping system of the host so only virus RNA translated.
Poly A tail: mRNA cleaved about 20-30bp downstream of the AAUAAA site = poly A site. Poly A site is preserved (we cut after it) and then 200 A residues are added for stability (protects from degradation like a telomere.) GU rich sequence is cleaved.
CPSF factor binds to the AAUAAA region, and CstF binds to the G/U sequence downstream of the poly adenylation site.
The complexes bind together to fold the mRNA over, exposing the poly A site for cleavage. The sequence with CstF is degraded, whilst the main mRNA sequence with the CPSF has a poly A tail added by poly A polymerase. 3’ ends usually vulnerable to exonuclease function, protected by tail. Not occurring on rRNA or tRNA so good way to select protein coding mRNA by using t affinity chromatography. Histones don’t have poly A tails -> have a stem loop structure instead via base pairing. When histone protein replication occurs (in S phase) stem loop structure is present so histones made from the mRNA, but in other phases the stem loop doesnt form -> no RNA made so no translation as we don’t need -> example of translational control.
nuclear export of mRNAs: mRNA with bound proteins = messenger ribonucleoprotein; hnRNP C tetramer measures transcript length, Tap-p15 mediates mRNA translocation by associating with FG repeats on nucleoporins in the nuclear envelope. “Nuclear Export of Messenger RNA” 2015 J. Katahira.
Splicing: must remove the introns from mRNA (non coding DNA present in the gene and in the primary transcript of mRNA.) Once mRNA leaves the nucleus it must be fully processed i.e. only exons. Exon exit the nucleus, introns stay.
GU —-> intron <—— AG
Thus 5’ splice site = GU, 3’ splice site = AG
5’ splice site is cut, leaving exon 1 and intron&exon2.
Exon 1 could react back to remake the previous structure, as it is held by the splisosome in place, but a lariat is formed by the intron: the 5’-UG reacts with an A residue (branch point) in the intron via a 5’-2’ bond (since the 3’-OH is being used in the DNA backbone.)5’ G cannot now back-react. Once the system has defined the branch point, the next AG (3’ splice site) is cut to release exon 2.
Guiding molecules: splisosome has URNAs (rich in uridine) to catalyse splicing progress.
Exon 1 held in place by the splisosome organelle so cannot diffuse away. [Autoimmune disease SLE -> antibodies made against their own proteins to inhibit splicing. Used to study splicing reaction.] SM proteins are associated with URNAs.
U1 binds to the 5’ splice site at GU, and U2 binds to the A residue in the intron, aka the branch point. U5 binds to exon 1. U4/6 replaces U1 at the UG 5’ splice site, resulting in U6 and U2 interaction -> this bends the DNA and allows the lariat to form. Now the 5’ splice site can be cut —-> exon 1 released, lariat and exon two remain —> exon 2 released by 3’ splicing —-> intron with associate proteins is free
All proteins left on the lariate to leave the RNA “clean” for translation.
Are all proteins made: Use Western Blotting to compare protein production in different tissues -> shows that some proteins are only translated in some tissues, hence protein control must be tissue specific.
Are all mRNAS made: Use northern blotting [RNA electrophoresis then probe the membrane] or microarrays to see if all mRNA of protein coding is made in each tissue type or not. Answer: no! Use a microarray with all genes and hybridise them with labelled mRNA -> shows that not all genes make mRNA in every tissue, so we have transcriptional control. Northern blotting shows mRNA differences in one gene between two tissues whereas microarrays are wider.
Transcriptional control is the main control -> all DNA is present in every cell but only some mRNAs are made, hence transcription is regulated (use Northern blotting to show this.)
Is all DNA present in all tissues: DNA is usually present in all cells, although:
Therefore = DNA expression is regulated! Differential expression = gene regulation!
Is control at transcription or after? mRNA is made only from active genes: this is shown through experiments that prove that non active mRNA is not made and quickly degraded:
Inactive genes might be made: make a Northern blot targeted to nuclear mRNA (homogenise nucleus to get snapshot) which would show if inactive genes are transcribed-> no evidence of this, only get active genes made into mRNA and pre spliced sequences. Steady state measure so cannot prove transcriptional control only support it.
Inactive genes might get degraded quickly: measure actual rates of gene transcription by pulse labelling:
pulse labelling -> add in radioactive UTPs and see what type of mRNA is synthesised (only works for RNA with a high synthesis level since radioactivity can only briefly be pulsed. Method is very insensitive, only works about about 20 genes that are highly synthesised.)
nuclear run on assay -> to get a higher efficiency with the pulse labelling method remove the cytoplasm and add radioactive UTP directly to the nucleus so only radioactive nucleotides are available (less dilute) remove cytoplasm = RNA polymerases stall on the gene add radioactive nucleotides polymerases start again, transcription occurs run a “nuclear run off” assay to see what mRNA is made [i.e. basically northern blotting, hybridise and see if the mRNA is present in two different tissue types, for e.g. ] hybridise the extracted RNA to gene specific probes on a blot
Salivary glands of Drosophila has polytene chromosome / giant chromosome [multiple rounds of replication produce many sister chromatids which stay fused together] due to global DNA amplification (unknown why this occurs.) Treat with steroid hormone and parts of the chromosome puff outwards (if you heat different parts puff outwards) -> puffs represent sites of transcription, puffing caused by new transcription -> prove this by giving steroid hormones and radioactive UTP to make radio-labelled RNA -> hydrides radioactive RNA back to the chromosomes = see where it was transcribed from, see that the mRNA hybridises with the puffs. If the puff disappears so does the RNA hybridisation site since they are one and the same. Visualisation of transcriptional control!
^ all proves that control is at transcription. Most expensive step of expression is translation so best to have transcriptional control to minimise lost ATP.
Precursor-RNA control: RNA processing control. In Drosphila “incorrect” splicing to not remove introns can mark the presursor-mRNA for destruction; p element transposase gene controlled by this discard, only processed and made into protein in germline cells.
Alternate splicing in eukaryotes -> RNA processing control = different exon arrangement from one gene, can splice the exon differently e.g. can make either calcitonin in the thyriod or CGPR in the brain from the same gene. Gene has two poly A tail sites so that either transcript will still have a poly A tail. Calcitonin = exons 1-4, CGPR = exons 1-3 and 5&6.
e.g. antibodies are released from the cell by alternate splicing —> usually antibodies made with a hydrophobic tail (to sit in the membrane) but when under attack splicing pattern is changed to remove this exon from the immunoglobulin primary transcript’s C terminus, so that the antibodies can be released from the cell. So stage 1: membrane bound antibodies made in early immune response, stage 2: antigens binds, stimulates cells with correct antibody to proliferate, change in splicing to allow secreted antibodies.
e.g. troponin always made in muscles, but spliced alternatively in different muscles
RNA stability —> casein, protein in milk -> prolactin hormone stimulates the gene and stabilises the mRNA of casein.
e.g. histone production, stabilised by the stem loop structure rather than poly A tail, occurs during S phase when histone production is needed, stem loop not made outside of S phase.
e.g. mRNA of globin gene kept stable in enucleated RBCs. Globin mRNA half-lives vary from 10 to 24 hours. thalassemia = disease caused by mutation of DNA than inhibits globin mRNA transcription/processing e.g. blocks polyadenylation.
e.g. iron-responsive element stabilised by iron regulatory proteins binding the transferrin receptor mRNA or the DMT1 mRNA.
Translational control —> in humans occurs after fertilisation. All egg cells have mRNA for the necessary proteins in them, but they are only translated in the fertilised egg since it is a large energy cost.
miRNA: regulatory RNA: can induce degradation of an mRNA by binding (cleavage of the dsRNA) inhibit translation by binding to block machinery bind to DNA to turn euchromatin -> heterochromatin
Differentiation intention kept over generations due to chromatin structure: cartilage cells cultured in different growth medium, lose differentiated state to become stem cells —> gene expression changes, collagen expression lost —> will return to being cartilage cells in normal growth medium [memory of differentiation held in cells somehow.]
cells know what they are going to become —> Drosophila, larval stage then pupa then adult. In larval stage tissues have differentiated, and have undifferentiated “imaginal” discs along the length: these cells form the adult body, larval differentiated tissue falls away. If you move imaginal discs around it retains its differentiation intention despite a new environment, so differentiation is an innate cell property and not environmentally based: “commitment.” Commitment maintained over generations and will differentiate on signal.
Lineage differences stem from chromatin structures!
Gene unwinds before transcription back into the beads on a string structure
^^Proof: isolate chromatin along from cells -> digest with DNase1, isolate the DNA, cut with restriction enzymes to see changes, gene bands on southern blot disappear over time due to digestion. Digestion patterns are tissue specific! Euchromatin is digested before heterochromatin due to tighter packing. So active genes are held in a looser form. e.g. ofalbumin gene in liver is digested before vitilogenin.
Methylation of genes to silence the C residue in eukaryotes (A residue in prokaryotes) = gene silencing. CG-GC patterns often have the same methylation pattern. Methyl group put on the same carbon as thymine. Lack of methylation = active genes.
acetylate the histones -> Add CH3COO- at lysine and arginine residues. Activates transcription, either by recruiting TF or by blocking the inhibition of transcription [i.e. blocks molecule that keeps heterochromatin structure in place.] (probably a bit of both.) Use sodium buterate to inhibit deacetylation = increase of histone acetylation.
histone methylation -> can either increase or decrease transcription; again arginine and lysine most often modified. Negative effects of histone methylation to block transcription: binding of siRNA to DNA recruits HP1 protein to methylate histones at inhibitory sites and recruits de novo methylase, Dnmt. H3 is the most oft-modified histone, and most of its modifications occur in its N terminus -> Lysine is modified by covalent attachment of one acetyl or one, two or three methyl groups to the nitrogen in its R group.
Is all the DNA equally sensitive to these changes? No -> DNase 1 endonuclease experiment reveals hypersensitive sites within the DNA. More likely to be cut here. Hypersensitive sites = naked DNA, no histones. Very open structure therefore.
Beta globin gene: sits on chromosome 11 and contains adult beta goblin, 2 gamma globins (one with glycine one with alanine) and an epsilon globin and a pseudogene. Alpha globin genes (3 types) are on chromsome 16. Adult Hb made from two alpha and two beta, embryonic Hb made from 2 gamma and an alpha, early embryo makes Hb from 2 epsilon and two alpha. No alpha in this gene -> present on another chromosome.
are featured in locus control region (LCR.) Hypersensitive sites are introduced before each gene become used during development. If you add a globin gene into an animal it won’t get expressed unless the LCR is present. LCR binds regulatory proteins for transcription. We need epsilon made first, so introduce a hypersensitivity point there; then when we need gamma introduce on there, remove other one. etc. [The locus control region (LCR) is a long-range cis-regulatory element that enhances expression of linked genes at distal chromatin sites. It functions in a copy number-dependent manner and is tissue-specific, as seen in the selective expression of β-globin genes in erythroid cells.
Locus control regions, Blood 2002
Drosophila: SGS4 gene mutation prevents expression of SGS4 due to loss of upstream hypersensitivity sites (deletion.)
How is alpha globin coexpressed with other globins to make haemoglobin at different developmental stages? Integrator gene switched on by signal —> makes a factor to switch on more genes which have the target sequence e.g. switch on alpha and beta vs alpha and gamma
e.g. heat shock element: HSE only in genes that respond the heat, like IRE only in genes responding to [iron.] If we move the HSE onto a new gene e.g. thymidine kinase now heat inducible. HSF made by integrator gene to switch on all HSE containing genes.
Domain for DNA binding, and one for transcriptional activation. HSF also has a hormone binding domain. TF binds to stimulate assembly of transcription complex and stimulates activity of complex to upregulate it.
Negative transcription factors: interfere with positive TF mostly.
Direct repression of transcription:
TFs can be made only in some tissues e.g. MyoD only made in muscle cells to aid differentiation [i.e. is what makes a cell a muscle cell] OR TF can only be activated in the necessary tissue e.g.
e.g. glucocorticode binds to the receptor to throw off the HSP, glucocorticode enters nucleus to bind to DNA [ligand binding and activation.]
e.g. HSF bound to Hsp90 to be inactive, once temperature raised Hsp90 needs to deal with unfolded protein so unbinds, activates HSF by allowing it to form a trimer + needs to be phosphorylated = can bind to DNA
e.g. IRP usually bound to the IRE on the mRNA of ferritin and transferrin receptors to block translation. Iron binds to IRP, conformational change, unbinds mRNA so translation can occur! So at high iron conc iron storage proteins increase.