MUTATION AND REPAIR OF DNA
Most
biological molecules have a limited lifetime. Many proteins, lipids and RNAs
are degraded when they are no longer needed or damaged, and smaller molecules
such as sugars are metabolized to compounds to make or store energy. In
contrast, DNA is the most stable biological molecule known, befitting its role
in storage of genetic information. The DNA is passed from one generation to
another, and it is degraded only when cells die. However, it can change, i.e.
it is mutable. Mutations, or changes
in the nucleotide sequence, can result from errors during DNA replication, from
covalent changes in structure because of reaction with chemical or physical
agents in the environment, or from transposition. Most of the sequence alterations
are repaired in cells. Some of the
major avenues for changing DNA sequences and repairing those mutations will be
discussed in this chapter.
Sequence alteration in the
genomic DNA is the fuel driving the course of evolution. Without such mutations,
no changes would occur in populations of species to allow them to adapt to
changes in the environment. Mutations in the DNA of germline cells fall into
three categories with respect to their impact on evolution. Most have no effect
on phenotype; these include sequence changes in the large portion of the genome
that neither codes for protein, or is involved in gene regulation or any other
process. Some of these neutral
mutations will become prevalent in a population of organisms (or fixed) over long periods of time by
stochastic processes. Other mutations do have a phenotype, one that is
advantageous to the individuals carrying it. These mutations are fixed in
populations rapidly (i.e. they are subject to positive selection). Other mutations have a detrimental phenotype,
and these are cleared from the population quickly. They are subject to negative or purifying selection.
Whether a mutation is neutral,
disadvantageous or useful is determined by where it is in the genome, what the
type of change is, and the particulars of the environmental forces operating on
the locus. For our purposes, it is important to realize that sequence changes
are a natural part of DNA metabolism. However, the amount and types of
mutations that accumulate in a genome are determined by the types and
concentrations of mutagens to which a cell or organism is exposed, the
efficiency of relevant repair processes, and the effect on phenotype in the
organism.
Mutations and
mutagens
Types of mutations
Mutations
commonly are substitutions, in which
a single nucleotide is changed into a different nucleotide. Other mutations
result in the loss (deletion) or
addition (insertion) of one or more
nucleotides. These insertions or deletions can range from one to tens of
thousands of nucleotides. Often an insertion or deletion is inferred from
comparison of two homologous sequences, and it may be impossible to ascertain
from the data given whether the presence of a segment in one sequence but not
another resulted from an insertion of a deletion. In this case, it can be
referred to as an indel. One
mechanism for large insertions is the transposition
of a sequence from one place in a genome to another
Nucleotide substitutions are one
of two classes. In a transition, a
purine nucleotide is replaced with a purine nucleotide, or a pyrimidine
nucleotide is replaced with a pyrimidine nucleotide. In other words, the base
in the new nucleotide is in the same chemical class as that of the original
nucleotide. In a transversion, the
chemical class of the base changes, i.e. a purine nucleotide is replaced with a
pyrimidine nucleotide, or a pyrimidine nucleotide is replaced with a purine
nucleotide.
Comparison of the sequences of
homologous genes between species reveals a pronounced preference for
transitions over transversions (about 10-fold), indicating that transitions
occur much more frequently than transversions.
Errors in Replication
Despite effective proofreading
functions in many DNA polymerases, occasionally the wrong nucleotide is
incorporated. It is estimated that E.
coli DNA polymerase III holoenzyme (with a fully functional proofreading
activity) uses the wrong nucleotide during elongation about 1 in 108
times. It is more likely for an incorrect pyrimidine nucleotide to be
incorporated opposite a purine nucleotide in the template strand, and for a
purine nucleotide to be incorporated opposite a pyrimidine nucleotide. Thus
these misincorporations resulting in a transition substitution are more common.
However, incorporation of a pyrimidine nucleotide opposite another pyrimidine
nucleotide, or a purine nucleotide opposite another purine nucleotide, can
occur, albeit at progressively lower frequencies. These rarer misincorporations
lead to transversions.
Question 7.1. If a dCTP is
incorporated into a growing DNA strand opposite an A
in the template strand, what mutation will result? Is it a transition or a
transversion?
Question 7.2. If a dCTP is incorporated into a growing DNA strand opposite
a T in the template strand, what mutation will
result? Is it a transition or a transversion?
A change in the isomeric form of
a purine or pyrimidine base in a nucleotide can result in a mutation. The
base-pairing rules are based on the hydrogen-bonding capacity of nucleotides
with their bases in the keto
tautomer. A nucleotide whose base is in
the enol tautomer can pair with the
"wrong" base in another nucleotide. For example, a T in the rare enol isomer will pair with a keto G, and an enol G will pair with a keto T.
The enol tautomers of the normal deoxynucleotides guanidylate and
thymidylate are rare, meaning that a single molecule is in the keto form most of the time, or within a
population of molecules, most of them are in the keto form. However, certain nucleoside and base analogs adopt these
alternative isomers more readily. For instance 5-bromo-deoxyuridine (or 5-BrdU)
is an analog of deoxythymidine (dT) that is in the enol tautomer more frequently than dT is (although most of the time
it is in the keto tautomer).
Thus the frequency of
misincorporation can be increased by growth in the presence of base and
nucleoside analogs. For example, growth in the presence of 5-BrdU results in an
increase in the incorporation of G opposite a T in the DNA, as illustrated in
Fig. 7.3. After cells take up the nucleoside 5-BrdU, it is converted to
5-BrdUTP by nucleotide salvage enzymes that add phosphates to its 5’ end.
During replication, 5-BrdUTP (in the keto
tautomer) will incorporate opposite an A in DNA. The 5-BrdU can shift into the enol form while in DNA, so that when it
serves as a template during the next round of replication (arrow 1 in the
diagram below), it will direct incorporation of a G in the complementary strand. This G will in turn direct incorporation of a
C in the top strand in the next round of replication (arrow 2). This leaves a C:G base pair where there was a
T:A base pair in the parental DNA. Once the pyrimidine shifts back to the
favored keto tautomer, it can direct
incorporation of an A, to give the second product in the diagram below (with a
BrU-A base pair).
Likewise, misincorporation of A
and C can occur when they are in the rare imino
tautomers rather than the favored amino
tautomers. In particular, imino C
will pair with amino A, and imino A will
pair with amino C
Misincorporation during
replication is the major pathway for introducing transversions into DNA. Normally, DNA is a series of
purine:pyrimidine base pairs, but in order to have a transversion, a pyrimidine
has to be paired with another pyrimidine, or a purine with a purine. The DNA
has to undergo local structural changes to accommodate these unusual base
pairs. One way this can happen for a purine-purine base pair is for one of the
purine nucleotides to shift from the preferred anti conformation to the syn
conformation. Atoms on the "back side" of the purine nucleotide in
the syn-isomer can form hydrogen
bonds with atoms in the rare tautomer of the purine nucleotide, still in the
preferred anti conformation. For
example, an A nucleotide in the syn-,
amino- isomer can pair with an A
nucleotide in the anti-, imino- form. Thus the transversion
required a shift in the tautomeric form of the base in one nucleotide as well
as a change in the base-sugar conformation (anti
to syn) of the other nucleotide.
Errors in replication are not
limited to substitutions. Slippage
errors during replication will add or delete nucleotides. A DNA polymerase
can insert additional nucleotides, more commonly when tandem short repeats are
the template (e.g. repeating CA dinucleotides). Sometimes the template strand
can loop out and form a secondary structure that the DNA polymerase does not
read. In this case, a deletion in the nascent strand will result. The ability
of intercalating agents to increase the frequency of such deletions is
illustrated in
Reaction with mutagens
Many
mutations do not result from errors in replication. Chemical reagents can
oxidize and alkylate the bases in DNA, sometimes changing their base-pairing
properties. Radiation can also damage DNA. Examples of these mutagenic
reactions will be discussed in this section.
Chemical modification by oxidation
When the
amino bases, adenine and cytosine, are oxidized, they also lose an amino group.
Thus the amine is replaced by a keto group in the product of this oxidative
deamination reaction. For instance, oxidation of cytosine produces uracil,
which base pairs with adenine (shown for deoxycytidine in. Likewise,
oxidation of adenine yields hypoxanthine, which base pairs with cytosine . Thus the products of these chemical reactions will be mutations in the
DNA, if not repaired. Oxidation of guanine yields xanthine . In
DNA, xanthine will pair with cytosine, as does the original guanine, so this
particular alteration is not mutagenic.
Oxidation
of C to U occurs spontaneously at a high rate. The frequency is such that 1 in
1000 Cs in the human genome would become Us during a lifetime, if they were not
repaired. As will be discussed later, repair mechanisms have evolved to replace
a U in DNA with a T.
Methylation of C prior to its
oxidative deamination will effectively mask it from the repair processes to
remove U’s from DNA. This has a substantial impact on the genomes of organisms
that methylate C. In many eukaryotes, including vertebrates and plants (but not
yeast or Drosophila), the principal
DNA methyl transferase recognizes the dinucleotide CpG in DNA as the substrate,
forming 5-methyl-CpG (Fig. 7.8). When 5-methyl cytosine undergoes oxidative
deamination, the result is 5-methyl uracil, which is the same as thymine. The
surveillance system that recognizes U’s in DNA does nothing to the T, since it
is a normal component of DNA. Hence the oxidation of 5-methyl CpG to TpG,
followed by a round of replication, results in a C:G to T:A transition at
former CpG sites (Fig. 7.8). This spontaneous deamination is quite frequent;
indeed, C to T transitions at CpG dinucleotides are the most common mutations
in humans. Since this transition is not repaired, over time the number of CpG
dinucleotides is greatly diminished in the genomes of vertebrates and plants.
Me
--CG-- --CG-- [O] --TG-- + NH3 --TG--
|||||| ® |||||| ® ||o||| ® |||||| +wt
--GC-- --GC-- --GC-- --AC--
Methyl- Replicate
transferase mutation
Figure 7.8. Methylation of CpG dinucleotides followed
by oxidative deamination results in TpG dinucleotides.
Some
regions of plant and vertebrate genomes do not show the usual depletion of CpG
dinucleotides. Instead, the frequency of CpG approaches that of GpC or the
frequency expected from the individual frequency of G and C in the genome. One
working definition of these CpG islands
is that they are segments of genomic DNA at least 100 bp long with a CpG to GpC
ratio of at least 0.6. These islands can be even longer and have a CpG/GpC >
0.75. They are distinctive regions of these genomes and are often found in
promoters and other regulatory regions of genes. Examination of several of
these CpG islands has shown that they are not methylated in any tissue, unlike
most of the other CpGs in the genome. Current areas of research include
investigating how the CpG islands escape methylation and their role in
regulation of gene expression.
The rate of
oxidation of bases in DNA can be increased by treating with appropriate
reagents, such as nitrous acid (HNO2). Thus treatment with nitrous acid will
increase the oxidation of C to U, and hence lead to C:G to T:A transitions in
DNA. It will also increase the oxidation of adenine to hypoxanthine, leading to
A:T to G:C transitions in DNA.
Chemical modification by alkylation
Many mutagens are alkylating agents. This means that they
will add an alkyl group, such as methyl or ethyl, to a base in DNA. Examples of
commonly used alkylating agents in laboratory work are N-methyl-nitrosoguanidine
and N-methyl-N'-nitro-nitrosoguanidine. The chemical
warfare agents sulfur mustard and nitrogen mustard are also alkylating agents.
N-methyl-nitrosoguanidine
and MNNG transfer a methyl group to guanine (e.g. to the O6
position) and other bases (e.g. forming 3-methyladenine from adenine). The additional methyl (or other alkyl group)
causes a distortion in the helix. The distorted helix can alter the base
pairing properties. For instance, O6-methylguanine will sometimes base pair with
thymine.
The order
of reactivity of nucleophilic centers in purines follows roughly this series:
N7-G >> N3-A > N1-A @ N3-G @ O6-G.
A common laboratory reagent for purines in DNA is
dimethylsulfate, or DMS. The products of this reaction are primarily N7-guanine,
but N3-adenine is also detectable. This reaction is used to identify
protein-binding sites in DNA, since interaction with a protein can cause
decreased reactivity to DMS of guanines within the binding site but enhanced
reactivity adjacent to the site. Methylation to form N7-methyl-guanine does not
cause miscoding in the DNA, since this modified purine still pairs with C.
Chemicals that cause deletions
Some
compounds cause a loss of nucleotides from DNA. If these deletions occur in a
protein-coding region of the genomic DNA, and are not an integral multiple of
3, they result in a frameshift mutation. These are commonly more severe
loss-of-function mutations than are simple substitutions. Frameshift mutagens
such as proflavin or ethidium bromide have flat, polycyclic ring structures. They may bind to and intercalate
within the DNA, i.e. they can insert between stacked base pairs. If a segment
of the template DNA is the looped out, DNA polymerase can replicate past it,
thereby generating a deletion. Intercalating agents can stabilize secondary
structures in the loop, thereby
increasing the chance that this segment stays in the loop and is not
copied during replication Thus growth of cells in the presence
of such intercalating agents increase the probability of generating a deletion.
Ionizing radiation
High energy
radiation, such as X-rays, g-rays, and b particles (or electrons) are powerful
mutagens. Since they can change the number of electrons on an atom, converting
a compound to an ionized form, they are referred to as ionizing radiation. They can cause a number of chemical changes in
DNA, including directly break phosphodiester backbone of DNA, leading to
deletions. Ionizing radiation can also break open the imidazole ring of
purines. Subsequent removal of the damaged purine from DNA by a glycosylase
generates an apurinic site.
Ultraviolet radiation
Ultraviolet
radiation with a wavelength of 260 nm will form pyrimidine dimers
between adjacent pyrimidines in the DNA. The dimers can be one of two types. The major product is a cytobutane-containing thymine dimer
(between C5 and C6 of adjacent T's). The other product has a covalent bond
between position 6 on one pyrimidine and
position 4 on the adjacent pyrimidine, hence it is called the "6-4"
photoproduct.
The pyrimidine dimers cause a
distortion in the DNA double helix. This distortion blocks replication and
transcription.
Table
7.1 lists several causes of mutations in DNA, including mutagens as well as
mutator strains in bacteria. Note that some of these mutations lead to
mispairing (substitutions), others lead to distortions of the helix, and some
lead to both.
Transitions
can be generated both by damage to the DNA and by misincorporation during
replication. Transversions occur primarily by misincorporation during
replication. The frequency of such errors is greatly increased in mutator
strains, e.g. lacking a proofreading function in the replicative DNA
polymerase. Also, after a bacterial cell has sustained sufficient damage to
induce the SOS response, the DNA polymerase shifts into a an error-prone mode
of replication. This can also be a source of mutant alleles.
Table. 7.1. Summary
of effects of various agents that alter DNA sequences (mutagens and mutator
genes)
Agent (mutagen, etc.)
|
Example
|
Result
|
|
|
|
Nucleotide analogs
|
BrdUTP
|
transitions, e.g. A:T to G:C
|
Oxidizing agents
|
nitrous acid
|
transitions, e.g. C:G to T:A
|
Alkylating agents
|
nitrosoguanidine
|
transitions, e.g. G:C to A:T
|
Frameshift mutagens
|
Benz(a)pyrene
|
deletions (short)
|
Ionizing radiation
|
X-rays, g-rays
|
breaks and deletions (large)
|
UV
|
UV, 260 nm
|
Y-dimers, block replication
|
|
|
|
Misincorporation:
|
|
|
Altered DNA Pol
III
|
mutD=dnaQ; e
subunit of DNA PolIII
|
transitions, transversions and frameshifts in mutant
strains
|
Error-prone
repair
|
Need UmuC, UmuD, DNA PolIII
|
transitions and transversions in wild-type during SOS
|
Other mutator
genes
|
mutM, mutT, mutY
|
transversions in the mutant strains
|
Repair mechanisms
The second
part of this chapter examines the major classes of DNA repair processes. These
are:
reversal of
damage,
nucleotide
excision repair,
base
excision repair,
mismatch
repair,
recombinational
repair, and
error-prone
repair.
Many of these processes were first studies in bacteria such
as E. coli, however only a few are limited
to this species. For instance, nucleotide excision repair and base excision
repair are found in virtually all organisms, and they have been well
characterized in bacteria, yeast, and
mammals. Like DNA replication itself, repair of damage and misincorporation is
a very old process.
Reversal of damage
Some kinds
of covalent alteration to bases in DNA can be directly reversed. This occurs by
specific enzyme systems recognizing the altered base and breaking bonds to
remove the adduct or change the base back to its normal structure.
Photoreactivation is a light-dependent
process used by bacteria to reverse pyrimidine dimers formed by UV radiation.
The enzyme photolyase binds to a pyrimidine dimer and catalyzes a second
photochemical reaction (this time using visible light) that breaks the
cyclobutane ring and reforms the two adjacent thymidylates in DNA. Note that
this is not formally the reverse of the reaction that formed the pyrimidine
dimers, since energy from visible light is used to break the bonds between the
pyrimidines, and no UV radiation is released. However, the result is that the
DNA structure has been returned to its state prior to damage by UV. The
photolyase enzyme has two subunits, which are encoded by the phrA and phrB genes in E. coli.
A second
example of the reversal of damage is the removal
of methyl groups. For instance, the enzyme O6‑methylguanine
methyltransferase, encoded by the ada
gene in E. coli, recognizes O6‑methylguanine in duplex DNA. It then removes
the methyl group, transferring it to an amino acid of the enzyme. The
methylated enzyme is no longer active, hence this has been referred to as a
suicide mechanism for the enzyme.
Excision repair
The most common means of repairing damage or a
mismatch is to cut it out of the duplex
DNA and recopy the remaining complementary strand of DNA, as outlined in Fig.
7.12. Three different types of excision repair have been characterized:
nucleotide excision repair, base excision repair, and mismatch repair. All
utilize a cut, copy, and paste mechanism.
In the cutting stage, an enzyme or
complex removes a damaged base or a string of nucleotides from the DNA. For the
copying, a DNA polymerase (DNA polymerase I in E. coli) will copy the template to replace the excised, damaged
strand. The DNA polymerase can initiate synthesis from 3' OH at the
single-strand break (nick) or gap in the DNA remaining at the site of damage
after excision. Finally, in the pasting
stage, DNA ligase seals the remaining nick to give an intact, repaired DNA.
Nucleotide excision repair
In nucleotide excision repair (NER), damaged bases are cut out within
a string of nucleotides, and replaced with DNA as directed by the undamaged
template strand. This repair system is used to remove pyrimidine dimers formed
by UV radiation as well as nucleotides modified by bulky chemical adducts. The
common feature of damage that is repaired by nucleotide excision is that the
modified nucleotides cause a significant distortion in the DNA helix. NER
occurs in almost all organisms examined.
Some of the
best-characterized enzymes catalyzing
this process are the UvrABC excinuclease and the UvrD helicase in E. coli. The genes encoding this repair
function were discovered as mutants that are highly sensitive to UV damage,
indicating that the mutants are defective in UV repair. As illustrated in, wild type E. coli
cells are killed only at higher doses of UV radiation. Mutant strains
can be identified that are substantially more sensitive to UV radiation; these
are defective in the functions needed for UV-resistance,
abbreviated uvr. By collecting large
numbers of such mutants and testing them for their ability to restore
resistance to UV radiation in combination, complementation groups were
identified. Four of the complementation groups, or genes, encode proteins that
play major rules in NER; they are uvrA,
uvrB, uvrC and uvrD.
The enzymes
encoded by the uvr genes have been
studied in detail. The polypeptide products of the uvrA, uvrB, and uvrC genes are subunits of a
multisubunit enzyme called the UvrABC
excinuclease. UvrA is the protein encoded by uvrA, UvrB is encoded by uvrB,
and so on. The UvrABC complex recognizes damage-induced structural distortions
in the DNA, such as pyrimidine dimers. It then cleaves on both sides of the
damage. Then UvrD (also called helicase II), the product of the uvrD gene, unwinds the DNA, releasing
the damaged segment. Thus for this system, the UvrABC and UvrD proteins carry
out a series of steps in the cutting phase of excision repair. This leaves a
gapped substrate for copying by DNA polymerase and pasting by DNA ligase.
The UvrABC
proteins form a dynamic complex that recognizes damage and makes
endonucleolytic cuts on both sides. The two cuts around the damage allow the
single-stranded segment containing the damage to be excised by the helicase
activity of UvrD. Thus the UvrABC dynamic complex and the UvrBC complex can be
called excinucleases. After the
damaged segment has been excised, a gap of 12 to 13 nucleotides remains in the
DNA. This can be filled in by DNA polymerase and the remaining nick sealed by
DNA ligase. Since the undamaged template directs the synthesis by DNA polymerase,
the resulting duplex DNA is no longer damaged.
In more
detail, the process goes as follow. UvrA2 (a dimer) and
Uvr B recognize the damaged site as a (UvrA)2UvrB complex. UvrA2 then
dissociates, in a step that requires ATP hydrolysis. This is an autocatalytic
reaction, since it is catalyzed by UvrA, which is itself an ATPase. After UvrA has dissociated, UvrB (at the damaged
site) forms a complex with UvrC. The UvrBC complex is the active nuclease. It makes the incisions on each side of the
damage, in another step that requires ATP. The phosphodiester backbone is
cleaved 8 nucleotides to the 5' side of the damage and 4-5 nucleotides on the
3' side. Finally, the UvrD helicase then unwinds DNA so the damaged segment is
removed. The damaged DNA segment dissociates attached to the UvrBC complex.
Like all helicase reactions, the unwinding requires ATP hydrolysis to disrupt
the base pairs. Thus ATP hydrolysis is required at three steps of this series
of reactions.
Nucleotide
excision repair is very active in mammalian cells, as well as cells from may
other organisms. The DNA of a normal skin cell exposed to sunlight would
accumulate thousands of dimers per day if this repair process did not remove
them! One human genetic disease, called xeroderma pigmentosum (XP), is a skin
disease caused by defect in enzymes that remove UV lesions. Fibroblasts
isolated from individual XP patients are markedly sensitive to UV radiation
when grown in culture, similar to the phenotype shown by E. coli uvr mutants.
These XP cell lines can be fused in culture and tested for the ability to
restore resistance to UV damage. XP cells lines that do so fall into different
complementation groups. Several complementation groups, or genes, have been
defined in this way. Considerable progress has been made recently in
identifying the proteins encoded by each XP gene (Table 7.2). Note the tight
analogy to bacterial functions needed for NER. Similar functions are also found
in yeast (Table 7.2). Additional proteins utilized in eukaryotic NER include
hHR23B (which forms a complex with the DNA-damage sensor XPC), ERCCI (which
forms a complex with the XPF to catalyze incision 5’ to the site of damage),
the several other subunits of TFIIH and the single-strand
binding protein RPA.
Table 7.2 Genes affected in XP patients, and encoded
proteins
Human Gene
|
Protein Function
|
Homologous to
S. cerevisiae
|
Analogous to
E. coli
|
XPA
|
Binds damaged DNA
|
Rad14
|
UvrA/UvrB
|
XPB
|
3’ to 5’ helicase, component of TFIIH
|
Rad25
|
UvrD
|
XPC
|
DNA-damage sensor (in complex with hHR23B)
|
Rad4
|
|
XPD
|
5’ to 3’ helicase, component of TFIIH
|
Rad3
|
UvrD
|
XPE
|
Binds damaged DNA
|
|
UvrA/UvrB
|
XPF
|
Works with ERRC1 to cut DNA on 5’ side of damage
|
Rad1
|
UvrB/UvrC
|
XPG
|
Cuts DNA on 3’ side of damage
|
Rad2
|
UvrB/UvrC
|
NER occurs in two modes in many
organisms, including bacteria, yeast and mammals. One is the global repair that
acts throughout the genome, and the second is a specialized activity is that is
coupled to transcription. Most of the XP gene products listed in Table 2
function in both modes of NER in mammalian cells. However, XPC (acting in a
complex with another protein called hHR23B) is a DNA-damage sensor that is
specific for global genome NER. In transcription coupled NER, the elongating
RNA polymerase stalls at a lesion on the template strand; perhaps this is the
damage recognition activity for this mode of NER. One of the basal transcription
factors that associates with RNA polymerase II, TFIIH (see Chapter 10), also
plays a role in both types of NER. A rare genetic disorder in humans, Cockayne
syndrome (CS), is associated with a defect specific to transcription coupled
repair. Two complementation groups have been identified, CSA and CSB.
Determination of the nature and activity of the proteins encoded by them will
provide additional insight into the efficient repair of transcribed DNA
strands. The phenotype of CS patients is pleiotropic, showing both
photosensitivity and severe neurological and other developmental disorders,
including premature aging. These symptoms are more severe than those seen for
XP patients with no detectable NER, indicating that transcription-coupled
repair or the CS proteins have functions in addition to those for NER.
Other
genetic diseases also result from a deficiency in a DNA repair function, such
as Bloom's syndrome and Fanconi's anemia. These are intensive areas of current
research. A good resource for updated information on these and other inherited
diseases, as well as human genes in general, is the Online Mendelian
Inheritance in Man, or OMIM, accessible at http://www.ncbi.nlm.nih.gov.
Ataxia
telangiectasia, or AT, illustrates the effect of alterations in a protein not
directly involved in repair, but perhaps signaling that is necessary for proper
repair of DNA. AT is a recessive, rare genetic disease marked by uneven gait
(ataxia), dilation of blood vessels (telangiectasia) in the eyes and face,
cerebellar degeneration, progressive mental retardation, immune deficiencies,
premature aging and about a 100-fold increase in susceptibility to
cancers. That latter phenotype is
driving much of the interest in this locus, since heterozygotes, which comprise
about 1% of the population, also have an increased risk of cancer, and may
account for as much as 9% of breast cancers in the United States. The gene that is mutated in AT
(hence called "ATM") was isolated in 1995 and localized to chromosome
11q22-23.
The ATM
gene does not appear to encode a protein that participates directly in DNA
repair (unlike the genes that cause XP upon mutation). Rather, AT is caused by a defect in a
cellular signaling pathway. Based on
homologies to other proteins, the ATM gene product may be involved in the
regulation of telomere length and cell cycle progression. The C-terminal domain is homologous to
phosphatidylinositol-3-kinase (which is also a Ser/Thr protein kinase) - hence
the connection to signaling pathways.
The ATM protein also has regions of homology to DNA-dependent protein
kinases, which require breaks, nicks or gaps to bind DNA (via subunit Ku);
binding to DNA is required for the protein kinase activity. This suggests that ATM protein could be
involved in targeting the repair machinery to such damage.
Base excision repair
Base
excision repair differs from nucleotide excision repair in the types substrates
recognized and in the initial cleavage event. Unlike NER, the base excision
machinery recognizes damaged bases that do not cause a significant distortion
to the DNA helix, such as the products of oxidizing agents. For example, base
excision can remove uridines from DNA, even though a G:U base pair does not
distort the DNA. Base excision repair is versatile, and this process also can
remove some damaged bases that do distort the DNA, such as methylated purines.
In general, the initial recognition is a specific damaged base, not a helical
distortion in the DNA. A second major difference is that the initial cleavage is
directed at the glycosidic bond connecting the purine or pyrimidine base to a
deoxyribose in DNA. This contrasts with the initial cleavage of a
phosphodiester bond in NER.
Cells
contain a large number of specific glycosylases
that recognize damaged or inappropriate bases, such as uracil, from the DNA.
The glycosylase removes the damaged or inappropriate base by catalyzing
cleavage of the N-glycosidic bond that attaches the base to the sugar-phosphate
backbone. For instance, uracil-N-glycosylase, the product of the ung gene, recognizes uracil in DNA and
cuts the N-glycosidic bond between the base and deoxyribose. Other
glycosylases recognize and cleave damaged bases. For instance methylpurine
glycosylase removes methylated G and A from DNA. The result of the activity of
these glycosylases is an apurinic/apyrimidinic site, or AP site). At
an AP site, the DNA is still an intact duplex, i.e. there are no breaks in the
phosphodiester backbone, but one base is gone.
Next, an AP endonuclease nicks the DNA just 5’
to the AP site, thereby providing a primer for DNA polymerase. In E. coli, the 5' to 3' exonuclease
function of DNA polymerase I removes the damaged region, and fills in with
correct DNA (using the 5' to 3' polymerase, directed by the sequence of the
undamaged complementary strand).
Additional
mechanisms have evolved for keeping U’s out of DNA. E. coli also has a dUTPase,
encoded by the dut gene, which
catalyzes the hydrolysis of dUTP to dUMP.
The product dUMP is the substrate for thymidylate synthetase, which
catalyzes conversion of dUMP to dTMP. This keeps the concentration of dUTP in
the cell low, reducing the chance that it will be used in DNA synthesis. Thus
the combined action of the products of the dut
+ ung genes helps prevent the
accumulation of U's in DNA.
Mismatch repair
The third
type of excision repair we will consider is mismatch repair, which is used to repair errors that occur during
DNA synthesis. Proofreading during replication is good but not perfect. Even
with a functional e subunit, DNA
polymerase III allows the wrong nucleotide to be incorporated about once in
every 108 bp synthesized in E. coli. However, the measured mutation rate in bacteria is as low
as one mistake per 1010 or 1011 bp. The enzymes that catalyze mismatch repair are responsible for
this final degree of accuracy. They recognize misincorporated nucleotides,
excise them and replace them with the correct nucleotides. In contrast to
nucleotide excision repair, mismatch repair does not operate on bulky adducts
or major distortions to the DNA helix. Most of the mismatches are substitutes
within a chemical class, e.g. a C incorporated instead of a T. This causes only
a subtle helical distortions in the DNA, and the misincorporated nucleotide is
a normal component of DNA. The ability of a cell to recognize a mismatch
reflects the exquisite specificity of MutS,
which can distinguish normal base pairs from those resulting from
misincorporation. Of course, the repair machinery needs to know which of the
nucleotides at a mismatch pair is the correct one and which was
misincorporated. It does this by determining which strand was more recently
synthesized, and repairing the mismatch on the nascent strand.
In E. coli, the methylation of A in a GATC
motif provides a covalent marker for the parental strand, thus methylation of
DNA is used to discriminate parental from progeny strands. Recall that the dam
methylase catalyzes the transfer of a methyl group to the A of the
pseudopalindromic sequence GATC in duplex DNA. Methylation is delayed for
several minutes after replication. IN this interval before methylation of the
new DNA strand, the mismatch repair system can find mismatches and direct its
repair activity to nucleotides on the unmethylated, newly replicated strand.
Thus replication errors are removed preferentially.
The enzyme
complex MutH-MutL-MutS , or MutHLS, catalyzes mismatch repair in E. coli. The genes that encode these
enzymes, mutH, mutL and mutS, were
discovered because strains carrying mutations in them have a high frequency of
new mutations. This is called a mutator
phenotype, and hence the name mut
was given to these genes. Not all mutator genes are involved in mismatch
repair; e.g., mutations in the gene encoding the proofreading enzyme of DNA
polymerase III also have a mutator phenotype. This gene was independently
discovered in screens for defects in DNA replication (dnaQ ) and mutator genes (mutD).
Three complementation groups within the set of mutator alleles have been
implicated primarily in mismatch repair; these are mutH, mutL and mutS.
MutS will recognize seven of the eight
possible mismatched base pairs (except for C:C) and bind at that site in the
duplex DNA. MutH and MutL (with ATP bound) then join the
complex, which then moves along the DNA in either direction until it finds a
hemimethylated GATC motif, which can be as far a few thousand base pairs away.
Until this point, the nuclease function of MutH has been dormant, but it is
activated in the presence of ATP at a hemimethylated GATC. It cleaves the
unmethylated DNA strand, leaving a nick 5' to the G on the strand containing
the unmethylated GATC (i.e. the new DNA strand). The same strand is nicked on
the other side of the mismatch. Enzymes involved in other processes of repair
and replication catalyze the remaining steps. The segment of single-stranded
DNA containing the incorrect nucleotide is to be excised by UvrD, also known as
helicase II and MutU. SSB and exonuclease I are also involved in the excision.
As the excision process forms the gap, it is filled in by the concerted action
of DNA polymerase III
Mismatch
repair is highly conserved, and investigation of this process in mice and
humans is providing new clues about mutations that cause cancer. Homologs to the E. coli genes mutL and mutS have been identified in many other
species, including mammals. The key breakthrough came from analysis of
mutations that cause one of the most common hereditary cancers, hereditary nonpolyposis colon cancer
(HNPCC). Some of the genes that, when mutated, cause this disease encode
proteins whose amino acid sequences are significantly similar to those of two
of the E. coli mismatch repair
enzymes. The human genes are called hMLH1
(for human mutL homolog 1), hMSH1, and hMSH2 (for human mutS homolog 1 and 2, respectively). Subsequent
work has shown that these enzymes in humans are involved in mismatch repair.
Presumably the increased frequency of mutation in cells deficient in mismatch
repair leads to the accumulation of mutations in proto-oncogenes, resulting in
dysregulation of the cell cycle and loss of normal control over the rate of
cell division.
Recombination repair (Retrieval system)
In the
three types of excision repair, the damaged or misincorporated nucleotides are
cut out of DNA, and the remaining strand of DNA is used for synthesis of the
correct DNA sequence. However, this complementary strand is not always
available. Sometimes DNA polymerase has to synthesize past a lesion, such as a
pyrimidine dimer or an AP site. One way it can do this is to stop on one side
of the lesion and then resume synthesis about 1000 nucleotides further down.
This leaves a gap in the strand opposite the lesion.
The
information needed at the gap is retrieved from the normal daughter molecule by
bringing in a single strand of DNA, using RecA-mediated recombination (see Chapter VIII). This fills the gap opposite the dimer, and
the dimer can now be replaced by excision repair. The resulting gap
in the (previously) normal daughter can be filled in by DNA polymerase, using
the good template.
Translesion synthesis
As just
described, DNA polymerase can skip past a lesion on the template strand,
leaving behind a gap. It has another option when such a lesion is encountered,
which is to synthesis DNA in a non-template directed manner. This is called translesion synthesis, bypass
synthesis, or error-prone repair. This is the last resort for DNA repair, e.g.
when repair has not occurred prior to replication. In translesion replication, the DNA
polymerase shifts from template directed synthesis to catalyzing the
incorporation of random nucleotides. These random nucleotides are usually
mutations (i.e. in three out of four times), hence this process is also
designated error-prone repair.
Translesion
synthesis uses the products of the umuC
and umuD genes. These genes are named
for the UV nonmutable phenotype of mutants defective in these
genes.
UmuD forms
a homodimer that also complexes with UmuC. When the concentration of
single-stranded DNA and RecA are increased (by DNA damage, see next section),
RecA stimulates an autoprotease activity in UmuD2 to form UmuD’2.
This cleaved form is now active in translesional synthesis. UmuC itself is a
DNA polymerase. A multisubunit complex containing UmuC, the activated UmuD’2
and the a subunit of DNA polymerase III
catalyze translesional synthesis. Homologs of the UmuC polymerase are found in
yeast (RAD30) and humans (XP-V).
The SOS response
A coordinated battery of responses to DNA damage in E. coli is referred to as the SOS response.
This name is derived from the maritime distress call, “SOS” for "Save Our
Ship".
Accumulating
damage to DNA, e.g. from high doses of radiation that break the DNA backbone,
will generate single-stranded regions in DNA. The increasing amounts of single-stranded
DNA induce SOS functions, which stimulate both the recombination repair and the
translesional synthesis just discussed.
Key
proteins in the SOS response are RecA and
LexA. RecA binds to single stranded
regions in DNA, which activates new functions in the protein. One of these is a
capacity to further activate a latent proteolytic activity found in several
proteins, including the LexA repressor, the UmuD protein and the repressor encoded by bacteriophage lambda. RecA activated by binding to single-stranded DNA is not itself a
protease, but rather it serves as a co-protease, activating the latent
proteolytic function in LexA, UmuD and some other proteins.
In the
absence of appreciable DNA damage, the LexA protein represses many operons,
including several genes needed for DNA repair: recA, lexA, uvrA, uvrB, and umuC. When the activated RecA
stimulates its proteolytic activity, it cleaves itself (and other proteins),
leading to coordinate induction of the SOS regulated operons
Restriction/Modification
systems
The DNA
repair systems discussed above operate by surveillance of the genome for damage
or misincorporation and then bring in enzymatic
machines to repair the defects. Other systems of surveillance in
bacterial genomes are restriction/modification
systems. These look for foreign DNA that has invaded the cell, and then
destroy it. In effect, this is another means of protecting the genome from the
damage that could result from the integration of foreign DNA.
These
systems for safeguarding the bacterial cell from invasion by foreign DNA use a
combination of covalent modification and restriction by an endonuclease. Each
species of bacteria modifies its DNA by methylation
at specific sites (Fig. 7.19). This protects the DNA from cleavage by the
corresponding restriction endonuclease.
However, any foreign DNA (e.g. from an infecting bacteriophage or from a
different species of bacteria) will not be methylated at that site, and the
restriction endonuclease will cleave there.
The result is that invading DNA will be cut up and inactivated, while
not damaging the host DNA.
Any DNA
that escapes the restriction endonuclease will be a substrate for the
methylase. Once methylated, the
bacterium now treats it like its own DNA, i.e. does not cleave it. This process
can be controlled genetically and biochemically to aid in recombinant DNA work.
Generally, the restriction endonuclease is encoded at the r locus and the methyl transferase is encoded at the m locus. Thus passing a plasmid DNA
through an r‑m+ strain (defective in restriction but competent
for modification) will make it resistant to restriction by strains with a
wildtype r+ gene. For some restriction/modification
systems, both the endonuclease and the methyl transferase are available
commercially. In these cases, one can modify the foreign DNA (e.g. from humans)
prior to ligating into cloning vectors to protect it from cleavage by the
restriction endonucleases it may encounter after transformation into bacteria.
For the
type II restriction/modification systems, the methylation and restriction
occurs at the same, pseudopalindromic site.
These are the most common systems, with a different sequence specificity
for each bacterial species. This has
provided the large variety of restriction endonucleases that are so commonly
used in molecular biology.
Additional Readings
Friedberg, E. C., Walker, G. C., and Siede, W. (1995) DNA repair and mutagenesis, ASM Press,
Washington, D.C.
Kornberg, A. and Baker, T. (1992) DNA Replication, 2nd Edition, W. H. Freeman and Company,
New York.
Zakian, V. (1995) ATM-related
genes: What do they tell us about functions of the human gene? Cell 82: 685-687.
Kolodner, R. (1996)
Biochemistry and genetics of eukaryotic mismatch repair. Genes & Development 10:1433-1442.
Sutton MD, Smith BT, Godoy VG, Walker GC. (2000) The SOS
response: recent insights into umuDC-dependent mutagenesis and DNA damage
tolerance. Annu Rev Genet
34:479-497.
De Laat, W. L., Jaspers, N. C. J. and Hoeijmakers, J. H. J.
(1999) Molecular mechanism of nucleotide excision repair. Genes & Development 13: 768-785. This review focuses on
nucleotide excision repair in mammals.
Question 7.12 If the top strand of the segment of DNA
GGTCGTT were targeted for reaction with nitrous acid, and then it underwent two
rounds of replication, what are the likely products?
Question 7.13 Are the following statements about nucleotide
excision repair in E. coli true or
false?
a) UvrA and UvrB recognize structural
distortions resulting from damage in the DNA helix.
b) In a complex with UvrB, UvrC cleaves
the damaged strand on each side of the lesion.
c) The helicase UvrD unwinds the DNA,
thereby dissociating the damaged patch.
Question 7.14 Are the
following statements about mismatch repair in E. coli true or false?
a) MutS will recognize a mismatch.
b) MutL, in a complex with ATP, will bind
to the MutS (bound to the mismatched region) and activate MutH.
c) MutH will cleave 5' to the G of the
nearest methylated GATC motif (GmeATC).
d) The mismatch repair system can
discriminate between old versus newly synthesized strands of DNA.
For the next 6
problems, consider the following DNA sequence, from the first exon of the HRAS gene. A transversion of G to T at position 24
confers anchorage independence and tumorigenicity to NIH 3T3 cells
(fibroblasts). This mutation is one step
in tumorigenic transformation of bladder cells, and it likely plays a role in
other cancers.
10 20 30
5' TAAGCTGGTG GTGGTGGGCG CCGGCGGTGT
3' ATTCGACCAC CACCACCCGC GGCCGCCACA
Question 7.15 What would the sequence be if the G at position
14 (top strand) were alkylated at the O6 position by MNNG and then went through 2
rounds of replication?
Question 7.16 What would the sequence be if the C at
position 24 (bottom strand) were oxidized by HNO2 and then went through 2 rounds of replication?
Question 7.17 What would happen if this sequence were
irradiated with UV at a wavelength of 260 nm?
Question 7.18 If you were in charge of maintaining this DNA
sequence, and you had the enzymatic tools known in E. coli, how would you repair the damage from question7.15? Consider what would happen if the damage were
corrected before or after replication.
Question 7.19. How could
(a) the
oxidative damage in problem 7.16 or
(b) the UV
products in problem 7.17
be repaired?
Question 7.20 Let's say that a C to A transversion occurred
at position 24 on the bottom strand of the segment below, and that a segment
with a GATC is located about 300 bp away.
10 20 30 ... m
5' TAAGCTGGTG GTGGTGGGCG CCGGCGGTGT ... GGACGGATCC
3' ATTCGACCAC CACCACCCGC GGCAGCCACA ... CCTGCCTAGG
If this DNA is marked by the dam methylase system similarly to E. coli, how would the mismatch at position 24 be repaired? How does the cell decide which is the correct
nucleotide, and what enzymes would be used?
Explain how the enzymes work in this specific example.
Question 7.21. The
following is paraphrased from a presentation at the year 2000 meeting of the
American Society for Human Genetics .
Fanconi anemia (FA) is an autosomal recessive disease
associated with cancer predisposition. Cultured cells from FA patients have
high levels of spontaneous chromosome breaks, suggesting that FA cells may have
a defect in DNA repair. To test this hypothesis, DNA end-joining activity was
measured in nuclear extracts from diploid fibroblasts belonging to FA
complementation groups A and D, and from several normal donors. Extracts from
normal donors (controls) efficiently joined linear plasmid substrates, but
extracts from FA fibroblasts had only 10% the activity of the normal controls.
Addition of FA extract to normal cell extract had no effect on the activity of
the latter. However, when extracts from fibroblasts of FA complementation group
A were combined with those of complementation group D, normal levels of DNA
end-joining activity were reconstituted.
Post Comment
No comments