Marker as Br

Embed Size (px)

Citation preview

  • 8/8/2019 Marker as Br

    1/50

    Terms Commonly Used in Genomics Research

    A B C D E F G H I J L MN O P R S T

    A

    accession number: an alphanumerical code which identifies a DNA sequence in a database.

    algorithm: a procedure embedded in a computer program.

    alignment: the process of comparing two or more DNA sequences to assess their degree ofidentity.

    alternative splicing: the mechanism by which different introns (intervening sequences found

    within a gene) are removed during transcription, which results in the formation of variant mRNAmessages from a single gene.

    amino acid: a simple class of organic compounds, 20 of which are used as the building blocks of

    proteins. Each amino acid bears both a carboxy (COOH) and an amino (NH2) group. The codonsequence determines the sequence of amino acids in the gene product. Four of the 20 biologically

    significant amino acids are alanine, glycine, arginine and leucine.

    B

    base/base pair: the four nitrogenous subunits (nucleotides) of DNA: adenine (abbreviated as A),

    guanine (G), cytosine (C), and thymine (T). In the DNA molecule, they are linked to one anotherin pairs of long chains, where each member of the pair is complementary to the other. This

    double-stranded chain is itself twisted into a double helix. The complementarity between thestrands is brought about by the interaction between A and T, and between G and C. Since the

    identity of a base on one strand can be used to infer the identity of the corresponding base on theother strand, the terms base and base pair are often used interchangeably. The number of

    bases (or base pairs) is used as a measurement of the size of a genome. For example, the lengthof the human genome is approximately 3 billion base pairs (abbreviated bp).

    BAC: an abbreviation for bacterial artificial chromosome. These are vectors designed to carry

    large pieces of inserted DNA. They can be propagated in E. coli, and so are used for cloning andother molecular biology purposes.

    bioinformatics: a research discipline combining computer science, biology, and information

    technology, targeting the storage, management and analysis of large amounts of biological data.

    BLAST: an abbreviation for basic local alignment search tool. This is a sequence comparisonalgorithm much used for DNA alignment. It is available online through NCBI.

    C

  • 8/8/2019 Marker as Br

    2/50

    cDNA: an abbreviation for complementary DNA. This is the in vitro transcription productfrom mRNA. cDNA molecules usually lack intron sequence.

    chromosome: the structure in the eukaryotic nucleus and in the prokaryotic cell which carries

    most of the DNA. Prokaryotes have a single chromosome, but in eukaryotes, the diploid number

    varies from two pairs to hundreds. The variation is particularly notable in the plant kingdom.

    codon: a set of 3 nucleotides in a DNA sequence, which encodes a specific amino acid.

    comparative genomics: an approach which sets out to compare the sequences of two or more

    related organisms. It is frequently used as a means of identifying gene functions and for formingevolutionary hypotheses.

    computational biology: the analysis and interpretation of biological data.

    C0T analysis: a method to distinguish between highly repetitive and low copy DNA sequences,

    which uses the principle of DNA renaturation kinetics, in which the rate at which a particularsingle-stranded sequence returns to the double-stranded state depends on the number of times it

    is found in the genome. In particular, the method is used to enrich a preparation of genomicDNA for low copy sequences (more likely to be genes).

    D

    database: a collection of data. See also relational database.

    DNA: an abbreviation for deoxyribose nucleic acid, the carrier molecule of geneticinformation. The chain of nucleotides is held together on a polymer backbone formed by a sugar

    (deoxyribose) and a phosphate group (see also base).

    DNA chip: see microarray.

    DNA fingerprinting: the creation of a unique genetic profile of an individual based on its DNA.

    DNA sequence: the sequence of bases forming the DNA molecule. They are always expressedas a sequence of the four letters, each of which represents one of the four bases - for example

    GCATATTGCT.

    E

    EST: an abbreviation for expressed sequence tag. These represent fragments of genesequences, and are obtained by single-pass sequencing of cDNA. They have been heavily usedfor gene discovery, particularly in organisms that have not yet been sequenced, and also as a

    source of sequence to design genic molecular markers.

    exon: the part of a DNA sequence which encodes a protein (usually in conjunction with otherexons).

  • 8/8/2019 Marker as Br

    3/50

    F

    FastA: the first widely used search algorithm for database similarity searching; now sometimesused simply to denote the file format in which sequences are commonly expressed.

    functional genomics: the study of the structure, organization and function of a genome duringdevelopmental and other life processes of an organism.

    G

    gap: a space introduced into a DNA alignment to compensate for insertions and deletions in one

    sequence relative to another.

    GenBank: the most frequently accessed public domain database for DNA sequence data andrelated information. Managed by NCBI, supported by the National Library of Medicine and NIH,

    available at http://www.ncbi.nlm.nih.gov .

    gene: the unit of heredity, transmitted from generation to generation during reproduction. Eachgene consists of a sequence of nucleotides, occupying a specific position along a chromosome.

    Most genes encodes a specific functional product.

    gene expression: the process in which a gene is actively transcribed or "turned on".

    gene family: a group of closely related sequences which probably encode functionally similarproducts.

    genetic engineering: the technique of cloning a gene from one organism, and then adding it to

    another. Also refers to methods for altering gene expression, without necessarily introducinggenes from another species. The rationale is most commonly to introduce or enhance a trait, to

    the benefit of the recipient, the producer, the environment or the consumer.

    genome: the entire genetic content of an organism. Genome size varies widely amongorganisms.

    genotype: a genetic constitution of an organism, see also phenotype.

    GMO: an abbreviation for genetically modified organism. Although technically this could

    refer to genetic modification through conventional breeding and selection, typically the termspecifically is applied to organisms modified by genetic engineering. Also called transgenics.

    H

    haplotype: the specific allelic constitution within a sequence which is always inherited as a unit.For example, within a 1,000bp sequence, there may be four bases which vary in a population (the

    other 996 being identical for every member of the population). The haplotype of each individualis defined by the combination of the four variable bases present in the target sequence.

  • 8/8/2019 Marker as Br

    4/50

    heuristic: a procedure which derives an approximate solution in a more economical or faster

    way than can the more mathematically "strict" algorithm. In computer science, heuristics areapplied when an exact solution is computationally impractical.

    homology: the degree of identity between two DNA or amino acid sequences. Originallyhomology referred to the degree of identity between two individuals, which followed from theirhaving a common evolutionary origin.

    I

    imprinting: the phenomenon whereby a gene is expressed differently in an offspring depending

    on whether it was inherited from its father or its mother.

    intron: a DNA sequence within a gene which interrupts the exons, and is not usually transcribed.

    J

    junk DNA: describes non-coding DNA, although much of it probably has a function, such as

    to stabilize the structure of the genome or to control gene expression.

    L

    library: a set of DNA sequences or clones.

    M

    mapping: the process of identifying the location of a gene or DNA segment along achromosome. In genetic mapping, this is done by analyzing patterns of inheritance in segregatingpopulations (measured in recombinational units, commonly centiMorgans). In physical mapping,

    this describes the actual location of a sequence in a particular genomic region (measured in bp).

    metabolomics: the study of the global small molecule metabolite output of a specific cellularprocess or set of processes.

    microarray (or DNA chip, gene chip): a device in which a minute amount of each of many

    thousands of genic and/or other DNA sequences is immobilized on a glass or plastic support.When hybridized with a preparation of labeled cDNA, they are used to simultaneously measure

    the expression levels of all the sequences present on the chip.

    minimal tiling path: the smallest number of overlapping clones (usually BACs) needed togenerate a larger sequence. Overlaps are defined by the ability of two clones to hybridize

    successfully with one another.

    molecular marker: a gene or DNA fragment with a known location on a chromosome. (For a

    good tutorial on the uses of markers, see the downloadable training materials available from the

  • 8/8/2019 Marker as Br

    5/50

    International Plant Genetic Resources Institute, http://www.ipgri.cgiar.org/ .

    mutation: an abrupt change in the genotype of an organism which is not the result ofrecombination.

    N

    NCBI: abbreviation for National Center for Biotechnology Information, the organization

    which manages GenBank, PubMed (a database of publications), and other databases (available athttp://www.ncbi.nlm.nih.gov ).

    nucleic acid: see base/base pair and DNA.

    nucleotide: the unit of DNA, consisting of one base, one phosphate molecule, and the

    sugar deoxyribose. See also base/base pair

    O

    ortholog: a copy of a gene present in more than one related species. Orthologs are assumed to

    have derived from a common ancestral gene at the time of the last common ancestor.

    P

    paralog: a copy of a gene present in the same species. Paralogs arose from gene duplication.

    PCR: abbreviation for polymerase chain reaction, the process by which a defined fragment of

    DNA is replicated in vitro in a so-called thermocycler or PCR machine. These devices are

    designed to control the temperature and the the time over which a particular temperature is held.

    phenotype: the visible appearance of an (with respect to one or several traits). The phenotype

    reflects the combined action of the genotype and the environment where the individual exists.

    phylogenetics: the field of biology which attempts to identify and understand relationshipsbetween the various life forms.

    phylogenomics: a method of assigning a function to a gene based on its evolutionary history in a

    phylogenetic tree; phylogenomics uses information related to the evolution of a gene to improvethe prediction of gene function.

    polyploidy: a state in which multiple copies of a complete genome are present. Polyploidy is

    rare in animals, but common in plants. In animals (and also plants) some tissues within a diploidorganism can be polyploid. The polyploid series is haploid (1 copy), diploid (2 copies), triploid

    (3 copies), tetraploid (4 copies), pentaploid (5 copies), hexaploid (6 copies) etc.

    promoter: the part of a gene which is used to control the gene's expression.

  • 8/8/2019 Marker as Br

    6/50

    proteins: large molecules composed of amino acids. Proteins are involved in many cellularstructures, and are key to the catalysis of most reactions within the living cell.

    proteome: the set of all proteins in a cell. Unlike the relatively static genome, the dynamic

    proteome changes from minute to minute in response to many intra- and extracellular

    environmental signals.

    proteomics: the large-scale analysis of an organism's proteins to reveal expression and

    functions.

    R

    recombination: the formation among the offspring of a mating of genetic combinations notpresent in either parent, achieved via the physical exchange of genetic material during meiosis.

    regulatory DNA: DNA which controls the activity of genes. These DNA sequences tend to be

    short and are usually located close the genes they control.

    relational database: a database which cross-references the different types of data it contains,and allows queries of any type (a sequence, the sequence name, etc.) to retrieve data.

    RNA: an abbreviation for ribonucleic acid, the molecule responsible for translating DNA into

    proteins. Made up of a single chain of nucleotides (the same bases as in DNA, except that uracilreplaces thymine). There are three main types of RNA: messenger RNA, transfer RNA, and

    ribosomal RNA.

    RNA interference (RNAi): a natural process used by the cell to turn off, or silence, aparticular gene or gene family. Scientists can now use a transgenic approach which mimics this

    process, and therefore can manipulate gene expression. In research it is currently being heavilyused to identify the function of various genes, by studying the phenotypic effect of turning these

    genes off.

    S

    sequencing: determining the order (sequence) of bases in DNA, or amino acids in a protein.

    SNP: an abbreviation for single nucleotide polymorphism, pronounced "snip". A SNP which

    distinguishes two sequences can be used as a genetic marker.

    structural genomics: an approach to identifying the 3-D structure of proteins, which will helpidentify their functions and provide targets for drug design.

    synteny: the occurrence of two or more orthologs on the same chromosome in different species,

    without regard to gene order. Increasingly used to include conservation of gene order as well,although this is better described by the term collinearity.

  • 8/8/2019 Marker as Br

    7/50

    T

    transgenic: an organism containing genetic material from another organism transferred bygenetic engineering. See also GMO.

    transcription: the process in which RNA is formed from DNA.

    transcriptome: the parts of the genome which are transcribed.

    transcriptomics: a means of depicting the expression level of many genes, typically based

    on microarray technology.

    transformation: the process of adding a gene from one organism into another.

    transposon: a genetic element which is able within the genome.

    U

    unigene: a representation of a gene family, used to avoid the appearance of highly redundant

    sequences in EST libraries.

    universal primers: a PCR primer pair which can amplify a set of orthologs.

    UTR: an abbreviation for untranslated region, that part of a gene sequence which is nottranslated into a protein.

    Main sources and other glossaries

    Chemis Interactive Molecular Library: nucleic acidshttp://www.geneticengineering.org/chemis/Chemis-NucleicAcid/DNA.htm , 2000, Dr Didier

    Collomb 2/13/02

    Friend, S.H. and Stoughton, R.B. (2002, February). The magic of microarrays. ScientificAmerican, pp. 44-53

    Hartwell, L.H., Hood, L., Goldberg, M., Reynolds, A.E., Silver, L.M., & Veres, R.C. (2000).

    Genetics: from genes to genomes. New York: McGraw-Hill Companies, Inc.

    Interagency Working Group on Plant Genomes (2000). National Plant Genome Initiative.Washington, D.C.: National Science and Technology Council

    Genomics Initiative, a supplement to the Cornell Chronicle. (1999, January). Cornell University

    Glossary of Biotechnolgy for Food and Agriculture. FAO Research and Technology Paper #9.

  • 8/8/2019 Marker as Br

    8/50

    Human Genome Management Information System (HGMIS) (2001). Genomics and its impact onmedicine and society: a primer, [pdf]. HGMIS at Oak Ridge National Laboratory, Oak Ridge,

    TN, for the U.S. Department of Energy Human Genome Program. Available athttp://www.ornl.gov/hgmis

    National Center for Biotechnology Information, http://www.ncbi.nlm.nih.gov/

    National Institutes of Health, National Institute of General Medical Sciences (2001) Genetics

    Basics. NIH Publication No. 01-662. Also available at:http://publications.nigms.nih.gov/genetics/

    Genome News Network glossary http://www.genomenewsnetwork.org/

    Wikipedia, the free encyclopedia http://en.wikipedia.org/

    For reviews of some online glossaries in genomics and biotechnology, see

    http://www.sciencegenomics.org

    What is Marker - Assisted Breeding?

    Classical plant breeding is the intentional interbreeding and selection of plant varieties with thegoal of producing new varieties with improved properties (e.g. higher yield, bigger fruit, disease

    resistance, etc.).

    Marker-Assisted Breeding (MAB) combines classical plant breeding with the tools anddiscoveries of molecular biology and genetics, most specifically the use of molecular markers.

    Use of markers in the breeding cycle

  • 8/8/2019 Marker as Br

    9/50

    Plant breeding is a reiterating cycle of crossing and selection. Markers can be used to increasethe effectiveness of each of the various steps involved in breeding, as we will see in the next

    sections.

    Terminology

    The termsMarker-Assisted Breeding (MAB), Marker-Assisted Selection (MAS), andMolecularBreedingare often used interchangeably. In this module, we will useMarker-Assisted Breeding

    (MAB) for the general process, andMarker-Assisted Selection (MAS) for more the specific usageof molecular markers to select for particular traits or genotypes.

    See the Wikipedia definition of Marker-Assisted Selection for more:

    http://en.wikipedia.org/wiki/Marker_assisted_selection

    What are molecular markers?

    A marker, in this context, is an identifier (sometimes called a tag) of a particular aspect ofphenotype and/or genotype; its inheritance can easily be followed from generation to generation.

    Markers can be:Morphological: phenotypic variation which is scorable on the basis of a single plants (e.g.

    flowering time)Biochemical: variants in the size or net charge of a protein (eg isozymes) or inthe chemical composition of a metabolite (e.g. sugar)

    Molecular: variants in the DNA sequence (eg microsatellites)

  • 8/8/2019 Marker as Br

    10/50

    Marker assisted selection ormarker aided selection(MAS) is a process whereby a marker

    (morphological, biochemical or one based on DNA/RNA variation) is used for indirect selectionof a genetic determinant or determinants of a trait of interest (i.e. productivity, disease resistance,

    abiotic stress tolerance, and/or quality). This process is used inplant and animal breeding.

    Contents

    [hide]

    y 1 Overviewy 2 Marker typesy 3 Gene vs markery 4 Important properties of ideal markers for MASy 5 Demerits of morphological markersy 6 Selection for major genes linked to markersy 7 Situations that are favorable for molecular marker selectiony 8 Steps for MASy 9 QTL mapping techinquesy 10 Single step MAS and QTL mappingy 11 High-throughput genotyping techniquesy 12 Use of MAS for backcross breedingy 13 Marker assisted gene pyramidingy 14 Referencesy 15 See also

    [edit] Overview

    Considerable developments inbiotechnology have ledplant breeders to develop more efficient

    selection systems to replace traditional phenotypic-pedigree-based selection systems.

    Marker assisted selection (MAS) is indirect selection process where a trait of interest is selectednot based on the trait itself but on a marker linked to it.

    [1][2][3][4]For example if MAS is being

    used to select individuals with a disease, the level of disease is not quantified but rather a markerallele which is linked with disease is used to determine disease presence. The assumption is that

    linked allele associates with the gene and/orquantitative trait locus (QTL) of interest. MAS canbe useful for traits that are difficult to measure, exhibit low heritability, and/or are expressed late

    in development.

  • 8/8/2019 Marker as Br

    11/50

    [edit] Marker types

    A marker may be:

    y Biological- Different pathogen races or insect biotypes based on host pathogen or host parasiteinteraction can be used as a marker since the genetic constitution of an organism can affect itssusceptibility to pathogens or parasites.

    y Morphological - First markers loci available that have obvious impact on morphology of plant.Genes that affect form, coloration, male sterility or resistance among others have been analyzed

    in many plant species. Examples of this type of marker may include the presence or absence of

    awn, leaf sheath coloration, height, grain color, aroma of rice etc. In well-characterized crops

    like maize, tomato, pea, barley or wheat, tens or even hundreds of such genes have been

    assigned to different chromosomes.

    y Biochemical- A gene that encodes a protein that can be extracted and observed; for example,isozymes and storage proteins.

    y Cytological - The chromosomal banding produced by different stains; for example, G banding.y DNA-based and/or molecular- A unique (DNA sequence), occurring in proximity to the gene or

    locus of interest, can be identified by a range of molecular techniques such as RFLPs, RAPDs,

    AFLP, DAF, SCARs, microsatellites etc.

    Sax[who?] in 1923 first reported association of a simply inherited genetic markerwith a

    quantitative trait in plants when he observed segregation of seed size associated with segregation

    for a seed coat color marker in beans (Phaseolus vulgaris L. ). Rasmusson in 1935 demonstratedlinkage of flowering time (a quantitative trait) in peas with a simply inherited gene for flower

    color.[citation needed]

    [edit] Gene vs marker

    The gene of interest is directly related with production of protein(s) that produce certainphenotypes whereas markers should not influence the trait of interest but are genetically linked

    (and so go together during segregation of gametes due to the concomitant reduction inhomologous recombination between the marker and gene of interest). In many traits genes are

    discovered and can be directly assayed for their presence with a high level of confidence.However, if a gene is not isolated marker's help is taken to tag a gene of interest. In such case

    there may be some false positive results due to recombination between marker of interest andgene (or QTL). A perfect marker would elicit no false positive results.

    [edit] Important properties of ideal markers for MAS

    An ideal marker:

    y Easy recognition of all possiblehenotype]]s (homo- and heterozygotes) from all different allelesy Demonstrates measurable differences in expression between trait types and/or gene of interest

    alleles, early in the development of the organism

    y Has no effect on the trait of interest that varies depending on the allele at the marker loci

  • 8/8/2019 Marker as Br

    12/50

    y Low or null interaction among the markers allowing the use of many at the same time in asegregating population

    y Abundant in numbery Polymorphic

    [edit] Demerits of morphological markers

    Morphological markers are associated with several general deficits that reduce their usefulness

    including:

    y the delay of marker expression until late into the development of the organismy dominancey deleterious effectsy pleiotropyy confounding effects of genes unrelated to the gene or trait of interest but which also affect the

    morphological marker (epistasis)

    yrare polymorphism

    y frequent confounding effects of environmental factors which affect the morphologicalcharacteristics of the organism

    To avoid problems specific to morphological markers, the DNA-based markers have beendeveloped. They are highlypolymorphic, simple inheritance (often codomimant), abundantly

    occur throughout the genome, easy and fast to detect, minimum pleiotropic effect and detectionis not dependent on the developmental stage of the organism. Numerous markers have been

    mapped to different chromosomes in several crops including rice, wheat, maize, soybean andseveral others. Those markers have been used in diversity analysis, parentage detection, DNA

    fingerprinting, and prediction of hybrid performance. Molecular markers are useful in indirectselection processes, enabling manual selection of individuals for further propagation.

    [edit] Selection for major genes linked to markers

    The major genes which are responsible for economically important characteristics are frequent in

    the Plant Kingdom. Such characteristics include disease resistance, male sterility, self-incompatibility, others related to shape, color, and architecture of whole plants and are often of

    mono- or oligogenic in nature. The marker loci which are tightly linked to major genes can beused for selection and are sometimes more efficient than direct selection for the target gene. Such

    vantages in efficiency may be due for example, to higher expression of the marker mRNA insuch cases that the marker is actually a gene. Alternatively, in such cases that the target gene of

    interest differs between two alleles by a difficult-to-detect single nucleotide polymorphism, anexternal marker (be it another gene or a polymorphism that is easier to detect, such as a short

    tandem repeat) may present as the most realistic opti.

    [edit] Situations that are favorable for molecular marker selection

    There are several indications for the use of molecular markers in the selection of a genetic trait.

  • 8/8/2019 Marker as Br

    13/50

    In such situations that:

    y the selected character is expressed late in plant development, like fruit and flower features oradult characters with a juvenile period (so that it is not necessary to wait for the organism to

    become fully developed before arrangements can be made for propagation)

    y the expression of the target gene is recessive (so that individuals which areheterozygouspositive for the recessive allele can be crossed to produce some homozygous offspring with the

    desired trait)

    y there is requirement for the presence of special conditions in order to invoke expression of thetarget gene(s), as in the case of breeding for disease and pest resistance (where inoculation with

    the disease or subjection to pests would otherwise be required). This advantage derives from

    the errors due to unreliable inoculation methods and the fact that field inoculation with the

    pathogen is not allowed in many areas for safety reasons. Moreover, problems in the

    recognition of the environmentally unstable genes can be eluded.

    y the phenotype is affected by two or more unlinked genes (epistatis). For example, selection formultiple genes which provide resistance against diseases or insect pests for gene pyramiding.

    The cost ofgenotyping (an example of a molecular marker assay) is reducing while the cost ofphenotyping is increasing[citation needed] particularly in developed countries thus increasing theattractiveness of MAS as the development of the technology continues.

    [edit] Steps for MAS

    Generally the first step is to map the gene orquantitative trait locus (QTL) of interest first by

    using different techniques and then use this information for marker assisted selection. Generally,

    the markers to be used should be close to gene of interest (

  • 8/8/2019 Marker as Br

    14/50

    linked to the trait of interest are identified by QTL mapping and later the same information inused in the same population. In this approach, pedigree structure are created from families that

    are created by crossing number of parents (in three-way or four way crosses). Both phenotypingand genotyping is done using molecular markers mapped the possible location of QTL of

    interest. This will identify markers and their favorable alleles. Once these favorable marker

    alleles are identified, the frequency of such alleles will be increased and response to markerassisted selection is estimated. Marker allele(s) with desirable effect will be further used in nextselection cycle or other experiments.

    [edit] High-throughput genotyping techniques

    Recently high-throughput genotyping techniques are developed which allows marker aided

    screening of many genotypes. This will help breeders in shifting traditional breeding to marker

    aided selection. One of example of such automation is using DNA isolation robots, capillaryelectrophoresis and pipetting robots.

    One of recent example of capllilary system is Applied Biosystems 3130 Genetic Analyzer. Thisis the latest generation of 4-capillary electrophoresis instruments for the low to medium

    throughput laboratories.

    [edit] Use of MAS for backcross breeding

    A minimum of five or six-backcross generations are required to transfer a gene of interest from a

    donor (may not be adapted) to a recipient (recurrent adapted cultivar). The recovery of therecurrent genotype can be accelerated with the use of molecular markers. If the F1 is

    heterozygous for the markerlocus, individuals with the recurrent parent allele(s) at the markerlocus in first or subsequent backcross generations will also carry a chromosome tagged by the

    marker.

    [edit] Marker assisted gene pyramiding

    Gene pyramiding has been proposed and applied to enhance resistance to disease and insects by

    selecting for two or more than two genes at a time. For example in rice such pyramids have been

    developed against bacterial blight and blast. The advantage of use of markers in this case allowsto select for QTL-allele-linked markers that have same phenotypic effect.

  • 8/8/2019 Marker as Br

    15/50

    ABI 3130 genetic analyzer.

    MAS has also been proved useful forlivestockimprovement[6]

    .

    A coordinated effort to implement wheat (Triticum turgidum and Triticum aestivum) markerassisted selection in the U.S. as well as a resource for marker assisted selection exists at the

    Wheat CAP (Coordinated Agricultural Project) website. Farhad Kahani

    [edit] References

    1. ^review MAS in plant breeding2. ^ Ribaut, J.-M. et al., Genetic basis of physiological traits. In Application of Physiology in Wheat

    Breeding, CIMMYT, Mexico, 2001.

    3. ^ Ribaut, J.-M. and Hoisington, D. A., Marker assisted selection: new tools and strategies. TrendsPlant Sci.,

    1998,3

    ,236239

    .4. ^ Rosyara, U.R. 2006. REQUIREMENT OF ROBUST MOLECULAR MARKER TECHNOLOGY FOR

    PLANT BREEDING APPLICATIONS.Journal of Plant Breed. Gr. 1: 6772. click to download

    5. ^ Rosyara, U. R.; K.L. Maxson-Stein; K.D. Glover; J.M. Stein; J.L. Gonzalez-Hernandez. 2007.Family-based mapping of FHB resistance QTLs in hexaploid wheat. Proceedings of National

    Fusarium head blight forum, 2007, Dec 2-4, Kansas City, MO.

    6. ^ Dekkers., J. C. M. 2004. Commercial application of marker- and gene-assisted selection inlivestock: Strategies and lessons. J. Anim. Sci. 82:E313-E328.

    4. review application of MAS in crop improvement

    7. Collard B.C., D.J. Mackill . 2007. Marker-assisted selection: an approach for precision plant

    breeding in the twenty-first century.Philos Trans R Soc Lond B Biol Sci. 2007 (in press)

    9. Dubcovsky, J. 2004. Marker-Assisted Selection in Public Breeding Programs: The WheatExperience. Crop Sci. 44:6.

    10. Goodman, M.M. 2004. Plant Breeding Requirements for Applied Molecular Biology. CropSci. 44:6.

    11. MAS, what is it?

    Key concept: genetic linkageWhen 2 genetic loci or alleles of genes are physically near each other on a chromosome, they aremore likely to be inherited together. By looking at how frequently they are inherited together

    rather than separately among a set of offspring from a cross, we can calculate how closely linkedthey are to one another.

  • 8/8/2019 Marker as Br

    16/50

    In marker-assisted breeding, keep in mind that the marker and the gene for the trait of interestmay not be at exactly the same locus, but tightly linked (very close together). We will discuss

    this further in the mapping and QTL sections.

    What are the advantages of molecularmarkers?

    In this module we will only discuss molecular markers, which have several advantages over the

    other types:

    y They are not subject to environmental influencey They are unlimited in numbery They are usually more objectivey They can be easier to analysey

    They may be less expensive than some types of markers (especially when they can bedone in high-throughput)

    (see also de Vicente and Fulton 2004)

    What are the advantages of using markers in

    breeding?

    y They can save a lot of time in the breeding processy

    They may aid in discovering more information about the function of the gene of interesty They have many uses, including genetic diversity assessment, quality control (e.g. in

    variety development), marker-assisted breeding (the focus of this module) and others

    (Peleman and van der Voort 2003)

    Lets look at a few of the advantages in more detail, as well as some disadvantages

    Advantages of MAB: Time

    When a marker is genetically linked to a trait, its use can speed up the identification ofgenetically superior plants.

    DNA can be extracted from very young plants and the marker assay carried out long beforethe plant expresses the actual trait.

  • 8/8/2019 Marker as Br

    17/50

    DNA can be extracted from plants at a very early stage, much sooner than most traits can be seen

    or measured. This person is harvesting leaves for DNA extraction from tomato seedlings just afew days after germination.

    Time savings add up

    Advantages of MAB: Cost

    Depending on the trait, the use of MAB can also reduce costs. Maintaining field plots and

    greenhouse space, and employing labour to measure traits can be expensive, and sometimes (forexample certain diseases), impossible. The ability to test for the presence of a certain allele when

  • 8/8/2019 Marker as Br

    18/50

    the plant is still small rather than waiting until the associated trait can be seen can decrease theamount of phenotyping that is necessary.

    Products such as the FTA cards (http://www.whatman.com) shown at left can make DNA

    extractions, and therefore marker work, easier. This person is extracting DNA using a verysimple procedure and a very young plant.

    Of course, some phenotyping will always be required to confirm results, but MAB can decrease

    the amount of phenotyping in many situations.

    dvantages of MAB: not subject to

    environmental effectsBecause they are not subject to environmental effects, markers ensure that a trait can be selectedregardless of the conditions (location and climate) where the plants are grown. Furthermore, new

    varieties developed can be identified and tracked with their unique genetic fingerprint (seeexample in the applications section).

    Advantages of MAB: knowledge

  • 8/8/2019 Marker as Br

    19/50

    Using markers can also give us a deeper understanding of the traits we are selecting for andHOW they work. This could allow for more efficient selection in the future.

    For example, once a marker trait correlation is established, the marker can be used to clone thegene, and more thoroughly study its action. In tomato, a major QTL affecting fruit weight was

    cloned and found to control carpel cell number early in fruit development (Frary et al. 2000).

    Disadvantages of MAB: Costs

    Using molecular markers requires the use of specific laboratory equipment, at the very least aPCR (polymerase chain reaction) thermalcycler and electrophoresis and visualization equipment.

    So start-up costs can be high, although these may be compensated for by later savings (andprices of the necessary equipment and reagents have been decreasing over time).

    Disadvantages of MAB: technical skills

    needed

    Along with the equipment required for molecular marker work comes the need for the technical

    skills and knowledge of how to do the work and understand the results.

    These are not difficult skills to learn, but are not always part of a classical plant breeders

    education.

  • 8/8/2019 Marker as Br

    20/50

    Plant breeders learn how to load electrophoresis gels at a training course in Ghana.

    Outsourcing

    With the proliferation of molecular marker laboratories and companies it is now possible, andindeed often cost-effective, to have the marker work done off-site for a fee rather than at the

    home institution.

    A recent service developed to address this need is the GCP Genotyping Support Service.

    Choosing markers for MAB

    There are many types of molecular markers suited to MAB. Each has its own advantages anddisadvantages. We will learn more about markers in later slides.

    Resources

    For a good analysis on weighing the costs of MAB and conventional breeding:

    Dreher K, Khairallah M, Ribaut J-M, Morris M (2003) Money matters (I): costs of field andlaboratory procedures associated with conventional and marker-assisted breeding at CIMMYT.

    Mol Breeding 11: 221-234

    For a good overview of MAB:

  • 8/8/2019 Marker as Br

    21/50

    Peleman JD and van der Voort JR (2003) The challenges in Marker Assisted Breeding. In:Eucarpia leafy vegetables. Van Hintum et al (eds). The Netherlands: Center for Genetic

    ResourcesCollard BCY and Mackill DJ (2008) Marker-assisted selection: an approach for precision plant

    breeding in the twenty-first century. Phil Trans R Soc B 363: 557-572

    For a comprehensive book on marker-assisted breeding:

    Newbury HJ (ed) (2003) Plant Molecular Breeding. Blackwell Publishing, CRC Press,Birmingham, UK

    Guimares EP, Ruane J, Scherf BD, Sonnino A, Dargie JD (eds) (2007) Marker-assistedselection: current status and future perspectives in crops, livestock, forestry and fish. Food and

    Agriculture Organization of the United Nations, Rome. Freely downloadable from

    http://www.fao.org/docrep/010/a1120e/a1120e00.htm

    Considerations in selection of marker type

    There are many types of molecular markers available. Which type you select to use for yourproject will depend on:

    What the goals of the project are

    How variable the germplasm is

    What sort of population is being analyzed What level of resolution is needed Whether or not there is previous work you can take advantage of (ie. Marker development)

    Desirable properties of markers

    Different marker types have variable characteristics. Desirable qualities of molecular markers

    include the following:

    y Polymorphicy

    Reproducibley Evenly distributed across the whole genome (not clustered in particular regions)y Inexpensivey Easy to analysey Co-dominant (so that heterozygotes can be distinguished from homozygotes)y Other criteriay Other criteria may need to be considered in selecting a marker type:

  • 8/8/2019 Marker as Br

    22/50

    y What marker platforms you have access to or can manage in your laboratoryy For example, are agarose, polyacrylamide or sequencing gels available? Various marker

    types require specific equipment (see references below for more details)y

    How many markers you will needy

    For example, in an initial QTL study (see later section) you may only need a smallnumber of markers distributed across the genome, but for separating linkage drag or fine-mapping a gene, you will need many concentrated in a small part of the genome. The cost

    per marker may be decreased if many are used.y

    If outsourcing is an optiony This may be more efficient than doing the work in-house.y For more details, see: De Vicente and Fulton 2004, Dreher et al. 2003, Farooq and Azam

    2002a, 2002b

    y Extraction of DNAy

    Obtaining DNA of sufficient quality and quantity is an important consideration for MAB.Some markers need relatively large quantities of high grade DNA, others work well with

    small amounts of low grade DNA. The required quality and quantity of DNA need to beconsidered when deciding which markers, and which DNA extraction protocol to use.

    y For example, AFLPs require about higher amounts of high quality DNA, but an SSRassay works with 10ng of lesser quality DNA (more on types of markers later).

    yy A laboratory technician carefully adds chloroform to a large DNA extraction prep.

    Understanding markers

    To be able to appropriately select and use markers in breeding, it is important to understand how

    these markers are designed and how they are able to identify specific areas of the genome.

    The key concepts to understanding this include:

    y the basic structure of DNAy the Polymerase Chain Reaction (PCR)y the organization of the DNA sequence

    Genome organization

  • 8/8/2019 Marker as Br

    23/50

    It is important to remember that only part (sometimes a very small part!) of the DNA sequence iscomposed of genes. The rest is non-coding sequence, including lots of repetitive sequences,

    microsatellites and transposons. In some species, the genic fraction of the genome may be

  • 8/8/2019 Marker as Br

    24/50

    Most molecular marker platforms include a visualization system such as gel electrophoresis thatrequires a minimal amount of DNA in order for it to be seen. Therefore PCR is important not

    only in identifying specific regions of DNA (as with specific primers, discussed next slide) butalso to generate enough copies (amplify) of the segment of interest such that it can be seen on a

    gel system. Thus the importance of PCR in MAB cannot be overemphasized.

    At the far left a person loads a small horizontal agarose gel, while the next photo shows several

    types of vertical gel apparatus. For more information on PCR and gel electrophoresisequipment, see the Resources at the end of this section.

    Primers

    An important component of PCR is the primer(s), which are short sequences of DNA (typically10-30 base pairs long) that help initiate the synthesis process and also determine exactly whichregion(s) of DNA will be amplified.

    The design of primer sequences exploits the complementarity property of the DNA molecule. In

    the example below, the sequence to be amplified (the "template") is shown in blue, and apossible primer sequence (18 bases in this case) is shown in red.

    Keep in mind that the primer sequences can be located anywhere along the template sequence,but must flank the key area of interest.

    DNA template:GCACTTAGCGTAATCGATCTAATGGCATGTGTACGATGCCGTAPrimer sequence:

    CGTGAATCGCATTAGCTA

  • 8/8/2019 Marker as Br

    25/50

    Primer Design

    Designing good primers involves understanding some other concepts of DNA synthesis:

    the melting temperature of the double-stranded DNA (to know what annealing temperature touse for your PCR)

    the stability and relative GC content (GC bonds are more stable than AT bonds, which affectsthe melting temperature)

    avoiding complementarity within the primer sequence, as this inhibits proper annealing

    But there are a number of software programs to help you design your primers, many of which arefreely available on the internet (see Resources at the end of this section). Primers can be

    purchased from any DNA synthesis facility.

    Molecular markers

    As mentioned previously, there are many types of possible markers. Here we will only discussmolecular markers, due to the advantages they have over other types.

    And of molecular markers, we will include only PCR-based markers, by far the most widelyused types.

    As detailed information is available elsewhere, we will just briefly summarize types of markershere, and go on to discuss their use in MAB.

    More resources about molecular markers can be found at the end of this chapter.

    his is a general term for markers that are developed with sequence-specific primers, usually for a

    particular genome region or type of region. Some knowledge of DNA sequence is necessary,obviously, but in many cases this is already publicly available. If you find the sequence of a gene

    you are interested in online or in a publication, you could design primers specific for that

    sequence, and try to amplify the sequence in your own germplasm.

    For example, a search of the NCBI* database for the term rice blast resistance results in many

    sequences, one of which is a gene that is 3458 base pairs (bp) long, the first ~500 of which areshown below.

  • 8/8/2019 Marker as Br

    26/50

    Either manually or using a primer design software program, primers could be developed whichwould amplify a fragment of this gene from rice genomic DNA.

    *National Center for Biotechnology Information

    Expressed Sequence Tags (ESTs)

    An EST is a short (200-500bp) DNA sequence obtained from one or both ends of a cDNAmolecule. Because cDNA is obtained from mRNA, we know that EST sequences are genic. Incontrast, sequences obtained from genomic DNA are highly likely to be non-genic. The

    development of cDNA libraries is technically challenging, but ESTs are highly useful markers.Large numbers of ESTs are publicly available, and primers can easily be designed from these

    sequences.

    Crop # of EST sequences

    Maize 1,464,859

    Soybean 1,317,957

    Rice 1,220,876Wheat 1,051,300

    Barley 478,734

    Cowpea 183,658

    Cassava 76,566

    As of October 3, 2008, NCBI had more than 57,000,000 EST sequences. This figure shows the #available for a few crops. The updated numbers can be found at

    http://www.ncbi.nlm.nih.gov/dbEST/dbEST_summary.html

    An informative tutorial on ESTs can also be found athttp://www.ncbi.nlm.nih.gov/About/primer/est.html

    Microsatellites

  • 8/8/2019 Marker as Br

    27/50

    Also called simple sequence repeats (SSRs), these are tandemly arranged blocks of shortnucleotide sequences, usually 1-10 nucleotides long (though more typically 2 or 3), repeated up

    to 50 times. The number of repeat units in the block can vary noticeably between individualswithin a species. This variation can be targeted by PCR, by placing the primers either side of the

    block. This leads to highly reproducible, co-dominant, easily analyzed and polymorphic markers.

    As a result, SSRs represent one of the most widely used markers in MAB.

    \

    The di-nucleotide motif AG is repeated 6 times in this example of a microsatellite. Primerswould be designed from the sequence of the red flanking sequences.

    (See Powell et al. 1996)

    Microsatellites, contd.Note that the differences seen (the polymorphisms) between organisms are due to the numberof

    repeats, which leads to a difference in size of the amplified products (ie. The length of the DNAsegment between the 2 primers), not a difference in the DNA sequence per se.

    In this example, the amplified fragment from A is shorter than that from B, because the AGmotif is repeated fewer times. As a result, the A amplicon runs faster through the gel than the B

    amplicon, and the polymorphism is recognized by the different positions of the bands.

  • 8/8/2019 Marker as Br

    28/50

    Cleaved Amplified Polymorphism Sequences

    (CAPS)

    A CAPS marker represents a refinement of a STS marker. Where an STS assay shows no allelicvariation in amplicon size, it may still be informative if the amplicon varies insequencebetweenindividuals. If such sequence variation can be identified by treatment with a restriction enzyme

    after the PCR, the STS becomes a CAPs marker (note that the "C" in CAPS stands for "cleaved"to reflect the need for restriction digestion to identify the polymorphism). Since each restriction

    enzyme has its unique recognition site, a CAPs marker needs to specify both the primers and thespecific restriction enzyme used.

    AFLPs

    The AFLP technique is a rather complicated combination of restriction digestion and selective

    PCR, which can quickly generate a large number of markers (Vos et al. 1995). These markers arehighly reproducible and require no a priori sequence knowledge, but they require high-resolutionvisualization platforms and can be difficult to analyze due to their complex banding patterns.

    Therefore AFLPs are used more in genetic mapping and fingerprinting than MAB; however, iflinkage between a gene of interest and a particular AFLP fragment can be established, then it ispossible to convert the AFLP into a simple marker, called a SCAR (see later slide).

  • 8/8/2019 Marker as Br

    29/50

    An example of an AFLP gel (from Vos et al. 1995) showing the resultant complex bandingpatterns.

    RAPDsRandom Amplified Polymorphic DNA markers use a single, short (usually around 10 bases)

    primer. This amplifies anonymous sequence(s) throughout the genome (Williams et al. 1990).

    Although they are simple to perform, inexpensive, and easy to analyze, their reproducibility ispoor and the assay is not codominant (heterozygotes cannot be distinguished from

    homozygotes). Therefore, RAPDs per se are not well suited for MAB. However, if aninformative RAPD marker is identified, it is possible to convert it (as for an AFLP) into a SCAR

    (see next slide).

    SCARs

    If a RAPD or an AFLP fragment appears to be correlated with the presence of a favourable alleleat an important gene, it can be converted to a more reliable marker called a Sequence

    Characterized Amplified Region (Paran and Michelmore 1993) (or simply a Sequence-taggedsite, STS).

    The idea is to first purify the DNA fragment (by cutting it out of the gel), then clone it. The DNAsequence of the clone will allow for specific primers to be designed. Obviously this is not doableon a large scale, but it is a useful way of exploiting individual RAPD or AFLP fragments in

    MAB.

  • 8/8/2019 Marker as Br

    30/50

    SNPs: Single Nucleotide Polymorphisms

    SNPs (pronounced snips) are differences in DNA sequence of just one (or sometimes a small

    number of) nucleotides. Where these differences occur within a genic sequence, they are more

    often than not phenotypically neutral, but sometimes they can be associated with a change in theamino acid sequence of the gene product. They are very common, and are distributed throughoutthe genome.

    SNP genotyping can be relatively simple, but SNP discovery generally requires extensive DNA

    sequencing. Although not as yet not widely used in MAB, in future SNPs are likely to dominatethe field, due to the increase in automation possible*.

    An example of a SNP between 2 small DNA sequences.

    Other marker types

    New marker types are always being developed; just a few additional types are noted here, withreferences for further information:

    TRAP (Targeted Region Amplified Polymorphism) (Miklas et al. 2006)

    SRAP (Sequence Related Amplified Polymorphism, targeting open reading frames) (Li and

    Quiros 2001)

    DArT (Diversity Array Technology): http://www.diversityarrays.com/

    Associated or complementary technologiesThere are a number of technologies that can be used in addition to or instead of molecular

    markers in special cases, for example in polyploids where alleles may have dosage effects.

    A few examples are given below, all of which are explained well in Wikipediahttp://en.wikipedia.org:

  • 8/8/2019 Marker as Br

    31/50

    Single strand conformation polymorphism (SSCP)High resolution melting technique (HRM)

    Taqman assay

    Markers in MAB

    The most important requirement for MAB is to identify a convenient marker(s) closely linked toa gene(s) of interest. Later sections explain how this is done, via mapping and/or QTL

    identification. The next section will discuss the selection of germplasm and genetic diversity.

    Resources for Primer DesignPrimer-Blast from NCBI (free software with helpful hints):http://www.ncbi.nlm.nih.gov/tools/primer-blast/index.cgi?LINK_LOC=BlastHome

    Primer3 (free software): http://frodo.wi.mit.edu/

    Tutorial on primer design by the Board of Regents of the University of Wisconsin system:

    http://bioweb.uwlax.edu/GenWeb/Molecular/Seq_Anal/Primer_Design/primer_design.htm

    Resources for PCRMullis KB, Ferre F, Gibbs RA (eds) (1994) PCR: the polymerase chain reaction. Birkhauser,Boston, MA.

    University of Nebraska has a nice animation depicting the PCR process (and many others):

    http://croptechnology.unl.edu/download.cgi

    Resources for selecting markers

    De Vicente MC and Fulton T (2004) Using molecular marker technology effectively in plantdiversity studies. Vol 1. Learning module. CD-ROM. International Plant Genetic Resources

    Institute, Rome, Italy and Institute for Genomic Diversity, Cornell University, Ithaca, NY, USA.

    Farooq S & Azam F (2002a) Molecular Markers in Plant Breeding I: Concepts andCharacterization. Pakistan Journal of Biological Sciences 5 (10): 1135-1140.A good background

    paper.

  • 8/8/2019 Marker as Br

    32/50

    Farooq S & Azam F (2002b) Molecular Markers in Plant Breeding-II. Some Pre-requisites forUse. Pakistan Journal of Biological Sciences 5 (10): 1141-1147. A good overview of markers,

    including a helpful comparison table.

    Genetic diversity

    Genetic diversity usually refers to the variation, or differences between, organisms at the DNAsequence level. This can be effected by natural or artificial (i.e. human) selection, mutation,

    recombination and other mechanisms.

    Genetic diversity can be considered in many different ways, as discussed in later slides. Thisimage shows the electrophoretic separation of a number of maize SSR markers all in one gel.

    Each lane represents a different maize line and the banding patterns identify allelic differences at

    each of the SSR loci.

    mportance of diversity in plant breeding

    Crop improvement is predicated on identifying new alleles and introgressing them into breeding

    lines, particularly in those crops that have low levels of genetic diversity. Shuffling alleles by

    crossing among very genetically similar lines cannot produce continued improvements over thelong term.

    Many disease resistance genes now in our current cultivars were introgressed from wild relatives.For example, most cultivated tomato varieties contain nematode resistance genes crossed in from

    wild relatives of tomato.

  • 8/8/2019 Marker as Br

    33/50

    Solanum peruvianum, a wild relative of tomato that has been used as a source of diseaseresistance genes.

    Importance of diversity for molecular

    marker workIn addition to the importance of genetic diversity to crop improvement in general, it is very

    important to the use of molecular markers - we need to see differences between alleles to be ableto map genes or do any studies involving markers.

    This diagram shows the marker profile of 5 plant lines. It is only because there is somedifference polymorphism between the lines which makes any analysis based on markers

    possible (mapping, diversity studies, etc.). The differences in band mobility through the gelsreflects differences in the DNA sequence of the amplicons produced from the five templates.

    Definition: polymorphism

    y The word polymorphism technically means presence of many forms

  • 8/8/2019 Marker as Br

    34/50

    y In genetic terms, it refers to the coexistence of two or more alternative phenotypes in apopulation or among populations. In general, these diverse phenotypes are caused by

    alternative alleles of one gene/locusy At the molecular level, polymorphism refers to the coexistence of alternative banding

    patterns or DNA variants when revealed by a given detection method, such as that shown

    on the gel diagram in the previous slide

    Crop Domestication:

    From plants in the wild to our kitchen

    Over time, humans have selected those plants that exhibited traits that are in OUR (humans)

    interests: larger fruit, more kernels.

    Crop Domestication

    Crop domestication by its nature decreases genetic variation, since it deliberately selects only asmall number of plants out of the many present in the wild population (those whose phenotype is

    considered desirable by the selectors, ie. humans).

  • 8/8/2019 Marker as Br

    35/50

    Traits selected for by humans

    Traits that have been selected for by humans include:

    Determinate growth habit (flowering occurs at the top of the plant, preventing further growth)

    Retention of mature seed on the plant (loss of grain shattering)Synchronous ripening, shorter maturity

    Lower content of bitter tasting and harmful compoundsReduced sprouting (higher seed dormancy)

    Improved harvest index (the proportion of the plant which is used); larger seed or fruit sizeElimination of seeds, such as in banana

    Many of these trait changes reduce the ability of the plant to compete in the wild, and also

    decrease the genetic variability remaining in the crop.

    Consequences of loss of genetic diversity:

    One result of less diversity is that consumers and farmers are now accustomed to, and demand,

    uniformity such as: round red apples, plants all the same height in the field.

  • 8/8/2019 Marker as Br

    36/50

    But the loss of genetic diversity can have devastating consequences, such as the Irish potato blight of

    1850, the Southern corn leaf blight of1970, and the current crisis in banana, Black Sigatoka disease,

    shown above.

    Banana image Copyright 2001 by The American Phytopathological Society,http://www.apsnet.org/education/feature/banana/; apple photo courtesy of New York AppleAssociation

    Germplasm banks

    Most crops have many accessions stored in genebanks, or germplasm banks, that are available

    free of charge or with a shipping and handling fee, for example, the

    USDA-ARS National Plant Germplasm System (http://www.ars-grin.gov/npgs/).

    The CGIAR system has a number of genebanks around the

    world: http://www.cgiar.org/impact/accessions.htm.

    The International Rice Research Institute (IRRI) genebank in Los Banos, Philippines, stores over 80,000

    accessions of rice.

    Wild relatives are also a good source of

    diversityMany wild relatives of our crops have also been saved in genebanks around the world.

    Alleles which can be transferred from a wild relative to a crop plant not only increase the crop'sgenetic diversity, but also can lead to an improvement in a valuable trait, even when this couldnever be predicted from the wild relative's phenotype (Tanksley and McCouch 1997).

  • 8/8/2019 Marker as Br

    37/50

    As of 2006, the CGIAR centers (Consultative Group on International Agricultural Research)together curate more than 650,000 accessions of crop, forage and agroforestry species (2006

    Bioversity International).

    Phenotype does not equal genotype

    Although a wild relative may appear not to have many desirable characteristics, we now know

    that there may be many hidden alleles that could effect the trait in the direction that we want. Forexample, genes for increasing yield can be found in a low-yielding plant (Eshed and Zamir 1995,Mallikarjuna Swamya and Sarla 2008). This is especially true for quantitative traits, which we

    will discuss later in the module.

    In this example of a genome with 6 chromosomes, there are 2 genes (or QTL) (red) that areassociated with high yield, but 4 (blue) that are associated with lower yield, so overall the

    individual may be a poor yielder. MAB techniques, such as those in the upcoming chapters, canhelp identify those genes with the positive effects.

  • 8/8/2019 Marker as Br

    38/50

    Selecting the germplasm to use

    This is the most important step as it will affect every result of your work from this point forward.

    Available germplasm resources can include cultivated varieties and landraces (traditionalvarieties), wild species or relatives, commercial cultivars and other breeding lines.

    The potato collection at the IPK genebank, Germany

    Criteria for selecting germplasm

  • 8/8/2019 Marker as Br

    39/50

    Improved cultivars (and even landraces) are often associated with low genetic diversity,especially in the self-pollinated species.

    However, increasing diversity by introgression from wild relatives can be complicated by the

    existence of crossing barriers or poor hybrid fertility. Sometimes, these can be overcome by the

    use of techniques such as embryo rescue, etc.

    Criteria for selecting parents for a MAB project may include the need for a particular trait that

    appears in an accession, or just a general need for more diversity.

    Genetic Diversity Assessment

    Before selecting the parents you wish to use for your MAB work, you may need to assess the

    genetic diversity that is available in your germplasm set (or that which you have acquired fromgenebanks or other sources), unless this work has been done previously.

    The goal is to select parents that are genetically diverse enough that you can identify differences

    polymorphisms in the progeny. There is no set level of genetic diversity required, and eachcrop is different.

    Methods of genetic diversity assessment

    We will not go over the many methods of genetic diversity assessment here (see the resources at

    the end of this section). These can include in-depth calculations of allele frequencies, geneticdistance calculations, etc. but can be just a simple measure of what percentage of molecular

    markers assessed show polymorphism between two parents. This could give you enoughinformation to be able to select parents for a new breeding population.

  • 8/8/2019 Marker as Br

    40/50

    Measures of genetic diversity

    In brief, here are just a few of the measures of genetic diversity:

    yBased on the number of variants

    o Polymorphism or rate of polymorphism (Pj)o Proportion of polymorphic locio Number of alleles (A) and allelic richness (As)o Average number of alleles per locus

    y Based on the frequency of variantso Average expected heterozygosity

    (He; Neis genetic diversity)y The genetic distance between two samples is described as the proportion of genetic

    elements (alleles, genes, gametes, genotypes) that the two samples do not share

    See de Vicente, Lopez and Fulton 2004 for more details, in particular Chapter 3

    Using markers to assess diversity

    Clearly the use of markers is needed for these measures of genetic diversity. Many differenttypes of markers can be used. The Resources at the end of this chapter include examples and

    comparisons, as well as software programs available for the calculations.

    As with most statistics in MAB, there are no specific cut-offs for what levels of diversity are

    good this is something you must decide, with your goals and germplasm specifics in mind.

    Results of diversity analyses

    The results of genetic diversity analyses can be a simple measure of genetic distance, or the

    commonly seen phenograms/dendrograms (trees) or cluster diagrams.

  • 8/8/2019 Marker as Br

    41/50

    Selecting parentsIn general, one crossing parent in a MAB project is a cultivated variety that needs improvementin one or more traits, and the other parent is selected either because it exhibits some particular

    desired trait or is only distantly related to the first parent (and therefore has the potential to havenew alleles for a number of traits).

    Next, we will look at tips for phenotyping in MAB..

    Software resources for genetic diversity

    assessmentHere are just a few of the free software programs available (see de Vicente, Lopez, and Fulton

    2004 for a more exhaustive list). Note: you should always get statistical assistance whenanalyzing your data!

    Arlequin http://lgb.unige.ch/arlequin/

    PowerMarker http://statgen.ncsu.edu/powermarker/

    PHYLIP http://evolution.genetics.washington.edu/phylip.html

    DnaSP http://www.ub.edu/dnasp/

    MEGA http://www.megasoftware.net/

    Structure http://pritch.bsd.uchicago.edu/structure.html

  • 8/8/2019 Marker as Br

    42/50

    Ontology

    An ontology is a structured controlled vocabulary, where term definitions are agreed upon and

    used consistently by a community.

    For example, the grain yield of a wheat crop is commonly expressed as kilograms per hectare of

    mature grain at 14% moisture content; and grain size as the weight of 1,000 dehusked grains

    Ontologies Available

    A number of publicly available ontologies have been developed.

    A good example is given at Gramene

  • 8/8/2019 Marker as Br

    43/50

    Here you can browse the terms used for a large number of traits, letting you name each trait in the

    same way as other colleagues within the plant research community.

    Experimental Design: Example

    The tree in the corner of of the field would tend to shade the plants growing close to it, which

    would reduce their performance compared to plants far from the corner. Without measuringplants of the same line in another plot, these plants would be classified as low yielding.

    Likewise, those planted near the edge of the plot may grow better because their roots have morespace to exploit. Without randomisation, they would be planted on the field's edge in every

    replicate, and therefore appear to have better yield, which is an accident of where they wereplanted, and has nothing to do with their genetic make-up.

    Kinds of data: Categorical

    Categorical data is, as you might guess, data that fall into categories, or classes. There are 2 maintypes of categorical data.

    1.) Nominal data has no natural order or relationship. In the fruit shape example below, a score

    of 2 does not imply more than or better than a score of 1. The classes are just different.

    To find a marker correlated with this type of data, you need only a test of independence (e.g. chi-square).

  • 8/8/2019 Marker as Br

    44/50

    Categorical data, contd.

    2.) Another type of categorical data is ordinal that is, there is some natural order (the scores are

    not unrelated).

    For example, if you score fruit size on a scale of 1-5, where 1 = smallest and 5 = biggest, a scoreof 2 means the fruit are larger than fruit scored 1. There is a relationship between the scores, a

    natural order.

    For this kind of data you need an association test (e.g. Kendalls tau statistic; Kendall 1938).

    Continuous data

    Continuous data do not fall into discrete categories. Instead they produce a continuous

    distribution. Some of these distributions can be modeled algebraically. The two most commonones arePoisson and Gaussian (or normal).

    Examples of continuous traits include yield, size, nutrients, etc. Certainly you could score these traits on

    a scale, but if you measure the exact quantities, they will not fall into clear classes.

    A perfect normal distribution (top) as compared to a histogram of yield data from a

    QTL experiment (bottom). Data points fall into a continuous range, not discrete classes.

    Identifying correlations for this type of data requires more complex calculations such as regressions.

  • 8/8/2019 Marker as Br

    45/50

    Top image from http://mathworld.wolfram.com/PoissonDistribution.html, lower image from T.Fulton using QGene software (http://www.qgene.org).

    Data checking functions

    There are a number of things you should do to check your data for clear errors. One is looking atthe minimum and maximum data points (either by eye, by sorting the data, or using a function in

    a program like Excel). If these are very different than what you expected, or there is a data pointthat is greatly different than the rest of the range, you should doublecheck in case this is an error

    (typographical, measuring error, etc.).

    Trait histograms like the examples below are a good quick way to look for outliers. In the first

    histogram, note that there is one, and only one, plant that scores 8.5 for brix (a measure ofsoluble solids) (red arrow). This is highly unusual and could be an error. It could be better to

    remove this from the data set, or at least keep it in mind when you analyse results.

    What are genetic maps?

    A genetic map is a representation of the position of the genes and/or markers along achromosome, as determined by linkage analysis. To understand this we first need to understandthe concepts of inheritance, independent assortment and recombination.

  • 8/8/2019 Marker as Br

    46/50

    Using markers to assess recombination

    Recall that a segregating population is required for our linkage analysis, so that we can observe

    which parental gamete combinations have been inherited in the offspring.

    Another requirement is markers that can identify differences polymorphisms between the

    parental alleles in our populations. We may need to try a large number of markers to identify

    those that are polymorphic between our 2 parents. This is typically called a parental survey.Markers that are polymorphic then need to be retested on each line in our mapping population.Lets look at a real-life example.

    Using markers to assess recombination

    This figure shows the profile of parents 1 and 2 assayed with CAPs marker X. The first lane is asize standard. There is a clear polymorphism between the two parents. If both parents showed the

    same band (i.e. they were monomorphic for this marker), this would not be a useful marker.

  • 8/8/2019 Marker as Br

    47/50

    Note: CAPS markers identify polymorphisms by digesting the PCR products with a restriction

    enzyme. The parent 2 profile consists of two restriction fragments, but the parent 1 amplifiedsegment contains no restriction site, and thus is not cut by this particular enzyme.

    Marker example, contd.

    This figure shows the same marker assayed on 27 F2 progeny. Each carries either one of the two

    parental alleles, or has inherited one allele from each parent.

    Remember, each plant does have 2 alleles. But, when they are both the same size, as in ahomozygote, they run to the same position on the gel and so appear as one band.

    Conventionally a 1 is assigned to one parental type, a 3 is assigned to the other parental type

    (typically the wild parent if there was one) and a 2 to the heterozygotes. Some softwareprograms use A, B, and H instead.

  • 8/8/2019 Marker as Br

    48/50

    Marker scoring for mapping

    Therefore the data for the marker in this particular case would look like this:

    322221232233232331132312121

    Of course, to be able to estimate linkage between each pair of markers and eventually to generatea genetic map, you need to analyse many markers. Lets look at an example.

    Mapping data

    The partial data file below shows allele calls for 83 F2 progeny at five loci.

    It is important that the scores for each plant are in exactly the same order for each marker so that

    recombinations can be identified.

    Note the presence of scores that are 4 or 5 what are these? Some mapping programs allow for

    ambiguity. For example, there may be a band pattern on your gel that you are sure is not a "3",

    but cant decide whether it is a "1" or a "2". In this situation you might score this as a "5" (and apattern that is either a "2" or a "3" might be scored as "4"). This is highly dependent on the

    software program you use. In this case missing data is scored as 0.

    *TG230

    23320221253323232232233253232322232222213242242225301233322223222232223123322222223

    *CT276211201332113220213322311121213322212352401222232132122113132221223213220133211

    13221*TG23

    22320212121312233111222212222231222122323323132322223322201321312222122223232122222

    *CT50212203332123233211222223223222221321113152213232121233321323303331333233222223

    22200

    *TG36031320222322133322223231315532221222223232121222223322121321222231222213323222322223

    LOD scores

  • 8/8/2019 Marker as Br

    49/50

    The LOD (logarithm of odds) score is the statistical test most used in genetic linkage calculations

    (indicating the likelihood of linkage vs. non-linkage).

    Specifically:

    If two markers are unlinked, then the odds (Ou) of getting exactly x recombinant progeny and yparental-type progeny is: (0.5x)(0.5y).

    If the markers are linked, with a recombination frequency of r, then the odds ratio (OL) is given

    by (rx)[(1-r)

    y].

    The odds ratio is OL/OU and the LOD score is log10 (OL/OU).

    Use of LOD scores

    By convention, a LOD score of 3 or higher is accepted as linkage.

    A score of 3.0 means that the likelihood of the observed recombination frequency occurring if

    the two loci are not linked (that is, by random chance) is less than 1 in 1000 (a score of 2, 1 in100, and so on).

    Available software programs

    iMAS (Integrated Marker Assisted Selection) http://www.icrisat.org/gt-bt/Imas.htm

    *Mapmakerhttp://www.broad.mit.edu/ftp/distribution/software/mapmaker3/

    MapDisto http://mapdisto.free.fr/

    **JoinMap http://www.kyazma.nl/index.php/mc.JoinMap/

    *Mapmaker is no longer being updated for new operating systems, but is still widely in use**Not a free program

    Software available

    QGene (http://www.qgene.org/)

    R/QTL (http://www.rqtl.org)

  • 8/8/2019 Marker as Br

    50/50

    QTL Cartographer (http://statgen.ncsu.edu/qtlcart/index.php)

    TetraploidMap (http://www.bioss.ac.uk/knowledge/tetraploidmap/)

    Steps in MABIn summary, the following steps are a general simplified outline of how MAB might be appliedin a breeding program. Keep in mind that these must be adapted to the specific crop and goals of

    each project.

    y Assess genetic diversity of germplasm collection with markersy Select parentsy Make crosses, generate segregating populationsy Apply markers to generate a genetic mapy Phenotype a segregating population for traits of interesty

    Apply markers to use with phenotype data for QTL discovery/marker-trait correlations(may be done simultaneously as the previous 2 steps)y Select key marker-trait correlations of interesty Select plants from the segregating population that contain desired alleles for the traits of

    interest

    y Make new crosses (either to introgress new desirable alleles into commercial lines, makehybrids with other desired properties, get rid of linkage drag, etc.)

    y Continue until new lines are ready for commercial production