9
1 Exercícios de Bioinformática Prof. Maria Lucila Hernandez Macedo e Prof. Leandro E. C. Diniz Exercicio 01 A partir da sequência abaixo, analise e responda: 1 acatccgcgg caacgcctcc ttggtgtcgt ccgcttccaa taacccagct tgcgtcctgc 61 acacttgtgg cttccgtgca cacattaaca actcatggtt ctagctccca gtcgccaagc 121 gttgccaagg cgttgagaga tcatctggga agtcttttac ccagaattgc tttgattcag 181 gccagctggt ttttcctgcg gtgattcgga aattcgcgaa ttcctctggt cctcatccag 241 gtgcgcggga agcaggtgcc caggagagag gggataatga agattccatg ctgatgatcc 301 caaagattga acctgcagac caagcgcaaa gtagaaactg aaagtacact gctggcggat 361 cctacggaag ttatggaaaa ggcaaagcgc agagccacgc cgtagtgtgt gccgcccccc 421 ttgggatgga tgaaactgca gtcgcggcgt gggtaagagg aaccagctgc agagatcacc 481 ctgcccaaca cagactcggc aactccgcgg aagaccaggg tcctgggagt gactatgggc 541 ggtgagagct tgctcctgct ccagttgcgg tcatcatgac tacgcccgcc tcccgcagac 601 catgttccat gtttctttta ggtatatctt tggacttcct cccctgatcc ttgttctgtt 661 gccagtagca tcatctgatt gtgatattga aggtaaagat ggcaaacaat atgagagtgt 721 tctaatggtc agcatcgatc aattattgga cagcatgaaa gaaattggta gcaattgcct 781 gaataatgaa tttaactttt ttaaaagaca tatctgtgat gctaataagg aaggtatgtt 841 tttattccgt gctgctcgca agttgaggca atttcttaaa atgaatagca ctggtgattt 901 tgatctccac ttattaaaag tttcagaagg cacaacaata ctgttgaact gcactggcca 961 ggttaaagga agaaaaccag ctgccctggg tgaagcccaa ccaacaaaga gtttggaaga 1021 aaataaatct ttaaaggaac agaaaaaact gaatgacttg tgtttcctaa agagactatt 1081 acaagagata aaaacttgtt ggaataaaat tttgatgggc actaaagaac actgaaaaat 1141 atggagtggc aatatagaaa cacgaacttt agctgcatcc tccaagaatc tatctgctta 1201 tgcagttttt cagagtggaa tgcttcctag aagttactga atgcaccatg gtcaaaacgg 1261 attagggcat ttgagaaatg catattgtat tactagaaga tgaatacaaa caatggaaac 1321 tgaatgctcc agtcaacaaa ctatttctta tatatgtgaa catttatcaa tcagtataat 1381 tctgtactga tttttgtaag acaatccatg taaggtatca gttgcaataa tacttctcaa 1441 acctgtttaa atatttcaag acattaaatc tatgaagtat ataatggttt caaagattca 1501 aaattgacat tgctttactg tcaaaataat tttatggctc actatgaatc tattatactg 1561 tattaagagt gaaaattgtc ttcttctgtg ctggagatgt tttagagtta acaatgatat 1621 atggataatg ccggtgagaa taagagagtc ataaacctta agtaagcaac agcataacaa 1681 ggtccaagat acctaaaaga gatttcaaga gatttaatta atcatgaatg tgtaacacag 1741 tgccttcaat aaatggtata gcaaatgttt tgacatgaaa aaaggacaat ttcaaaaaaa 1801 taaaataaaa taaaaataaa ttcacctagt ctaaggatgc taaaccttag tactgagtta 1861 cattgtcatt tatatagatt ataacttgtc taaataagtt tgcaatttgg gagatatatt 1921 tttaagataa taatatatgt ttacctttta attaatgaaa tatctgtatt taattttgac 1981 actatatctg tatataaaat attttcatac agcattacaa attgcttact ttggaataca 2041 tttctccttt gataaaataa atgagctatg tattaacaaa aaaaaaaaaa aaaaaaaaaa 2101 aaaaaaaaaa aaaaaa O que fazer? Você recebeu uma sequência na direção 5´3´ de um cDNA (cópia de RNA mensageiro) desconhecido. As perguntas são: 1. A sequência corresponde a qual gene? 2. De que espécie? 3. Quais os 12 nucleotídeos do início da tradução? 4. Qual a sequência de aminoácidos codificada pelo RNA? (apresentar como exemplo abaixo)

Lista de Exercicios bioinformatica

Embed Size (px)

DESCRIPTION

exercicio

Citation preview

  • 1

    Exerccios de Bioinformtica Prof. Maria Lucila Hernandez Macedo e Prof. Leandro E. C. Diniz

    Exercicio 01

    A partir da sequncia abaixo, analise e responda:

    1 acatccgcgg caacgcctcc ttggtgtcgt ccgcttccaa taacccagct tgcgtcctgc

    61 acacttgtgg cttccgtgca cacattaaca actcatggtt ctagctccca gtcgccaagc

    121 gttgccaagg cgttgagaga tcatctggga agtcttttac ccagaattgc tttgattcag

    181 gccagctggt ttttcctgcg gtgattcgga aattcgcgaa ttcctctggt cctcatccag

    241 gtgcgcggga agcaggtgcc caggagagag gggataatga agattccatg ctgatgatcc

    301 caaagattga acctgcagac caagcgcaaa gtagaaactg aaagtacact gctggcggat

    361 cctacggaag ttatggaaaa ggcaaagcgc agagccacgc cgtagtgtgt gccgcccccc

    421 ttgggatgga tgaaactgca gtcgcggcgt gggtaagagg aaccagctgc agagatcacc

    481 ctgcccaaca cagactcggc aactccgcgg aagaccaggg tcctgggagt gactatgggc

    541 ggtgagagct tgctcctgct ccagttgcgg tcatcatgac tacgcccgcc tcccgcagac

    601 catgttccat gtttctttta ggtatatctt tggacttcct cccctgatcc ttgttctgtt

    661 gccagtagca tcatctgatt gtgatattga aggtaaagat ggcaaacaat atgagagtgt

    721 tctaatggtc agcatcgatc aattattgga cagcatgaaa gaaattggta gcaattgcct

    781 gaataatgaa tttaactttt ttaaaagaca tatctgtgat gctaataagg aaggtatgtt

    841 tttattccgt gctgctcgca agttgaggca atttcttaaa atgaatagca ctggtgattt

    901 tgatctccac ttattaaaag tttcagaagg cacaacaata ctgttgaact gcactggcca

    961 ggttaaagga agaaaaccag ctgccctggg tgaagcccaa ccaacaaaga gtttggaaga

    1021 aaataaatct ttaaaggaac agaaaaaact gaatgacttg tgtttcctaa agagactatt

    1081 acaagagata aaaacttgtt ggaataaaat tttgatgggc actaaagaac actgaaaaat

    1141 atggagtggc aatatagaaa cacgaacttt agctgcatcc tccaagaatc tatctgctta

    1201 tgcagttttt cagagtggaa tgcttcctag aagttactga atgcaccatg gtcaaaacgg

    1261 attagggcat ttgagaaatg catattgtat tactagaaga tgaatacaaa caatggaaac

    1321 tgaatgctcc agtcaacaaa ctatttctta tatatgtgaa catttatcaa tcagtataat

    1381 tctgtactga tttttgtaag acaatccatg taaggtatca gttgcaataa tacttctcaa

    1441 acctgtttaa atatttcaag acattaaatc tatgaagtat ataatggttt caaagattca

    1501 aaattgacat tgctttactg tcaaaataat tttatggctc actatgaatc tattatactg

    1561 tattaagagt gaaaattgtc ttcttctgtg ctggagatgt tttagagtta acaatgatat

    1621 atggataatg ccggtgagaa taagagagtc ataaacctta agtaagcaac agcataacaa

    1681 ggtccaagat acctaaaaga gatttcaaga gatttaatta atcatgaatg tgtaacacag

    1741 tgccttcaat aaatggtata gcaaatgttt tgacatgaaa aaaggacaat ttcaaaaaaa

    1801 taaaataaaa taaaaataaa ttcacctagt ctaaggatgc taaaccttag tactgagtta

    1861 cattgtcatt tatatagatt ataacttgtc taaataagtt tgcaatttgg gagatatatt

    1921 tttaagataa taatatatgt ttacctttta attaatgaaa tatctgtatt taattttgac

    1981 actatatctg tatataaaat attttcatac agcattacaa attgcttact ttggaataca

    2041 tttctccttt gataaaataa atgagctatg tattaacaaa aaaaaaaaaa aaaaaaaaaa

    2101 aaaaaaaaaa aaaaaa

    O que fazer?

    Voc recebeu uma sequncia na direo 53 de um cDNA (cpia de RNA mensageiro) desconhecido. As perguntas so:

    1. A sequncia corresponde a qual gene? 2. De que espcie? 3. Quais os 12 nucleotdeos do incio da traduo? 4. Qual a sequncia de aminocidos codificada pelo RNA? (apresentar como

    exemplo abaixo)

  • 2

    Dicas:

    1. Faa um BLAST da sequncia no site: http://www.ncbi.nlm.nih.gov/BLAST/

    Entre em: Nucleotide-nucleotide BLAST (blastn)

    Coloque a sequncia no box superior

    Aperte o boto virtual:

    Na prxima pgina, aperte o boto e aguarde (no aperte vrias vezes, pois o tempo ser ainda mais lento)...

    Logo abaixo voc encontrar a sequencia de DNA mais parecida com aquela que voc iniciou a procura. Voc ver o nome e a espcie e um valor de similaridade

    (E value), que inversamente proporcional semelhana das sequncias

    (sequncias idnticas do E value igual a ZERO).

    2. Para obter a sequncia de aminocidos, v ao site: http://us.expasy.org/tools/dna.html

    O programa ir analisar todos os possveis ORF (Open Reading Frames).

    Explicando: o ribossomo l o RNA de 3 em 3 letras, podendo comear pela

    primeira letra da sequncia, pela segunda ou pela terceira. Dependendo em qual

    posio ele comea, a sequncia de aminocidos obtida ser diferente. Preste

    ateno nos sinais Met (abreviao de Metionina) e Stop (abreviao de cdon

    de trmino da traduo). A sequncia da protena (se a sequncia que voc est

    analisando for o cDNA inteiro, de ponta a ponta) comea sempre com uma

    Metionina e acaba com um Stop. Quanto mais longa for a sequncia de

    aminocidos desde uma Met at um Stop, maiores chances de que esse seja o

    ORF certo (o polipeptdeo, ou protena, codificado pela sequncia em questo).

    Outra observao: em alguns casos no se sabe se a sequncia de cDNA a fita + ou a fita , por isso o programa vai liberar informao de traduo nas duas direes da sequncia dada.

    Voc encontrar outra forma de visualizar qual a ORF certa no site (a ORF o box verde clique nele para ver a sequncia):

    http://www.ncbi.nlm.nih.gov/gorf/gorf.html

    Tem uma homepage com um exerccio muito interessante, onde voc faz a transcrio e depois a traduo, experimente!

    http://learn.genetics.utah.edu/content/begin/dna/transcribe/

  • 3

    Exerccio 2

    Aps sequenciamento do DNA de uma amostra qualquer de microrganismo isolado do

    solo foi obtida a sequncia abaixo. Identifique o organismo a qual essa sequncia

    pertence. Caso essa sequncia seja codificante, a que tipo de protena estaria

    relacionada.

    ATGGCCATACTTTGCAGTACTGCATTGGCTCTGGGCGCATGCGGAAGTATGGGGAAAGCGGGCGGCCCGACAATGCGTTGTCTATAGAACAAACAACGCAGATAGACAGCGCAGACGGGATTGATGCCTCCAAACTGCTCTTTTCCTC

    TTCCCAAGCCGTTGTTATCGCCGGTGATTCTGTGGGGCAGAAGTGGGAGGGCGCGAAAGCAGCGGTGAAGCGGGGCGCGCCGCTGCTGGTGCGCACTGCCGATAACGCGTCGGCCATTGATTCGGAGATAAAGCGCCTCGGGGCTCA

    AGACGTTATTAAGATTGACGAGCCTCAGGCCCCGGACCCGGAAATTTCCGAGGCAACATTGCCGATAAAATCTC

    GCAGCTCACGCCCGAATCCCCGCTTTTTAACGGCGGCGCGTCCATCCTGGTCTCCGGGCACACCACGGCCGCTGATGTAGCCACCGCACGCGCGTCGGGGGCCAATGTGGAGTACCTGTCTTCGGGCGATGCGCGTGAAAGCTCTGCG

    CTATCCGCTGATCCCGACGCTCATGTGGTTGCCCTGGGTCCAAGTTTTGCCAACAAAGAACGCTTTAATCGCCAG

    GTAGAGATGATTAGCCATGGTGAGGTCCCCGGTGGCGGGCATCTCATTTTCCCCTCGCATCGCGTGGTAGCTCTCTACGGTCATCCTTCCGGCGGGGCGCTGGGAGTGCTTGGCGAGCAACCTGCTGAGGAAGCCGTAAACAGGGTGA

    ATGATTTAGTGGGTAAGTATCAGGCCATTGCACCGGAAGAGAGCATGATCCCCGCCTTTGAGATCATTGCTACC

    GTGGCGAGCTCGTCAGCAGGGCCGGATGGCAATTATTCCAATGAGGGGAACGTTGATGAGCTGCGCCCGTGGGT

    TGAAGCGATTGGTGATGCTGGAGGCATAGCGATTCTTGATCTACAACCTGGCAGCGCAAGCTTCCTTGAACAGG

    CACAACAATTTGAGGAATTGCTGAAACTACCGCACGTCGGACTGGCGATAGATCCCGAGTGGCGGCTTAAGCCG

    GGGGAGAAACCCATGGAGAGGGTCGGCAGTGTTGGGGCGGGGGAAGTGAACCAGACTGCTGCGTGGCTGCGGGACCTGGTAAAAGATAACGAGCTCCCGCAGAAAGTCTTTGTTGTGCACCAATTTCAGCATCAGATGGTGCAGAAC

    AGGGAAACCTTGGACACCACGGCACCGGAACTTTCGTGGGTTCTTCACGCAGATGGCCACGGAACCGCGGGCG

    ATAAGTTTGCCACGTGGGATATGGTGCGGAAGAATCTGCAGCCCGAGTTCTACCTTGCGTGGAAGAACTTTATCGATGAGGATCAGCCGATGTTCACCCCCGAGCAGACGTTTAAGATCGAGCCTCGGCCTTGGTTTGTGTCCTATCA

    ATAA

    Exerccio 3:

    Faa uma digesto in silico do fragmento amplificado pelos primers Eub-8f e 1492r do

    genoma do organismo identificado no exerccio acima, utilizado as enzimas de

    restrio: HaeIII, HhaI, MseI, MspI e RsaI.

    Exerccio 4:

    As sequncias abaixo correspondem a quais organismos?

    Caso elas correspondessem a um gene, qual seria a funo e estrutura da protena

    codificadas por cada sequncia.

    Sequncia 1

    CGAAAAATAAGCCATAGTCGGCACCATAAGCATAACCTAGCTCTGCGATTATCTCTAACATAATTAACTT

    AAGCAGCCGTATTTATAAAGAAATTTCCAAAATAAAGCGAATATTCTAGAATCCCAAAACAAACTGGTTG

    TTGCGGTAGGTCATTTGTTTGGCAGAAAGAAAACTCGAGAAATTTCTCTGGCCGTTATTCTCTATTCGTT

    TTGTGACTCTCCCTCTTTGTACTATTGCTCTCTCACTCTGTCACACAGTAAACGGCGCACTGTTCTCGTT

    GCTTCGAGAGAGCGCGCCTCGAATGTTCGCGAAAAGAGCGCCGGAGTATAAATAGAGGAGCTTCGTCGAC

    GGAGAGTCAATTCTATTCAAACAAGCAAAGTGAACACATCGCTAAGCGAAAGCTAAGCAAACAAACAAGC

    GCAGCTGAACAAGCTAAACAATCTGCAATAAAGTGCAAGTTAAAGTGAATCAATTAAAAGTAACCAACAA

    CCAAGTAATTAAACTAAAAACTGCAACTACTGAAATCAACCAAGAAGTAATTATTGAAGACAAGAAGAGA

    ACTCTGAATACTTTCAACAAGTCGTTACCGAGGAAGAAGAACTCACACACAATGCCTGCTATTGGAATCG

    ATCTGGGCACCACCTACTCCTGCGTGGGTGTCTACCAACATGGCAAGGTGGAGATTATCGCCAACGACCA

    GGGCAACCGCACCACGCCGTCCTACGTGGCTTTCACAGATTCGGAACGCCTCATCGGCGATCCGGCTAAG

    AACCAGGTGGCCATGAACCCCAGAAACACAGTGTTTGACGCCAAGCGACTGATCGGCCGAAAATACGACG

    ACCCCAAGATCGCAGAGGACATGAAGCACTGGCCTTTCAAGGTTGTAAGCGACGGCGGAAAGCCCAAGAT

    CGGGGTGGAGTATAAGGGTGAGTCCAAGAGATTTGCCCCCGAGGAGATCAGCTCGATGGTACTGACCAAG

  • 4

    ATGAAGGAGACGGCGGAGGCATATCTGGGCGAGAGCATCACAGACGCAGTCATCACAGTTCCAGCCTACT

    TCAACGACTCCCAGCGCCAGGCTACCAAAGACGCCGGTCACATCGCCGGCCTGAATGTGCTCCGCATCAT

    CAATGAGCCCACGGCGGCAGCACTGGCCTACGGACTGGACAAGAACCTCAAGGGTGAGCGCAATGTGCTT

    ATCTTCGACTTGGGCGGCGGCACCTTCGATGTCTCCATCCTGACCATCGACGAGGGATCACTGTTCGAGG

    TGCGCTCCACCGCCGGAGACACACACTTGGGCGGCGAGGACTTTGACAACCGGCTAGTCACTCATCTGGC

    GGACGAGTTCAAGCGCAAGTACAAGAAGGATCTGCGCTCCAACCCTCGCGCCCTACGACGCCTCAGAACA

    GCAGCTGAACGGGCCAAGCGCACACTCTCCTCCAGCACGGAGGCCACCATCGAGATTGACGCACTGTTTG

    AGGGCCAAGACTTCTACACCAAAGTGAGCCGCGCCAGGTTTGAGGAGCTGTGCGCGGACCTCTTCCGCAA

    CACCCTGCAGCCTGTGGAGAAGGCCCTCAACGATGCCAAGATGGATAAGGGTCAGATCCACGACATCGTG

    CTCGTCGGCGGATCCACTCGCATTCCCAAGGTGCAAAGTCTGCTGCAGGACTTCTTCCACGGCAAGAACC

    TCAACCTATCCATCAACCCAGACGAGGCAGTTGCATACGGAGCTGCTGTGCAGGCCGCTATCCTCAGCGG

    AGACCAGAGCGGCAAGATCCAGGACGTGCTGCTGGTGGACGTGGCCCCACTTTCATTGGGAATTGAGACC

    GCTGGAGGTGTAATGACCAAGCTGATCGAGCGCAACTGCCGCATTCCGTGCAAGCAGACTAAGACGTTCT

    CCACATACGCGGACAACCAGCCCGGAGTCTCCATTCAGGTGTATGAGGGCGAACGTGCGATGACGAAGGA

    CAACAATGCATTGGGCACCTTCGATCTGTCCGGCATTCCACCTGCACCAAGGGGTGTGCCCCAGATAGAA

    GTTACCTTCGACTTGGACGCCAATGGAATCCTGAACGTCAGCGCCAAGGAGATGAGCACGGGCAAGGCCA

    AGAACATCACGATCAAGAACGACAAGGGACGGCTCTCGCAGGCCGAGATTGATCGCATGGTGAACGAGGC

    TGAAAAGTACGCCGACGAGGACGAGAAGCATCGCCAGCGAATAACCTCTAGAAATGCCCTGGAGAGCTAC

    GTCTTCAATGTGAAGCAGGCCGTGGAACAGGCACCTGCTGGCAAATTGGACGAGGCTGACAAGAACTCCG

    TCTTGGACAAGTGCAACGACACTATCCGGTGGCTGGACAGCAACACCACTGCCGAGAAGGAGGAGTTCGA

    CCACAAGCTGGAGGAGCTCACCCGCCACTGCTCCCCCATCATGACCAAGATGCATCAGCAGGGTGCGGGA

    GCTGGAGCTGGTGGTCCGGGAGCAAACTGCGGCCAGCAGGCGGGAGGATTTGGAGGCTACTCTGGACCCA

    CGGTCGAGGAGGTCGACTAAGGCCAAAGAGTCTAATTTTTGTTCATCAATGGGTTATAACATATGGGTTA

    TATTATAAGTTTGTTTTAAGTTTTTGAGACTGATAAGAATGTTTCGATCGAATATTCCATAGAACAACAA

    TAGTATTACCTAATTACCAAGTCTTAATTTAGCAAAAATGTTATTGCTTATAGAAAAAATAAATTATTTA

    TTTGAAATTTAAAGTCAACTTGTCATTTAATGTTTTGTAGACTTTTGAAAGTCTTACGATACAATTAGTA

    TCTAATATACATGGGTTCATTCTACATTCTATATTAGTGATGATTTCTTTAGCTAGTAATACATTTTAAT

    TATATTCGGCTTTGATGATTTTCTGATTTTTTCCGAACGGATTTTCGTAGACCCTTTCGATCTCATAATG

    GCTCATTTTATTGCGATGGACGGTCAGGAGAGCTCCACTTTTGAATTTCTGTTCGCAGACACCGCATTTG

    TAGCACATAGCCGGGACATCCGGTTTGGGGAGATTTTCCAGTCTCTGTTGCAATTGGTTTTCGGGAATGC

    GTTGCAG

    Sequncia 2

    GGTTCCAATCCTGCCTCTGCCACTTCTCAGTTGTATGCCCCAACCCAACCTGTCTGGCTCTGTCCTCCTT

    AACAGAAGGACGGCCCTGGCCACGGGCCACAGCCAGCAACGCTTAAGCACCAGGGCCGGCGAGTGCCCTG

    CCGTGGCACGGCTCCAGCGTCGCGCTCTCGAATTCATTTGCTTTCCTTAACGAGAGAAGGTTCCAGATGA

    GGGCTGAACCCTCTTCGCCCCGCCCACGGCCCCTGAACGCTGGGGGAGGAGTGCATGGGGAGGGGCGGCC

    CTCAAACGGGTCATTGCCATTAATAGAGACCTCAAACACCGCCTGCTAAAAATACCCGACTGGAGGAGCA

    TAAAAGCGCAGCCGAGCCCAGCGCCCCGCACTTTTCTGAGCAGACGTCCAGAGCAGAGTCAGCCAGCATG

    ACCGAGCGCCGCGTCCCCTTCTCGCTCCTGCGGGGCCCCAGCTGGGACCCCTTCCGCGACTGGTACCCGC

    ATAGCCGCCTCTTCGACCAGGCCTTCGGGCTGCCCCGGCTGCCGGAGGAGTGGTCGCAGTGGTTAGGCGG

    CAGCAGCTGGCCAGGCTACGTGCGCCCCCTGCCCCCCGCCGCCATCGAGAGCCCCGCAGTGGCCGCGCCC

    GCCTACAGCCGCGCGCTCAGCCGGCAACTCAGCAGCGGGGTCTCGGAGATCCGGCACACTGCGGACCGCT

    GGCGCGTGTCCCTGGATGTCAACCACTTCGCCCCGGACGAGCTGACGGTCAAGACCAAGGATGGCGTGGT

    GGAGATCACCGGTGAGCCCCCCTGCTCCTGCAGGGGAGAGGAGGAGGCTAGCAGGGCGGGCAGGGCCGGG

    GGCGTGCGGTTGAAACGGGGGTCCCGGGGGCCTGGGGAGTTAAACGTTGGCCCAGCACCGGGAAAAACAG

    GACTCCTGATTCCCTTGCTCAGGAATTGGGAGTGCGGGTCGCTTCTAAGGGCGCTTTCTGCTCTGTAATC

    CCAGCGCTTTGGGAGGCCGAGACGGGAGGATCGCTTGAGGCCAGGAGTTCAAGACTAGCCTGGGCAACAT

    AGCGAGACGCGCCCCCCCGCCCCGACCCCGCGCCATTACAAAAAAAAAGCAAACAAAAATTTTTTTAAAG

    ATCATCGATGAAGAGAGAAAATGCGCTTTTCTACAGAGTCCCCTTCCCACCCACAGCCCCATCCCCAGAT

    AAGCGGGGAGTTCCCTGGCGCGGTGCCAGTTTCTAGCCGCTGAGTGGGCGTGTGCGCGGCTCCAAGTGCG

    CCTGCGTACTGCTCACTCCCCAGCTCCGCGCCCTGCTCCGTTCCTCCCAAAACTCTGAATCGAAGAACTT

    TCCGGAAGTTTCTGAGAGCCCAGACCGGCGGGCACGCCCCCATCCCCAACCCCCTCTGTTAATCCCTACC

    AGCCTGCAGTCCTGGCTGCTTCCAAGCAGGAGGTGGGGCCTCTGGCCTAGCGGGGCCGAAAGGCAGTCCC

    CTCCCCCGCAGTCTGATTTCCCTCTTCCCCCCAAAGGCAAGCACGAGGAGCGGCAGGACGAGCATGGCTA

    CATCTCCCGGTGCTTCACGCGGAAATACACGTGAGTCCTGGCGCCAGGTCGGGGTGGGTGGGTGGCGTGG

    GGGTGGGGTCAGGGAAGAGGGCACAGGGACCCACCCGGTGTGTAATGTAACGCTTGCCTTTCCTCTCTGC

    ACGTCCAGGCTGCCCCCCGGTGTGGACCCCACCCAAGTTTCCTCCTCCCTGTCCCCTGAGGGCACACTGA

  • 5

    CCGTGGAGGCCCCCATGCCCAAGCTAGCCACGCAGTCCAACGAGATCACCATCCCAGTCACCTTCGAGTC

    GCGGGCCCAGCTTGGGGGCCCAGAAGCTGCAAAATCCGATGAGACTGCCGCCAAGTAAAGCCTTAGCCCG

    GATGCCCACCCCTGCTGCCGCCACTGGCTGTGCCTCCCCCGCCACCTGTGTGTTCTTTTGATACATTTAT

    CTTCTGTTTTTCTCAAATAAAGTTCAAAGCAACCACCTGTCACTGGCCCAGGCCCTGGTGTTTGTGGAAG

    GAAGCCTCAGGCACCTGCCATTTGCTGGCTTTCAGGAGTCATCTTTGCTCAGGCCCGTGCTGGGCCATGT

    GGGTACACTGGTGTAGGTTGCTGGACACAGGCTGACTCACATCCATAAAGACAGAGGTCTTAGGGCCGGG

    CGCAGTGGCTCATACCTACAATCCCAGCACTTTGGGGGGTTGAAGCAGGAGGAGTGCTTGAAGCCAAGAG

    TTCTAGACCAGCCTGGACAACA

    Sequncia 3

    AAACTTTCTGCGTCCGCCATCCTGTAGGAAGGATTTGTACACTTTAAACTCCCTCCCTGGTCTGAGTCCC

    ACACTCTCACCACCCAGCACCTTCAGGAGCTGACCCTTAACAGCTTCACCCACAGGGACCCCGAAGTTGC

    GTCGCCTCCGCAACAGTGTCAATAGCAGCACCAGCACTTCCCCACACCCTCCCCCTCAGGAATCCGTACT

    CTCTAGCGAACCCCAGAAACCTCTGGAGAGTTCTGGACAAGGGCGGAACCCACAACTCCGATTACTCAAG

    GGAGGCGGGGAAGCTCCACCAGACGCGAAACTGCTGGAAGATTCCTGGCCCCAAGGCCTCCTCCGGCTCG

    CTGATTGGCCCAGCGGAGAGTGGGCGGGGCCGGTGAAGACTCCTTAAAGGCGCAGGGCGGCGAGCAGGGC

    ACCAGACGCTGACAGCTACTCAGAATCAAATCTGGTTCCATCCAGAGACAAGCGAAGACAAGAGAAGCAG

    AGCGAGCGGCGCGTTCCCGATCCTCGGCCAGGACCAGCCTTCCCCAGAGCATCCACGCCGCGGAGCGCAA

    CCTTCCCAGGAGCATCCCTGCCGCGGAGCGCAACTTTCCCCGGAGCATCCACGCCGCGGAGCGCAGCCTT

    CCAGAAGCAGAGCGCGGCGCCATGGCCAAGAACACGGCGATCGGCATCGACCTGGGCACCACCTACTCGT

    GCGTGGGCGTGTTCCAGCACGGCAAGGTGGAGATCATCGCCAACGACCAGGGCAACCGCACGACCCCCAG

    CTACGTGGCCTTCACCGACACCGAGCGCCTCATCGGGGACGCCGCCAAGAACCAGGTGGCGCTGAACCCG

    CAGAACACCGTGTTCGACGCGAAGCGGCTGATCGGCCGCAAGTTCGGCGATGCGGTGGTGCAGTCCGACA

    TGAAGCACTGGCCCTTCCAGGTGGTGAACGACGGCGACAAGCCCAAGGTGCAGGTGAACTACAAGGGCGA

    GAGCCGGTCGTTCTTCCCGGAGGAGATCTCGTCCATGGTGCTGACGAAGATGAAGGAGATCGCTGAGGCG

    TACCTGGGCCACCCGGTGACCAACGCGGTGATCACGGTGCCCGCCTACTTCAACGACTCTCAGCGGCAGG

    CCACCAAGGACGCGGGCGTGATCGCCGGTCTAAACGTGCTGCGGATCATCAACGAGCCCACGGCGGCCGC

    CATCGCCTACGGGCTGGACCGGACCGGCAAGGGCGAGCGCAACGTGCTCATCTTCGACCTGGGGGGCGGC

    ACGTTCGACGTGTCCATCCTGACGATCGACGACGGCATCTTCGAGGTGAAGGCCACGGCGGGCGACACGC

    ACCTGGGAGGGGAGGACTTCGACAACCGGCTGGTGAGCCACTTCGTGGAGGAGTTCAAGAGGAAGCACAA

    GAAGGACATCAGCCAGAACAAGCGCGCGGTGCGGCGGCTGCGCACGGCGTGTGAGAGGGCCAAGAGGACG

    CTGTCGTCCAGCACCCAGGCCAGCCTGGAGATCGACTCTCTGTTCGAGGGCATCGACTTCTACACATCCA

    TCACGCGGGCGCGGTTCGAAGAGCTGTGCTCGGACCTGTTCCGCGGCACGCTGGAGCCCGTGGAGAAGGC

    CCTGCGCGACGCCAAGATGGACAAGGCGCAGATCCACGACCTGGTGCTGGTGGGCGGCTCGACGCGCATC

    CCCAAGGTGCAGAAGCTGCTGCAGGACTTCTTCAACGGGCGCGACCTGAACAAGAGCATCAACCCGGACG

    AGGCGGTGGCCTACGGGGCGGCGGTGCAGGCGGCCATCCTGATGGGGGACAAGTCGGAGAACGTGCAGGA

    CCTGCTGCTGCTGGACGTGGCGCCGCTGTCGCTGGGCCTGGAGACTGCGGGCGGCGTGATGACGGCGCTC

    ATCAAGCGCAACTCCACCATCCCCACCAAGCAGACGCAGACCTTCACCACCTACTCGGACAACCAGCCCG

    GGGTGCTGATCCAGGTGTACGAGGGCGAGAGGGCCATGACGCGCGACAACAACCTGCTGGGGCGCTTCGA

    GCTGAGCGGCATCCCGCCGGCGCCCAGGGGCGTGCCGCAGATCGAGGTGACCTTCGACATCGACGCCAAC

    GGCATCCTGAACGTCACGGCCACCGACAAGAGCACCGGCAAGGCCAACAAGATCACCATCACCAACGACA

    AGGGCCGCCTGAGCAAGGAGGAGATCGAGCGCATGGTGCAGGAGGCCGAGCGCTACAAGGCCGAGGACGA

    GGTGCAGCGCGACAGGGTGGCCGCCAAGAACGCGCTCGAGTCCTATGCCTTCAACATGAAGAGCGCCGTG

    GAGGACGAGGGTCTCAAGGGCAAGCTCAGCGAGGCTGACAAGAAGAAGGTGCTGGACAAGTGCCAGGAGG

    TCATCTCCTGGCTGGACTCCAACACGCTGGCCGACAAGGAGGAGTTCGTGCACAAGCGGGAGGAGCTGGA

    GCGGGTGTGCAGCCCCATCATCAGTGGGCTGTACCAGGGTGCGGGTGCTCCTGGGGCTGGGGGCTTCGGG

    GCCCAGGCGCCGCCGAAAGGAGCCTCTGGCTCAGGACCCACCATCGAGGAGGTGGATTAGAGGCCTCTGC

    TGGCTCTCCCGGTGTGGTCTAGAAAACAGACTCTTTGCACTTGATAGCTGCTTGGGCACCGATTACTGTC

    AAGGTTATTTAAAGTCTTCTTCATGGTTCAGTTTAAAGTTACAGTCTTTCTTAAGGTAATTGCGTTGACT

    GTTAAATTTTGTATGCATATATATATATATATATATATATATATATATATATTCAAATATATTCAAAGTA

    ATGTTGGGAGCAGCACTGTGCACTGTACCAGGGGATTATGTTTTATAGCTAATGATGTGTAAAGTCTAAA

    GATTTTTTTGTAATTTTTATATCAGTGTTCCAGTAGCCTGGGAAGACATATAGTCTAGCTGCCCAGTTCC

    CTGGAGATGGTCATCTCTAAGACAAAGTGTCTTAAACAAACGTCTTGGCACTGTGTACTACATAACTTTA

    CTCTTTTGTACTTAAAACTTTATCTGCTTGTCCATGTTAAGGTTTTGTGGTATAACCAGTATGTTCTTTG

    CATTTAATCTAAGTAGGTTAAAGATGGTGTATCCTTCCTGCATACATGTCTACACTGCCACCCTGTGTAC

    ATTTTTTTCTTTGCATCACTACAAACTAATGAAAAAAACTTTTATGACTTAAATATTCAAAATAAAAGGT

    TACAAGTATATTTTGTCTGTTTGTATGTTGGAAGGGCTAATGGATTCTGGGCTTCTGTGGATTTCTTAAG

    TTTTTTTTAAGATTTATTATTATATGTGAACACATTGTAGCTATCTTCAGACACACCAGAAAAGGGCATC

    AGATCTCATTACAGATGGGTGTGAGCCACCATGTGGTTCCTGGGATTTGAACTCAGGACCTTCGGAAGAG

  • 6

    TAGTCAGTGCTCTTAACTGCTGAGCTGTCTCTCCAGCCCCCGGATTTCTTAGTTTTGTGATAACTGGAAA

    AGGGATTTTTTTTTGGTGGATTTTCAGTGCAGTTATGCAGGGAGTACAGGTATTTACTTTGAGGGTCGGG

    CTCATTCATGGGAAAAAGTAGGGTGGTGCTGTTGTTTGGGGTCAGTGGAAGAGGGACTCAAGGGGCTATG

    AGAGCTCAGCTC

    Exerccio 5: exerccios de alinhamento A partir das sequncias abaixo, responda:

    >seq 1

    1 tgtgttcact agcaacctca aacagacacc atggtgcacc tgactcctga ggagaagtct

    61 gccgttactg ccctgtgggg caaggtgaac gtggatgaag ttggtggtga ggccctgggc

    121 aggctgctgg tggtctaccc ttggacccag aggttccttg agtcctttgg ggatctgtcc

    181 actcctgatg ctgttatggg caaccctaag gtgaaggctc atggcaagaa agtgctcggt

    241 gcctttagtg atggcctggc tcacctggac aacctcaagg gcacctttgc cacactgagt

    301 gagctgcact gtgacaagct gcacgtggat cctgagaact tcaggctcct gggcaacgtg

    361 ctggtctgtg tgctggccca tcactttggc aaagaattca ccccaccagt gcaggctgcc

    421 tatcagaaag tggtggctgg tgtggctaat gccctggccc acaagtatca ctaagctcgc

    481 tttcttgctg tccaatttct attaaaggtt cctttgttcc ctaagtccaa ctactaaact

    541 gggggatatt atgaagggcc tt

    >seq 2

    1 aacgtggatg aagttggtgg tgaggccctg ggcaggctgc tggtggtcta cccttggacc

    61 cagaggttct ttgagtcctt tggggatctg tccactcctg atgctgttat gggcaaccct

    121 aaggtgaagg ctcatggcaa gaaagtgctc ggtgccttta gtgatggcct ggctcacctg

    181 gacaacctca agggcacctt tgccacactg agtgagctgc actgtgacaa gctgcacgtg

    241 gatcctgaga acttcaggct cctgggcaac gtgctggtct gtgtgctgga ccatcacttt

    301 ggcaaagaat tcaccccacc agtgcaggct gcctatcaga aagtggtggc tggtgtggct

    361 aatgccctgg cccacaagta tcactaagct cgctttcttg ctgtccaatt tctattaaag

    421 gttcctttgt tccctaagtc caactactaa actgggggat attatgaagg gccttgagca

    481 tctggatt

    1. Faa o alinhamento das sequncias acima. 2. Classifique o tipo de alinhamento realizado? 3. Quais foram as posies que mostraram diferenas entre as duas sequncias? 4. Obtenha a sequncia protica dos genes. 5. H diferenas na regio que codifica a protena?

    Exerccio 6:

    Utilizando a sequncia abaixo:

    ctactggtacttcgatctctggggccgtggcaccctggtcactgtctcctcagagtcttctctgtccaggcacc

    1. Identifique o gene e a qual organismo ele pertence. 2. Obtenha a sequncia completa do gene no NCBI. 3. Alinhe as duas sequncias utilizando o programa CLUSTALW e verifique se houve

    mismatches e gaps no alinhamento. 4. H alguma alterao na regio que codifica a protena?

  • 7

    Exerccio 7:

    Defina os seguintes itens e d exemplos de aplicao:

    a) Comparao de sequncias; b) Alinhamento mltiplo; c) Primer; d) Busca booleana; e) Via metablica; f) Patente; g) Domnio.

    Exerccio 8:

    Execute os seguintes passos e registre (print screen) os resultados de cada etapa

    marcada com *:

    a) Escolha um gene com 100 ou mais aas e inclua sua sequncia em formato fasta na resposta;

    b) Execute BlastP (NCBI BLAST) e registre: a tela de configurao da busca; a lista de resultados (aps o grfico) e o segundo e o terceiro alinhamento;

    c) Interprete sucintamente o resultado (ignorando o primeiro alinhamento), confirmando ou no a identidade da protena e justificando a resposta;

    d) Execute um alinhamento mltiplo com sete (7) protenas prximas entre si, de organismos diferentes, e registre o resultado do alinhamento. Comente o resultado;

    e) Desenhe um par de primers para amplificar uma regio de 190 a 238 bp no gene desta protena, e registre configurao da ferramenta e as informaes dos primers

    escolhidos;

    f) Verifique se existem patentes relacionadas a esta protena, e descreva: g) Detalhes da busca executada; h) Nmero de patentes encontradas; i) Comente sucintamente uma destas patentes. j) Identifique o EC number desta protena; k) Classifique a protena segundo as ontologias do Gene Ontology.

    Exerccio 9:

    A sequencia abaixo est indicada em duas verses. A primeira sem edio (contendo

    ntrons e xons) e a segunda editada (spliced). A partir de uma destas sequncias

    indique:

    a) Local onde comea e termina o gene.

    b) Aps esta identificao da linha a, indique tambm a posio do gene completo no

    clone BAC de 79.629bp anexo (ex. +50bp a + 350bp, ou -1800bp a -1350bp).

    c) Em qual sentido esta o gene. Fita + ou fita -?

    d) Marque no gene completo, quem so os ntrons e os xons.

    e) Utilizando o clone BAC, e a localizao do gene feito no item b, identifique e

    selecione a regio de 2 kb upstream ao gene. Qual a funo desta regio de 2kb

    upstream ao gene?

  • 8

    f) A partir da seleo da regio upstream, monte a sequncia completa entre regio de

    2kb e gene. Monte tanto com o gene completo quanto com o gene editado.

    g) Identifique se h regies promotoras em cada uma das sequencias (gene + upstream).

    E se houver, indique suas posies na sequencia (use cores).

    h) A partir das duas sequncias de genes, se tivssemos que desenhar 1 par de primers

    para ser usado em PCR tempo real, qual seria a melhor estratgia para desenhar estes

    primers? Justifique? E como fazer para saber se o cDNA usado no tempo real no esta

    contaminado por gDNA?

    > Gene completo CGTGGCTGATGTTGCCAGTTGAAATCGGTAAAGCAACTGCATCGCTGTATCCTGACGAGAATAATGATAC

    ATCTCAAAATAATGTTATAAGTGCATTAGCCCCTCCGAAGGATGTGGATGATCTGCGGCTGATATCTGGA

    TATGGAAATGTCAATATATTCACGTATAGTGAATTGAGAGCTGCTACCAAGAATTTCCGGCCAGATCAGG

    TTCTTGGAGAGGGTGGCTTTGGGGTTGTATATAAAGGTGTTATTGATGAGAGTGTCAGGCCAGGTTCTGA

    AACCATCCAAGTTGCTGTGAAAGAGCTAAAGTCAGATGGCTTGCAGGGAGACAAAGAGTGGCTGGTAATT

    TCTCCCTTGTACTTTTTCAATATTTCTAAGCCAAATTTGTGATATTTTTTCTATCAGCATGAAATTTTCT

    TCACTGCACCATGTATGAGTTGGCAATTGTAGTGAAAAACAAGTTGTCTGAAAAATGTTGTGGAGATTAC

    ACCTAAGGTCTTATCTATTTCATCCTCTTGTATGTAGTTGAGTTAATGCTGTAAACACCCTCAACTTATA

    TAAAATGAAGTCACCTACACAACAAACGGTTGCATTCTGCAGATCATGATTATGTTATTTAATGGCAATA

    TCACTGATTGTATAAGTTGGAAACACTTTTATTGTTCTTTGGGAAATATATATTACATATTTGCTCTTCT

    TTGACTTTAGGCTTAAAGGGATGTTCATCCCTAAAAGCAGATATCTAAAGATGTATGGGAATATGAAATT

    TCTTCAAGTAACCACGGTATCTTCCAAGATGTAATTTATGTTACCTTATTTAGATATCTTATGGAAATCC

    AGAGAGGCATTACTGTCAAGAGAAATCGTTGTCGGTCCCCCTTTGGCTCCATCTCAACTTAAAATTCAGG

    AGTACATGACATGTGTAGTAAACTAATACCAAGTTTTGTCTTGGTCAGCTCCTCTATTATAAGATATATC

    CATATATCAGTTTATTTGGAAGTCCAAGATGCTTAGTAACCTGTGACAAAACCCTATAATAGAAATTTGG

    ACAGCCATATGATTACCAAAGATGGATTCCTTTTCCAATACCTCTGTGATCCATCTAGAATACTAATTAA

    ATGAGTTACTTTTTGGGTTTTTCAGGCAGAAGTAAACTATCTGGGGCAACTTAGTCATCCCAATCTTGTC

    AAACTTATTGGTTACTGCTGTGAAGGTGATCACAGGCTGCTAGTTTATGAGTATATGGCTTCTGGCAGCC

    TTGACAAGCACCTATTCCGACGTAAGTACTTCTCTGACACGAATATCATTACAATACTTAACTAGTTATT

    GTATAAAAATATTAATATCTGAGTTTGTGGAATTTAGTAAATGGTCATTTGTGGCCATCTATCCATTATA

    ACTTCATTTTTCGTTTTTGACAACAGATAAATAATTTTTTAATCTACTATGAAGAAGGAAGACTTTATTA

    CTAGAAACTAGATATTGCTGGGATATATATACCTTTTGTATATATGCATGGATTTGTTTATTATGCCTTT

    GTATAACAAGAGCAATTAATTGTAACTACTAACTATCTTCATCATATCATTCCTTCCTGCCACTGGGCTC

    TGTTAAGAGCAATGTTAGTAGTTTATAGTTGTTTATGGACTCTTTCCCCTGGAACCACTAATCTGGAATA

    ACCTACTTATACATCTAGATATCTTTATGTATGAAAGAATTATGAAAAAGATCTCATCTATCCTCAAGTT

    TTTTCATATATTCAAATAAATTTTATCTTATATTTTTTAATAATAACATAAATCTATCTTGTTATCCTCA

    TATGTGGCTAACAATTAATAATAATTAATCCTTTATCCTTCCTAGTTACTAAGTGACAAGTAAACCGGTG

    GGTAAGTAGTAAGCTAGGGGCAAAATTAGTAAGTATTGGGAATTGTACATGATGAGTTGTTGCCTTTTGC

    TTATGAGCACAATTGCCCCTATCTAATTTCCCATATAATAAACATAAGTTTCTAAAATATACAATGATTA

    AGTATAAGACAATACACTAAAAACAACTTTGCAACTTGGTAACTTTCTATATTCATTGTGTGTGTCTGAA

    ATATTTCTACCTTCTTGACAGGGGTTTGTCTTACAATGCCATGGTCTACTCGAATGAAAATTGCCCTTGG

    TGCTGCAAAAGGACTAGCCTTTCTTCATGCAGCTGAAAGATCAATCATCTATCGTGACTTCAAGACATCA

    AATATCTTACTGGATGAAGTTTGTATCTCTTCCTGCACCATTGGACTGGATATTTAAATTCCCTAGAACT

    ATCTCCATTATCATTATCATATTTAGCAAATACAACTCGGTTACAGATTATTAAGTCCCTATATATATTT

    GTAGTCCATAAATTTTATGGCCATAACATTGGCAAAATTAGATAAATTTGTTATATGGTTAAGAGCACAA

    TTACTTACATAAAAAAGATTTCATTCAGTGTGAACTCATCAATTTCTAAATAATGAGTCTGTTCCATTAA

    AAAAAAATGCATACTTATTATTTGAAAAAGAAAATTGCAAATGTCCAGTATGTGAGCAACAAAAGTGGTT

    ACTGAATCAATGAAAACAAGTAACTAAGGAACTCCATCGTATAATAATATTAAGGATACCCTTTTGAAGC

    ATGCCCATACTGTGAAAGGTCATTTATATGTTTTTCATAACCTGAAAATATAGAAATCATGAAAACATAG

    TTGTTTCTCATCAATTGTCTAATTCTTGGATTCACTTTGCAGGATTACAATGCAAAGCTCTCAGACTTTG

    GCCTTGCAAAAGAGGGGCCTACAGGTGACCAAACTCACGTTTCCACTCGGGTCGTTGGTACATATGGATA

    TGCAGCTCCCGAGTATATAATGACTGGTGAGCTTCTCAAACCACAATCCCAACTTTATGCAAAAGGGAGT

    GCAGATAATTAAGCGCTACTAGCCACTTTCTATCGAGTGTCTCAAATGTGTAGCGATTACATGTCGTTTT

    GTATATTTCTCTAGATGCCTTGCAGTAGGTGACTCTTTCTGGTTCTCTATTCCTTTTCTCTAAAGAAAAC

    CCATGTTTCTGAGCAACTTTACCTCTATTTTAGGGCGTATAGCTTAAATTAATAACTTCTTTCATTTATT

    CCAGGCCATTTAACTGCAAGGAGTGATGTTTACGGATTTGGAGTTGTATTGCTGGAGATGCTTTTAGGGA

    GAAGGGCAATGGACAAGAGCAGGCCCAGCAGACACCAGAACCTCGTTGAGTGGGCTCGACCACTCCTGAT

    CAATGGTCGGAAGTTGCTAAAGATCTTGGATCCAAGAATGGAAGGGCAATATTCTAACAGAGTTGCAACA

    GATGTGGCTAGTTTAGCATATCGATGCCTGAGCCAGAACCCGAAAGGGAGGCCAACAATGAACCAAGTAG

  • 9

    TCGAGTCGCTTGAGAGCCTTCAAGACCTGCCTGAGAACTGGGAAGGCATCCTGTTTCAGAGCAGTGAAGC

    TGCTGTGACTCTCTATGAGGCTCCAAAAGAGATTGCGAGTGACCATTTAGAAAAGAACTCCAGCGAGAAT

    GGAGAGAATGGATCAAATGTGCATGCCAAGGGAAGAAAGAAGCTTGGAAATGGCAGAAGCAACAGCGAGC

    CACCGCCGGTGGAGTTCAGTCAGTACAGTCCTTCACCTGAGTCAGAGAGACATGAGCCAAGTAGAAGATC

    AATCGATCATGACAGAATTCCAAGGCCACCTGCCTATTGACGTGGCTG

    > gene editado CGTGGCTGATGTTGCCAGTTGAAATCGGTAAAGCAACTGCATCGCTGTATCCTGACGAGAATAATGATAC

    ATCTCAAAATAATGTTATAAGTGCATTAGCCCCTCCGAAGGATGTGGATGATCTGCGGCTGATATCTGGA

    TATGGAAATGTCAATATATTCACGTATAGTGAATTGAGAGCTGCTACCAAGAATTTCCGGCCAGATCAGG

    TTCTTGGAGAGGGTGGCTTTGGGGTTGTATATAAAGGTGTTATTGATGAGAGTGTCAGGCCAGGTTCTGA

    AACCATCCAAGTTGCTGTGAAAGAGCTAAAGTCAGATGGCTTGCAGGGAGACAAAGAGTGGCTGGCAGAA

    GTAAACTATCTGGGGCAACTTAGTCATCCCAATCTTGTCAAACTTATTGGTTACTGCTGTGAAGGTGATC

    ACAGGCTGCTAGTTTATGAGTATATGGCTTCTGGCAGCCTTGACAAGCACCTATTCCGACGGGTTTGTCT

    TACAATGCCATGGTCTACTCGAATGAAAATTGCCCTTGGTGCTGCAAAAGGACTAGCCTTTCTTCATGCA

    GCTGAAAGATCAATCATCTATCGTGACTTCAAGACATCAAATATCTTACTGGATGAAGATTACAATGCAA

    AGCTCTCAGACTTTGGCCTTGCAAAAGAGGGGCCTACAGGTGACCAAACTCACGTTTCCACTCGGGTCGT

    TGGTACATATGGATATGCAGCTCCCGAGTATATAATGACTGGCCATTTAACTGCAAGGAGTGATGTTTAC

    GGATTTGGAGTTGTATTGCTGGAGATGCTTTTAGGGAGAAGGGCAATGGACAAGAGCAGGCCCAGCAGAC

    ACCAGAACCTCGTTGAGTGGGCTCGACCACTCCTGATCAATGGTCGGAAGTTGCTAAAGATCTTGGATCC

    AAGAATGGAAGGGCAATATTCTAACAGAGTTGCAACAGATGTGGCTAGTTTAGCATATCGATGCCTGAGC

    CAGAACCCGAAAGGGAGGCCAACAATGAACCAAGTAGTCGAGTCGCTTGAGAGCCTTCAAGACCTGCCTG

    AGAACTGGGAAGGCATCCTGTTTCAGAGCAGTGAAGCTGCTGTGACTCTCTATGAGGCTCCAAAAGAGAT

    TGCGAGTGACCATTTAGAAAAGAACTCCAGCGAGAATGGAGAGAATGGATCAAATGTGCATGCCAAGGGA

    AGAAAGAAGCTTGGAAATGGCAGAAGCAACAGCGAGCCACCGCCGGTGGAGTTCAGTCAGTACAGTCCTT

    CACCTGAGTCAGAGAGACATGAGCCAAGTAGAAGATCAATCGATCATGACAGAATTCCAAGGCCACCTGC

    CTATTGACGTGGCTG