66
“Space, the final frontier...” Como explorar o espaço de funções escores para o desenho de fármacos. Prof. Dr. Walter F. de Azevedo Jr. azevedolab.net

“Space, the final frontier” - azevedolab.net · o desenho de fármacos. Prof. Dr. Walter F. de Azevedo Jr. azevedolab.net. Doutor em Ciências –Física Aplicada –Universidade

Embed Size (px)

Citation preview

“Space, the final frontier...”

Como explorar o espaço de funções escores para o desenho de fármacos.

Prof. Dr. Walter F. de Azevedo Jr.azevedolab.net

Doutor em Ciências – Física Aplicada – Universidade de São Paulo –USP

Pesquisador Visitante na Universidade da Califórnia em Berkeley-EUA 1993-1996

Participante da pesquisa de cristalização no espaço com a Nasa (STS-95)-1998

Livre-Docente em Física – Universidade Estatual Paulista – UNESP

Editor Regional da Revista Current Drug Targets

Editor de Seção (Bioinformatics in Drug Design and Discovery) da Revista Current Medicinal

Chemistry

Membro do Corpo Editorial da Revista Current Bioinformatics

Pesquisador nível 1B do CNPq

azevedolab.net

O que é o espaço químico?

azevedolab.net

azevedolab.net

azevedolab.net

~1060 moléculas orgânicas

Bohacek RS, McMartin C, Guida WC. The art and practice of structure-based drug design: a molecular modeling perspective. Med Res Rev. 1996; 16(1):3–

50. PubMed

azevedolab.net

Vantagens...

Bohacek RS, McMartin C, Guida WC. The art and practice of structure-based drug design: a molecular modeling perspective. Med Res Rev. 1996; 16(1):3–

50. PubMed

azevedolab.net

Abstração de sistemas...

Bohacek RS, McMartin C, Guida WC. The art and practice of structure-based drug design: a molecular modeling perspective. Med Res Rev. 1996; 16(1):3–

50. PubMed

azevedolab.net

Definição de subespaços de interesse

Bohacek RS, McMartin C, Guida WC. The art and practice of structure-based drug design: a molecular modeling perspective. Med Res Rev. 1996; 16(1):3–

50. PubMed

azevedolab.net

Subespaço de fármacos

Bohacek RS, McMartin C, Guida WC. The art and practice of structure-based drug design: a molecular modeling perspective. Med Res Rev. 1996; 16(1):3–

50. PubMed

azevedolab.net

Subespaço de produtos naturais

Bohacek RS, McMartin C, Guida WC. The art and practice of structure-based drug design: a molecular modeling perspective. Med Res Rev. 1996; 16(1):3–

50. PubMed

azevedolab.net

Subespaço de inibidores de CDK

Bohacek RS, McMartin C, Guida WC. The art and practice of structure-based drug design: a molecular modeling perspective. Med Res Rev. 1996; 16(1):3–

50. PubMed

azevedolab.net

O que é o espaço de proteínas?

Hou J, Jun SR, Zhang C, Kim SH. Global mapping of the protein structure space and application in structure-based inference of protein function.

Proc Natl Acad Sci U S A. 2005; 102(10):3651-6.

Representation of protein space. Fonte: http://www.pnas.org/content/102/10/3651/tab-figures-data

azevedolab.net

O que é o espaço de proteínas?

Representation of protein space. Fonte: http://www.pnas.org/content/102/10/3651/tab-figures-data

Hou J, Jun SR, Zhang C, Kim SH. Global mapping of the protein structure space and application in structure-based inference of protein function.

Proc Natl Acad Sci U S A. 2005; 102(10):3651-6.

azevedolab.net

Espaço de proteínas

Smith JM. Natural selection and the concept of a protein space. Nature. 1970; 225(5232): 563–564. PubMed

azevedolab.net

azevedolab.net

Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.

2017; 24(23): 2459–2470. PubMed PDF

O que é o espaço de funções

escores?

azevedolab.net

Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.

2017; 24(23): 2459–2470. PubMed PDF

Como pesquisar o espaço de

funções escores?

azevedolab.net

Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.

2017; 24(23): 2459–2470. PubMed PDF

azevedolab.net

azevedolab.net

Espaço de proteínas

Espaço químico

Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.

2017; 24(23): 2459–2470. PubMed PDF

azevedolab.net

Espaço de proteínas

Espaço químico

Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.

2017; 24(23): 2459–2470. PubMed PDF

azevedolab.net

Espaço de proteínas

Espaço químico

Ki, IC50, Kd ou G

Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.

2017; 24(23): 2459–2470. PubMed PDF

azevedolab.net

Espaço de proteínas

Espaço químico

Ki, IC50, Kd ou G

Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.

2017; 24(23): 2459–2470. PubMed PDF

azevedolab.net

Espaço de proteínas

Espaço químico

Ki, IC50, Kd ou G

Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.

2017; 24(23): 2459–2470. PubMed PDF

azevedolab.net

log(𝐾𝑖) =

𝑖=0

𝑁

𝛼𝑖𝑥𝑖 +

𝑗=0

𝑀

𝛽𝑗𝑦𝑖2

log(𝐾𝑖) =

𝑖=0

𝑁

𝛼𝑖𝑥𝑖 +

𝑗=0

𝑀

𝑘=0

𝑀

𝛾𝑗,𝑘 𝑧𝑗,𝑘 +

𝑗=0

𝑀

𝛽𝑗𝑦𝑖2

log(𝐾𝑖) =

𝑖=0

𝑁

𝛼𝑖𝑥𝑖 +

𝑗=0

𝑀

𝑘=0

𝑀

𝛾𝑗,𝑘 𝑧𝑗,𝑘 + 𝐴.𝐸𝑥𝑝(𝑤)

𝑗=0

𝑀

𝛽𝑗𝑦𝑖2

log(𝐾𝑖) =

𝑖=0

𝑁

𝛼𝑖𝑥𝑖 + 𝐴

𝑗=0

𝑁

𝛽𝑗𝑦𝑗 + 𝐵

𝑘=0

𝑁

𝛾𝑘𝑧𝑘 + 𝐶

𝑘=0

𝑁

𝜔𝑘𝑧𝑘2 + 𝐷

𝑘=0

𝑁

𝜔𝑘𝑧𝑘3 + 𝐸

𝑘=0

𝑁

𝜔𝑘𝑧𝑘4

log(𝐾𝑖) =

𝑖=0

𝑁

𝑗=0

𝑀

𝑘=0

𝑀

𝛾𝑖,𝑗,𝑘 𝑧𝑖,𝑗,𝑘 + 𝐵. 𝑆𝑖𝑛(𝑤)

𝑗=0

𝑀

𝛽𝑗𝑦𝑖2

log 𝐾𝑖 =

𝑖=0

𝑁

𝛼𝑖𝑥𝑖 + 𝐴.𝐸𝑥𝑝 𝜔

𝑗=0

𝑀

𝛽𝑗𝑦𝑖2 …

Espaço de proteínas

Espaço de funções escores

Espaço químico

Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.

2017; 24(23): 2459–2470. PubMed PDF

azevedolab.net

log(𝐾𝑖) =

𝑖=0

𝑁

𝛼𝑖𝑥𝑖 +

𝑗=0

𝑀

𝛽𝑗𝑦𝑖2

log(𝐾𝑖) =

𝑖=0

𝑁

𝛼𝑖𝑥𝑖 +

𝑗=0

𝑀

𝑘=0

𝑀

𝛾𝑗,𝑘 𝑧𝑗,𝑘 +

𝑗=0

𝑀

𝛽𝑗𝑦𝑖2

log(𝐾𝑖) =

𝑖=0

𝑁

𝛼𝑖𝑥𝑖 +

𝑗=0

𝑀

𝑘=0

𝑀

𝛾𝑗,𝑘 𝑧𝑗,𝑘 + 𝐴.𝐸𝑥𝑝(𝑤)

𝑗=0

𝑀

𝛽𝑗𝑦𝑖2

log(𝐾𝑖) =

𝑖=0

𝑁

𝛼𝑖𝑥𝑖 + 𝐴

𝑗=0

𝑁

𝛽𝑗𝑦𝑗 + 𝐵

𝑘=0

𝑁

𝛾𝑘𝑧𝑘 + 𝐶

𝑘=0

𝑁

𝜔𝑘𝑧𝑘2 + 𝐷

𝑘=0

𝑁

𝜔𝑘𝑧𝑘3 + 𝐸

𝑘=0

𝑁

𝜔𝑘𝑧𝑘4

log(𝐾𝑖) =

𝑖=0

𝑁

𝑗=0

𝑀

𝑘=0

𝑀

𝛾𝑖,𝑗,𝑘 𝑧𝑖,𝑗,𝑘 + 𝐵. 𝑆𝑖𝑛(𝑤)

𝑗=0

𝑀

𝛽𝑗𝑦𝑖2

log 𝐾𝑖 =

𝑖=0

𝑁

𝛼𝑖𝑥𝑖 + 𝐴.𝐸𝑥𝑝 𝜔

𝑗=0

𝑀

𝛽𝑗𝑦𝑖2 …

Espaço de proteínas

Espaço de funções escores

Espaço químico

Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.

2017; 24(23): 2459–2470. PubMed PDF

azevedolab.net

log(𝐾𝑖) =

𝑖=0

𝑁

𝛼𝑖𝑥𝑖 +

𝑗=0

𝑀

𝛽𝑗𝑦𝑖2

log(𝐾𝑖) =

𝑖=0

𝑁

𝛼𝑖𝑥𝑖 +

𝑗=0

𝑀

𝑘=0

𝑀

𝛾𝑗,𝑘 𝑧𝑗,𝑘 +

𝑗=0

𝑀

𝛽𝑗𝑦𝑖2

log(𝐾𝑖) =

𝑖=0

𝑁

𝛼𝑖𝑥𝑖 +

𝑗=0

𝑀

𝑘=0

𝑀

𝛾𝑗,𝑘 𝑧𝑗,𝑘 + 𝐴.𝐸𝑥𝑝(𝑤)

𝑗=0

𝑀

𝛽𝑗𝑦𝑖2

log(𝐾𝑖) =

𝑖=0

𝑁

𝛼𝑖𝑥𝑖 + 𝐴

𝑗=0

𝑁

𝛽𝑗𝑦𝑗 + 𝐵

𝑘=0

𝑁

𝛾𝑘𝑧𝑘 + 𝐶

𝑘=0

𝑁

𝜔𝑘𝑧𝑘2 + 𝐷

𝑘=0

𝑁

𝜔𝑘𝑧𝑘3 + 𝐸

𝑘=0

𝑁

𝜔𝑘𝑧𝑘4

log(𝐾𝑖) =

𝑖=0

𝑁

𝑗=0

𝑀

𝑘=0

𝑀

𝛾𝑖,𝑗,𝑘 𝑧𝑖,𝑗,𝑘 + 𝐵. 𝑆𝑖𝑛(𝑤)

𝑗=0

𝑀

𝛽𝑗𝑦𝑖2

log 𝐾𝑖 =

𝑖=0

𝑁

𝛼𝑖𝑥𝑖 + 𝐴.𝐸𝑥𝑝 𝜔

𝑗=0

𝑀

𝛽𝑗𝑦𝑖2 …

Espaço de proteínas

Espaço de funções escores

Espaço químico

Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.

2017; 24(23): 2459–2470. PubMed PDF

azevedolab.net

log(𝐾𝑖) =

𝑖=0

𝑁

𝛼𝑖𝑥𝑖 +

𝑗=0

𝑀

𝛽𝑗𝑦𝑖2

log(𝐾𝑖) =

𝑖=0

𝑁

𝛼𝑖𝑥𝑖 +

𝑗=0

𝑀

𝑘=0

𝑀

𝛾𝑗,𝑘 𝑧𝑗,𝑘 +

𝑗=0

𝑀

𝛽𝑗𝑦𝑖2

log(𝐾𝑖) =

𝑖=0

𝑁

𝛼𝑖𝑥𝑖 +

𝑗=0

𝑀

𝑘=0

𝑀

𝛾𝑗,𝑘 𝑧𝑗,𝑘 + 𝐴.𝐸𝑥𝑝(𝑤)

𝑗=0

𝑀

𝛽𝑗𝑦𝑖2

log(𝐾𝑖) =

𝑖=0

𝑁

𝛼𝑖𝑥𝑖 + 𝐴

𝑗=0

𝑁

𝛽𝑗𝑦𝑗 + 𝐵

𝑘=0

𝑁

𝛾𝑘𝑧𝑘 + 𝐶

𝑘=0

𝑁

𝜔𝑘𝑧𝑘2 + 𝐷

𝑘=0

𝑁

𝜔𝑘𝑧𝑘3 + 𝐸

𝑘=0

𝑁

𝜔𝑘𝑧𝑘4

log(𝐾𝑖) =

𝑖=0

𝑁

𝑗=0

𝑀

𝑘=0

𝑀

𝛾𝑖,𝑗,𝑘 𝑧𝑖,𝑗,𝑘 + 𝐵. 𝑆𝑖𝑛(𝑤)

𝑗=0

𝑀

𝛽𝑗𝑦𝑖2

log 𝐾𝑖 =

𝑖=0

𝑁

𝛼𝑖𝑥𝑖 + 𝐴.𝐸𝑥𝑝 𝜔

𝑗=0

𝑀

𝛽𝑗𝑦𝑖2 …

Espaço de proteínas

Espaço de funções escores

Espaço químico

Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.

2017; 24(23): 2459–2470. PubMed PDF

azevedolab.net

log(𝐾𝑖) =

𝑖=0

𝑁

𝛼𝑖𝑥𝑖 +

𝑗=0

𝑀

𝛽𝑗𝑦𝑖2

log(𝐾𝑖) =

𝑖=0

𝑁

𝛼𝑖𝑥𝑖 +

𝑗=0

𝑀

𝑘=0

𝑀

𝛾𝑗,𝑘 𝑧𝑗,𝑘 +

𝑗=0

𝑀

𝛽𝑗𝑦𝑖2

log(𝐾𝑖) =

𝑖=0

𝑁

𝛼𝑖𝑥𝑖 +

𝑗=0

𝑀

𝑘=0

𝑀

𝛾𝑗,𝑘 𝑧𝑗,𝑘 + 𝐴.𝐸𝑥𝑝(𝑤)

𝑗=0

𝑀

𝛽𝑗𝑦𝑖2

log(𝐾𝑖) =

𝑖=0

𝑁

𝛼𝑖𝑥𝑖 + 𝐴

𝑗=0

𝑁

𝛽𝑗𝑦𝑗 + 𝐵

𝑘=0

𝑁

𝛾𝑘𝑧𝑘 + 𝐶

𝑘=0

𝑁

𝜔𝑘𝑧𝑘2 + 𝐷

𝑘=0

𝑁

𝜔𝑘𝑧𝑘3 + 𝐸

𝑘=0

𝑁

𝜔𝑘𝑧𝑘4

log(𝐾𝑖) =

𝑖=0

𝑁

𝑗=0

𝑀

𝑘=0

𝑀

𝛾𝑖,𝑗,𝑘 𝑧𝑖,𝑗,𝑘 + 𝐵. 𝑆𝑖𝑛(𝑤)

𝑗=0

𝑀

𝛽𝑗𝑦𝑖2

log 𝐾𝑖 =

𝑖=0

𝑁

𝛼𝑖𝑥𝑖 + 𝐴.𝐸𝑥𝑝 𝜔

𝑗=0

𝑀

𝛽𝑗𝑦𝑖2 …

Espaço de proteínas

Espaço de funções escores

Espaço químico

Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.

2017; 24(23): 2459–2470. PubMed PDF

azevedolab.net Ferramentas para explorar o espaço de funções escores

azevedolab.net Ferramentas para explorar o espaço de funções escores

azevedolab.net

Tool to Analyze the Binding Affinity

azevedolab.net Sistema massa-mola para simulação das interações

proteína-ligante (visão estilizada)

azevedolab.net

𝑉 𝑥 = 𝑉 𝑑0 + 𝑉 𝑑0′ 𝑥 − 𝑑0 + ( Τ1 2!)𝑉 𝑑0

′′ 𝑥 − 𝑑02 + ( Τ1 3!)𝑉 𝑑0

′′′ 𝑥 − 𝑑03 +⋯ (1)

𝑉 𝑥 ≈ 𝑉 𝑑0 + ( Τ1 2!)𝑉 𝑑0′′ 𝑥 − 𝑑0

2 (2)

d0

d0 = equilíbrio

Sistema massa-mola em movimento harmônico simples

Sistema massa-mola para simulação das interações

proteína-ligante (física básica)

azevedolab.net

𝑉 𝑥 ≈ 𝑉 𝑑0 + ( Τ1 2!)𝑉 𝑑0′′ 𝑥 − 𝑑0

2 (2)

Sistema massa-mola para simulação das interações

proteína-ligante (física básica)

azevedolab.net

Materiais homogêneos isotrópicos (Kot et al., 2015; Kot & Nagahashi, 2017)

Folha de grafeno (Kim et al., 2014)

Tunelamento de elétrons em transistors(Pasupathy et al., 2005),

Kim,M.H. et al. (2014) Vibrational characteristics of graphene sheets elucidated using an elastic network model. Phys. Chem. Chem. Phys., 16, 15263–15271.

Kot,M. et al. (2015) Elastic moduli of simple mass spring models. Vis. Comput., 31(10), 1339–1350.

Kot,M. and Nagahashi,H. (2017) Mass spring models with adjustable Poisson’s ratio. Vis. Comput., 33(3), 283–291.

Pasupathy,A.N. et al. (2005) Vibration-assisted electron tunneling in C140 transistors. Nano Lett., 5, 203-207.

Sistema massa-mola para simulação (exemplos)

azevedolab.net Sistema massa-mola para simulação das interações

proteína-ligante (distância de equilíbrio)

azevedolab.net Sistema massa-mola para simulação das interações

proteína-ligante (fluxograma)

azevedolab.net Sistema massa-mola para simulação das interações

proteína-ligante (pares de átomos)

azevedolab.net

𝑃𝐵𝐴 = 𝛼0 + σ𝑖σ𝑗 𝛼𝑖,𝑗(𝑑𝑖,𝑗 − 𝑑0,𝑖,𝑗)2 (3)

Sistema massa-mola para simulação das interações

proteína-ligante (afinidade de ligação)

azevedolab.net

𝑃𝐵𝐴 = 𝛼0 + σ𝑖σ𝑗 𝛼𝑖,𝑗(𝑑𝑖,𝑗 − 𝑑0,𝑖,𝑗)2 (3)

𝑅𝑆𝑆 = σ𝑖=1𝑀 (𝑦𝑖 − 𝑃𝐵𝐴𝑖)

2 + 𝜆1σ𝑗=1𝑁 𝜔𝑗 + 𝜆2σ𝑗=1

𝑁 𝜔𝑗2

(4)

[1] Legendre, AM. Nouvelle méthodes pour la déterminiation des orbites des comètes, Courcier, Paris, 1805.

[2] Tibshirani, R. Regression shrinkage and selection via the lasso, J. R. Stat. Soc. Series B Stat. Methodol. 1996, 58, 267–288. https://doi.org/10.1111/j.1467-9868.2011.00771.x.

[3] Tikhonov, AN. On the regularization of ill-posed problems, Dokl. Akad. Nauk SSSR. 1963, 153, 49–52.

[4] H. Zou, T. Hastie, Regularization and variable selection via the elastic net, J. R. Stat. Soc. Series B Stat. Methodol. 2005, 67, 301–220. https://doi.org/10.1111/j.1467-

9868.2005.00503.x.

Método 1 2 Referências

Regressão Linear Ordinária 0 0 [1]

Least absolute shrinkage and selection operator (Lasso) >0 0 [2]

Ridge 0 > 0 [3]

Elastic Net > 0 > 0 [4]

Sistema massa-mola para simulação das interações

proteína-ligante (aprendizado de máquina)

azevedolab.net Sistema massa-mola para simulação das interações

proteína-ligante (sistema biológico)

azevedolab.net

M

G2

S

G1

CDK2/

Cyclin E

CDK1/

Cyclin A

CDK2/

Cyclin A

CDK1/

Cyclin B

CDK4/

Cyclin D CDK6/

Cyclin D

Sistema massa-mola para simulação das interações

proteína-ligante (sistema biológico)

azevedolab.net

Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.

2017; 24(23): 2459–2470. PubMed PDF

Espaço de funções escores

Espaço de proteínasEspaço químico

azevedolab.net

PDB access code Ligand Identification

Ligand Chain

Ligand Number

Ki (nM) Test Set

1E1V CMG A 401 12000 11E1X NW1 A 401 1300 01H1S 4SP A 1298 6 01JSV U55 A 400 2000 11OGU ST8 A 1298 2400 01PXM CK5 A 500 60 11PXN CK6 A 500 195 01PXO CK7 A 500 2 11PXP CK8 A 500 220 01PYE PM1 A 700 386 11V1K 3FP A 299 35000 12CLX F18 A 1299 13300 02EXM ZIP A 400 78000 02FVD LIA A 299 3 02XMY CDK A 500 0.11 12XNB Y8L A 1299 149 13BLR CPB A 940 3 03DDQ RRC A 299 250 03LFN A27 A 299 3160 03LFS A07 A 299 2500 13MY5 RFZ A 300 65000 04ACM 7YG A 1302 210 04BCK T3E A 1298 4 04BCM T7Z A 1297 123 04BCN T9N A 1299 12 04BCO T6Q A 1299 131 04BCP T3C A 1299 568 04BCQ TJF A 1296 147 04EOP 1RO A 301 890 04NJ3 2KD A 301 140 05D1J 56H A 4000 38 0

Sistema massa-mola para simulação das interações

proteína-ligante (conjunto de dados)

azevedolab.net

Regression are the following:

α0 = -6.581356;

αC,N = -0.111232;

αC,O = -0.406456;

αN,F = -0.353717.

The equilibrium distances are the following:

d0,C,N =3.99463;

d0,C,O =3.88190;

d0,N,F = 4.21672 Å.

Taba obtained these results using the elastic net with

cross-validation (CV) as a regression method.

𝑃𝐵𝐴 = 𝛼0 + σ𝑖σ𝑗 𝛼𝑖,𝑗(𝑑𝑖,𝑗 − 𝑑0,𝑖,𝑗)2

Sistema massa-mola para simulação das interações

proteína-ligante (modelo)

azevedolab.net

Scoring Functions p-value1 R p-value2Free Energya -0.133 0.7324 0.204 0.2227Final Intermolecular Energya 0.133 0.7324 0.204 0.2228vdW+Hbond+desolv Energya 0.133 0.7324 0.204 0.2228Electrostatic Energya 0.533 0.1392 0.376 0.0789Final Total Internal Energya -0.133 0.7324 0.089 0.4365Torsional Free Energya 0.068 0.8630 0.000 0.9792Plants Scoreb 0.183 0.6368 0.001 0.9348MolDock Scoreb 0.217 0.5755 0.010 0.7950Rerank Scoreb 0.333 0.3807 0.007 0.8336Interaction Scoreb 0.367 0.3317 0.013 0.7698Protein Scoreb 0.367 0.3317 0.025 0.6839Water Scoreb -0.569 0.1098 0.395 0.0699Internal Scoreb 0.033 0.9322 0.001 0.9369Electrostatic Scoreb 0.548 0.1269 0.204 0.2218Electrostatic Long Scoreb -0.548 0.1269 0.204 0.2218H-Bond Scoreb 0.650 0.0581 0.512 0.0301Ligand Efficiency 1 Scoreb 0.150 0.7001 0.024 0.6935Ligand Efficiency 3 Scoreb 0.283 0.4600 0.023 0.6968Affinity Scorec -0.067 0.8647 0.117 0.3669Gauss1 Scorec -0.367 0.3317 0.120 0.3603Gauss2 Scorec -0.283 0.4600 0.018 0.7297Repulsion Scorec -0.700 0.0358 0.240 0.1804Hydrophobic Scorec 0.100 0.7980 0.002 0.9157Hydrogen Scorec -0.583 0.0992 0.340 0.0993Taba (3 variables, d 4.5 Å) 0.783 0.01252 0.794 0.0107

Predictive performance of scoring functions (test set).aAutoDock 4, bMolegro Virtual Docker (MVD), cAutoDock Vina.

p-value1 and p-value2 are related to ρ and R, respectively.

Sistema massa-mola para simulação das interações

proteína-ligante (análise estatística)

azevedolab.net

𝑃𝐵𝐴 = 𝛼0 + σ𝑖σ𝑗 𝛼𝑖,𝑗(𝑑𝑖,𝑗 − 𝑑0,𝑖,𝑗)2

Hernandes MZ, Cavalcanti SM, Moreira DR, de Azevedo Junior WF, Leite AC. Halogen atoms in the modern medicinal chemistry: hints for the drug

design. Curr Drug Targets. 2010; 11(3):303–314. PubMed

Regression are the following:

α0 = -6.581356;

αC,N = -0.111232;

αC,O = -0.406456;

αN,F = -0.353717.

The equilibrium distances are the following:

d0,C,N =3.99463;

d0,C,O =3.88190;

d0,N,F = 4.21672 Å.

Sistema massa-mola para simulação das interações

proteína-ligante (modelo)

azevedolab.net

SAnDReS

Statistical Analysis of Docking Results and Scoring Functions

sandres.net

Xavier MM, Heck GS, de Avila MB, Levin NM, Pintro VO, Carvalho NL, Azevedo WF Jr. SAnDReS a Computational Tool for Statistical Analysis of Docking Results and

Development of Scoring Functions. Comb Chem High Throughput Screen. 2016; 19(10): 801–812. Link PubMed Go To SAnDReS PDF GitHub

azevedolab.net

𝑃𝐵𝐴 = 𝛼0 + 𝛼1𝑥1 + 𝛼2𝑥2 + 𝛼3𝑥3 +𝛼4𝑥1𝑥2 + 𝛼5𝑥1𝑥3 + 𝛼6𝑥2𝑥3+

𝛼7𝑥12 + 𝛼8𝑥2

2 + 𝛼9𝑥32

Onde x1, x2 e x3 são termos de energias tiradas de programas como o AutoDock 4, AutoDock Vinae Molegro Virtual Docker

Xavier MM, Heck GS, de Avila MB, Levin NM, Pintro VO, Carvalho NL, Azevedo WF Jr. SAnDReS a Computational Tool for Statistical Analysis of Docking Results and

Development of Scoring Functions. Comb Chem High Throughput Screen. 2016; 19(10): 801–812. Link PubMed Go To SAnDReS PDF GitHub

Função escore polinomial

azevedolab.netScoring Functions and Energy Terms DescriptionMolDock Score Protein-ligand Scoring Function. This scoring function is the sum of internal ligand

energies, protein interaction energy and soft penalties

PLANTS Score Protein ligand Scoring FunctionRe-rank Score Protein ligand Scoring FunctionEnergy Term 1 Interaction energy between the ligand and the target molecule(s) (Interaction)

Energy Term 2 Interaction energy between the ligand and the co-factor (Cofactor)

Energy Term 3 Interaction energy between the ligand and the protein (Protein)

Energy Term 4 Interaction energy between the ligand and the water molecules (Water)

Energy Term 5 Internal energy of the ligand (Internal)

Energy Term 6 Short-range electrostatic protein-ligand interactions (r<4.5Å) (Electro)

Energy Term 7 Long-range electrostatic protein-ligand interactions (r>4.5A) (ElectroLong)

Energy Term 8 Hydrogen bonding energy (HBond)LE1 Score Ligand Efficiency 1: MolDock Score divided by Heavy Atoms count

LE3 Score Ligand Efficiency 3: Rerank Score divided by Heavy Atoms count

Docking Score Score evaluated before post-processing (either PLANTS or MolDock). Only used for re-

docking.Displaced Water Score Energy contributions from non-displaced and displaced water interactions.

AutoDock4 Scoring Function This scoring function makes use of five energetic terms: the torsional term, the

hydrogen bonding interactions, the electrostatic potential, the desolvation energy, and

the van der Waals interactions

AutoDock Vina Scoring Function Vina makes use of the following energy terms: Gauss1, Gauss2, repulsion, hydrophobic,

hydrogen bond, and torsion. They are defined elsewhere

List of all scoring functions used in this study.

Xavier MM, Heck GS, de Avila MB, Levin NM, Pintro VO, Carvalho NL, Azevedo WF Jr. SAnDReS a Computational Tool for Statistical Analysis of Docking Results and

Development of Scoring Functions. Comb Chem High Throughput Screen. 2016; 19(10): 801–812. Link PubMed Go To SAnDReS PDF GitHub

azevedolab.net

Xavier MM, Heck GS, de Avila MB, Levin NM, Pintro VO, Carvalho NL, Azevedo WF Jr. SAnDReS a Computational Tool for Statistical Analysis of Docking Results and

Development of Scoring Functions. Comb Chem High Throughput Screen. 2016; 19(10): 801–812. Link PubMed Go To SAnDReS PDF GitHub

azevedolab.net

Datasets PDB Access Codes

HRIC50

We used a dataset composed of an ensemble of high-resolutioncrystallographic structures solved to resolution better than 1.5 Å,and for which there is experimental data for half-maximal inhibitory concentration (IC50) for the active ligands

2GG3,2GG7,2GG9,2HU6,2I5F,2IKG,2NMZ,2NNG,2OW6,2PDG,2PIY,2PZN,2QCF,2R3I,2W14,2W3B,2W9H,2WUU

,2WZX,2X5O,2XPC,2XU3,2XU4,2Y1O,2Y68,2YC3,2YEX,2YJ2,2YJ8,2YJ9,2YJC,2YK9,2YKE,2YKJ,3B28,3B7E,3B8Z,

3BCJ,3BLB,3CBP,3DCR,3DD0,3DN5,3EJS,3EJT,3EJU,3ESS,3EWZ,3EX3,3F66,3FCI,3FS6,3GHV,3GHW,3H5B,3HHA,

3HJ0,3HNB,3HS4,3HYG,3I06,3I33,3I6C,3I6O,3IOG,3IU7,3KFA,3KIG,3KKU,3KL6,3KWZ,3L14,3M0I,3M4H,3NKK,

3NTZ,3NU0,3NU3,3NWB,3NXO,3NXX,3NZB,3OND,3OT3,3OVX,3OZS,3OZT,3PA3,3PKA,3PKB,3PX8,3R6T,3RL4,

3S1Y,3S71,3SPK,3TEM,3U2C,3UHM,3VF3,3VHV,3VW9,3WFG,3ZSJ,3ZXH,4A6V,4A6W,4BW1,4DHR,4DRI,4DRN,

4DRO,4DRQ,4E4A,4F3I,4FH2,4FLK,4FYO,4GCJ,4GQR,4GV1,4HCT,4HCU,4HCV,4HWW,4HXQ,4HXS,4HY4,4HYI,

4IGH,4IKU,4JHT,4KEB,4L7G,4M5R

CDK2IC50 1GII, 1OIR, 2B53, 2B54, 2R3H, 3IGG, 3LE6, 3PXZ, 3PY0, 3RZB, 4RJ3

List of PDB access codes used for both datasets.

de Ávila MB, Xavier MM, Pintro VO, de Azevedo WF. Supervised machine learning techniques to predict binding affinity.

A study for cyclin-dependent kinase 2. Biochem Biophys Res Commun. 2017; 494: 305–310. PubMed PDF

azevedolab.net

)0.000040(x .z)0.001090(y-.z)0.000185(x+.y)0.000069(x+5.763674 2=)log( 50IC

where x is Interaction energy between the pose and the protein, y is

Internal energy of the ligand (Internal), and z is Hydrogen bonding

energy (HBond).

Scoring Functions

and Energy Terms

(training

set)

p-value

(training

set)

(test set)p-value (test

set)

PLANTS Score 0.266 3.797.10-03 0.167 2.185.10-01

MolDock Score 0.284 1.939.10-03 0.224 9.678.10-02

Rerank Score 0.227 1.371.10-02 0.109 4.219.10-01

Term 1 0.334 2.305.10-04 0.215 1.109.10-01

Term 2 0.130 1.623.10-01 0.211 1.192.10-01

Term 3 0.340 1.795.10-04 0.147 2.810.10-01

Term 4 0.214 2.032.10-02 0.083 5.455.10-01

Term 5 -0.077 4.104.10-01 0.155 2.541.10-01

Term 6 -0.107 2.514.10-01 -0.179 1.871.10-01

Term 7 0.134 1.511.10-01 0.101 4.568.10-01

Term 8 -0.067 4.746.10-01 -0.237 7.889.10-02

Polscore0000060 0.401 7.243.10-06 0.328 1.363.10-02

Results for training and test sets for HRIC50 dataset.

de Ávila MB, Xavier MM, Pintro VO, de Azevedo WF. Supervised machine learning techniques to predict binding affinity.

A study for cyclin-dependent kinase 2. Biochem Biophys Res Commun. 2017; 494: 305–310. PubMed PDF

azevedolab.net

Scoring Functions

and Energy Terms

(training

set)

p-value

(training

set)

(test set)p-value (test

set)

PLANTS Score 0.266 3.797.10-03 0.167 2.185.10-01

MolDock Score 0.284 1.939.10-03 0.224 9.678.10-02

Rerank Score 0.227 1.371.10-02 0.109 4.219.10-01

Term 1 0.334 2.305.10-04 0.215 1.109.10-01

Term 2 0.130 1.623.10-01 0.211 1.192.10-01

Term 3 0.340 1.795.10-04 0.147 2.810.10-01

Term 4 0.214 2.032.10-02 0.083 5.455.10-01

Term 5 -0.077 4.104.10-01 0.155 2.541.10-01

Term 6 -0.107 2.514.10-01 -0.179 1.871.10-01

Term 7 0.134 1.511.10-01 0.101 4.568.10-01

Term 8 -0.067 4.746.10-01 -0.237 7.889.10-02

Polscore0000060 0.401 7.243.10-06 0.328 1.363.10-02

Results for training and test sets for HRIC50 dataset.

de Ávila MB, Xavier MM, Pintro VO, de Azevedo WF. Supervised machine learning techniques to predict binding affinity.

A study for cyclin-dependent kinase 2. Biochem Biophys Res Commun. 2017; 494: 305–310. PubMed PDF

azevedolab.net

PDB Access

Code

Active

Ligand

Code

Resolution (Å) IC50(nM) log(IC50) Predicted

log(IC50)

1GII 1PU 2.00 260 -6.585 -6.4631OIR HDY 1.91 32 -7.495 -7.4952B53 D23 2.00 600 -6.222 -6.2212B54 D05 1.85 20 -7.699 -7.6992R3H SCE 1.50 20000 -4.699 -3.8393IGG EFQ 1.80 80.75 -7.093 -6.1963LE6 2BZ 2.00 35 -7.456 -6.2773PXZ JWS 1.70 5900 -5.229 -5.2773PY0 SU9 1.75 79.25 -7.101 -6.6783RZB 02Z 1.90 100000 -4.000 -5.7794RJ3 3QS 1.63 93 -7.032 -6.544

Experimental and predicted log(IC50) for all structures in the CDK2IC50 dataset.

de Ávila MB, Xavier MM, Pintro VO, de Azevedo WF. Supervised machine learning techniques to predict binding affinity.

A study for cyclin-dependent kinase 2. Biochem Biophys Res Commun. 2017; 494: 305–310. PubMed PDF

azevedolab.net

Scoring Functions and

Energy Termsa

ρ p-value R2 p-value

Affinityb 0.418 2.006.10-01 0.237 1.289.10-01

Gauss1b -0.773 5.299.10-03 0.393 3.889.10-02

Gauss2b -0.645 3.196.10-02 0.386 4.125.10-02

Repulsionb -0.618 4.265.10-02 0.276 9.715.10-02

Hydrophobicb -0.391 2.345.10-01 0.223 1.424.10-01

Hydrogenb -0.730 1.069.10-02 0.280 9.386.10-02

Free Energyc 0.445 1.697.10-01 0.082 3.923.10-01

Final Intermolecular Energyc 0.400 2.229.10-01 0.082 3.923.10-01

vdW+Hbond+desolv Energyc 0.409 2.115.10-01 0.082 3.923.10-01

Electrostatic Energyc -0.209 5.372.10-01 0.082 3.922.10-01

Final Total Internal Energyc 0.588 5.725.10-02 0.345 5.730.10-02

Torsional Free Energyc -0.304 3.637.10-01 0.106 3.298.10-01

MolDock Scored 0.391 2.345.10-01 0.173 2.028.10-01

PLANTS Scored 0.682 2.084.10-02 0.507 1.401.10-02

Rerank Scored -0.591 5.558.10-02 0.768 4.044.10-04

Ligand Efficiency 1d -0.391 2.345.10-01 0.183 1.888.10-01

Ligand Efficiency 3d -0.345 2.981.10-01 0.294 8.516.10-02

Polscore 60e 0.845 1.045.10-03 0.608 4.650.10-03

Statistical analysis of predictive power for all structures in the CDK2IC50 dataset.

de Ávila MB, Xavier MM, Pintro VO, de Azevedo WF. Supervised machine learning techniques to predict binding affinity.

A study for cyclin-dependent kinase 2. Biochem Biophys Res Commun. 2017; 494: 305–310. PubMed PDF

aEnergy term and scoring function values were calculated using the

crystallographic position for the ligands.bScoring function and energy terms calculated using AutoDock Vina.cScoring function and energy terms calculated using AutoDock 4. dScoring function and energy terms calculated using MVD.eMachine learning model generated using SAnDReS.

azevedolab.net

Conclusões

Taba e SAnDReS são capazes de gerar modelos para previsão de

afinidade superiores a funções clássicas (Molegro Virtual Docker,

AutoDock 4, AutoDock Vina

O conceito do espaço de funções escores é uma forma elegante de

desenvolvermos modelos direcionados para sistemas biológicos de

interesse

azevedolab.net

Trabalhos Futuros

SAnDReS 2.0

Liberação da versão estável do Taba

Aplicação de ambas abordagens a sistemas biológicos de interesse

azevedolab.net

Desenvolvimento dos códigos

azevedolab.net

Disponibilidade dos códigos e modelos

https://github.com/azevedolab/

azevedolab.net

Amauri Duarte da Silva

azevedolab.net

azevedolab.net

https://www.facebook.com/azevedolab.net/

azevedolab.net

azevedolab.net

Obrigado!