← Back
Study of ruthenium(II) complexes with anticancer drugs as ligands. Design of metal-based phototherapeutic agents.
Practical Applications of Deep Learning to Impute
Heterogeneous Drug Discovery Data
Benedict W. J. Irwin*†§, Julian Levell‖, Thomas M. Whitehead‡, Matthew D. Segall*†, Gareth J. Conduit‡§
† Optibrium Limited, Cambridge Innovation Park, Denny End Rd, Cambridge, CB25 9PB, UK
‡ Intellegens Limited, Eagle Labs, 28 Chesterton Road, Cambridge, CB4 3AZ, UK
‖ Constellation Pharmaceuticals Inc., 215 First St Suite 200, Cambridge, MA 02142, USA
§ University of Cambridge, Cavendish Laboratory, 19 JJ Thomson Ave, Cambridge, CB3 0HE, UK
Abstract
Contemporary deep learning approaches still struggle to bring a useful improvement in the field of drug
discovery due to the challenges of sparse, noisy and heterogeneous data that are typically encountered in this
context. We use a state-of-the-art deep learning method, Alchemite™, to impute data from drug discovery
projects, including multi-target biochemical activities, phenotypic activities in cell-based assays, and a variety of
absorption, distribution, metabolism, and excretion (ADME) endpoints. The resulting model gives excellent
predictions for activity and ADME endpoints, offering an average increase in 𝑅2 of 0.22 versus quantitative
structure-activity relationship methods. The model accuracy is robust to combining data across uncorrelated
endpoints and projects with different chemical spaces, enabling a single model to be trained for all compounds
and endpoints. We demonstrate improvements in accuracy on the latest chemistry and data when updating
models with new data as an ongoing medicinal chemistry project progresses.
Introduction
Machine learning and, more recently, deep learning methods, are becoming well established and have been
successful in a variety of scientific and commercial applications 1,2. However, in the field of drug discovery,
training on sparse and often noisy data requires extensive modification to existing algorithms to deliver useful
results 3–5. Recent advances are showing promise using deep learning to predict properties including solubility
6,7, drug induced liver injury 8, target activities 9,10, and many other endpoints 11,12. While each of these models
may be individually good, they are tailored to predict only one specific endpoint, or group of closely related
endpoints. A great deal of human time is also invested to optimize the hyperparameters 13 and architecture 4 of
each model to prevent problems such as overfitting 11,14 and instability with different sizes of dataset 15.
Additionally, the training of deep neural networks can be slow 11,13 and may require significant investment in
hardware 9.
Many modern applications of deep learning in drug discovery are exploring new areas such as compound
generation 16–18 and compound synthesis 19. Meanwhile, realizing the goal of a fully generalized deep learning
quantitative structure-activity relationship (QSAR) model that can be applied to general pharmaceutical project
data, on both large and small scales, with minimal human intervention, has not received the same degree of
attention. There are many pre-deep learning QSAR methods 20 including decision trees and random forests 21–
23, radial basis functions 24, support vector machines 25,26 and Gaussian processes 27–29. Intermediate neural
network methods have a long history, including artificial neural networks (ANN) 11,30 and general regression
neural networks (GRNN) 31.
So far, despite all this effort, attempts to apply traditional deep learning methods such as deep neural networks
9,10 and deep belief networks 7,32 to prediction of experimental drug discovery endpoints, in a practical way that
1
helps a project progress, have resulted in only small improvement over traditional QSAR modelling methods 33
such as random forests, with an average increase in 𝑅2 coefficient of determination of only 0.043 − 0.051 9 .
Most recently, increases have been seen in the case of graph convolutional networks 34 which can add average
increases of 0.14 to 𝑅2 values 35. Significant improvements over ‘conventional’ machine learning are generally
only seen in large datasets, or in the case of multitask learning where there are strong correlations between the
endpoints 5. The reason this increase is not larger is likely due to challenges that arise when using pharmaceutical
data in conventional approaches. These are problems arising from sparse, noisy, heterogenous and dynamic
data, that prohibit deep methods from adding their full value.
In this paper, we describe an application of a deep learning method for data imputation, Alchemite™, to an
ongoing drug discovery project. While originally developed and proved in the context of materials discovery 36–
39, success has been seen in an example application of this method to a challenging, public domain benchmark
data set of kinase activity data 40,41. In this benchmark, Alchemite was shown to outperform a range of QSAR
methods, including a multi-task deep neural network trained using TensorFlow 42, and collective matrix
factorization 43. Furthermore, this benchmark demonstrated Alchemite’s ability to focus on the most confident
predictions with a commensurate improvement in accuracy.
While applications to benchmarking data provide proof of concept and a robust comparison with other methods,
these data sets are not representative of the full range of data encountered in the context of drug discovery
projects. In particular, the aforementioned kinase data set comprises only target activity data (expressed as
pIC50 values). In this work, we extend our previous work to apply the Alchemite algorithm to heterogeneous
drug discovery data in a project-based context and explore the temporal evolution of data throughout the
project to solve the challenges outlined above. We will briefly discuss the challenges in solving the practical
issues encountered when modelling drug discovery data using other methods.
Prediction and Imputation
There are distinct differences between the problems of predicting an endpoint based on a complete set of
inputs, e.g. a QSAR regression model, and imputing an endpoint with sparse data, e.g. filling in the gaps in data
for an experimental endpoint. Figure 1 shows a comparison of these two methods. A QSAR regression model is
a function of a full set of complete inputs, i.e. molecular descriptors that can be calculated for every compound.
The sparsity of drug discovery data prevents assays and experimental values - which may not always be present
- to all be used as inputs for this kind of model. The subset of compounds that has all experimental values present
is generally quite small, and even if a model were to be trained on these data, new measurements must be made
for all inputs in order to make a new prediction. On the contrary, an imputation model can take all existing data
(both molecular descriptors and target experimental endpoints) as inputs to the model and fill in the missing
values using whatever data may be present. If the model is correctly designed, it does not suffer the same
limitation from missing values as the prediction model. If data are present, they can be used, and if they are
missing, they can be predicted.
2
Figure 1. Comparison of a QSAR model (here a random forest) with the deep imputation process (Alchemite), which
takes both complete descriptor columns and incomplete assay columns as input. These are used by the deep
learning network to fill in the missing values in the assay data columns with an error bar for each data point.
The challenges of Modelling Drug Discovery Data
For an algorithm or method to get the most out of drug discovery data, it should address a few challenges with
which common methods often struggle:
Missing Data
If one considers all of the compounds and assays in a large pharmaceutical company's corporate collection,
typically only a small fraction (< 1%) of the possible compound-assay endpoint combinations have been
measured in practice. Public domain databases are also sparsely populated; for example, the ChEMBL 44 data
set is just 0.05% compete. Even in the context of an ongoing project, only a small proportion of compounds will
be progressed for more detailed studies, such as measurement of absorption, distribution, metabolism, and
excretion (ADME) properties. We have seen above that the design of an imputation model can use sparse
experimental columns as inputs to a deep algorithm. One limiting factor for the application of deep learning is
the lack of support for this kind of missing data in contemporary methods 45,46. If inputs are not always present,
simple implementations of common algorithms such as neural networks cannot give sensible answers without
significant alteration 46,47. Recent developments, such as the method presented in this study, have taken deep
imputation a step further, working comfortably on datasets with <1% of data present 40.
Uncertainty and Confidence
Experimental data are inherently noisy. Even good-quality pharmaceutical data may have up to one log unit of
variability 26, and some values could be incorrect for due to experimental errors or artefacts 48. Furthermore, a
failure to take uncertainty from noisy predictions into account can lead to wasted time and missed opportunities
through misdirection. Conversely, using uncertainties correctly can lead to optimized decisions and a mitigation
of risk 49. A practically useful algorithm should handle explicit uncertainties in the input experimental data and
also give a measure of uncertainty in predictions they output.
3
Heterogeneous Data
In the course of drug discovery projects, datasets will be generated using a wide variety of assays which cover
target and phenotypic activities, ADME properties, toxicity and physicochemical properties of compounds of
interest. Endpoints may be correlated if they are for the same target under different conditions, related targets
or measurements of the same property in different tissues. More complex assay endpoints, such as phenotypic
responses in cell-based assays, may be correlated with multiple, simpler endpoints such as target activities,
membrane permeability, solubility and protein binding. When these mixed results are separated out into
separate endpoints, the columns in the data matrix become increasingly sparse, making correlations harder to
use without special techniques built for extremely sparse data, for example by Whitehead et al. 40. Another
method that has attempted this is the pQSAR 2.0 method of Martin et al. 41,50. However, previous methods such
as pQSAR have focused on combining similar types of endpoint only, for example all pIC50 values. Few, if any,
methods have yet attempted to make use of correlations from heterogeneous data with a variety of different scales
and distributions, but this is solved automatically in an imputation model as described in Figure 1.
The Temporal Evolution of a Project
Drug design projects evolve with time as the hit- and lead-optimization processes result in an exploration of
chemical space beyond the compounds for which data was previously available. The chemical space of interest
may jump as series are discarded or focus during late lead optimization. Compound activity and other properties
will improve as the project nears its goal, increasing the range of values. Specific assays may become
concentrated and data rich when an issue is being focused on, while other assays become sparser when an issue
is presumed to have been addressed or no-longer relevant. If a model is to be deployed across an entire project
data set, or even across multiple projects, it should be able to handle a multi-scale approach and seamlessly
transition from early hit-based screening to lead development, retraining as more data become available.
The majority of machine learning methods are based around interpolation of training values. A successful method
should continue to add value after the chemistry has evolved. Many models cannot handle temporally split test
data 51 and this is an important validation for whether a method can add real value to an ongoing project.
In the following Methods section, we will describe the Alchemite method and the data sets to which it was
applied in this study. In the Results and Discussion section we will present the results of applications in the
context of an ongoing drug discovery project and the four challenges outlined above. Finally, we will draw some
conclusions and discuss potential future work.
Methods
The Alchemite method is a deep and iterative multiple imputation method that is a novel adaptation of a neural
network in which all inputs are also outputs 36–40. A detailed description of the underlying algorithm is given by
Verpoort et al. 38 and more recently by Whitehead et al. 40. Additional information and description of the
algorithm is given in the supplementary information
The goal is to solve for the weights and biases of a neural network where some outputs of the neural network
in the first iteration(s) are potentially used as the inputs of subsequent iterations. This is solved iteratively in the
context of a fixed-point equation 𝑓(𝒙) = 𝒙. For the inputs to the first iteration, missing values are replaced by
the mean of the available values of the corresponding endpoint. An iterative expectation maximization
algorithm is applied 52to converge the weights of the network.
In the applications described herein, the model will have 𝑁 inputs and outputs, of which 𝑁 = 𝑁𝑑 + 𝑁𝑒 ; where
𝑁𝑑 is a number of molecular descriptors and 𝑁𝑒 is the number of experimental assay endpoints. The matrix
columns corresponding to the descriptor inputs will be complete because these can be computed in advance for
any molecular structure. However, the assay endpoint columns may be sparsely occupied; some, or even most,
of the potential experimental data may be missing. The output is a complete matrix of assay endpoints in which
the missing values have been imputed (the process illustrated in Figure 1).
4
In this work, 200 networks are trained, with the data rows carrying different weights. This is substantially more
than in previous work 36–38, and leads to an ensemble of predictions for each missing value in the dataset. The mean
of these 200 predictions can be used as the predicted value. The standard deviation of the 200 predictions is used
as a measure of uncertainty in that value, giving an error bar for each predicted cell in the imputed matrix.
The hyperparameters of the network were optimized using a five-fold cross validation within the training set
data only 53. The tree-structured Parzen estimator 54 from the python library hyperopt 55 was used. The algorithm
uses a combination of Bayesian inference and non-parametric density estimation to optimize the so-called
expected improvement 54,56. Hyperparameter optimization was applied to the number of inputs for each
endpoint, the number of iteration layers (convergence loop in Figure S2), and the iterative mixing ratio alongside
the hyperparameters of the neural network (Figure S1).
Molecular Descriptors
In this work the number of molecular descriptors was 𝑁𝑑 = 330. The descriptors used included whole-molecule
properties such as molecular weight, lipophilicity, and polar surface area; and structural fragments defined by
SMARTS 57. These descriptors were calculated with the Auto-Modeller™ module of the StarDrop™ software 58
and have previously been used to train successful QSAR models 59. However, any set of numerical descriptors
can be used as input.
QSAR Methods for Comparison
In this work, the Alchemite models will be compared against QSAR models generated with the Auto-Modeller
module in StarDrop 58. For each endpoint, individual models were trained using four common QSAR methods:
Partial least squares (PLS), which describes the target property as a linear combination of latent variables 60 ;
Radial basis functions (RBF), a simple but effective data driven method which approximates the target quantity
as a linear combination of basis functions centered around the training points 61; Random forests (RF), which
trains the split criteria for a collection of 100 randomized decision trees to minimize the variance in predictions
62; and Gaussian process (GP) with fixed hyperparameters, a Bayesian method that draws models using the
posterior distribution of a multivariate Gaussian with a parametric correlation matrix over the training set 28.
Data Sets
Data cleaning was required. Qualified data (i.e. value containing the symbols >, <) were removed from the data
set because preliminary investigations demonstrated that simple inclusion of these data with no qualifier symbol
produced less-stable models. Some of the raw data were transformed onto scales and distributions more
amenable to modelling: IC50 values were transformed by taking the negative log of the IC50 in molar
concentration (pIC50); percentage columns underwent a logit transform such that logit(𝑥) = ln(𝑥(1 − 𝑥) −1 ),
The base 10 logarithm was taken of other ADME endpoints that varied over multiple orders of magnitude.
Summary tables and series information are provided in the Supplementary Materials for compounds in all
datasets. Distributions of experimental data and molecular characteristics are also provided along with
experimental protocols for ADME endpoints.
Initial data
Two real project data sets, Project A and Project B, were provided by Constellation Pharmaceuticals 63; including
rows equating to anonymized compounds, and columns containing sparse experimental data for a
heterogeneous mixture of activity, cell, and ADME endpoints. Project A had already finished; no new data would
be added. Project B was an ongoing project; the data were provided in batches and models iteratively trained
as the project evolved. The targets for the project were unrelated but some of the types of ADME data were
present in both projects. After the modelling work was completed more details have been published about
Project A which developed inhibitors for EP300/CBP Histone Acetyltransferase (HAT), further details can be
found in the references 64–66.
5
Table 1. A summary of the initial data received for Projects A and B. The ADME assays were shared between the
datasets. The number of endpoints of each type for each project are shown. The data for each endpoint were
sparse and the percentage filled of data points of each type that had been measured is also shown.
Number of
Compounds
Bioactivity Assays
Cell Assays
ADME Assays
Number
% Filled
Number
% Filled
Number
% Filled
Project A
1241
3
45
2
15
8
16
Project B
338
5
55
0
N/A
8
3
The initial data are summarized in Table 1. The activity endpoints included 3 target bioactivity columns over 2 target
isoforms, and 2 cell-based assay columns for Project A; and 5 bioactivity columns over three isoforms for Project B.
The targets of Projects A and B were enzymes from unrelated protein families, and there should be no correlation
between target activities or cross-target activity for compounds designed for each target. The ADME endpoints
included kinetic solubility, permeability measured in a parallel artificial membrane permeability assay (PAMPA),
human and mouse plasma protein binding (PPB), human and mouse liver microsome intrinsic clearance (HLM Clint,
MLM Clint), and reversible cytochrome P450 (CYP) 2D6 and 3A4 inhibition.
The data were split into an 80% training set and a 20% independent test set. The split was stratified randomly
over rows to find the set of training/test rows that had approximately equal data sparsity for all columns
simultaneously. This was required because the ADME columns were so sparse that many purely random splits
would leave an empty test column.
Unified vs. Individual Models
To compare the stability of models under different partitioning of the data the following additional models were
trained for comparison with a single, unified model of all data across both projects:
1) Only activity data from Project A
2) Only the activity data from Project B
3) All of the Project A data
4) All of the ADME data from Project A and Project B
5) All of the data from both Project A and Project B
Temporal Data
At the start of the study, the Project B data set contained 338 compounds. As the study progressed, another 874
compounds were added to Project B, sorted by the date on which they were synthesized and registered in the
database, which correlates with the measurement time of assay results. This allowed a temporal split to be made
51. The new compounds were split into three blocks of ~300 compounds, with block 1 being the oldest and block
3 being the newest compounds in the project. The final block often had higher activities and more relevant
ADME data.
Three data splits were generated to allow the construction of three temporal models. Model 1 which used all of
the initial data (from Table 1) as a training set, Model 2 which used all of the initial data and the first block of
temporally split compounds, and Model 3 which used all of the initial data and the first two blocks of temporally
split compounds. All three models were validated against the final unseen block of compounds so that an
independent comparison could be made.
6
Model Assessment
The quality of the models was assessed using the coefficient of determination (R2), in the range (−∞, 1], (N.B.
This should not be confused with the Pearson correlation coefficient which is in the range [−1,1]). The
coefficient of determination is defined as
2
𝑅 =1−
∑𝑁
𝑖=1(𝑓𝑖 − 𝑦𝑖 )
∑𝑁
̅)
𝑖=1(𝑦𝑖 − 𝑦
2
2,
where 𝑦̅ is the mean of the observed data points, 𝑦𝑖 , and 𝑓𝑖 is the model prediction of data point 𝑦𝑖 . In addition,
the root mean squared error (RMSE) of the results for each endpoint is considered:
𝑁
1
𝑅𝑀𝑆𝐸 = √ ∑(𝑓𝑖 − 𝑦𝑖 )2 .
𝑁
𝑖=1
Results and Discussion
Initial Comparison with QSAR Methods
We compared the multi-target Alchemite method with conventional QSAR models of single endpoints. The QSAR
models are based only on molecular descriptors because they cannot use incomplete experimental data as input.
Table 2. Comparison of Alchemite model performance against single-endpoint machine learning methods for
QSAR on the independent test set for the initial data received from Constellation Pharmaceuticals. The bold
result is the best method in the row.
RF
(𝑹𝟐 )
RBF
(𝑹𝟐 )
GP
(𝑹𝟐 )
PLS
(𝑹𝟐 )
Alchemite
(𝑹𝟐 )
𝑹𝟐 Boost Over
Second-best Method
CYP2D6 % Inhibition
0.26
0.37
0.40
0.08
0.63
+ 0.23
CYP3A4 % Inhibition
0.26
0.24
0.21
0.15
0.3
+ 0.04
HLM Clint
0.11
0.07
-0.18
-0.08
0.43
+ 0.32
Kinetic Solubility
0.44
0.54
0.54
0.40
0.50
- 0.04
MLM Clint
0.37
0.51
0.49
0.31
0.54
+ 0.03
PAMPA Permeability
0.24
0.18
0.28
0.19
0.21
- 0.07
ADME PPB% Human
0.60
0.56
0.58
0.48
0.72
+ 0.12
ADME PPB% Mouse
0.47
0.49
0.53
0.56
0.63
+ 0.07
Project A Bio. 1
0.50
0.46
0.48
0.53
0.94
+ 0.41
Project A Bio. 2
0.63
0.56
0.67
0.64
0.79
+ 0.12
Project A Bio. 3
0.50
0.25
0.46
0.54
0.92
+ 0.38
Project A Cell 1
0.62
0.72
0.71
0.73
0.84
+ 0.11
Project A Cell 2
-0.29
-1.2
-0.48
-0.27
0.57
+ 0.84
Project B Bio. 1
0.44
0.43
0.38
0.30
0.65
+ 0.21
Project B Bio. 2
0.46
0.52
0.40
0.28
0.82
+ 0.30
Project B Bio. 3
0.53
0.45
0.44
0.37
0.82
+ 0.29
Project B Bio. 4
0.46
0.44
0.44
0.30
0.62
+ 0.16
Project B Bio. 5
0.56
0.57
0.53
0.47
0.71
+ 0.14
Endpoint Name
(Merged Data Set)
7
From the results in Table 2 we can see that Alchemite adds significant predictive value over single-endpoint
QSAR methods, when comparing the results on the 20% held out test set for the initial data. On average for an
individual endpoint Alchemite adds 0.2 to the 𝑅2 value of the next leading method (range -0.07 to 0.84) and
outperforms the best QSAR model on 16 out of 18 endpoints. Where there is not an improvement, the
performance is effectively equivalent to the best QSAR result.
Figure 2 shows the best QSAR model from the four types shown in Table 3, N.B. it is strictly speaking unfair to
compare the best of the test set results against Alchemite, as it would not be known a priori which model was
best. Despite this, Alchemite is still significantly better than this result in almost all endpoints across both
activities and ADME varieties. On average the 𝑅2 for QSAR models is 0.44, and on average the 𝑅2 for Alchemite
models is 0.65.
In particular, we can see that the Project A Cell 2 (cell proliferation) results cannot be predicted with
conventional QSAR methods; a negative R2 indicates a performance that is worse than random (i.e. shuffling the
test labels). This is likely because cell activity not only depends on target protein activity, but also on the
compound reaching the target which will be strongly influenced by physicochemical and ADME properties.
However, assay-assay correlations are strong so when the biochemical assay and ADME results, such as solubility
and permeability, can be used as inputs to the model with Alchemite, there is a significant improvement in the
ability to predict cell-based activity, even though the majority of data are not available for most compounds.
Figure 2. Comparison of the results on the independent test set for the best of four QSAR methods (blue) with
an Alchemite model (orange) built with all of the training data from the initial data set.
Comparison of a Single, Unified Model with Individual Models
Table 3 shows a breakdown of the 𝑅2 performances of models constructed with different subsets of the initial
data, as described under “Unified vs Individual Models” in the Data Sets section above. There is excellent
agreement between models generated with different combinations of project data sets and endpoints, showing
that it is not necessary to train individual models for different projects or objectives; the single model of both
projects and all data performs equivalently to models built on the individual subsets.
8
The average coefficient of determination is particularly high on activity models with 𝑅2 = 0.81 for the project
which has complete lead optimization (Project A), and 𝑅2 = 0.73 for the new project which is in hit-to-lead (Project
B). The ADME 𝑅2 values are good, considering the data sparsity (only 16% present) and complexity of the
endpoints. The summary statistics for the model with all of the data are similar to the average of the two models.
Table 3. Summary of five model types to check how robust the algorithm is to data partitioning. Cells with N/A
represent combinations which cannot be measured because of the data split definition.
ADME Average 𝑹𝟐
Activity Average 𝑹𝟐
All Average 𝑹𝟐
Project A Activity
N/A
0.81
0.81
Project B Activity
N/A
0.73
0.73
Project A All
0.52
0.82
0.63
All ADME Data
0.50
N/A
0.50
All Data
0.50
0.77
0.65
Model
We further drill down into the relative performance in Figure 3 where we compare the models built on individual
data sets (i.e. only Project A or only Project B) versus a model constructed on both data sets simultaneously. We
can see for cell and bioactivity assays that the predictive power of both types of model is virtually identical. On
average, the quality of the models is also the same for ADME endpoints, although there is increased variability.
It should be noted that the individual project model for ADME properties was only built and tested on Project A
because there were insufficient ADME data for Project B with which to build and test an individual model, while
the model built on All Data is built and tested on both Projects A and B. Therefore, these models are compared
on different test sets.
Figure 3. A breakdown of independent test 𝑅2 values across endpoints in the initial dataset. For endpoint
marked with * the individual project model for ADME properties was built and tested on Project A only.
Selecting the Most Confident Predictions
9
An ensemble of predictions is generated for each missing element of the data matrix and the distribution of this
ensemble can take many shapes. The mean and the standard deviation of this distribution gives a unique
prediction and error bar for each missing value, where the error bar represents one standard deviation about
the mean. In the case where descriptor values, or sparse experimental inputs for a new compound extrapolate
beyond the training data, the error bar will grow to show the algorithms has limited knowledge of that region
of chemical space. Figure 4 shows an example scatter plot of the predicted versus observed activity, Project B
Bioactivity 2 pIC50, for the independent test set of the initial data. We can see the uncertainty estimates as error
bars in the y-axis, which intersect with the identity line in almost all cases. The only significant outlier (red point)
has correctly been assigned a large uncertainty, indicating that the model has determined this to be a lowconfidence prediction.
Figure 4. A plot of predicted versus observed Project B – Bioactivity 2 values for the independent test set of the
initial data predictions. The error bars show one standard deviation in the predicted value and the dotted line
shows the identity line of perfect fit. One clear outlier is highlighted in red, which is correctly assigned the highest
uncertainty in prediction.
We can exploit our knowledge of the uncertainties in the predicted values by disregarding those with the highest
uncertainty. We would expect the remaining, more confident, values to have a higher accuracy. In Figure 5 we
analyze the impact of discarding the predictions in increasing order of confidence (i.e. the predictions with the
largest error bars will be discarded first). The RMSE is plotted on the y-axis of the graph, such that low values
indicate more accurate predictions. The orange line shows that, as the least confident predictions are removed,
the RMSE falls sharply, confirming the expected behavior. For this model we can predict around 80% of results
with an RMSE of approximately 0.1 log units.
10
Figure 5. Plot of RMSE of predicted test results when predictions with lowest confidence are removed. The
orange line shows the performance of the Alchemite model. For comparison, the black dotted lines show the
minimum and maximum RMSE achievable as the least- and most-accurate results are removed, i.e. the order
which minimizes or maximizes the RMSE (N.B. in practice this order is not known without measuring against the
test set). The blue shaded region and dashed line indicate the expected results from randomly removing results.
For this endpoint, Alchemite accurately identifies the least confident results, leading to a large improvement in
RMSE when only discarding a few of the predictions.
Temporal Learning and Validation.
We will now focus on the additional compounds provided from Constellation Pharmaceuticals as Project B
progressed. Results in this section correspond to the models trained on blocks of data as described under
“Temporal Data” in the Data Sets section.
Figure 6 shows the average 𝑅2 of Models 1, 2, and 3 (bold, black line) and the individual endpoint 𝑅2 values for
the same models (fine, colored lines), for predictions on an independent test set corresponding to the most
recent block of compounds and associated data. The average 𝑅2 increases linearly, showing constant
improvement with additional project data. The breakdown shows a reduction in the variance of model
performances, and a general tendency for models to pass above the 𝑅2 = 0.7 line (a threshold for a very good
model). Initially only activity models are above this line, by the third model even ADME properties are being
predicted with this high level of accuracy. A small number of endpoints do not increase in performance, notably
the CYP inhibition endpoints that are some of the sparsest and most complex ADME endpoints in this dataset.
11
Figure 6. The coefficient of determination (𝑅2 ) of Models 1, 2 and 3 on an independent test set corresponding
to the most recent block of compounds and associated data (Block 3), as more data are added temporally across
the project. Bold, black: the average coefficient across all endpoints. Fine, colors: The coefficient for each
endpoint with some examples given.
To deliver further insights we now focus in on the model predictions for human plasma protein binding (Figure
7). There are two classes of compounds in the test set: 1) many moderate binders and 2) four strong binders.
Model 1 has limited ability to distinguish between these two classes, with a great deal of overlap in the error
bars. With only 19 more training points in Model 2, the predictions for the strong binders improve and the error
bars allow the compounds to be more confidently distinguished. By the third model, with 42 further training
points, the 𝑅2 has increased significantly and the model can distinguish all four compounds.
Figure 7. Plots of predictions with error bars by Models 1, 2 and 3 (left to right) for human protein plasma binding
on the independent test set corresponding to the most recent compounds and associated data (Block 3). 𝑅2
values, training set sizes and the identity (black) and best fit (grey) lines are shown on each plot. The logit
transform was applied to the percent bound data. cleaning12 compounds have 𝑙𝑜𝑔𝑖𝑡(𝑃𝑃𝐵) ≤ 2 which
corresponds to 𝑃𝑃𝐵 < 88%and 4 compounds have 𝑙𝑜𝑔𝑖𝑡(𝑃𝑃𝐵) > 4, which corresponds to 𝑃𝑃𝐵 > 98%. The
highest 2 compounds have 𝑙𝑜𝑔𝑖𝑡(𝑃𝑃𝐵) ≥ 5.5 which corresponds to 𝑃𝑃𝐵 > 99.6%.
12
We now focus on the data rich Project B bioactivity 2 endpoint, shown in Figure 8. There are more training points
for this activity column and the models 1,2, and 3 progressively improve from 𝑅2 = 0.73 through to an excellent
model with 𝑅2 = 0.93. The uncertainties in the predictions for actives reduce greatly by the third model due to
the large amount of training data. There were very few examples of training activity greater than 8, thus the
model begins to extrapolate effectively on the far-right hand side of the plot.
Figure 8. Plots of predictions with error bars by Models 1, 2 and 3 (left to right) for the Project B Bioactivity 2
endpoint on the independent test set corresponding to the most recent compounds and associated data (Block
3). 𝑅2 values, training set sizes and identity (black) and best fit (grey) lines are shown on each plot.
Figure 9 shows the breakdown of the accuracy of model predictions on an independent test set for models
generated and tested with all of the data received. For a consistent comparison with the initial model, an 80:20
stratified split was applied, as for the initial data set. The average 𝑅2 from the best of four QSAR methods for
each of the endpoints was now 0.50, which had improved from the previous value of 0.44. This shows that the
QSAR methods had used the additional information to improve the model quality. The final Alchemite average
𝑅2 was 0.72, which had improved from 0.65 for the initial set, providing an average improvement of 0.22 over
QSAR models on this final data set.
Notably, there are now five bioactivity models at or above the excellent 𝑅2 = 0.9 threshold. Alchemite has
retained strong models for Project A endpoints as more data are added for Project B.
13
Figure 9. Comparison of the results on the independent test set for best of four QSAR methods (blue) with an
Alchemite model (orange) built with all of the training data using an 80:20 stratified random split on the final
data set. This plot can be compared to Figure 2 to inspect the improvement in models with more data.
Conclusions
We have demonstrated a flexible deep learning algorithm that can be used for wide scale and general-purpose
data imputation in the context of an ongoing drug discovery project. It can handle multiple, potentially unrelated
inputs and build stable models that outperform conventional QSAR methods by using incomplete experimental
data as input to learn transferrable assay-assay correlations. It is also notable that this method still outperforms
QSAR in the limit of a smaller data set, representative of a medicinal chemistry project. This contrasts with other
deep learning methods which have seen more marginal improvements and generally require much larger datasets.
We considered the application of this method in relation to the challenges of dealing with sparse, noisy and
heterogeneous data in the context of an evolving drug discovery project.
We have seen that an Alchemite model can be trained for data spanning multiple projects and a variety of diverse
endpoints and the quality of predictions was very similar when compared to separate models. This shows promise
in its ability to capture information at multiple levels of resolution in a single model. The most notable examples
where imputation added much greater value over QSAR were for complex endpoints, such as cell-based assays,
that likely required a combination of experimental and descriptor inputs to make a meaningful model.
Furthermore, we showed that the confidence estimates in individual predictions enable the most accurate
predictions to be identified for individual endpoints. This outcome has now been seen in both homogeneous
data 40 and for heterogeneous data in this study.
Finally, we illustrated the application of Alchemite to evolving project data, demonstrating that as more data
become available the model can be retrained resulting in rapidly improving accuracy on the most recent
chemistry and experimental data. This enables the application of these models to augment an ongoing project
and guide the next most valuable experiment to perform, in order to yield maximum possible benefit.
Supporting Information:
The supporting information includes a description of the data set in terms of chemical diversity, chemical series,
distributions and tables of common chemical properties and assay values. Although the code for the Alchemite
is not in the public domain due to IP restrictions, readers are encouraged to use the email below if they would
like assistance and further information about understanding or reproducing the method.
Corresponding Author Information:
Benedict W. J. Irwin, ben@optibrium.com
Matthew D. Segall, matt@optibrium.com
Notes:
BWJI and MDS are employees of Optibrium Ltd. which produce the StarDrop software. TMW and GJC are
employees of Intellegens Ltd. JRL is an employee of Constellation Pharmaceuticals Inc.
References
(1)
Lecun, Y.; Bengio, Y.; Hinton, G. Deep Learning. 2015. https://doi.org/10.1038/nature14539.
(2)
Schmidhuber, J. Deep Learning in Neural Networks: An Overview. Neural Networks 2015, 61, 85–117.
https://doi.org/10.1016/j.neunet.2014.09.003.
(3)
Chen, H.; Engkvist, O.; Wang, Y.; Olivecrona, M.; Blaschke, T. The Rise of Deep Learning in Drug
Discovery. Drug Discov. Today 2018, 23 (6), 1241–1250. https://doi.org/10.1016/j.drudis.2018.01.039.
14
(4)
Ramsundar, B.; Liu, B.; Wu, Z.; Verras, A.; Tudor, M.; Sheridan, R. P.; Pande, V. Is Multitask Deep
Learning Practical for Pharma? J. Chem. Inf. Model. 2017, 57 (8), 2068–2076.
https://doi.org/10.1021/acs.jcim.7b00146.
(5)
Mayr, A.; Klambauer, G.; Unterthiner, T.; Steijaert, M.; Wegner, J. K.; Ceulemans, H.; Clevert, D.-A.;
Hochreiter, S. Large-Scale Comparison of Machine Learning Methods for Drug Target Prediction on
ChEMBL. Chem. Sci. 2018, 9 (24), 5441–5451. https://doi.org/10.1039/C8SC00148K.
(6)
Lusci, A.; Pollastri, G.; Baldi, P. Deep Architectures and Deep Learning in Chemoinformatics: The
Prediction of Aqueous Solubility for Drug-like Molecules. J. Chem. Inf. Model. 2013, 53 (7), 1563–1575.
https://doi.org/10.1021/ci400187y.
(7)
Li, H.; Yu, L.; Tian, S.; Li, L.; Wang, M.; Lu, X. Deep Learning in Pharmacy: The Prediction of Aqueous
Solubility Based on Deep Belief Network. Autom. Control Comput. Sci. 2017, 51 (2), 97–107.
https://doi.org/10.3103/s0146411617020043.
(8)
Xu, Y.; Dai, Z.; Chen, F.; Gao, S.; Pei, J.; Lai, L. Deep Learning for Drug-Induced Liver Injury. J. Chem. Inf.
Model. 2015, 55 (10), 2085–2093. https://doi.org/10.1021/acs.jcim.5b00238.
(9)
Ma, J.; Sheridan, R. P.; Liaw, A.; Dahl, G. E.; Svetnik, V. Deep Neural Nets as a Method for Quantitative
Structure-Activity Relationships. J. Chem. Inf. Model. 2015, 55 (2), 263–274.
https://doi.org/10.1021/ci500747n.
(10)
Xu, Y.; Ma, J.; Liaw, A.; Sheridan, R. P.; Svetnik, V. Demystifying Multitask Deep Neural Networks for
Quantitative Structure-Activity Relationships. J. Chem. Inf. Model. 2017, 57 (10), 2490–2504.
https://doi.org/10.1021/acs.jcim.7b00087.
(11)
Baskin, I. I.; Winkler, D.; Tetko, I. V. A Renaissance of Neural Networks in Drug Discovery. Expert Opin.
Drug Discov. 2016, 11 (8), 785–795. https://doi.org/10.1080/17460441.2016.1201262.
(12)
Halberstam, N. M.; Baskin, I. I.; Palyulin, V. A.; Zefirov, N. S. Neural Networks as a Method for
Elucidating Structure–Property Relationships for Organic Compounds. Russ. Chem. Rev. 2003, 72 (7),
629–649. https://doi.org/10.1070/RC2003v072n07ABEH000754.
(13)
Hessler, G.; Baringhaus, K.-H. Artificial Intelligence in Drug Design. Molecules 2018, 23 (10), 2520.
https://doi.org/10.3390/molecules23102520.
(14)
Tetko, I. V; Livingstone, D. J.; Luik, A. I. Neural Network Studies. 1. Comparison of Overfitting and
Overtraining. J. Chem. Inf. Model. 1995, 35 (5), 826–833. https://doi.org/10.1021/ci00027a006.
(15)
Varnek, A.; Marcou, G.; Baskin, I.; Pandey, A. K. Inductive Transfer of Knowledge : Application of MultiTask Learning and Feature Net Approaches to Model Tissue-Air Partition Coefficients. 2009, 133–144.
(16)
Gómez-Bombarelli, R.; Wei, J. N.; Duvenaud, D.; Hernández-Lobato, J. M.; Sánchez-Lengeling, B.;
Sheberla, D.; Aguilera-Iparraguirre, J.; Hirzel, T. D.; Adams, R. P.; Aspuru-Guzik, A. Automatic Chemical
Design Using a Data-Driven Continuous Representation of Molecules. ACS Cent. Sci. 2018, 4 (2), 268–
276. https://doi.org/10.1021/acscentsci.7b00572.
(17)
Segler, M. H. S.; Kogej, T.; Tyrchan, C.; Waller, M. P. Generating Focused Molecule Libraries for Drug
Discovery with Recurrent Neural Networks. ACS Cent. Sci. 2018, 4 (1), 120–131.
https://doi.org/10.1021/acscentsci.7b00512.
(18)
De Cao, N.; Kipf, T. MolGAN: An Implicit Generative Model for Small Molecular Graphs. 2018.
(19)
Segler, M. H. S.; Preuss, M.; Waller, M. P. Planning Chemical Syntheses with Deep Neural Networks and
Symbolic AI. Nature 2018, 555 (7698), 604–610. https://doi.org/10.1038/nature25978.
(20)
Lo, Y.; Rensi, S. E.; Torng, W.; Altman, R. B. Machine Learning in Chemoinformatics and Drug Discovery.
Drug Discov. Today 2018, 23 (8), 1538–1546. https://doi.org/10.1016/j.drudis.2018.05.010.
(21)
Gao, C.; Cahya, S.; Nicolaou, C. A.; Wang, J.; Watson, I. A.; Cummins, D. J.; Iversen, P. W.; Vieth, M.
Selectivity Data: Assessment, Predictions, Concordance, and Implications. J. Med. Chem. 2013, 56 (17),
6991–7002. https://doi.org/10.1021/jm400798j.
(22)
Schürer, S. C.; Muskal, S. M. Kinome-Wide Activity Modeling from Diverse Public High-Quality Data
Sets. J. Chem. Inf. Model. 2013, 53 (1), 27–38. https://doi.org/10.1021/ci300403k.
15
(23)
Christmann-Franck, S.; van Westen, G. J. P.; Papadatos, G.; Beltran Escudie, F.; Roberts, A.; Overington,
J. P.; Domine, D. Unprecedently Large-Scale Kinase Inhibitor Set Enabling the Accurate Prediction of
Compound–Kinase Activities: A Way toward Selective Promiscuity by Design? J. Chem. Inf. Model.
2016, 56 (9), 1654–1675. https://doi.org/10.1021/acs.jcim.6b00122.
(24)
Zakharov, A. V.; Peach, M. L.; Sitzmann, M.; Nicklaus, M. C. A New Approach to Radial Basis Function
Approximation and Its Application to QSAR. J. Chem. Inf. Model. 2014, 54 (3), 713–719.
https://doi.org/10.1021/ci400704f.
(25)
Shahlaei, M.; Fassihi, A. CHEMISTRY QSAR Analysis of Some 1- ( 3 , 3-Diphenylpropyl ) -Piperidinyl
Amides and Ureas as CCR5 Inhibitors Using Genetic Algorithm-Least Square Support Vector Machine.
2013, 4384–4400. https://doi.org/10.1007/s00044-012-0430-2.
(26)
Barrett, S. J.; Langdon, W. B. Advances in the Application of Machine Learning Techniques in Drug
Discovery , Design and Development SVM Applications in Pharmaceuticals Research. 2004.
(27)
Burden, F. R. Quantitative Structure - Activity Relationship Studies Using Gaussian Processes. 2001,
830–835. https://doi.org/10.1021/ci000459c.
(28)
Obrezanova, O.; Csányi, G.; Gola, J. M. R.; Segall, M. D. Gaussian Processes: A Method for Automatic
QSAR Modeling of ADME Properties. J. Chem. Inf. Model. 2007, 47 (5), 1847–1857.
https://doi.org/10.1021/ci7000633.
(29)
Obrezanova, O.; Segall, M. D. Gaussian Processes for Classification: QSAR Modeling of ADMET and
Target Activity. J. Chem. Inf. Model. 2010, 50 (6), 1053–1061. https://doi.org/10.1021/ci900406x.
(30)
Myint, K.; Wang, L.; Tong, Q.; Xie, X. Molecular Fingerprint-Based Artificial Neural Networks QSAR for
Ligand Biological Activity Predictions. Mol. Pharm. 2012, 9 (10), 2912–2923.
https://doi.org/10.1021/mp300237z.
(31)
Shahlaei, M.; Sabet, R.; Ziari, M. B.; Moeinifard, B.; Fassihi, A.; Karbakhsh, R. QSAR Study of Anthranilic
Acid Sulfonamides as Inhibitors of Methionine Aminopeptidase-2 Using LS-SVM and GRNN Based on
Principal Components. Eur. J. Med. Chem. 2010, 45 (10), 4499–4508.
https://doi.org/10.1016/j.ejmech.2010.07.010.
(32)
Ghasemi, F.; Mehridehnavi, A.; Fassihi, A.; Pérez-Sánchez, H. Deep Neural Network in QSAR Studies
Using Deep Belief Network. Appl. Soft Comput. 2018, 62 (October), 251–258.
https://doi.org/10.1016/j.asoc.2017.09.040.
(33)
Dearden, J. C. The History and Development of Quantitative Structure-Activity Relationships (QSARs).
Int. J. Quant. Struct. Relationships 2017, 2 (2), 36–46. https://doi.org/10.4018/IJQSPR.2017070104.
(34)
Feinberg, E. N.; Sur, D.; Wu, Z.; Husic, B. E.; Mai, H.; Li, Y.; Sun, S.; Yang, J.; Ramsundar, B.; Pande, V. S.
PotentialNet for Molecular Property Prediction. ACS Cent. Sci. 2018, 4 (11), 1520–1530.
https://doi.org/10.1021/acscentsci.8b00507.
(35)
Feinberg, E. N.; Sheridan, R.; Joshi, E.; Pande, V. S.; Cheng, A. C. Step Change Improvement in ADMET
Prediction with PotentialNet Deep Featurization. 2019.
(36)
Conduit, B. D.; Jones, N. G.; Stone, H. J.; Conduit, G. J. Design of a Nickel-Base Superalloy Using a
Neural Network. Mater. Des. 2017, 131, 358–365. https://doi.org/10.1016/j.matdes.2017.06.007.
(37)
Conduit, B. D.; Jones, N. G.; Stone, H. J.; Conduit, G. J. Probabilistic Design of a Molybdenum-Base Alloy
Using a Neural Network. Scr. Mater. 2018, 146, 82–86.
https://doi.org/10.1016/j.scriptamat.2017.11.008.
(38)
Verpoort, P. C.; MacDonald, P.; Conduit, G. J. Materials Data Validation and Imputation with an
Artificial Neural Network. Comput. Mater. Sci. 2018, 147, 176–185.
https://doi.org/10.1016/j.commatsci.2018.02.002.
(39)
Santak, P.; Conduit, G. Predicting Physical Properties of Alkanes with Neural Networks. Fluid Phase
Equilib. 2019, 112259. https://doi.org/10.1016/j.fluid.2019.112259.
(40)
Whitehead, T. M.; Irwin, B. W. J.; Hunt, P.; Segall, M. D.; Conduit, G. J. Imputation of Assay Bioactivity
Data Using Deep Learning. J. Chem. Inf. Model. 2019, 59 (3), 1197–1204.
16
https://doi.org/10.1021/acs.jcim.8b00768.
(41)
Martin, E. J.; Polyakov, V. R.; Tian, L.; Perez, R. C. Profile-QSAR 2.0: Kinase Virtual Screening Accuracy
Comparable to Four-Concentration IC50s for Realistically Novel Compounds. J. Chem. Inf. Model. 2017,
57 (8), 2077–2088. https://doi.org/10.1021/acs.jcim.7b00166.
(42)
Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard,
M.; et al. TensorFlow : A System for Large-Scale Machine Learning This Paper Is Included in the
Proceedings of the TensorFlow : A System for Large-Scale Machine Learning. 2016.
(43)
Singh, A. P.; Gordon, G. J. Relational Learning via Collective Matrix Factorization Categories and Subject
Descriptors. 2008.
(44)
Bento, A. P.; Gaulton, A.; Hersey, A.; Bellis, L. J.; Chambers, J.; Davies, M.; Krüger, F. A.; Light, Y.; Mak,
L.; McGlinchey, S.; et al. The ChEMBL Bioactivity Database: An Update. Nucleic Acids Res. 2014, 42 (D1),
D1083–D1090. https://doi.org/10.1093/nar/gkt1031.
(45)
Rubin, D. B. Inference and Missing Data. Biometrika 1976, 63 (3), 581–592.
(46)
Smieja, M.; Struski, Ł.; Tabor, J.; Zieliński, B.; Spurek, P. Processing of Missing Data by Neural Networks.
Adv. Neural Inf. Process. Syst. 2018, 2018-Decem (Section 4), 2719–2729.
(47)
Tresp, V.; Ahmad, S.; Neuneier, R. Training Neural Networks with Deficient Data. Adv. Neural Inf.
Process. Syst. 1994, 6. https://doi.org/10.1.1.23.6971.
(48)
Yang, J. J.; Ursu, O.; Lipinski, C. A.; Sklar, L. A.; Oprea, T. I.; Bologa, C. G. Badapple : Promiscuity Patterns
from Noisy Evidence. J. Cheminform. 2016, 1–14. https://doi.org/10.1186/s13321-016-0137-3.
(49)
Segall, M. D.; Champness, E. J. The Challenges of Making Decisions Using Uncertain Data. J. Comput.
Aided. Mol. Des. 2015, 29 (9), 809–816. https://doi.org/10.1007/s10822-015-9855-2.
(50)
Martin, E. J.; Polyakov, V. R.; Zhu, X.-W.; Mukherjee, P.; Tian, L.; Liu, X. All-Assay-Max2 PQSAR: Activity
Predictions as Accurate as 4-Concentration IC50s for 8,558 Novartis Assays. bioRxiv 2019, No. 4218,
620864. https://doi.org/10.1101/620864.
(51)
Sheridan, R. P. Time-Split Cross-Validation as a Method for Estimating the Goodness of Prospective
Prediction. J. Chem. Inf. Model. 2013, 53 (4), 783–790. https://doi.org/10.1021/ci400084k.
(52)
Mclachlan, G.; Krishnan, T. The EM Algorithm and Extensions , 2nd Edition. 2008.
(53)
Marron, J. . S. . A Comparison of Cross-Validation Techniques in Density Estimation. Ann. Stat. 1987, 15
(1), 152–162.
(54)
Bergstra, J.; Bardenet, R.; Bengio, Y.; Kégl, B. Algorithms for Hyper-Parameter Optimization. Adv.
Neural Inf. Process. Syst. 2011, 2546–2554. https://doi.org/2012arXiv1206.2944S.
(55)
Bergstra, J.; Komer, B.; Eliasmith, C.; Yamins, D.; Cox, D. D. Hyperopt: A Python Library for Model
Selection and Hyperparameter Optimization. Comput. Sci. Discov. 2015, 8 (1).
https://doi.org/10.1088/1749-4699/8/1/014008.
(56)
Jones, D. R. A Taxonomy of Global Optimization Methods Based on Response Surfaces. 2001, 345–383.
(57)
Daylight SMARTS https://www.daylight.com/dayhtml/doc/theory/theory.smarts.html (accessed Dec
16, 2019).
(58)
StarDropTM. (accessed 16/12/2019).
(59)
Hunt, P. A.; Segall, M. D.; Tyzack, J. D. WhichP450: A Multi-Class Categorical Model to Predict the
Major Metabolising CYP450 Isoform for a Compound. J. Comput. Aided. Mol. Des. 2018, 32 (4), 537–
546. https://doi.org/10.1007/s10822-018-0107-0.
(60)
Wold, S.; Sjostrom, M.; Eriksson, L. PLS Method. In The Encyclopedia of Computational Chemistry;
Schleyer, P., Allinger, N., Clark, T., Gasteiger, J., Kollman, P., S., Ed.; John Wiley and Sons.: Chichester,
UK, 1999; p pp 1−16.
(61)
Introduction. In Radial Basis Functions: Theory and Implementations; Buhmann, M. D., Ed.; Cambridge
Monographs on Applied and Computational Mathematics; Cambridge University Press: Cambridge,
17
2003; pp 1–10. https://doi.org/DOI: 10.1017/CBO9780511543241.002.
(62)
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32.
(63)
Constellation Pharmaceuticals https://www.constellationpharma.com/ (accessed Dec 16, 2019).
(64)
Gardberg, A. S.; Huhn, A. J.; Cummings, R.; Bommi-Reddy, A.; Poy, F.; Setser, J.; Vivat, V.; Brucelle, F.;
Wilson, J. Make the Right Measurement: Discovery of an Allosteric Inhibition Site for P300-HAT. Struct.
Dyn. 2019, 6 (5), 054702. https://doi.org/10.1063/1.5119336.
(65)
Wilson, J. E.; Huhn, A.; Gardberg, A. S.; Poy, F.; Brucelle, F.; Vivat, V.; Patel, G.; Patel, C.; Cummings, R.;
Sims, R.; et al. Early Drug Discovery Efforts Towards the Identification of EP300/CBP Histone
Acetyltransferase (HAT) Inhibitors. ChemMedChem 2020, cmdc.202000007.
https://doi.org/10.1002/cmdc.202000007.
(66)
Wilson, J. E.; Patel, G.; Patel, C.; Brucelle, F.; Huhn, A.; Gardberg, A. S.; Poy, F.; Cantone, N.; BommiReddy, A.; Sims, R. J.; et al. Discovery of CPI-1612: A Potent, Selective, and Orally Bioavailable
EP300/CBP Histone Acetyltransferase Inhibitor. ACS Med. Chem. Lett. 2020, acsmedchemlett.0c00155.
https://doi.org/10.1021/acsmedchemlett.0c00155.
18