Technology Overview    
Prediction Models for Human PK

The use of in silico prediction of ADME/Tox properties is gaining acceptance as a useful assessment tool for early identification of likely drug candidate failures. However, until now, it has been difficult to locate reliable models for the prediction of human pharmacokinetics in silico.

Strand Genomics Pvt. Ltd. has developed five human PK models and is offering them either as trupk or in partnership with Bio-Rad Informatics as part of the KnowItAll Informatics System, ADME/Tox Edition.

Several machine-learning methods including neural networks, decision trees and support vector machines were employed to identify a small set from 1054 molecular descriptors that correlated with each pharmacokinetic parameter. The input to the predictors is the 2-D structure of a molecule, which is used to compute the descriptors that are utilized by the models.

Human PK Models
Model
Units Comments
Plasma Protein Binding Fraction of total drug
Total drug in the plasma of drug bound to the plasma proteins
Bioavailability Fraction of total oral dose Total oral dose of a drug that reaches the plasma upon absorption in the gut and first pass metabolism.
Volume of Distribution L/Kg Measure of the total distribution volume available to a drug in the body. Low volumes of distribution imply that the drug remains in the plasma while very high volumes indicate that the drug distributes widely into the various tissue compartments in the body.
Elimination Half-life Hours Measures the half-life of a drug observed in the plasma.
Rate of Absorption 1/hour Measures the rate at which a drug is absorbed orally from the gut into the plasma. Quickly absorbed drugs will exhibit a very fast time to maximum concentration in the plasma.

Training Set Profile

The activity values used in the generation of the Strand human PK models were taken from the following sources:

  • FDA CDER Human Drugs Database
  • RxList Drug Information Guide
  • Thummel K.E., Shen D.D., in "Goodman & Gilman's The Pharmacological Basis of Therapeutics", 10/e, Ed. by Hardman J.G., Limbird L.E., Gilman A.G., McGraw-Hill, pp. 1924-2023
Number of Lipinski’s Rule Violations
Number of Molecules
0
378
1
49
2
30
3
17
4
0

All the molecules used in building the models are commercially available drugs. 90% of the drugs listed here are considered drug-like based on Lipinski’s classification. The detailed list is on the left-

  • Average number of H-bond donors per compound: 2.26
  • Average number of H-bond acceptors per compound: 6.2
  • Average molecular weight: 351
  • Percent of drug like compounds: 90%

A self-dissimilarity test for the training set was carried out by using 2D structural fingerprints and was determined to be 52%. The maximal dissimilarity was observed to be 93 %

Model Characteristics: Training and External Validation


Training Statistics:

 
Classification Accuracy
Regression Accuracy
 
N
Low %
High %
% Accurately Predicted
R-squared
Protein Binding
306
99
100
85
0.93
Bioavailability
185
--
--
89
0.80
Volume of Distribution
206
99
87
97
0.86
Elimination Half life
341
100
100
82
0.82
Rate of Absorption
193
89
86
97
0.86

Cross Validation Statistics:

 
Classification Accuracy
Regression Accuracy
 
Validation Type
Low %
High %
% Accurately Predicted
Robust Q-squared
Protein Binding
N-Fold
75
75
78
0.87
Bioavailability
N-Fold
--
--
75
0.65
Volume of Distribution
N-Fold
69
67
89
0.73
Elimination Half life
N-Fold
76
85
73
0.67
Rate of Absorption
N-Fold
78
76
93
0.78

External Validation Statistics:

 
Classification Accuracy
Regression Accuracy
 
N
Low %
High %
% Accurately Predicted
R-squared
Protein Binding
74
82
97
75
0.90
Bioavailability
42
--
--
86
0.70
Volume of Distribution
53
92
86
79
0.73
Elimination Half life
66
85
65
71
0.72
Rate of Absorption
30
50
90
63
0.67

 
 
© 2004 Strand Genomics. All Rights Reserved. | trupk@strandgenomics.com