RS-Predictor models augmented with SMARTCyp reactivities: robust metabolic regioselectivity predictions for nine CYP isozymes

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

  • Jed Zaretzki
  • Patrik Rydberg
  • Charles Bergeron
  • Kristin P Bennett
  • Lars Olsen
  • Curt Mark Breneman
RS-Predictor is a tool for creating pathway-independent, isozyme-specific site of metabolism (SOM) prediction models using any set of known cytochrome P450 substrates and metabolites. Until now, the RS-Predictor method was only trained and validated on CYP 3A4 data, but in the present study we report on the versatility the RS-Predictor modeling paradigm by creating and testing regioselectivity models for substrates of the nine most important CYP isozymes. Through curation of source literature, we have assembled 680 substrates distributed among CYPs 1A2, 2A6, 2B6, 2C19, 2C8, 2C9, 2D6, 2E1 and 3A4, which we believe is the largest publicly accessible collection of P450 ligands and metabolites ever released. A comprehensive investigation into the importance of different descriptor classes for predicting the regioselectivity of each isozyme is made through the generation of multiple independent RS-Predictor models for each set of isozyme substrates. Two of these models include a DFT reactivity descriptor derived from SMARTCyp. Optimal combinations of RS-Predictor and SMARTCyp are shown to have stronger performance than either method alone, while also exceeding the accuracy of the commercial regioselectivity prediction methods distributed by StarDrop and Schrödinger, correctly identifying a large proportion of the metabolites in each substrate set within the top two rank-positions: 1A2(83.0%), 2A6(85.7%), 2B6(82.1%), 2C19(86.2%), 2C8(83.8%), 2C9(84.5%), 2D6(85.9%), 2E1(82.8%), 3A4(82.3%) and merged(86.0%). Comprehensive datamining of each substrate set and careful statistical analyses of the predictions made by the different models revealed new insights into molecular features that control metabolic regioselectivity and enable accurate prospective prediction of likely SOMs.
TidsskriftJournal of Chemical Information and Modeling
Udgave nummer6
Sider (fra-til)1637-1659
StatusUdgivet - 2012

ID: 38165494