Umeå University's logo

umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Conformal Prediction to define applicability domain – a case study on predicting ER and AR binding
Swedish Toxicology Sciences Research Center, Södertälje, Sweden.
Umeå University, Faculty of Science and Technology, Department of Chemistry.ORCID iD: 0000-0002-3601-2797
Umeå University, Faculty of Science and Technology, Department of Chemistry.
2016 (English)In: SAR and QSAR in environmental research (Print), ISSN 1062-936X, E-ISSN 1029-046X, Vol. 27, no 4, p. 303-316Article in journal (Refereed) Published
Abstract [en]

A fundamental element when deriving a robust and predictive in silico model is not only the statistical quality of the model in question but, equally important, the estimate of its predictive boundaries. This work presents a new method, conformal prediction, for applicability domain estimation in the field of endocrine disruptors. The method is applied to binders and non-binders related to the oestrogen and androgen receptors. Ensembles of decision trees are used as statistical method and three different sets (dragon, rdkit and signature fingerprints) are investigated as chemical descriptors. The conformal prediction method results in valid models where there is an excellent balance in quality between the internally validated training set and the corresponding external test set, both in terms of validity and with respect to sensitivity and specificity. With this method the level of confidence can be readily altered by the user and the consequences thereof immediately inspected. Furthermore, the predictive boundaries for the derived models are rigorously defined by using the conformal prediction framework, thus no ambiguity exists as to the level of similarity needed for new compounds to be in or out of the predictive boundaries of the derived models where reliable predictions can be expected.

Place, publisher, year, edition, pages
Taylor & Francis Group, 2016. Vol. 27, no 4, p. 303-316
Keywords [en]
Conformal prediction, oestrogen receptor, androgen receptor, random forest, signature descriptors
National Category
Chemical Sciences Computer Sciences
Identifiers
URN: urn:nbn:se:umu:diva-120255DOI: 10.1080/1062936X.2016.1172665ISI: 000375443100001Scopus ID: 2-s2.0-84963808622OAI: oai:DiVA.org:umu-120255DiVA, id: diva2:927607
Available from: 2016-05-12 Created: 2016-05-12 Last updated: 2023-03-24Bibliographically approved
In thesis
1. A step forward in using QSARs for regulatory hazard and exposure assessment of chemicals
Open this publication in new window or tab >>A step forward in using QSARs for regulatory hazard and exposure assessment of chemicals
2016 (English)Doctoral thesis, comprehensive summary (Other academic)
Alternative title[sv]
Ett steg framåt i användandet av QSARs för regulatorisk riskbedömning och bedömning av exponeringen till kemikalier
Abstract [en]

According to the REACH regulation chemicals produced or imported to the European Union need to be assessed to manage the risk of potential hazard to human health and the environment. An increasing number of chemicals in commerce prompts the need for utilizing faster and cheaper alternative methods for this assessment, such as quantitative structure-activity or property relationships (QSARs or QSPRs). QSARs and QSPRs are models that seek correlation between data on chemicals molecular structure and a specific activity or property, such as environmental fate characteristics and (eco)toxicological effects.

The aim of this thesis was to evaluate and develop models for the hazard assessment of industrial chemicals and the exposure assessment of pharmaceuticals. In focus were the identification of chemicals potentially demonstrating carcinogenic (C), mutagenic (M), or reprotoxic (R) effects, and endocrine disruption, the importance of metabolism in hazard identification, and the understanding of adsorption of ionisable chemicals to sludge with implications to the fate of pharmaceuticals in waste water treatment plants (WWTPs). Also, issues related to QSARs including consensus modelling, applicability domain, and ionisation of input structures were addressed.

The main findings presented herein are as follows:

  • QSARs were successful in identifying almost all carcinogens and most mutagens but worse in predicting chemicals toxic to reproduction.
  • Metabolic activation is a key event in the identification of potentially hazardous chemicals, particularly for chemicals demonstrating estrogen (E) and transthyretin (T) related alterations of the endocrine system, but also for mutagens. The accuracy of currently available metabolism simulators is rather low for industrial chemicals. However, when combined with QSARs, the tool was found useful in identifying chemicals that demonstrated E- and T- related effects in vivo.
  • We recommend using a consensus approach in final judgement about a compound’s toxicity that is to combine QSAR derived data to reach a consensus prediction. That is particularly useful for models based on data of slightly different molecular events or species.
  • QSAR models need to have well-defined applicability domains (AD) to ensure their reliability, which can be reached by e.g. the conformal prediction (CP) method. By providing confidence metrics CP allows a better control over predictive boundaries of QSAR models than other distance-based AD methods.
  • Pharmaceuticals can interact with sewage sludge by different intermolecular forces for which also the ionisation state has an impact. Developed models showed that sorption of neutral and positively-charged pharmaceuticals was mainly hydrophobicity-driven but also impacted by Pi-Pi and dipole-dipole forces. In contrast, negatively-charged molecules predominantly interacted via covalent bonding and ion-ion, ion-dipole, and dipole-dipole forces.
  • Using ionised structures in multivariate modelling of sorption to sludge did not improve the model performance for positively- and negatively charged species but we noted an improvement for neutral chemicals that may be due to a more correct description of zwitterions.

 

Overall, the results provided insights on the current weaknesses and strengths of QSAR approaches in hazard and exposure assessment of chemicals. QSARs have a great potential to serve as commonly used tools in hazard identification to predict various responses demanded in chemical safety assessment. In combination with other tools they can provide fundaments for integrated testing strategies that gather and generate information about compound’s toxicity and provide insights of its potential hazard. The obtained results also show that QSARs can be utilized for pattern recognition that facilitates a better understanding of phenomena related to fate of chemicals in WWTP.

Abstract [sv]

Enligt kemikalielagstiftningen REACH måste kemikalier som produceras i eller importeras till Europeiska unionen riskbedömas avseende hälso- och miljöfara. Den ökande mängden kemikalier som används i samhället kräver snabbare och billigare alternativa riskbedömningsmetoder, såsom kvantitativa struktur-aktivitets- eller egenskapssamband (QSARs eller QSPRs). QSARs och QSPRs är datamodeller där samband söks korrelationer mellan data för kemikaliers struktur-relaterade egenskaper och t.ex. kemikaliers persistens eller (eko)toxiska effekter.

Målet med den här avhandlingen var att utvärdera och utveckla modeller för riskbedömning av industri kemikalier och läkemedel för att studera hur QSARs/QSPRs kan förbättra riskbedömningsprocessen. Fokus i avhandlingen var utveckling av metoder för identifiering av potentiellt cancerframkallande (C), mutagena (M), eller reproduktionstoxiska (R) kemikalier, och endokrint aktiva kemikalier, att studera betydelsen av metabolism vid riskbedömning och att öka vår förståelse för joniserbara kemikaliers adsorption till avloppsslam. Avhandlingen behandlar även konsensusmodellering, beskrivning av modellers giltighet och betydelsen av jonisering för kemiska deskriptorer.

De huvudsakliga resultaten som presenteras i avhandlingen är:

  • QSAR-modeller identifierade nästan alla cancerframkallande ämnen och de flesta mutagener men var sämre på att identifiera reproduktionstoxiska kemikalier.
  • Metabolisk aktivering är av stor betydelse vid identifikationen av potentiellt toxiska kemikalier, speciellt för kemikalier som påvisar östrogen- (E) och sköldkörtel-relaterade (T) förändringar av det endokrina systemet men även för mutagener. Träffsäkerheten för de tillgängliga metabolismsimulatorerna är ganska låg för industriella kemikalier men i kombination med QSARs så var verktyget användbart för identifikation av kemikalier som påvisade E- och T-relaterade effekter in vivo.
  • Vi rekommenderar att använda konsensusmodellering vid in silico baserad bedömning av kemikaliers toxicitet, d.v.s. att skapa en sammanvägd förutsägelse baserat på flera QSAR-modeller. Det är speciellt användbart för modeller som baseras på data från delvis olika mekanismer eller arter.
  • QSAR-modeller måste ha ett väldefinierat giltighetsområde (AD) för att garantera dess pålitlighet vilket kan uppnås med t.ex. conformal prediction (CP)-metoden. CP-metoden ger en bättre kontroll över prediktiva gränser hos QSAR-modeller än andra distansbaserade AD-metoder.
  • Läkemedel kan interagera med avloppsslam genom olika intermolekylära krafter som även påverkas av joniseringstillståndet. Modellerna visade att adsorptionen av neutrala och positivt laddade läkemedel var huvudsakligen hydrofobicitetsdrivna men också påverkade av Pi-Pi- och dipol-dipol-krafter. Negativt laddade molekyler interagerade huvudsakligen med slam via kovalent bindning och jon-jon-, jon-dipol-, och dipol-dipol-krafter.
  • Kemiska deskriptorer baserade på joniserade molekyler förbättrade inte prestandan för adsorptionsmodeller för positiva och negativa joner men vi noterade en förbättring av modeller för neutrala substanser som kan bero på en mer korrekt beskrivning av zwitterjoner.

Sammanfattningsvis visade resultaten på QSAR-modellers styrkor och svagheter för användning som verkyg vid risk- och exponeringsbedömning av kemikalier. QSARs har stor potential för bred användning vid riskidentifiering och för att förutsäga en mängd olika responser som krävs vid riskbedömning av kemikalier. I kombination med andra verktyg kan QSARs förse oss med data för användning vid integrerade bedömningar där data sammanvägs från olika metoder. De erhållna resultaten visar också att QSARs kan användas för att bedöma och ge en bättre förståelse för kemikaliers öde i vattenreningsverk.

Place, publisher, year, edition, pages
Umeå: Umeå University, 2016. p. 72
Keywords
QSAR, in silico, non-testing tools, risk assessment, exposure assessment, hazard assessment, carcinogenicity, mutagenicity, reproductive toxicity, endocrine disruption, estrogen, androgen, transthyretin, sorption
National Category
Chemical Sciences Computer and Information Sciences
Research subject
biology, Environmental Science; Computer Science; Toxicology; Statistics
Identifiers
urn:nbn:se:umu:diva-120223 (URN)978-91-7601-504-9 (ISBN)
Public defence
2016-06-03, KB3B1, KBC-huset, Umeå, 13:00 (English)
Opponent
Supervisors
Available from: 2016-05-13 Created: 2016-05-12 Last updated: 2018-06-07Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Rybacka, AleksandraAndersson, Patrik L.

Search in DiVA

By author/editor
Rybacka, AleksandraAndersson, Patrik L.
By organisation
Department of Chemistry
In the same journal
SAR and QSAR in environmental research (Print)
Chemical SciencesComputer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 777 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf