Change search
ReferencesLink to record
Permanent link

Direct link
A step forward in using QSARs for regulatory hazard and exposure assessment of chemicals
Umeå University, Faculty of Science and Technology, Department of Chemistry.ORCID iD: 0000-0002-3601-2797
2016 (English)Doctoral thesis, comprehensive summary (Other academic)Alternative title
Ett steg framåt i användandet av QSARs för regulatorisk riskbedömning och bedömning av exponeringen till kemikalier (Swedish)
Abstract [en]

According to the REACH regulation chemicals produced or imported to the European Union need to be assessed to manage the risk of potential hazard to human health and the environment. An increasing number of chemicals in commerce prompts the need for utilizing faster and cheaper alternative methods for this assessment, such as quantitative structure-activity or property relationships (QSARs or QSPRs). QSARs and QSPRs are models that seek correlation between data on chemicals molecular structure and a specific activity or property, such as environmental fate characteristics and (eco)toxicological effects.

The aim of this thesis was to evaluate and develop models for the hazard assessment of industrial chemicals and the exposure assessment of pharmaceuticals. In focus were the identification of chemicals potentially demonstrating carcinogenic (C), mutagenic (M), or reprotoxic (R) effects, and endocrine disruption, the importance of metabolism in hazard identification, and the understanding of adsorption of ionisable chemicals to sludge with implications to the fate of pharmaceuticals in waste water treatment plants (WWTPs). Also, issues related to QSARs including consensus modelling, applicability domain, and ionisation of input structures were addressed.

The main findings presented herein are as follows:

  • QSARs were successful in identifying almost all carcinogens and most mutagens but worse in predicting chemicals toxic to reproduction.
  • Metabolic activation is a key event in the identification of potentially hazardous chemicals, particularly for chemicals demonstrating estrogen (E) and transthyretin (T) related alterations of the endocrine system, but also for mutagens. The accuracy of currently available metabolism simulators is rather low for industrial chemicals. However, when combined with QSARs, the tool was found useful in identifying chemicals that demonstrated E- and T- related effects in vivo.
  • We recommend using a consensus approach in final judgement about a compound’s toxicity that is to combine QSAR derived data to reach a consensus prediction. That is particularly useful for models based on data of slightly different molecular events or species.
  • QSAR models need to have well-defined applicability domains (AD) to ensure their reliability, which can be reached by e.g. the conformal prediction (CP) method. By providing confidence metrics CP allows a better control over predictive boundaries of QSAR models than other distance-based AD methods.
  • Pharmaceuticals can interact with sewage sludge by different intermolecular forces for which also the ionisation state has an impact. Developed models showed that sorption of neutral and positively-charged pharmaceuticals was mainly hydrophobicity-driven but also impacted by Pi-Pi and dipole-dipole forces. In contrast, negatively-charged molecules predominantly interacted via covalent bonding and ion-ion, ion-dipole, and dipole-dipole forces.
  • Using ionised structures in multivariate modelling of sorption to sludge did not improve the model performance for positively- and negatively charged species but we noted an improvement for neutral chemicals that may be due to a more correct description of zwitterions.


Overall, the results provided insights on the current weaknesses and strengths of QSAR approaches in hazard and exposure assessment of chemicals. QSARs have a great potential to serve as commonly used tools in hazard identification to predict various responses demanded in chemical safety assessment. In combination with other tools they can provide fundaments for integrated testing strategies that gather and generate information about compound’s toxicity and provide insights of its potential hazard. The obtained results also show that QSARs can be utilized for pattern recognition that facilitates a better understanding of phenomena related to fate of chemicals in WWTP.

Abstract [sv]

Enligt kemikalielagstiftningen REACH måste kemikalier som produceras i eller importeras till Europeiska unionen riskbedömas avseende hälso- och miljöfara. Den ökande mängden kemikalier som används i samhället kräver snabbare och billigare alternativa riskbedömningsmetoder, såsom kvantitativa struktur-aktivitets- eller egenskapssamband (QSARs eller QSPRs). QSARs och QSPRs är datamodeller där samband söks korrelationer mellan data för kemikaliers struktur-relaterade egenskaper och t.ex. kemikaliers persistens eller (eko)toxiska effekter.

Målet med den här avhandlingen var att utvärdera och utveckla modeller för riskbedömning av industri kemikalier och läkemedel för att studera hur QSARs/QSPRs kan förbättra riskbedömningsprocessen. Fokus i avhandlingen var utveckling av metoder för identifiering av potentiellt cancerframkallande (C), mutagena (M), eller reproduktionstoxiska (R) kemikalier, och endokrint aktiva kemikalier, att studera betydelsen av metabolism vid riskbedömning och att öka vår förståelse för joniserbara kemikaliers adsorption till avloppsslam. Avhandlingen behandlar även konsensusmodellering, beskrivning av modellers giltighet och betydelsen av jonisering för kemiska deskriptorer.

De huvudsakliga resultaten som presenteras i avhandlingen är:

  • QSAR-modeller identifierade nästan alla cancerframkallande ämnen och de flesta mutagener men var sämre på att identifiera reproduktionstoxiska kemikalier.
  • Metabolisk aktivering är av stor betydelse vid identifikationen av potentiellt toxiska kemikalier, speciellt för kemikalier som påvisar östrogen- (E) och sköldkörtel-relaterade (T) förändringar av det endokrina systemet men även för mutagener. Träffsäkerheten för de tillgängliga metabolismsimulatorerna är ganska låg för industriella kemikalier men i kombination med QSARs så var verktyget användbart för identifikation av kemikalier som påvisade E- och T-relaterade effekter in vivo.
  • Vi rekommenderar att använda konsensusmodellering vid in silico baserad bedömning av kemikaliers toxicitet, d.v.s. att skapa en sammanvägd förutsägelse baserat på flera QSAR-modeller. Det är speciellt användbart för modeller som baseras på data från delvis olika mekanismer eller arter.
  • QSAR-modeller måste ha ett väldefinierat giltighetsområde (AD) för att garantera dess pålitlighet vilket kan uppnås med t.ex. conformal prediction (CP)-metoden. CP-metoden ger en bättre kontroll över prediktiva gränser hos QSAR-modeller än andra distansbaserade AD-metoder.
  • Läkemedel kan interagera med avloppsslam genom olika intermolekylära krafter som även påverkas av joniseringstillståndet. Modellerna visade att adsorptionen av neutrala och positivt laddade läkemedel var huvudsakligen hydrofobicitetsdrivna men också påverkade av Pi-Pi- och dipol-dipol-krafter. Negativt laddade molekyler interagerade huvudsakligen med slam via kovalent bindning och jon-jon-, jon-dipol-, och dipol-dipol-krafter.
  • Kemiska deskriptorer baserade på joniserade molekyler förbättrade inte prestandan för adsorptionsmodeller för positiva och negativa joner men vi noterade en förbättring av modeller för neutrala substanser som kan bero på en mer korrekt beskrivning av zwitterjoner.

Sammanfattningsvis visade resultaten på QSAR-modellers styrkor och svagheter för användning som verkyg vid risk- och exponeringsbedömning av kemikalier. QSARs har stor potential för bred användning vid riskidentifiering och för att förutsäga en mängd olika responser som krävs vid riskbedömning av kemikalier. I kombination med andra verktyg kan QSARs förse oss med data för användning vid integrerade bedömningar där data sammanvägs från olika metoder. De erhållna resultaten visar också att QSARs kan användas för att bedöma och ge en bättre förståelse för kemikaliers öde i vattenreningsverk.

Place, publisher, year, edition, pages
Umeå: Umeå University , 2016. , 72 p.
Keyword [en]
QSAR, in silico, non-testing tools, risk assessment, exposure assessment, hazard assessment, carcinogenicity, mutagenicity, reproductive toxicity, endocrine disruption, estrogen, androgen, transthyretin, sorption
National Category
Chemical Sciences Computer and Information Science
Research subject
biology, Environmental Science; Computer Science; Toxicology; Statistics
URN: urn:nbn:se:umu:diva-120223ISBN: 978-91-7601-504-9OAI: diva2:927389
Public defence
2016-06-03, KB3B1, KBC-huset, Umeå, 13:00 (English)
Available from: 2016-05-13 Created: 2016-05-12 Last updated: 2016-05-26Bibliographically approved
List of papers
1. On the Use of In Silico Tools for Prioritising Toxicity Testing of the Low-Volume Industrial Chemicals in REACH
Open this publication in new window or tab >>On the Use of In Silico Tools for Prioritising Toxicity Testing of the Low-Volume Industrial Chemicals in REACH
2014 (English)In: Basic & Clinical Pharmacology & Toxicology, ISSN 1742-7835, E-ISSN 1742-7843, Vol. 115, no 1, 77-87 p.Article, review/survey (Refereed) Published
Abstract [en]

This study was conducted to evaluate the utility of a selection of commercially and freely available non-testing tools and to analyse how REACH registrants can apply these as prioritisation tool for low-volume chemicals. The analysis was performed on a set of organic industrial chemicals and pesticides with extensive peer-reviewed risk assessment data. Analysed in silico model systems included Derek Nexus, Toxtree, QSAR Toolbox, LAZAR, TEST and VEGA, and results from these were compared with expert-judged risk classification according to the classifying, labelling and packaging (CLP) regulation. The most reliable results were obtained for carcinogenicity; however, less reliable predictions were derived for mutagenicity and reproductive toxicity. A group of compounds frequently predicted as false negatives was identified. These were relatively small molecules with low structural complexity, for example benzene derivatives with hydroxyl-, amino- or aniline-substituents. A rat liver S9 metabolite simulator was applied to illustrate the importance of considering metabolism in the risk assessment procedure. We also discuss outcome of combining predictions from multiple model systems and advise how to apply in silico tools. These models are proposed to be used to prioritise low-volume chemicals for testing within the REACH legislation, and we conclude that further guidance is needed so that industry can select and apply models in a reliable, systematic and transparent way.

National Category
Organic Chemistry
urn:nbn:se:umu:diva-91186 (URN)10.1111/bcpt.12193 (DOI)000337583400011 ()
Available from: 2014-07-23 Created: 2014-07-21 Last updated: 2016-05-12Bibliographically approved
2. Identifying potential endocrine disruptors among industrial chemicals and their metabolites - development and evaluation of in silico tools
Open this publication in new window or tab >>Identifying potential endocrine disruptors among industrial chemicals and their metabolites - development and evaluation of in silico tools
2015 (English)In: Chemosphere, ISSN 0045-6535, E-ISSN 1879-1298, Vol. 139, 372-378 p.Article in journal (Refereed) Published
Abstract [en]

The aim of this study was to improve the identification of endocrine disrupting chemicals (EDCs) by developing and evaluating in silico tools that predict interactions at the estrogen (E) and androgen (A) receptors, and binding to transthyretin (T). In particular, the study focuses on evaluating the use of the EAT models in combination with a metabolism simulator to study the significance of bioactivation for endocrine disruption. Balanced accuracies of the EAT models ranged from 77-87%, 62-77%, and 65-89% for E-, A-, and T-binding respectively. The developed models were applied on a set of more than 6000 commonly used industrial chemicals of which 9% were predicted E- and/or A-binders and 1% were predicted T-binders. The numbers of E- and T-binders increased 2- and 3-fold, respectively, after metabolic transformation, while the number of A-binders marginally changed. In-depth validation confirmed that several of the predicted bioactivated E- or T-binders demonstrated in vivo estrogenic activity or influenced blood levels of thyroxine in vivo. The metabolite simulator was evaluated using in vivo data from the literature which showed a 50% accuracy for studied chemicals. The study stresses, in summary, the importance of including metabolic activation in prioritization activities of potentially emerging contaminants. 

Endocrine disruptor, Metabolism, Estrogen, Androgen, Transthyretin, QSAR
National Category
Other Earth and Related Environmental Sciences
urn:nbn:se:umu:diva-110546 (URN)10.1016/j.chemosphere.2015.07.036 (DOI)000361868000049 ()26210185 (PubMedID)
Available from: 2015-11-10 Created: 2015-10-23 Last updated: 2016-05-12Bibliographically approved
3. CERAPP: Collaborative Estrogen Receptor Activity Prediction Project
Open this publication in new window or tab >>CERAPP: Collaborative Estrogen Receptor Activity Prediction Project
Show others...
2016 (English)In: Journal of Environmental Health Perspectives, ISSN 0091-6765, E-ISSN 1552-9924, Vol. 124, no 7, 1023-1033 p.Article in journal (Refereed) Published
Abstract [en]

Background: Humans are exposed to thousands of man-made chemicals in the environment. Some chemicals mimic natural endocrine hormones and, thus, have the potential to be endocrine disruptors. Most of these chemicals have never been tested for their ability to interact with the estrogen receptor (ER). Risk assessors need tools to prioritize chemicals for evaluation in costly in vivo tests, for instance, within the EPA Endocrine Disruptor Screening Program (EDSP).

Objectives: Here, we describe a large-scale modeling project called CERAPP (Collaborative Estrogen Receptor Activity Prediction Project) and demonstrate the efficacy of using predictive computational models trained on high-throughput screening data to evaluate thousands of chemicals for ER-related activity and prioritize them for further testing.

Methods: CERAPP combined multiple models developed in collaboration among 17 groups in the United States and Europe to predict ER activity of a common set of 32,464 chemical structures. Quantitative structure-activity relationship models and docking approaches were employed, mostly using a common training set of 1677 chemical structures provided by US EPA, to build a total of 40 categorical and 8 continuous models for binding, agonist, and antagonist ER activity. All predictions were evaluated on a set of 7,522 chemicals curated from the literature. To overcome the limitations of single models, a consensus was built by weighting models on scores based on their evaluated accuracies.

Results: Individual model scores ranged from 0.69 to 0.85, showing high prediction reliabilities. Out of the 32,464 chemicals, the consensus model predicted 4,001 chemicals (12.3%) as high priority actives and 6,742 potential actives (20.8%) to be considered for further testing.

Conclusion: This project demonstrated the possibility to screen large libraries of chemicals using a consensus of different in silico approaches. This concept will be applied in future projects related to other endpoints.

Place, publisher, year, edition, pages
Environmental Health Perspectives, 2016
National Category
Chemical Sciences Computer Science
urn:nbn:se:umu:diva-120258 (URN)10.1289/ehp.1510267 (DOI)000380749300025 ()
Available from: 2016-05-12 Created: 2016-05-12 Last updated: 2016-10-24
4. Conformal Prediction to define applicability domain – a case study on predicting ER and AR binding
Open this publication in new window or tab >>Conformal Prediction to define applicability domain – a case study on predicting ER and AR binding
2016 (English)In: SAR and QSAR in environmental research (Print), ISSN 1062-936X, E-ISSN 1029-046X, Vol. 27, no 4, 303-316 p.Article in journal (Refereed) Published
Abstract [en]

A fundamental element when deriving a robust and predictive in silico model is not only the statistical quality of the model in question but, equally important, the estimate of its predictive boundaries. This work presents a new method, conformal prediction, for applicability domain estimation in the field of endocrine disruptors. The method is applied to binders and non-binders related to the oestrogen and androgen receptors. Ensembles of decision trees are used as statistical method and three different sets (dragon, rdkit and signature fingerprints) are investigated as chemical descriptors. The conformal prediction method results in valid models where there is an excellent balance in quality between the internally validated training set and the corresponding external test set, both in terms of validity and with respect to sensitivity and specificity. With this method the level of confidence can be readily altered by the user and the consequences thereof immediately inspected. Furthermore, the predictive boundaries for the derived models are rigorously defined by using the conformal prediction framework, thus no ambiguity exists as to the level of similarity needed for new compounds to be in or out of the predictive boundaries of the derived models where reliable predictions can be expected.

Place, publisher, year, edition, pages
Taylor & Francis Group, 2016
Conformal prediction, oestrogen receptor, androgen receptor, random forest, signature descriptors
National Category
Chemical Sciences Computer Science
urn:nbn:se:umu:diva-120255 (URN)10.1080/1062936X.2016.1172665 (DOI)000375443100001 ()
Available from: 2016-05-12 Created: 2016-05-12 Last updated: 2016-06-20Bibliographically approved
5. Considering ionic state in modelling sorption of pharmaceuticals to sewage sludge
Open this publication in new window or tab >>Considering ionic state in modelling sorption of pharmaceuticals to sewage sludge
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Partitioning of chemicals between particular matter and water in sewage treatment plants provide essential information on fate of chemicals and is particularly challenging for pharmaceuticals that frequently are present in ionized form. The aim of this study was to investigate how ionization state affects partitioning to sludge of active pharmaceutical ingredients (APIs). In addition, we investigated the use of chemical descriptors based on ionized structures to improve our understanding of the underlying mechanisms of sludge sorption and for use in quantitative structure-property relationship (QSPR) models. We collected KD values for 110 APIs, which were classified as neutral, positive, or negative at pH 7. The models with the highest performance exceeded 0.75 R2Y and 0.65 Q2. We found that neutral and positively charged APIs share dominant intermolecular forces with sludge, i.e., hydrophobic, Pi-Pi and dipole-dipole interactions. In contrast, hydrophobicity driven interactions for negatively charged APIs was of little importance and sorption was mainly driven by covalent bonding, and ion-ion, ion-dipole, and dipole-dipole interactions. The performance of the models increased by 5-10% by adding charge-related descriptors, implying importance of electrostatic interactions. Using descriptors calculated for ionized structures did not improve model statistics for positive and negative APIs, however, the model statistics of the neutral APIs increased. We believe that this increase resulted from a better description of neutral zwitterions present in the dataset. 

QSAR, in silico, sorption, sludge, pharmaceuticals, charge
National Category
Chemical Sciences Computer Science Environmental Sciences
urn:nbn:se:umu:diva-120257 (URN)
Available from: 2016-05-12 Created: 2016-05-12 Last updated: 2016-05-12

Open Access in DiVA

fulltext(3416 kB)57 downloads
File information
File name FULLTEXT01.pdfFile size 3416 kBChecksum SHA-512
Type fulltextMimetype application/pdf
spikblad(183 kB)13 downloads
File information
File name SPIKBLAD01.pdfFile size 183 kBChecksum SHA-512
Type spikbladMimetype application/pdf

Search in DiVA

By author/editor
Rybacka, Aleksandra
By organisation
Department of Chemistry
Chemical SciencesComputer and Information Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 57 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 629 hits
ReferencesLink to record
Permanent link

Direct link