Synthetic accessibility score as a filter for virtual chemical structures in polymer materials designстатья
Информация о цитировании статьи получена из
Scopus
Статья опубликована в журнале из списка Web of Science и/или Scopus
Дата последнего поиска статьи во внешних источниках: 1 апреля 2026 г.
Аннотация:ContextComputer-aided design of polymers with specific properties often encounters excessive computational demands due to the vast number of potential chemical structures. Reducing the number of candidates is critical for efficient virtual screening. A promising approach to overcome this challenge is the use of the synthetic accessibility score (SAscore), originally proposed by Ertl and Schüffenhauer (2009). It relies on decomposing chemical structures into molecular fragments and leveraging statistical analyses of fragment frequencies from large chemical databases. The initial implementation of the SAscore algorithm was not suitable for polymers. In this study, we demonstrate that our previously developed polymer-adapted approach can be used to filter polymer structures. We analyzed the impact of using fragments obtained by decomposing chemical compounds from different databases on SAscore calculation results. We showed that, in most cases, the differences fall within the error range. However, structures containing fragments with limited database presence may exhibit notable SAscore fluctuations across fragment sources.MethodThe Ertl and Schüffenhauer SAscore algorithm was adapted to account for the specific structural features of polymeric structures. We implemented our algorithm in a program that decomposes large databases of chemical structures into fragments and calculates their frequency of occurrence. The decomposition of polymers into fragments considers the ambiguity of the smallest repeating unit (SRU) specification and the possibility of specifying a polymer structure with multiple SRUs. This ensures identical calculated SAscore values for polymers specified in different ways. For SAscore calculations, we used PubChem Compounds (115 million records as of August 2023) and Aurora Fine Chemicals (116 million records) as sources of molecular fragments. The resulting database of molecular fragments is publicly accessible. A test database of virtually generated polymers with calculated SAscore values (MCKintech, 2.2 million records) is also freely available.