ИСТИНА |
Войти в систему Регистрация |
|
ИСТИНА ПсковГУ |
||
This presentation concerns the use of different dimensionality reduction and data visualization techniques in chemoinformatics. It starts with the justification of the need to visualize data as an important step to transfer the knowledge acquired by computers by analyzing raw data to humans. We live in a three-dimensional world and have to move on a nearly flat surface of the Earth. Our senses and our brain are adapted to work effectively under such conditions. Therefore, we perceive information (including chemical information) better when it recalls the world we are evolutionary adapted to interact with. Since the world of chemical data is highly multidimensional, we need to reduce the number of dimensions to 2 or 3 in order to take advantage of our natural biological mechanism of vision and analysis of spatial information. Numerous methods to perform dimensionality reduction have been developed by mathematicians for this purpose [1-3]. The approaches frequently used in chemoinformatics are briefly discussed. The main part of the presentation deals with the theory of the Generative Topographic Mapping (GTM) [4] and the use of this method in the domain of chemoinformatics [5-10]. GTM is a Bayesian approach originally developed as a probabilistic extension of Kohonen self-organizing maps (SOMs) in order to overcome some of its drawbacks [4]. The advantages of using GTM to process information in chemistry are explained. Several examples of mapping chemical datasets using GTM are discussed. Recent developments discussed in the presentation concern: (i) the use of iterative GTM and two-level meta-GTM to process big data sets [8], (ii) the concept of GTM-based activity landscapes and their use to build QSAR/QSPR regression models and define their applicability domains [9], (iii) different approaches to build GTM-based classification models and define their applicability domains [6-7], (iv) Stargate GTM approach, which can be used both to perform simultaneous predictions of several properties/activities and to detect structures with specified activity profile (inverse-QSAR) [10].
№ | Имя | Описание | Имя файла | Размер | Добавлен |
---|---|---|---|---|---|
1. | Презентация | baskin-kazan-2016.pdf | 2,7 МБ | 22 мая 2016 [BaskinII] |