QSPR Study of Alkylbenzenes using Principal Component Regression Analysis
Aditya Pegu1*, Monjit Chamua2, Sumanta Borah3 and A Bharali4The QSPR, which connects structural features to physicochemical properties, is a useful part of drug design and discovery. The ultimate goal of the QSPR formulation is to develop mathematical models that estimate the physicochemical properties of molecular structures.There are over 3000 Topological indices (TIs) in the literature, therefore one must decide how to pick those that best describe the physicochemical property being studied. And in a regression equation inclusion of large number of TIs may increases the fit but the predictive ability of the developed model will face a substantial decrease due to their multicollinearity. Applying principal component analysis is the best method as they will reduce the dimension without losing the original data of the indices. Also, it eliminates the problems of multicollinearity among the indices and hence provides a good predictive model. In this article we have considered 37 degree-based and neighborhood degree-based topological indices to predict the physicochemical properties of 42 alkyl benzenes such as boiling point, critical pressure, critical volume and critical temperature using multilinear regression analysis. Also, we use principal component analysis to reduce the dimension and to overcome the multicollinearity among the indices.