Identification and extraction of teratogenic drugs using text mining
Abstract
Abstract
Introduction: Teratogens are drugs, chemicals, or even infections that may impair fetal development. Therefore, exposure to a teratogen before or during pregnancy can lead to congenital defects or miscarriage. The present study aims to develop a method to identify teratogenic drugs and theircomplications from latest studies.
Materials and Methods: In this study, the abstracts of all articles (n=1117) on teratogenic drugs have been reviewed and analyzed. First, the abstracts of the articles converted into a processable format. Next, the teratogenic drugs in the articles were extracted, and the related side effects were recorded for each drug. Finally, based on the output of the feature extraction phase, the drugs were clustered based on the identified complications. Compared with the FDA list of teratogenic drugs, a list of new teratogenic drugs was reported.
Results: Of 149 drugs extracted by text mining, 123 drugs were found in the FDA list of teratogenic drugs, and 26 were not included, of which no complications have been reported for 11 of the drugs in articles, and the other 15 drugs were found as new teratogenic drugs based on the latest studies due to the extracted side effects. These drugs may be used to update the FDA classification of drugs during pregnancy.
Conclusion: The presented model based on text processing and data mining can identify and extract teratogenic drugs and their complications from articles automatically. Consequently, this model can lead to more safety and prevention of drug abuse and reduce the side effects of teratogenic drugs.