Honors Projects in Biological and Biomedical Sciences

Identification of Significant Gene Expression Changes Incorporating Heterogeneity in Perturbation Experiments

Katharine Cross, Bryant UniversityFollow

Document Type

Thesis

Comments

The dataset used is publicly available on the LINCS Portal, which is a National Health Common Fund Program made up of over 15 credible institutions such as Harvard Medical, Stanford, etc. The chosen dataset contains perturbation experiment observations on gene expression changes.

First Faculty Advisor

TingTing Zhao

Second Faculty Advisor

Brian Blais

Keywords

knockoffs; significant gene expression; perturbation experiment; heterogeneity

Publisher

Bryant University

Rights Management

CC - BY

Abstract

Machine learning methods have been widely applied to the field of genomics and bioinformatics. Specifically utilizing novel machine learning algorithms to study gene-drug interactions has the potential to make a major positive impact on new drug discovery. It is possible that heterogeneity may exist within Vorinostat drug perturbation experiments due to the effects of the perturbations on the gene expressions. Thus, the challenge is to identify the most important genes in a high-dimensional setting while first identifying subpopulations to address population heterogeneity. In this work, clustering techniques are applied to first identify group sub-population structures in the gene expression changes across multiple Vorinostat perturbations. Next, statistical knockoffs are applied to identify important gene expression changes within each subpopulation with the theoretically guaranteed false discovery rate. Gaussian Mixture knockoff generation is used to construct negative controls and identify important genes across these subpopulations within the Vorinostat family and make comparisons. This research has the potential to aid future novel drug discoveries, along with enhancing the potential of drug repurposing within the field of Pharmacoeconomics. Identification of such gene-drug interactions can facilitate a better understanding of the mechanism of the disease and identify new drug targets. The results support the theory of heterogeneity, as two distinct clusters were discovered. Cluster zero appears to include a majority of genes that had positive coefficients after interacting with the Vorinostat perturbations, resulting in up-regulation in the expression of those genes. Cluster one consisted largely of genes that had negative coefficients after interacting with the Vorinostat treatment, indicating down-regulation in the expression of those genes.

Download

Included in

Bioinformatics Commons, Data Science Commons, Genomics Commons

COinS

Honors Projects in Biological and Biomedical Sciences

Identification of Significant Gene Expression Changes Incorporating Heterogeneity in Perturbation Experiments

Document Type

Comments

First Faculty Advisor

Second Faculty Advisor

Keywords

Publisher

Rights Management

Abstract

Included in

Search

Browse

Links

Honors Projects in Biological and Biomedical Sciences

Identification of Significant Gene Expression Changes Incorporating Heterogeneity in Perturbation Experiments

Authors

Document Type

Comments

First Faculty Advisor

Second Faculty Advisor

Keywords

Publisher

Rights Management

Abstract

Included in

Share

Search

Browse

Links