Feature Engineering for Microbiome Data

Repo for a project done with Persimune, CHIP at Rigshospitalet.

Aim

The aim of this research was to find novel feature engineering methods for microbiome data, which preserve information encoded in this data while also possibly reducing dataset size. With biological data, it is essential to have explainability and the methods here have been tried with consideration of this fact. The tools tried are based on a machine learning classification task of disease vs control. Along with this, the methods tried in this research have been performed with and without compositional normalization of data. This was done to help understand why it is important to consider the compositional nature of microbiome data when doing any analysis.

Methods

The general workflow followed in this analysis is shown below:

Results

The details of the analysis and results can be found in this report.

How to use this repo

To replicate the results obtained in this analysis, follow these steps:

Run 0.ipynb

This file creates the CRC_all file shown in the workflow and some plots

Run 1.1.ipynb and 1.2.ipynb

1.1 = Control, 1.2 = Control (CRC)

Run 2.1.ipynb and 2.2.ipynb

2.1 = Feature summation, 2.2 = Feature summation (CRC)

Run 3.1.ipynb and 3.2.ipynb

3.1 = Symbolic regression, 3.2 = Symbolic regression (CRC)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Feature Engineering for Microbiome Data

Aim

Methods

Results

How to use this repo

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.ipynb_checkpoints		.ipynb_checkpoints
CRC		CRC
plots		plots
0.ipynb		0.ipynb
1.1.ipynb		1.1.ipynb
1.2.ipynb		1.2.ipynb
2.1.ipynb		2.1.ipynb
2.2.ipynb		2.2.ipynb
3.1.ipynb		3.1.ipynb
3.2.ipynb		3.2.ipynb
CHIP_report.pdf		CHIP_report.pdf
README.md		README.md

siddhijain25/CHIP-project

Folders and files

Latest commit

History

Repository files navigation

Feature Engineering for Microbiome Data

Aim

Methods

Results

How to use this repo

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages