Skip to content

Egem2 #50

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 31 commits into
base: master
Choose a base branch
from
Open

Egem2 #50

wants to merge 31 commits into from

Conversation

fanela
Copy link
Contributor

@fanela fanela commented Sep 25, 2019

No description provided.

fanela added 30 commits June 28, 2019 13:50
…o make scatter plots and calculate correlation in MATLAB_CODE_me. Saved outputs constrain_flux_regulation in VariablesSaved. Renamed min variable to min_model to avoid conflict with min function.
…into structures called sigRxn... while long program ran on desktop.
…alue. compared 1E-1 FBA and 1E-1 fBA and saved variables.�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D�[D
offgenes = unique(ccleids_met(ccle_expression_metz(:,iii) < -2));

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% set the glucose uptake based on media
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lower bounds only change glucose abundance for 4 medium conditions. The code matlab/scripts/nutrient_sensitivty/medium_LB_constraints.m in master can improve your drug sensitivity predictions.

load supplementary_software_code hdacexpfcs hdacexpallgeneids
%contains gene expression data after treatment with hdac inhibitors

% Converting arrays to tables and renaming some Variables to make
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once you prettify these variables, you should save them in vars and just load them rather than formatting them during every iteration within this script.

Copy link
Member

@ScottCampit ScottCampit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fanela, there are several changes that need to be made before pushing this to the master branch. General stuff that I saw:

  • Variable names are still super confusing - please help make the code readable by changing the variables names to something more understandable (ie instead of using v1, use KMTi_auc directly, which is much more clear.
  • There's a ton of code that is repetitive between the three new scripts in the MeCorr directory. Since you're reusing these variables, save them as .mat files and load them at the beginning of each script. That way, it's easier to figure out what this script is actually doing.
  • I found a few bugs that should be addressed and that should change your results and potentially your interpretation of the results.

Also, I moved some of your old matlab scripts in the drug_sensitivity folder. I though MeCorr was too generic, because there are several modules that compute correlation values between expression and Me flux. If you can, move these scripts there and save the new matlab variables in the vars directory.

fluxes_allrxns= NaN(height(exptidcelllinemediamatch),length(model2.rxns));
g_rate= NaN(height(exptidcelllinemediamatch), length(model2.rxns));

for i = 1:height(exptidcelllinemediamatch)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change i, ii, and iii into better variable names that are more accurate? It becomes really confusing reading the rest of the code

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repeat this for a lot of the variables below. Our goal is to have this code more readable so that it is easy to figure out what's going on without comments.

@@ -0,0 +1,419 @@
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename the three files with MATLAB_CODE_* files - it's not clear from the get-go what the difference between allrxns, me, and methyl are without opening and carefully reading the file contents.

ii = find(ismember(ctd2celllineidname_id, exptidcelllinemediamatch(i,2)));
iii = find(ismember(celllinenames_ccle1, ctd2celllineidname(ii,1)));
if ~isempty(iii)
iii = iii(1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is iii set to the first value if it's not empty? If length(iii) > 1, then you may want to average ccle_expression_metz for those columns, then get ongenes and offgenes. This gets the average gene expression values for the duplicated cell line.

rho_p= NaN(height(hmei_list),length(model2.rxns));
for j = 1:height(hmei_list)
fx = find(ismember(ctd2compoundidname_name, hmei_list(j,1)));
ix = ismember(drug_auc_expt(:,3), ctd2compoundidname_id(fx,1)); sum(ix);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separate code after semicolon to new line to improve readability

%% Create Table of Significant Reactions by Correlation value
% Workflow: Change threshold. Change struct field name (e.g. above3)
% accordingly
sigTF= (abs(rho) > 0.3);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you taking the absolute value of rho? Negative values are important and have a statistical interpretation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also if you want to get significant reactions, you would want to filter using the p-values as well as R value. That is, you want the reaction to have an R-value >= 0.3 and a p-value < 0.05.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 195: I coded abs(rho) > 0.3 so that the negative correlations below -0.3 would be selected as well. I am including those correlations in downstream analysis.

Copy link
Member

@ScottCampit ScottCampit Oct 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, you should create two variables have rho > 0.3 and rho < -0.3, unless you wrote in a way to distinguish between the negatively and positively correlated entries by saving their respective indices first. That way, you can combine them later to make a plot while keeping the respective scalar directions.

@@ -124,7 +121,6 @@
glucflag1 = [ ones(1,6), 0 ,0, 0, 0];
expval1 = [ ones(1,7), 0 , 1, 0];


[ix, aapos] = ismember( mediareactions1(20:37),acetylation_model.rxns);
posgluc = 1385; % glucose uptake reaction in recon1.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't hard-code this - use find(ismember(model.rxns, 'EX__glc_D_e')) to find the glucose exchange reaction. Same for glutamine and linoleic acid.

@@ -155,24 +151,18 @@
model3.lb(pyrpos) = -10;
end


model3.c(rxnpos) = epsilon_acetylation;
model3.c(rxnpos) = epsilon_methylation;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repetitive



model3.c(rxnpos) = epsilon_acetylation;
model3.c(rxnpos) = epsilon_methylation;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also repetitive - model3.c should not change the way your code is written.

acetgenessorted = unqgenes(spos);
[sx spos] = sort(acet_screen_rpmi(:,1));
acetgenessorted = unqgenes(spos);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change variable names to something more generic since this code looks at methylation now

model3 = modeltemp;


[solf.x,sol11] = constrain_flux_regulation(model3,[],[],0,0,0,[],[],minfluxflag);
if ~isempty(solf.x) && ~isnan(solf.x)
acet_screen_rpmi(i,2) = solf.x(objpos); % impact on growth
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change variable name so it's more obvious that it's the growth rate from the constrained metabolic model. Also, are you intentionally using the acetylation model? It is confusing since you're storing the objective coefficient value into the variable name epsilon_methylation.

@ScottCampit ScottCampit added bug Something isn't working enhancement New feature or request labels Sep 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants