-
Notifications
You must be signed in to change notification settings - Fork 5
subsitute vector<double> for map<PeakType, double> in SignalToNoiseEstimator #98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
|
||
virtual double getSignalToNoise(const PeakType & data_point) | ||
///Return to signal/noise estimate for date point @p index | ||
///@note the first query to this function will taake longer, as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
taake
? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ops, thanks.
|
||
if (write_debuginfo) | ||
{ | ||
std::cerr << first_it->getPrecursor().getMZ() << " " << first_it->getProduct().getMZ() << " "; | ||
} | ||
|
||
PeakSpectrum::FloatDataArray signal_to_noise; | ||
for (PeakSpectrum::Iterator sit = chromatogram.begin(); sit != chromatogram.end(); ++sit) | ||
for (Size sit = 0; sit < chromatogram.size(); ++sit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename sit
to i
, because its not an iterator anymore
PeakIterator last_; | ||
std::vector<double> stn_estimates_; | ||
/// raw data; | ||
Container c; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems very wasteful to copy the whole underlying data!
Can you try and remove this member entirely?!
The user calls .init(container) and there we can compute the S/N values and store them in stn_estimates
. There should be no need to store the container as well.
The only thing that does not work anymore is the is_result_valid_
idea. But that should not really be required and can be removed as well.
This should make the code quite a bit faster...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking at the code, you can simply remove the is_result_valid_
member, and within SignalToNoiseEstimator::update()
simply do stn_estimates_.clear();
, this way, if someone sets new parameters and forgets to call .init() afterwards, we will see a segfault and thus the error is clear. You can also add a OPENMS_PRECONDITION to check the index is within the vectors size in getSignalToNoise()
, so in debug builds we will get a proper error message and not just a segfault.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good. only the container member seems overkill.
Removing it should give a speedup.
Please remeasure when done :)
} | ||
|
||
///Return to signal/noise estimate for date point @p index | ||
///@note the first query to this function will taake longer, as | ||
///@note the first query to this function will take longer, as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not correct anymore, right? Since everything is computed during the call to init()
.
init(c); | ||
} | ||
|
||
OPENMS_POSTCONDITION(condition,message); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should say OPENMS_POSTCONDITION(index < stn_estimates_.size(), "S/N estimate beyond container size was requested");
or similar
ok. looks good. |
It coud be found that the type pf data, map<PeakType, doule>, was used in SignalToNoiseEstimator of Tool OpenSwathWorkflow, in which the values corresponded to the iteration of map are storaged. The stored values are synchronously updated contiguously with the iteration, using the function named getSignalToNoise. However, vector could be more efficient when processing large amounts of data. Thus, vectors are applied here. From the diagram shown below, it could be easily found that the processing speed differs a lot while using vector and map.(left: map | right: vector)

runtime of OpenSwathWorkflow with map: 17:31 m
runtime of OpenSwathWorkflow with vector: 15:38m