Lessons learned: A) Performance freaks to stop using #rstat 's runif for random generation. The Hoshiro random number generator https://arxiv.org/abs/1805.01407 is 10x faster. Implementations in #perl 's #PDL, #rstats (dqrng) and #python #numpy are within 20% of each other B) But does it make a difference in applications? To get to the bottom of this, I coded a truncated random variate generator in #rstats and #perl using #pdl (as well as standard u/perl) using the #GSL packages https://metacpan.org/pod/PDL::GSL::CDF & https://metacpan.org/pod/Math::GSL for accessing the CDF & quantile functions. In this context, it's the calculation of the #CDF that is the computationally intensive part, not the drawing of the random number itself. C) I should probably blog about these experiments at some point. Note that #pdl (but not base #perl) are rather competitive choices for large array processing with numerical operations. I mostly stay away of #python , but would not surprise me that for compute intensive stuff (where the heavy duty work is done in C/C++), it does not matter (much) which high level language one uses to build data applications submitted by /u/ReplacementSlight413 |
mstdn.science
http://mstdn.science is a place for people our field (microbiology), scientists in general, and science enthusiasts to discuss research and topics surrounding our work.Mastodon hosted on mstdn.science
Programming Feed reshared this.