We said we didn't want to go into this now, but I did take a quick jump in and out of the rabbithole.
I converted calc_1_seq to pass i directly in as an integerScalar to {calc,sim}_one without populating idx as a new tensor.
It didn't make much difference in a model with a million IID dnorm nodes: 0.2s for calc and 0.8s for sim when using idx and 0.16s for calc and 0.65s for sim when using i. Of course with more indices and reordering indices there will be more overhead.
Other things to note:
- We are still 10x slower than using
dnorm and rnorm from R: 0.02s for calc and 0.07s for sim. These are of course basically straight C for loops. So the overhead of calling calc_one is substantial in relative terms.
- Current nimble takes 40s for calc and 100s for sim, so huge improvement.
- Current nimble modeling building takes 200 s, while with new nimbleModel it takes 20s (and I'm surprised it is even that long - so there is something to look into there).
- nCompile and compileNimble for the model both take ~40s.
This is all with system.time and not microbenchmark, so take with a grain of salt for timing of the sub-one-seconds times.
We said we didn't want to go into this now, but I did take a quick jump in and out of the rabbithole.
I converted
calc_1_seqto passidirectly in as an integerScalar to{calc,sim}_onewithout populatingidxas a new tensor.It didn't make much difference in a model with a million IID dnorm nodes: 0.2s for calc and 0.8s for sim when using idx and 0.16s for calc and 0.65s for sim when using
i. Of course with more indices and reordering indices there will be more overhead.Other things to note:
dnormandrnormfrom R: 0.02s for calc and 0.07s for sim. These are of course basically straight C for loops. So the overhead of callingcalc_oneis substantial in relative terms.This is all with system.time and not microbenchmark, so take with a grain of salt for timing of the sub-one-seconds times.