The more intensive function is within rkf45_apply in my case. I simply added
a few pragmas to the loops, and it speed it up quite a lot.
Having looked at where you've placed the #pragma omp parallels, have
you tried enabling vectorization to see if the time spent in those
axpy-like operations could be improved? A good SSE-ready optimizer
should nail those.
I may have misunderstood your "millions of differential equations"
statement. Are you rather solving one problem with a million degrees
of freedom?
- Rhys