Hi all.
Sorry if this has already been said before, it’s already implemented and I need to download the latest version and/or I shouldn’t be making a thread about it. I’m sure I’m guilty of at least one of those.
***Anyways, as the title suggests, I would like to see multi-threading support for all simulations. But, before you chastise me because obviously communicating between cores slows it down so much you might as well use a single thread, I have a couple of ideas that could make this work. Perhaps it could be extended to GPUs too, but hey, I’m not all that good with programming in general, I don’t know if it’s easy or next to impossible (probably the latter). I’m putting my idea up for the experts to judge.
***The main problem is that every bit of a simulation, whether it’s the voxels in smoke or vertices in cloth, needs to interact with its surroundings. Regardless of how you divided it, you would end up with inconsistencies along the edges - either that or the simulation would get slowed by the communication between cores, plus synchronicity issues among others.
***First thing to do is the not compare with/reference the current subframe/step, but the one before.
Obviously this sacrifices a bit of accuracy and requires increasing the number of subframes/steps to accomodate (worst case scenario I think is logically double), but you now have all cores instead of one working. They each compare their own chunk with the entire previous subframe/step, which should already be cached anyways (there’s no reason to get rid of it other than to free up memory). Not too much memory used, no need for communication between cores, overall much faster. A bit of efficiency lost indirectly but it’s a big leap forward.
***
***Still though, there’s an obvious bottleneck. Let’s say I have 4 cores (which I actually do) and with a bunch of other programs running at the same time they are each at a different speed. The most free one finishes its chunk first, followed by the next, and continuing until the last. You could go back more frames, but that drastically reduces the efficiency and at the end of the day doesn’t solve the problem. This would become more pronounced with more cores and would really slow it down.
***Instead you could do something like the tiles used in rendering, except with 3D chunks (or however it would be divided). Each core deals with one little chunk (group of hairs, whatever) at a time and doesn’t need to sync or communicate because it’s comparing with the previous, already cached, subframe/step. Multithreading accomplished.
***There’s other technicalities as well (how the chunks would be determined, how the physics would be adjusted to retain accuracy while being fast, etc.) as well as the question of how to extend this to GPUs, but in theory at least this would be a massive leap forward. No more CPU idling, people with a butt ton of cores would no longer have to suffer from painfully slow simulations.
If it can be done I think this should have rather high priority.
Thoughts?