Go for parallel computation of different bodies within the same iteration, that way you won't have data dependencies across the parallel processes.
Granted, you may actually see degradation of speed for a two body simulation, but for more it should shine.
Reductionist and proud of it.
Being ignorant is not so much a shame, as being unwilling to learn. Benjamin Franklin
Chase after the truth like all hell and you'll free yourself, even though you never touch its coat tails. Clarence Darrow
A person who won't read has no advantage over one who can't read. Mark Twain