# Thread: More precise orbital simulation program

1. ## More precise orbital simulation program

I thought this might be the best place for this thread. Actually, I don't know much about how simulation programs are made or how precise they are, but I make a few of my own, and I've noticed that adding a couple of simple lines to the calculations for the gravitation of a two-body system can make them much more precise, up to millions of times as much, so that they can be run faster as well. It works best for masses of similar quantity, though, such as binary star systems and such, while a small mass orbitting a much larger one will remain about the same.

The general program lines for gravity might normally look something like this:

d=((x(z)-x(3-z))^2+(y(z)-y(3-z))^2)^.5
vx(z)=vx(z)-G*M(3-z)*(x(z)-x(3-z))/(d^3)*ti
vy(z)=vy(z)-G*M(3-z)*(y(z)-y(3-z))/(d^3)*ti
x(z)=x(z)+vx(z)*ti
y(z)=y(z)+vy(z)*ti

which is repeated in a loop as the orbits proceed. Here, d is the total distance between the two bodies each time the loop is run through, x and y are the coordinates of each body, vx and vy the vectors of speed in the x and y directions, and z is the designated number for the bodies, 1 and 2, which alternate through these calculations to aquire their orbits. G is the gravitational constant and M is the mass of the body, which is given here as M(3-z) since we are considering the mass of the other body, so for z=1, 3-z=2 and for z=2, 3-z=1.

Okay, now here is what I have found to be a surprisingly precise program for at least a two body system of similar masses:

x(z)=x(z)+vx(z)*ti-G*M(3-z)*(x(z)-x(3-z))/(d^3)*(ti^2)/2*M(z)/(M(z)+M(3-z))
y(z)=y(z)+vy(z)*ti-G*M(3-z)*(y(z)-y(3-z))/(d^3)*(ti^2)/2*M(z)/(M(z)+M(3-z))
d=((x(z)-x(3-z))^2+(y(z)-y(3-z))^2)^.5
vx(z)=vx(z)-G*M(3-z)*(x(z)-x(3-z))/(d^3)*ti
vy(z)=vy(z)-G*M(3-z)*(y(z)-y(3-z))/(d^3)*ti

in exactly that order. I'm not sure how many programmers we have on this forum, but I think one will like the precision this program provides. If anyone else has any more suggestions for creating a faster or more precise simulation program, please let me know.

2. Established Member
Join Date
Mar 2005
Posts
1,234
Why are you dividing by d^3? Your acceleration formula should be GM/d^2.

Also, if you want to make it faster, you computer probably does d * d much faster than d^2. Try them both and compare.

3. Originally Posted by tony873004
Why are you dividing by d^3? Your acceleration formula should be GM/d^2.
I am using vectors for the acceleration in the x and y directions, so ax=(GM/d^2)*(x/d)=GMx/d^3 and ay=(GM/d^2)*(y/d)=GMy/d^3. How do you do it?

Also, if you want to make it faster, you computer probably does d * d much faster than d^2. Try them both and compare.
Yes, that does make it slightly faster also. I've found that out while experimenting with different variations of writing the same program in different ways, along with a few other "tricks", such as only printing text periodically, since that seems to take an incredible amount of time. But while these can increase it somewhat as far as the program itself is concerned, the lines for the program in this thread can make it hundreds to thousands of times faster (for two-body systems of similar mass, anyway) by allowing one to increase the time increments with the same precision, or up to millions of times more precise for the same running time.

4. Established Member
Join Date
Mar 2005
Posts
1,234
Originally Posted by grav
I am using vectors for the acceleration in the x and y directions, so ax=(GM/d^2)*(x/d)=GMx/d^3 and ay=(GM/d^2)*(y/d)=GMy/d^3. How do you do it?
The same way you did it, at least in the Euler integrator. I just broke it into 2 steps, that's why I didn't recognize your d^3.

Here's a link to a thread on my forum where the Euler integrator source code appears. I compute all the accelerations and velocities first, then in another loop, I update the positions from the newly-computed velocities.
http://www.orbitsimulator.com/cgi-bi...1160874415/6#6
Also, sqrt(x) is faster than (x)^0.5. These little things make a big difference since they're inside a loop that will be executed thousands to millions of times.
Last edited by tony873004; 2007-Apr-15 at 10:35 PM.

5. Well, I saw the program code you referenced. I like the way you saved time by running the loop for the coordinates separately so that it only needs to run through n times for the number of bodies rather than n^2. The thing is, though, I ran the configuration you are using, and it was way off compared to this one. Unless I put wrote yours incorrectly or something, I think you will be very pleased with the new configuration by itself, much less the improvement in the stability of the orbit. I had forgotten I originally experimented with the configuration (the order in which the calculations take place) to come up with the most stable orbit before I even tried to improve on that. The new lines will all have to be run together through the loop and so might appear that they would take more time, but you could probably run it up to hundreds of times faster and still gain a more stable orbit than that one seems to give, so I think you will like it.

I will rewrite the lines to be used and post it. Just to be sure I get it right, though, I have a couple of questions. How do you figure tM and the object mass. It appears that the increments of time are also figured in, but how exactly? And what is 398600440000000?

And just in case this turns out to be a million dollar deal or something (would be nice ), this thread constitutes a copyright.

6. Originally Posted by grav
Just to be sure I get it right, though, I have a couple of questions. How do you figure tM and the object mass. It appears that the increments of time are also figured in, but how exactly? And what is 398600440000000?
Okay, I saw on your forum that 398600440000000 is calculated as G times Earth's mass. But where is the time? I guess it must be figured in separately for the coordinates when setting the graphics, right?

7. Established Member
Join Date
Mar 2005
Posts
1,234
You'll have to tell me where you made the change.

The reason you don't see time step in my code is that it is handled external to this code. When a user increases the time step by 2, the program increases the velocities of the objects by 2 and their mass by 2^2. This saves me from having to do the conversion in the loop, hence saving a lot of time.

8. Originally Posted by tony873004
You'll have to tell me where you made the change.
I just changed these

Code:
```f = (1/D) * (1/D) * tM;
fx = (dx / D) * f;
fy = (dy / D) * f;
fz = (dz / D) * f;
objvx[j] = objvx[j] - fx;
objvy[j] = objvy[j] - fy;
objvz[j] = objvz[j] - fz;

and

f = (1/D) * (1/D) * tM2;
fx = (-dx / D) * f;
fy = (-dy / D) * f;
fz = (-dz / D) * f;
objvx[k] = objvx[k] - fx;
objvy[k] = objvy[k] - fy;
objvz[k] = objvz[k] - fz;```
to these.

Code:
``` f = tM/D/D/D;
objvx[j] = objvx[j] - dx*f;
objvy[j] = objvy[j] - dy*f;
objvz[j] = objvz[j] - dz*f;

and

f = tM2/D/D/D;
objvx[k] = objvx[k] + dx*f;
objvy[k] = objvy[k] + dy*f;
objvz[k] = objvz[k] + dy*f;```
Check that you're using double precision variables. I find it odd that you say your program gives the same results for short simulations, but not long ones.
I'm using UBasic, which gives me up to 2600 digits, but the more I use, the slower it runs. I generally run it to about thirty or fourty decimal places. I think the reason my program runs off like that is because I was finding for the coordinates within the main loop, instead of doing it separately like you are, as well as running a loop for each body individually, so the slight change in coordinates of one body with each run affects the next.

9. Established Member
Join Date
Mar 2005
Posts
1,234
Check that you're using double precision variables. I find it odd that you say your program gives the same results for short simulations, but not long ones.

10. Ah, you've just explained what I couldn't understand.

I got confused when one of you wrote that x*x is faster that x^2, when all the compilers I know would result in the same calculation for both, (unless you force 2 to be real or use an explicit power function)

Try to learn a compiled language, you'll be amazed that the speed increase, and possibly confounded by the lack of effect many of the tweaks such as the ones you're doing has, since the compiler does most of them already.

11. Originally Posted by HenrikOlsen
Ah, you've just explained what I couldn't understand.

I got confused when one of you wrote that x*x is faster that x^2, when all the compilers I know would result in the same calculation for both, (unless you force 2 to be real or use an explicit power function)

Try to learn a compiled language, you'll be amazed that the speed increase, and possibly confounded by the lack of effect many of the tweaks such as the ones you're doing has, since the compiler does most of them already.
Thanks. I had been looking around for another program that might run faster and/or more accurately, but I was experimenting with different variations of formulas for gravity, and Basic is all I know, so I was hoping I could just change the few lines of calculations for gravity from another program to my needs, which wouldn't require much language skill, except for the layout I prefer.

12. Well, it looks like changing the orientation of the calculations only exchanged one problem for another. Instead of the size of the orbits steadily increasing or decreasing with time, they now steadily precess counter to their lines of motion around each other, although the size of the orbits themselves don't change. It seems like there might be some simple way to help resolve one or the other of these.
Last edited by grav; 2007-Apr-16 at 08:36 AM.

13. Established Member
Join Date
Mar 2005
Posts
1,234
Originally Posted by grav
...Instead of the size of the orbits steadily increasing or decreasing with time, they now steadily precess counter to their lines of motion around each other, although the size of the orbits themselves don't change...
My code does that if you use too large of a time step. Take Mercury for example. If you use a time step like 64K seconds, Mercury's orbit noticabely precesses. And it's precessing backwards from the direction it precesses in real life. If you slow the time step down, the precession gets less and less. If you slow it below 2k seconds, you've slowed it past 0, and it is now free to precess in the natural direction. As you slow it further, the rate of precession asymptotically approaches the correct Newtonian-predicted value of about 530 arcseconds per century. Here's a graph I made of time step vs precession in arcseconds per century.

Try running it to 15 decimal places. I doubt you need much more than that. 15 is what my program uses (Double precision). With a slow enough time step, if you propogate Apophis' orbit from today's vectors, to its 2029 approach of Earth, it only misses JPL's predicted close approach distance by about 200 km. That's not too bad considering both Apophis and Earth travelled over 20 billion kilometers each, on completely separate trajectories during this 22 year period to get to their encounter. For comparison, if someone could hit a golf ball from San Francisco to Tijuana, Mexico, this would be like predicting within 1/10th of a millimeter where it would land. This shows you just how powerful a Newtonian-only model, using a simple Euler algorithim with double precision variables and a slow time step can be.

I suspect that the 200 km difference has little to do with my use of only 15 digits. Their simulator takes GR into account, and perhaps more gravitational perturbers from the asteroid belt, as well as non-spherical gravity.

Since you say that UBasic lets you choose how many digits you use, if you get your code working, I'd be interested in seeing a plot of how different number of digits effects the results. How many are required for good results? How many are overkill?

14. Tony,

Is the source code for your simulator available? If it's "trade secrets", then no problem.

The reason I ask is I'd like to run it through Intel's C++ compiler and compare it. That sucker does vectorizing and parallelizing for multi-CPUs (and multi-core and hyperthreading) and I always like to see if it's as good as they like to brag about. It has impressed me with other stuff.

-Richard

15. Established Member
Join Date
Mar 2005
Posts
1,234
Originally Posted by publius
Tony,

You didn't tell me the main program would simply execute the calculation itself if the "codeoptimizer.dll" failed to load...
Yes I did
Originally Posted by tony873004
...BTW, if you want to see a comparison between the speed of VB 6.0 and .NET compiled C++, simply rename codeoptomizer.dll, so VB can't find it. This forces it to use VB to do the math...
But I've just discovered in most circumstances that the C++ dll doesn't speed things up at all, and on some computers it actually slows it down. So I retract my claim of 6x faster that I made above. (One of the biggest hassels of programming is assuming that something will behave on other computers the same way it behaves in my computer.) The only way to see for yourself is to temporarily re-name codeoptomizer.dll and compare the speed of a simulation to its speed using codepotomizer.dll. If you re-compile it using a different compiler, you can probably guage the relative difference between your compiled version and the .NET compiled version.

If it helps you, here's the VB code that calls the .dll.
This declares it. This snipet of code is only executed once:
Code:
```Private Declare Function RunLoop Lib "CodeOptomizer.dll" (ByVal NumObjects As Long, _
ByRef ObjMass As Double, ByRef objx As Double, ByRef objy As Double, _
ByRef objz As Double, ByRef objvx As Double, ByRef objvy As Double, _
ByRef objvz As Double, ByRef objsize As Double, _
ByRef CollisionObjectA As Long, ByRef CollisionObjectB As Long) As Integer```
And this calls it. This code is executed once per iteration.
Code:
`xxx = RunLoop(objects, ObjMass(0), objx(0), objy(0), objz(0), objvx(0), objvy(0), objvz(0), objsize(0), CollisionObjectA(0), CollisionObjectB(0))`
Also, if you make a change to codeoptomizer.dll, either by replacing it with a different version, or by renaming it as to hide it, you must exit Gravity Simulator and start again as it only reads it in once at the beginning of the program, and won't recognize your change.

16. Tony,

Indeed you did tell me. Doh! Read the whole thing, Richard, read the whole thing carefully.............. More later when I get around to it.....

-Richard

17. Is it actually truly called CodeOptomizer?

Reminds me on a time I had to support code originally written by a guy who is severely dyslexic, all the function names where not quite written in English.

18. Originally Posted by grav
...To run it just a little faster, you might try rewriting it as...
Originally Posted by tony873004
Thanks. That make it 8% faster.
If you liked that little bit, you should really like this. It has nothing to do with what I've been working on, but it should more than double your speed, depending on the time consumption of the rest of the program (the more time the rest of the program takes up, the more this will help), with the same accuracy, or you can run it at half of the time intervals for the same speed, but with twice the accuracy. If you're lucky enough that it quadruples it, you can have twice the accuracy and twice the speed. Just type in the three lines in blue. You can experiment with the value of 'qw' to find the point where the speed tops off. But with qw=10, it'll only plot the points one out of every ten times, so you don't want to go over qw=1000 or so to keep a smooth progression in the graphics. Unfortunately, however, the speed tops off for me at about qw=5 at about twice as fast, so I don't have the luxury of worrying about it . I have to go as high as qw=100 just to the speed up from there to about 2.5 . Maybe yours will do better.

Code:
```// CodeOptomizer.cpp : Defines the entry point for the DLL application.
//

#include "stdafx.h"

BOOL APIENTRY DllMain( HANDLE hModule,
DWORD  ul_reason_for_call,
LPVOID lpReserved
)
{
return TRUE;
}

int _stdcall RunLoop(int NumObjects, double * ObjMass, double * objx, double * objy, double * objz, double * objvx, double * objvy, double * objvz, double * objsize, long * CollisionObjectA, long * CollisionObjectB)
{
if (NumObjects < 1) return 0; //Should be 1 or more

double tM;
double tM2;
double dx;
double dy;
double dz;
double D;
double f;
double fx;
double fy;
double fz;
int Collisions;
Collisions = 0;
qw=10;
for (int qq=1;qq<=qw;qq++) {
for (int k=1;k<=NumObjects;k++) {
tM = ObjMass[k] * 398600440000000;
for (int j=k+1;j<=NumObjects;j++) {
dx = objx[j] - objx[k];
dy = objy[j] - objy[k];
dz = objz[j] - objz[k];
D = sqrt(dx*dx+dy*dy+dz*dz);
if  (((objsize[k] + objsize[j])/2) > D) {
Collisions = Collisions + 1;
CollisionObjectA[Collisions] = j;
CollisionObjectB[Collisions] = k;
}
if (tM > 0) {
f = tM/D/D/D;
objvx[j] = objvx[j] - dx*f;
objvy[j] = objvy[j] - dy*f;
objvz[j] = objvz[j] - dz*f;
}
tM2 = ObjMass[j] * 398600440000000;
if (tM2 > 0) {
f = tM2/D/D/D;
objvx[k] = objvx[k] + dx*f;
objvy[k] = objvy[k] + dy*f;
objvz[k] = objvz[k] + dz*f;
}

}
}

for (int h=0;h<=NumObjects;h++) {
objx[h] = objx[h] + objvx[h];
objy[h] = objy[h] + objvy[h];
objz[h] = objz[h] + objvz[h];
}
}

//End of loop computations ------
return 1; //Finished successfully

}```
Last edited by grav; 2007-Apr-18 at 06:19 PM.

19. Established Member
Join Date
Mar 2005
Posts
1,234
Thanks, I'll try this later today when I get home. I already added your previous suggestion (8%, which sometimes is as much as 14%) to the lastet beta version. Together with other changes, it's now 60% faster. The link is in my web forum. I'm anxious to try your new idea.

20. If your program keeps track of the total time that has elapsed since the simulation started so that it can be displayed and so forth, that will have to be multiplied by 'qw' as well, so you actually might want to put the line " qw=10; " in the very beginning of your program.

21. Established Member
Join Date
Mar 2005
Posts
1,234
Looking at your code, this is already implemented in an unreleased beta version Gravity Simulator. But in the released version, I take it a step farther. There's a plot / don't plot button. When not plotting, it only crunches numbers and is 10-20 times faster when simulating a few objects. But when the number of objects grows, the computation time rises quadratically, while the plotting time rises linearly. So the speed improvement asymptotically approaches 0% as the number of objects rise. I believe at about 50 objects, the improvement is negligable.

22. Originally Posted by tony873004
Looking at your code, this is already implemented in an unreleased beta version Gravity Simulator. But in the released version, I take it a step farther. There's a plot / don't plot button. When not plotting, it only crunches numbers and is 10-20 times faster when simulating a few objects. But when the number of objects grows, the computation time rises quadratically, while the plotting time rises linearly. So the speed improvement asymptotically approaches 0% as the number of objects rise. I believe at about 50 objects, the improvement is negligable.
Oh, well then, back to the drawing boards.

23. Established Member
Join Date
Mar 2005
Posts
1,234
Originally Posted by grav
Oh, well then, back to the drawing boards.
It was still a great idea. I've tweaked the code many times trying to squeeze as much speed out of it as possible. That's why its nice to still find some room for improvement like your d/d/d idea.

24. Tony,

Just some musings: what I'd like to do is get the whole thing off into C code (or Fortran -- I've got that too) and let Intel's compiler have at the whole thing.

Off the top of my head, what I would do is just a launch a thread to crank the integrations, letting it run wide open. It would fill a little array (of a certain step size) with the position info, and that array would be like a circular buffer. The plotting program would be in a separate thread and would just peek at the results and plot them at its own pace.

If the compiler could see the whole picture, especially for a large number of objects, it could vectorize better (SIMD instructions). Now the problem is, for SSE only, SIMD is single precision only. It requires SSE2 to do vector double precision operations.

Get me the whole loop off in C, and I'll bet Intel's compiler could go to down using SSE2 instructions.

-Richard

25. Originally Posted by Intel Compiler

CodeOptimizer.cpp
icl: Command line warning: overriding '/Qvc8' with '/Qvc7.1'
.\CodeOptimizer.cpp(29) : (col. 2) remark: loop was not vectorized: not inner loop.
.\CodeOptimizer.cpp(31) : (col. 3) remark: loop was not vectorized: existence of vector dependence.
.\CodeOptimizer.cpp(64) : (col. 2) remark: loop was not vectorized: vectorization possible but seems inefficient.
I'll take it's word for the latter, but it sees a data dependency that is preventing it from vectoring the inner loop. Can that be fixed? I'll knock around.......

-Richard

26. Established Member
Join Date
Mar 2005
Posts
1,234
Just to test the speed difference between the 2 compilers, you could always just hard-code the starting variables, creating a stand-alone program:

numobjects=2;
CollisionObjectA=0;
CollisionObjectB=0;
objx[1]=1;
objy[1]=1;
objz[1]=1;
objvx[1]=1;
objvy[1]=1;
objvz[1]=1;
objmass[1]=332000;
objx[2]=150000000000;
objy[2]=0;
objz[2]=0;
objvx[2]=0;
objvy[2]=30000;
objvz[2]=0;
objmass[2]=1;

This approximately gives you the Earth around the Sun.

Then just add a counter and a way for you to break out of the loop.

27. Tony,

Thanks. First though, I like to attack trying to re-write to get it to vectorize -- ie be able to use the SIMD, single instruction multiple data instructions to try to do some of the work in parallel.

Thinking about it, the larger time loop can NOT parallel, because the results of each iteration depend on the previous one -- you just can't split that up. However, for a large number of objects, we ought to be able to get some of that vectorized.

What we want to do is arrange it so each iteration doesn't depend on the results of the previous one. The way it's written now is what is killing that.

-Richard

28. Tony,

Oh, BTW, I got that .dll compiled with Intel's compiler (C/C++ v9.1) and working. I compiled several versions using different instruction set possibilities. Anyway, you might like to try out the one with SSE2 instructions. PM me and I'll e-mail you a copy. It's only 48KB in size.

It requires a CPU with SSE2 capability and will NOT gracefully fail if the CPU doesn't have it, but will fault out.

Note this is not vectorized, just doing what it could with the code as is using the SSE2 instructions.

I haven't had much time to really play with it, but it will take some modifying to allow vectorization, and, not being any expert on this, it's gonna take some thought. How can we modify that loop to allow parallel execution of different iterations.......... If succesful, the results will blow you away.

-Richard

29. Go for parallel computation of different bodies within the same iteration, that way you won't have data dependencies across the parallel processes.
Granted, you may actually see degradation of speed for a two body simulation, but for more it should shine.

30. Established Member
Join Date
Mar 2005
Posts
1,234
Originally Posted by HenrikOlsen
Go for parallel computation of different bodies within the same iteration, that way you won't have data dependencies across the parallel processes.
Granted, you may actually see degradation of speed for a two body simulation, but for more it should shine.
I don't know how to do that. Does this require a multiple processors?

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•