Vector Class Discussion

simple ray casting test slower when vectorized
Author:  Date: 2013-03-02 12:57
Hi

first let me thank you for providing this very intersting vectorclass library !

I'd like to use the library for accelerating a ray casting application and did some preliminary testing.
Unfortunately, the vectorized loop (using vec3d,vec4d) runs two times slower than the non-vectorized version on my Sandy-Bridge Xeon, compiled with gcc 4.7:

Vec4d v_ray_pos( ray_pos[0], ray_pos[1], ray_pos[2], 0),
v_ray_dir( ray_dir[0], ray_dir[1], ray_dir[2], 0);
Vec4d v_pos(0,0,0,0);
double pos[3]={0,0,0};

double t,t0=0,t1=10000,dt=(t1-t0)/500000000.;

// non-vectorized loop
TIMER_START(&tm);
for(t=t0;t<t1;t+=dt)
{
for(k=0;k<3;k++) pos[k] += dt*ray_dir[k];
for(k=0;k<3;k++) pos[k] += dt*ray_dir[k];
for(k=0;k<3;k++) pos[k] += dt*ray_dir[k];
}
TIMER_STOP(&tm);

// vectorized loop
TIMER_START(&tm);
for(t=t0;t<t1;t+=dt)
{
v_pos += dt*v_ray_dir;
v_pos += dt*v_ray_dir;
v_pos += dt*v_ray_dir;
}
TIMER_STOP(&tm);

Am I doing something wrong here with the usage of the vectorclass types or how else could one explain the poor performance of the
vectorized loop compared to the plain array-based version ?

Thanks a lot for any help and insights !

P.S.: apparently gcc did not auto-vectorize the non-vectorized loop (at least it did not report so via -ftree-vectorizer-verbose=2)

 
thread simple ray casting test slower when vectorized - epsilon - 2013-03-02
last replythread simple ray casting test slower when vectorized new - Agner - 2013-03-03
last replythread simple ray casting test slower when vectorized new - epsilon - 2013-03-03
last replythread simple ray casting test slower when vectorized new - Agner - 2013-03-03
last reply simple ray casting test slower when vectorized new - chad - 2013-03-14