Vector Class Discussion

Converting between the Vectors - Piece - 2012-12-30

Converting between the Vectors - Agner - 2012-12-30

Converting between the Vectors

Author: Piece Date: 2012-12-30 06:13

Hello,

great library. After implementing it in my program, i got a speedup by a factor of about 30. But i wonder if i overlooked something. My code looks not very good when i try to convert a Vec8s to a Vec8f. Just take a look at this code:

....
Vec8f weight_cap_f, weights, cap_weights; short val[8]; short current_hit; .....

inline float stats::weight() { // get total float vector Vec8f total_f = to_float( Vec8i( current_hit, val[1], val[2], val[3], val[4], val[5], val[6], val[7] ) ); // find overcap values Vec8f over_cap = ( total_f - weight_cap_f ) * to_float( reinterpret_i( total_f > weight_cap_f ) ); // weight it Vec8f sum = ( total_f + over_cap ) * weights - over_cap * cap_weights; // sum it all up return horizontal_add( sum ); }

what i try to do is basically i got a (short) number (eg 750) and a (short) cap (eg 500) and two (float) weights (eg 4.0 and 2.0). then i weight everything till the cap with 4.0 and everything above it with 2.0. ( = 500 * 4.0 + 250 * 2.0 = 2500.0 ).
and all that over 8 vectorelements. Can i optimize this code using your library? My program spends 70% of its time in this method. Maybe there ts a way to directly multipy Vec8f and Vec8s?

Best regards

Reply To This Message

Converting between the Vectors

Author: Agner Date: 2012-12-30 12:57


      Vec8f weight_cap_f, weights, cap_weights;

      short val[8];

      short current_hit;

    .....

inline float stats::weight() { // load val into vector Vec8s val_s = Vec8s().load(val); // replace first element by current_hit Vec8s total_s = blend<0,9,10,11,12,13,14,15>(Vec8s(current_hit), val_s); // convert short to int Vec8i total_i = Vec8i(extend_low(total_s),extend_high(total_s)); // convert to float Vec8f total_f = to_float(total_i); // find overcap values Vec8f over_cap = select(total_f > weight_cap_f, weight_cap_f - total_f, 0.0f); // weight it Vec8f sum = ( total_f + over_cap ) * weights - over_cap * cap_weights; // sum it all up return horizontal_add( sum ); }

You may consider using the same type throughout to avoid the many type conversions - they are expensive. I havent tested this code but I think you get the idea.

And don't expect me to solve all your programming problems...

Reply To This Message