Optimize matrixTransp1Mul_fx.
Closes #2181
Complexity analysis
Tiny improvement from 138.376 WMOPS to 137.352 WMOPS: this bit-exact optimisation removes a lot of unnecessary shift operations.
Before:
--- Complexity analysis [WMOPS] ---
|------ SELF ------| |--- CUMULATIVE ---|
routine calls min max avg min max avg
--------------- ------ ------ ------ ------ ------ ------ ------
ivas_jbm_dec_tc 1.00 1.891 1.915 1.915 26.345 38.997 29.507
ivas_spar_decode 1.00 1.010 1.041 1.019 2.544 2.684 2.628
ivas_spar_dec_MD 1.00 1.526 1.665 1.609 1.526 1.665 1.609
ivas_sce_dec 1.00 0.246 0.246 0.246 21.794 34.495 24.964
ivas_core_dec 1.00 3.152 11.390 7.909 21.548 34.249 24.718
acelp_core_dec 0.61 12.830 19.132 14.543 12.830 19.132 14.543
ivas_dec_prepare_renderer 1.00 7.068 8.724 7.173 7.068 8.724 7.173
ivas_dec_render 1.00 77.336 88.004 87.518 81.487 92.262 91.776
ivas_sba_prototype_renderer 4.00 3.922 4.258 4.258 3.922 4.258 4.258
stereo_tcx_core_dec 0.39 17.651 31.015 20.324 17.651 31.015 20.324
--------------- ------ ------ ------ ------
total 1000.00 119.955 138.376 128.456
After:
--- Complexity analysis [WMOPS] ---
|------ SELF ------| |--- CUMULATIVE ---|
routine calls min max avg min max avg
--------------- ------ ------ ------ ------ ------ ------ ------
ivas_jbm_dec_tc 1.00 1.891 1.915 1.915 26.345 38.997 29.507
ivas_spar_decode 1.00 1.010 1.041 1.019 2.544 2.684 2.628
ivas_spar_dec_MD 1.00 1.526 1.665 1.609 1.526 1.665 1.609
ivas_sce_dec 1.00 0.246 0.246 0.246 21.794 34.495 24.964
ivas_core_dec 1.00 3.152 11.390 7.909 21.548 34.249 24.718
acelp_core_dec 0.61 12.830 19.132 14.543 12.830 19.132 14.543
ivas_dec_prepare_renderer 1.00 7.068 8.724 7.173 7.068 8.724 7.173
ivas_dec_render 1.00 76.312 86.980 86.494 80.463 91.238 90.752
ivas_sba_prototype_renderer 4.00 3.922 4.258 4.258 3.922 4.258 4.258
stereo_tcx_core_dec 0.39 17.651 31.015 20.324 17.651 31.015 20.324
--------------- ------ ------ ------ ------
total 1000.00 118.931 137.352 127.432
Edited by Nicolas Roussin