tune WMOPS of matrix_product_mant_exp_fx (regular case)
tuned function matrix_product_mant_exp_fx (regular case) in WMOPS (about -36.2), all bit-exact to previous version
tuned function matrix_product_mant_exp_fx (regular case) in WMOPS (about -36.2), all bit-exact to previous version