Optimize matrix-mul output-format fix.
# Summary
The following code can be optimised on all matrix mul operations:
```
if ( L_and( is_zero_arr( outRe_fx[0], size ), is_zero_arr( outIm_fx[0], size ) ) )
{
*q_out = Q31;
move16();
}
```
task