Simplification of arithmetic codec subfunctions
The arithmetic codec uses some subroutines in combination with some comparators in a very inefficient way.
Example:
```
IF( GT_32( L_multi31x16_X2( range_h, range_l, p[8] ), cum ) )
{
p = p + 8;
}
```
Initially, the `Word32 range` is split into Word16 `range_h` and `range_l` on cost of 4 operations. The split is not needed at all.
The function `L_multi31x16_X2` costs 3 operations instead of 1.
The comparison of the product with Word32 could be merged into a MSU operation without additional costs.
The costly IF-clause could be changed into lower-case "if".
The new solution (naming still t.b.d.) could then look like that:
```
if (L_msui_32_16(cum, range, p[8]) < 0)
{
p = p + 8;
move16();
}
```
issue