Speech / music classification differs between float and fixed point

Basic info

The output of the speech/music classifier differs between float and fixed point, leading to different core-coder usage and thus different quality

on the figure below, from top to bottom:

IVAS_cod -STEREO 13200 48 .\scripts\testv\stvST48c.wav bit13200 IVAS_dec STEREO 48 bit13200 syn.wav

The differences shown above are for stereo encoding, both similar observations were made for -ISM 1.

(Clear steps or refer to a failing automated test, e.g. with a pipeline link)