Skip to content

Speech / music classification differs between float and fixed point

Basic info

Bug description

The output of the speech/music classifier differs between float and fixed point, leading to different core-coder usage and thus different quality

on the figure below, from top to bottom:

  • synthesis of stvST48c.wav
  • float sp_aud_decision0
  • fixed sp_aud_decision0
  • float sp_aud_decision1
  • fixed sp_aud_decision1
  • float sp_aud_decision2
  • fixed sp_aud_decision2

image

Ways to reproduce

IVAS_cod -STEREO 13200 48 .\scripts\testv\stvST48c.wav bit13200 IVAS_dec STEREO 48 bit13200 syn.wav

The differences shown above are for stereo encoding, both similar observations were made for -ISM 1.

(Clear steps or refer to a failing automated test, e.g. with a pipeline link)