Skip to content

TCX/HQ classifier: BASOP increases HQ usage for stereo

Basic info

  • Float reference:
    • Encoder (float): f2eb9e08 (Thu Mar 6 16:21:04 2025 +0000)
    • Decoder (float): f2eb9e08 (Thu Mar 6 16:21:04 2025 +0000)
  • Fixed point:
    • Encoder (fixed): 7f488e0a (Thu Mar 13 04:19:03 2025 +0000)
    • Decoder (fixed): 7f488e0a (Thu Mar 13 04:19:03 2025 +0000)

Bug description

The BASOP encoder selects HQ more often than the float reference. When running the Stereo @ 32kbps on a longer database (~3 minutes), the statistics look like this:

image

I ran a test on 166 music files and got a similar result: there is a tendency to increase the HQ core selection. The following plots show the core decision for FL and FX for the three items with largest increases of HQ core usage.

m017 test item

image

m091 test item

image

m101 test item

image

The TCX/HQ mdct_classifier was not changed from EVS to IVAS, so the change is a bit unexpected. I noted however that an IVAS-specific variant has been created. I am not sure why this was needed, since the input seems to be the same in terms of precision used.

I have also tried to extract the down-mix signal, and running this in ISM 1 NULL, but in that case the HQ usage did not increase. I did however extract the down-mix from the floating point code, perhaps I should try to get it from the BASOP code and repeat my experiment.

Ways to reproduce

You need to compile the BASOP main and ivas-float-update reference with DEBUGGING and DEBUG_MODE_INFO enabled.

Box folder: ...\Box_EXTERNAL_IVAS_BASOP_VERIFICATION\issues\issue-xxxx

IVAS_cod_ref -stereo 32000 48 m017.wav tmp.bs
mv res/core core_ref
IVAS_cod -stereo 32000 48 m017.wav tmp.bs
mv res/core core_dut

The core_ref and core_dut can be loaded e.g. in MATLAB and inspected. Note that the DEBUG_MODE_INFO replicates each value by 960 to fit the length of the audio frame.

Edited by norvell