Skip to content

DFT-Stereo residual decoding difference between flt and fx can lead to tonal artifacts

Tonal artifacts were observed for some frames at 32kbps stereo, git commit 7ddadf2c. The decoding mode there is DFT-Stereo with residual coding. In ivas_stereo_dft_dec.c:2937++, the residual spectrum is decoded and dequantized. The final decoded residual spectrum is available after line 2996 in res_buf. Most of the time, the decoded values are close to the ones from the floating point decoder, but there can be occasional very drastic mismatches in single bins.

How to reproduce:

Run BASOP codec:

./IVAS_cod -stereo 32000 32 ltv32_STEREO.wav bit
./IVAS_dec stereo 32 bit out_fx.wav

Run float decoder with bitstream:

git clone https://forge.3gpp.org/rep/ivas-codec-pc/ivas-codec.git
make -j -C ivas-codec
./ivas-codec/IVAS_cod stereo 32 bit out_float.wav

Difference between the two output signals looks like this (starts at 11.40s / frame 570): Screenshot_2024-01-31_at_15.11.05

In this example, the artifact is not very audible, but causes a short low frequency tone in the decoded output. The issue was found in other signals as well where it was more audible.

The problem was not there at 21493ea7. It started appearing at 8e418e5c. As multiple conversions were added at once, I deactivated the residual coding only by using the attached patch file with git apply patch_disable_eclvq_fx.diff. patch_disable_eclvq_fx.diff

When one compares the values of res_buf between the fx decoder and the float (or the patched fx decoder) in frame 570, one can see a big difference at index 4 (last value in the arrays listed below). The sign difference between the two values may hint at an overflow (?).

# res_buf from basop decoder:
[-96.4335937, -675.035156, 2217.96875, 675.03125, -12150.6289, ...]
# res_buf from patched decoder (basop decoder with float residual decoding):
[-96.4388656, -675.072083, 2218.09399, 675.072083, 12537.0527, ...]