MDCT-Stereo DTX: inter-channel coherence for perfectly correlated background noise is incorrectly decoded as negative value
Git SHA: 518d9f24
In ivas_decision_matrix_dec.c:51++, there is the array get_next_index_4_by_15 stored for decoding the inter-channel coherence in MDCT-Stereo DTX. The type is Word16, but the last value in the array (at index 15) is 32768, which is out of bounds for signed 16-bit integer. In case a 15 is decoded from the bitstream (which indicates a coherence of 1.0 between the two channels of background noise), in line 547 an overflow causes st->hFdCngDec->hFdCngCom->coherence_fx to become -32768 and three lines below st->hFdCngDec->hFdCngCom->coherence_flt is set to -1.0 due to the same reason. Later, e.g. in fd_cng_dec.c:1129, sqrt() is called on this negative value which results in a NaN value and a later assert. The coherence value should always be in [0, 1.0].
This is the cause for the failure of testcase tests/test_param_file_ltv.py::test_param_file_tests[stereo at 48 kbps, 48 kHz in, 48 kHz out, DTX on] from e.g. https://forge.3gpp.org/rep/sa4/audio/ivas-basop/-/jobs/218035. I stumbled over this with other internal testfiles and only later realized that this is already happening there, so maybe all this info is already known - if not, I hope the analysis helps a bit. In this testcase, the problem occurs in frame 7251:
./IVAS_cod -stereo -dtx 48000 48 ltv48_STEREO.wav bit
./IVAS_dec stereo 48 bit out.wav
My attempt at a fix can be found in the attached diff (apply with git apply diff_coherence_decoding.patch). With this the coherence value should be decoded correctly, but applying this diff causes the decoder to run into an assert already earlier, so there is some deeper analysis needed.