Skip to content

Wrong and unstable spatial image at the beginning of active speech for ISM1 with DTX

In the P800-11 Characterization experiment, there is a wrong and unstable spatial image in samples *a1s03* and *a2s06*. The spatial image in the beginning of the 1st sentence of both samples starts at far left and rapidly moves to the ~correct position, causing a serious spatial instability. To recall, P800-11 used DTX and 5% FER.

The issue is still present (re-evaluated on Jan 8), even in the absence of FEs. What is worrisome, it is present at all bitrates, (verified for all bitrates tested in Characterization, up to the highest bitrate of 128 kbps). The Jan 8 verification used the double concatenated input from \ivas-processing-scripts\experiments\characterization\P800-11\proc_output\cat?\out_-?6LKFS\preprocessing_2, coded at 128 kbps.

The issue is present both in FL and FX, and both for BINAURAL and 7.1+4 rendering.

The issue seems to be related to DTX: when processing the double concatenated files without -dtx, the issue is gone.

Edited by vaclav