Skip to content

OSBA/OMASA 0.5 normalization at encoder pre-rendering and at decoder rendering

Basic info

  • Float reference:
    • Encoder (float):
    • Decoder (float):
  • Fixed point:

Bug description

For the rendering, OSBA includes a scaling factor of 0.5 for the combination of objects and SBA. This is not there in OMASA..

Back then we have introduced this 0.5 factor for OSBA to avoid clipping (e.g., at the input to core coder in pre-rendering modes and in the binaural rendering in discrete modes) for cases where both, the SBA and the object content itself is individually reasonably leveled, but in combination has too little headroom. OMASA possibly assumes that the individual signals have sufficient headroom. (As an emergency measure, we have the limiter in the decoder – but it may not be reliable always).

To check the impact of removal of 0.5 normalization in OSBA, the following experiment was done (check "Ways to reproduce").

In discrete case, without 0.5 normalization at binaural renderer, clipping is observed in ivas_syn_output_fx() (even when limiter is enabled).

In the pre-rendering case, the signal take values higher than +-1 after pre-rendering (however, the data is stored in 32 bit buffers with sufficient headroom and no saturation/clipping is observed at the core coder input). At bitrates, 48 and above, AGC is not present in SBA and hence the input to core coder is higher than +-1. I have not checked where exactly the clipping may happen in core coder in this case but it is known that out of bound signal inputs to core coder can cause issues.

Ways to reproduce

Took stvOSBA_1ISM_FOA48c.wav, silenced Y and Z channel. Normalized Object, W and X channel to full scale.

Used stvOSBA_1ISM_FOA48c_ISM1_dbg.csv wherein the object is panned at 0 degree azimuth and 0 degree elevation. This is done so that ISM to SBA rendering puts data in W, X channels only.

To check the impact of normalization at pre-rendering at encoder, 32kbps and 48 kbps bitrates were used at shown below.

..\IVAS_cod.exe -ism_sba 1 1 osba_in\stvOSBA_1ISM_FOA48c_ISM1_dbg.csv 48000 48 osba_in\stvOSBA_1ISM_FOA48c_WX_norm.wav osba_in\out48.pkt

..\IVAS_cod.exe -ism_sba 1 1 osba_in\stvOSBA_1ISM_FOA48c_ISM1_dbg.csv 32000 48 osba_in\stvOSBA_1ISM_FOA48c_WX_norm.wav osba_in\out32.pkt

Then removed right shift by 1 in function ivas_merge_sba_transports_fx() and generated 32 and 48 kbps encoded packets again.

..\IVAS_cod.exe -ism_sba 1 1 osba_in\stvOSBA_1ISM_FOA48c_ISM1_dbg.csv 48000 48 osba_in\stvOSBA_1ISM_FOA48c_WX_norm.wav osba_in\out48_no_05norm.pkt

..\IVAS_cod.exe -ism_sba 1 1 osba_in\stvOSBA_1ISM_FOA48c_ISM1_dbg.csv 32000 48 osba_in\stvOSBA_1ISM_FOA48c_WX_norm.wav osba_in\out32_no_05norm.pkt

Then ran IVAS decoder on 32 and 48 kbps coded packets

..\IVAS_dec.exe FOA 48 osba_in\out<>.pkt osba_in\out<>_<>_foa.wav

To check the impact of normalization at binaural rendering at decoder, 512 kbps bitrate was used at shown below

..\IVAS_cod.exe -ism_sba 1 1 osba_in\stvOSBA_1ISM_FOA48c_ISM1_dbg.csv 512000 48 osba_in\stvOSBA_1ISM_FOA48c_WX_norm.wav osba_in\out512.pkt

Then ran IVAS decoder

..\IVAS_dec.exe FOA 48 osba_in\out512.pkt osba_in\out512_bin.wav

Then removed the right shift by 1 from ivas_osba_dirac_td_binaural_jbm_fx() and ran decoder again

..\IVAS_dec.exe FOA 48 osba_in\out512.pkt osba_in\out512_no_05norm_bin.wav

Input and output files are present here:

Box folder: ...\Box_EXTERNAL_IVAS_BASOP_VERIFICATION\issues\issue-1540

Edited by TYAGIRIS