Encoder downmix signal exhibits localised gain issue when input FS == 48 kHz

Basic info

Float reference:
- Encoder (float):
- Decoder (float):
Fixed point:
- Encoder (fixed): a00cb1a4
- Decoder (fixed):

Bug description

While looking into #1317 (closed) are trying to understand why the coding mode is different when the input sampling frequency is 48 kHz or 16 kHz, I observed that the 48kHz input right after the down-mixed shows sometimes a 6dB decrease of energy (3.42s:3.44s, 5.28s:5.32s, 5.58s:5.62s and 6.81s:6.84s) .

The first figure shows the 48kHz and 16kHz input down-sampled at 12.8 kHz and de-normalized (The effect of the Q_new scaling is removed from both signal).

The second figure focuses on the section from 5.25s to 5.75s, the signal shown being:

16KHz input **downsampled **to 12.8kHz
48KHz input **downsampled **to 12.8kHz
difference between 1 and 2
16 kHz **downmix **(effect of in_q_temp taken into account)
48 kHz **downmix **, resampled to 16kHz for comparison (effect of in_q_temp taken into account)
Spectrum of the section 5.28s:5.32s for the downmix signals

Ways to reproduce

Box folder: ...\Box_EXTERNAL_IVAS_BASOP_VERIFICATION\issues*issue-1317*

IVAS_cod -no_delay_cmp -sba 1  13200  16 am5ba1s11_HOA1_16_-16.wav bit
IVAS_cod -no_delay_cmp -sba 1  13200  48 am5ba1s11_HOA1_48_-16.wav bit