[WIP] Resolve "Enable rendering to all output formats for EVS mono and IVAS Stereo bitstreams"
- Related issues: #1419
- Requested reviewers:
Reason why this change is needed
- Currently the decoder crashes with a segfault if an IVAS commandline with OutputConfis specified for EVS Mono bitstreams. For IVAS Stereo, rendering to ambisonics outputs and binaural flavours returns an Invalid output format error.
- Decoder format switching between formats does not allow preserving output configuration, since not all output formats are supported for all input (bitstream) formats
Description of the change
- Rendering is enabled for:
- Mono to all output formats 1
- Stereo to all output formats (i.e. extends now to ambisonics and binaural variants)
 
- By default, the upmix for mono/stereo is non-spatial; this means:
- Mono/[Stereo] to ambisonics route to W/[W±Y]
- Binaural rendering uses passive upmix (non-diegetic panning to center)/[passthrough]
 
- To enable a spatial upmix, a renderer configuration file must be supplied specifying channel positions (chapter name MSUPMIX):- This file contains 3 parameters: AZIMUTH[],ELEVATION[]andRADIUS[].
- Radius 0 will be interpreted as omni/non-spatial
- Any other values are used as a spatial position for rendering (1 value required for mono, 2 for stereo)
 
- This file contains 3 parameters: 
- Technical details:
- Existing renderers are reused for implementation of this functionality. Overall the changes in this MR are really only "plumbing" to get things to work.
- LS setup conversion renderer is used for multichannel outputs
- Ambisonics spherical response (ivas_mc2sba()or passive upmix for ambisonics rendering)
- In case of binaural rendering the precedent set by high bitrate multichannel is followed:
- TD object renderer is used for everything except BRIRs, with channel positions set and propagated through hTransSetup.
- CRend is used for BINAURAL_ROOM_IR; if headrotation is enabled rotation of input sources is performed using EFAP on the pseudo 7.1+4 layout prior to rendering.
 
- TD object renderer is used for everything except BRIRs, with channel positions set and propagated through 
 
⚠️  This work is still in progress! Outstanding tasks:
- 
Split rendering Mono with LC3plus to BINAURAL_SPLIT_CODED has clicks (_PCM is OK) 
- 
External renderer implementation is missing -> will be dealt with in another issue 
- 
VoIP mode untested 
- 
Validation of user-input positions in render config file missing 
- 
Split rendering outputs are currently crashing 
- 
Rendering mono/stereo to ambisonics needs to be changed to default to a passive upmix (currently the spatial option is enabled by default) 
- 
Radius functionality is not yet verified/implemented 
Affected operating points
- Decoding a mono bitstream when specifying output format
- Decoding a stereo bitstream to binaural variants or ambisonics
Overview of rendering paths
| Input Format | Output Format | Render config supplied? | Rendering path | 
|---|---|---|---|
| Mono | Mono | N/A | Not changed in this MR (direct decoding) | 
| Mono | Stereo | N/A | Not changed in this MR (Non-diegetic upmix) | 
| Mono | Multichannel 2 | N/A | Mixing matrices ( ivas_ls_setup_conversion()) | 
| Mono | Ambisonics 3 | N/A | Passthrough to channel index 0 (W/Omni) | 
| Mono | Binaural 4 | NO | Non-diegetic upmix | 
| Mono | Binaural (ROOM_IR) | NO | Non-diegetic upmix | 
| Mono | Binaural 4 | YES | TD Object renderer, position specified via renderer config | 
| Mono | Binaural (ROOM_IR) | YES | CRend, position specified by render config but only 0,0 is supported | 
| Stereo | Mono | N/A | Not changed in this MR (Directly handled by CPE decoding) | 
| Stereo | Stereo | N/A | Not changed in this MR (passthrough/direct decoding) | 
| Stereo | Multichannel 2 | N/A | Not changed in this MR ( ivas_ls_setup_conversion()) | 
| Stereo | Ambisonics 3 | N/A | M/S routing to W and Y (W/mid = \frac{L+R}{2}; Y/side = \frac{L-R}{2}) | 
| Stereo | Binaural 4 | NO | Passthrough as Stereo | 
| Stereo | Binaural (ROOM_IR) | NO | Passthrough as Stereo | 
| Stereo | Binaural 4 | YES | TD Object renderer, positions specified via render config | 
| Stereo | Binaural (ROOM_IR) | YES | CRend, positions specified by render config but will snap to ±30 or ±90 azimuth with zero elevation | 
Commandlines for testing and review (also in zip file below).
Test script
#!/usr/bin/bash
set -euxo pipefail
### Mono upmix testing ###
# generate bitstream
../IVAS_cod 128000 48 ../scripts/testv/stv48c.wav mono.192
# mono to multichannel
../IVAS_dec -evs 5_1 48 mono.192 mono_to_51.wav
# mono to ambisonics
../IVAS_dec -evs FOA 48 mono.192 mono_to_FOA.wav
# mono to binaural formats (nonspatial by default)
../IVAS_dec -evs BINAURAL 48 mono.192 mono_to_hrir_dry.wav
../IVAS_dec -evs BINAURAL_ROOM_REVERB 48 mono.192 mono_to_reverb_dry.wav
../IVAS_dec -evs BINAURAL_ROOM_IR 48 mono.192 mono_to_brir_dry.wav
# mono to binaural formats with spatial upmix
../IVAS_dec -evs -render_config mono.txt BINAURAL 48 mono.192 mono_to_hrir_spatial.wav
../IVAS_dec -evs -render_config mono.txt BINAURAL_ROOM_REVERB 48 mono.192 mono_to_reverb_spatial.wav
../IVAS_dec -evs -render_config mono.txt BINAURAL_ROOM_IR 48 mono.192 mono_to_brir_spatial.wav
### Mono split rendering ###
## BINAURAL_SPLIT_CODED
# LCLD
# 0DOF@256kbps
../IVAS_dec -evs -T ../scripts/trajectories/full_circle_in_15s_delayed.csv -render_config mono_split_LCLD_0dof.txt BINAURAL_SPLIT_CODED 48 mono.192 mono_split_LCLD_0dof.192
../ISAR_post_rend -i mono_split_LCLD_0dof.192 -if BINAURAL_SPLIT_CODED -o mono_split_0dof_LCLD_coded.wav -T ../scripts/trajectories/full_circle_in_15s.csv -fs 48
# 2DOF@512kbps
../IVAS_dec -evs -T ../scripts/trajectories/full_circle_in_15s_delayed.csv -render_config mono_split_LCLD_2dof.txt BINAURAL_SPLIT_CODED 48 mono.192 mono_split_LCLD_2dof.192
../ISAR_post_rend -i mono_split_LCLD_2dof.192 -if BINAURAL_SPLIT_CODED -o mono_split_2dof_LCLD_coded.wav -T ../scripts/trajectories/full_circle_in_15s.csv -fs 48
# 3DOFHQ@768kbps
../IVAS_dec -evs -T ../scripts/trajectories/full_circle_in_15s_delayed.csv -render_config mono_split_LCLD_3dofhq.txt BINAURAL_SPLIT_CODED 48 mono.192 mono_split_LCLD_3dofhq.192
../ISAR_post_rend -i mono_split_LCLD_3dofhq.192 -if BINAURAL_SPLIT_CODED -o mono_split_3dofhq_LCLD_coded.wav -T ../scripts/trajectories/full_circle_in_15s.csv -fs 48
# LC3plus
# 0DOF@256kbps
../IVAS_dec -evs -T ../scripts/trajectories/full_circle_in_15s_delayed.csv -render_config mono_split_LC3plus_0dof.txt BINAURAL_SPLIT_CODED 48 mono.192 mono_split_LC3plus_0dof.192
../ISAR_post_rend -i mono_split_LC3plus_0dof.192 -if BINAURAL_SPLIT_CODED -o mono_split_0dof_LC3plus_coded.wav -T ../scripts/trajectories/full_circle_in_15s.csv -fs 48
# 2DOF@512kbps
../IVAS_dec -evs -T ../scripts/trajectories/full_circle_in_15s_delayed.csv -render_config mono_split_LC3plus_2dof.txt BINAURAL_SPLIT_CODED 48 mono.192 mono_split_LC3plus_2dof.192
../ISAR_post_rend -i mono_split_LC3plus_2dof.192 -if BINAURAL_SPLIT_CODED -o mono_split_2dof_LC3plus_coded.wav -T ../scripts/trajectories/full_circle_in_15s.csv -fs 48
# 3DOFHQ@768kbps
../IVAS_dec -evs -T ../scripts/trajectories/full_circle_in_15s_delayed.csv -render_config mono_split_LC3plus_3dofhq.txt BINAURAL_SPLIT_CODED 48 mono.192 mono_split_LC3plus_3dofhq.192
../ISAR_post_rend -i mono_split_LC3plus_3dofhq.192 -if BINAURAL_SPLIT_CODED -o mono_split_3dofhq_LC3plus_coded.wav -T ../scripts/trajectories/full_circle_in_15s.csv -fs 48
## BINAURAL_SPLIT_PCM
# LCLD
# 0DOF@256kbps
../IVAS_dec -evs -T ../scripts/trajectories/full_circle_in_15s_delayed.csv -render_config mono_split_LCLD_0dof.txt -om mono_split_LCLD_0dof.md BINAURAL_SPLIT_PCM 48 mono.192 mono_split_LCLD_0dof.192
../ISAR_post_rend -i mono_split_LCLD_0dof.192 -im mono_split_LCLD_0dof.md -if BINAURAL_SPLIT_PCM -o mono_split_0dof_LCLD_pcm.wav -T ../scripts/trajectories/full_circle_in_15s.csv -fs 48
# 2DOF@512kbps
../IVAS_dec -evs -T ../scripts/trajectories/full_circle_in_15s_delayed.csv -render_config mono_split_LCLD_2dof.txt -om mono_split_LCLD_2dof.md BINAURAL_SPLIT_PCM 48 mono.192 mono_split_LCLD_2dof.192
../ISAR_post_rend -i mono_split_LCLD_2dof.192 -im mono_split_LCLD_2dof.md -if BINAURAL_SPLIT_PCM -o mono_split_2dof_LCLD_pcm.wav -T ../scripts/trajectories/full_circle_in_15s.csv -fs 48
# 3DOFHQ@768kbps
../IVAS_dec -evs -T ../scripts/trajectories/full_circle_in_15s_delayed.csv -render_config mono_split_LCLD_3dofhq.txt -om mono_split_LCLD_3dofhq.md BINAURAL_SPLIT_PCM 48 mono.192 mono_split_LCLD_3dofhq.192
../ISAR_post_rend -i mono_split_LCLD_3dofhq.192 -im mono_split_LCLD_3dofhq.md -if BINAURAL_SPLIT_PCM -o mono_split_3dofhq_LCLD_pcm.wav -T ../scripts/trajectories/full_circle_in_15s.csv -fs 48
# LC3plus
# 0DOF@256kbps
../IVAS_dec -evs -T ../scripts/trajectories/full_circle_in_15s_delayed.csv -render_config mono_split_LC3plus_0dof.txt -om mono_split_LC3plus_0dof.md BINAURAL_SPLIT_PCM 48 mono.192 mono_split_LC3plus_0dof.192
../ISAR_post_rend -i mono_split_LC3plus_0dof.192 -im mono_split_LC3plus_0dof.md -if BINAURAL_SPLIT_PCM -o mono_split_0dof_LC3plus_pcm.wav -T ../scripts/trajectories/full_circle_in_15s.csv -fs 48
# 2DOF@512kbps
../IVAS_dec -evs -T ../scripts/trajectories/full_circle_in_15s_delayed.csv -render_config mono_split_LC3plus_2dof.txt -om mono_split_LC3plus_2dof.md BINAURAL_SPLIT_PCM 48 mono.192 mono_split_LC3plus_2dof.192
../ISAR_post_rend -i mono_split_LC3plus_2dof.192 -im mono_split_LC3plus_2dof.md -if BINAURAL_SPLIT_PCM -o mono_split_2dof_LC3plus_pcm.wav -T ../scripts/trajectories/full_circle_in_15s.csv -fs 48
# 3DOFHQ@768kbps
../IVAS_dec -evs -T ../scripts/trajectories/full_circle_in_15s_delayed.csv -render_config mono_split_LC3plus_3dofhq.txt -om mono_split_LC3plus_3dofhq.md BINAURAL_SPLIT_PCM 48 mono.192 mono_split_LC3plus_3dofhq.192
../ISAR_post_rend -i mono_split_LC3plus_3dofhq.192 -im mono_split_LC3plus_3dofhq.md -if BINAURAL_SPLIT_PCM -o mono_split_3dofhq_LC3plus_pcm.wav -T ../scripts/trajectories/full_circle_in_15s.csv -fs 48
### Stereo upmix testing ###
../IVAS_cod -stereo 256000 48 ../scripts/testv/stvST48c.wav stereo.192
# stereo to ambisonics
../IVAS_dec FOA 48 stereo.192 stereo_to_FOA.wav
# stereo to binaural formats (nonspatial by default)
../IVAS_dec BINAURAL 48 stereo.192 stereo_to_hrir_dry.wav
../IVAS_dec BINAURAL_ROOM_REVERB 48 stereo.192 stereo_to_reverb_dry.wav
../IVAS_dec BINAURAL_ROOM_IR 48 stereo.192 stereo_to_brir_dry.wav
# stereo to binaural formats with spatial upmix
../IVAS_dec -render_config stereo.txt BINAURAL 48 stereo.192 stereo_to_hrir_spatial.wav
../IVAS_dec -render_config stereo.txt BINAURAL_ROOM_REVERB 48 stereo.192 stereo_to_reverb_spatial.wav
../IVAS_dec -render_config stereo.txt BINAURAL_ROOM_IR 48 stereo.192 stereo_to_brir_spatial.wav
### Stereo split rendering ##
## BINAURAL_SPLIT_CODED
# LCLD
# 0DOF@256kbps
../IVAS_dec -T ../scripts/trajectories/full_circle_in_15s_delayed.csv -render_config stereo_split_LCLD_0dof.txt BINAURAL_SPLIT_CODED 48 stereo.192 stereo_split_LCLD_0dof.192
../ISAR_post_rend -i stereo_split_LCLD_0dof.192 -if BINAURAL_SPLIT_CODED -o stereo_split_0dof_LCLD_coded.wav -T ../scripts/trajectories/full_circle_in_15s.csv -fs 48
# 2DOF@512kbps
../IVAS_dec -T ../scripts/trajectories/full_circle_in_15s_delayed.csv -render_config stereo_split_LCLD_2dof.txt BINAURAL_SPLIT_CODED 48 stereo.192 stereo_split_LCLD_2dof.192
../ISAR_post_rend -i stereo_split_LCLD_2dof.192 -if BINAURAL_SPLIT_CODED -o stereo_split_2dof_LCLD_coded.wav -T ../scripts/trajectories/full_circle_in_15s.csv -fs 48
# 3DOFHQ@768kbps
../IVAS_dec -T ../scripts/trajectories/full_circle_in_15s_delayed.csv -render_config stereo_split_LCLD_3dofhq.txt BINAURAL_SPLIT_CODED 48 stereo.192 stereo_split_LCLD_3dofhq.192
../ISAR_post_rend -i stereo_split_LCLD_3dofhq.192 -if BINAURAL_SPLIT_CODED -o stereo_split_3dofhq_LCLD_coded.wav -T ../scripts/trajectories/full_circle_in_15s.csv -fs 48
# LC3plus
# 0DOF@256kbps
../IVAS_dec -T ../scripts/trajectories/full_circle_in_15s_delayed.csv -render_config stereo_split_LC3plus_0dof.txt BINAURAL_SPLIT_CODED 48 stereo.192 stereo_split_LC3plus_0dof.192
../ISAR_post_rend -i stereo_split_LC3plus_0dof.192 -if BINAURAL_SPLIT_CODED -o stereo_split_0dof_LC3plus_coded.wav -T ../scripts/trajectories/full_circle_in_15s.csv -fs 48
# 2DOF@512kbps
../IVAS_dec -T ../scripts/trajectories/full_circle_in_15s_delayed.csv -render_config stereo_split_LC3plus_2dof.txt BINAURAL_SPLIT_CODED 48 stereo.192 stereo_split_LC3plus_2dof.192
../ISAR_post_rend -i stereo_split_LC3plus_2dof.192 -if BINAURAL_SPLIT_CODED -o stereo_split_2dof_LC3plus_coded.wav -T ../scripts/trajectories/full_circle_in_15s.csv -fs 48
# 3DOFHQ@768kbps
../IVAS_dec -T ../scripts/trajectories/full_circle_in_15s_delayed.csv -render_config stereo_split_LC3plus_3dofhq.txt BINAURAL_SPLIT_CODED 48 stereo.192 stereo_split_LC3plus_3dofhq.192
../ISAR_post_rend -i stereo_split_LC3plus_3dofhq.192 -if BINAURAL_SPLIT_CODED -o stereo_split_3dofhq_LC3plus_coded.wav -T ../scripts/trajectories/full_circle_in_15s.csv -fs 48
## BINAURAL_SPLIT_PCM
# LCLD
# 0DOF@256kbps
../IVAS_dec -T ../scripts/trajectories/full_circle_in_15s_delayed.csv -render_config stereo_split_LCLD_0dof.txt -om stereo_split_LCLD_0dof.md BINAURAL_SPLIT_PCM 48 stereo.192 stereo_split_LCLD_0dof.192
../ISAR_post_rend -i stereo_split_LCLD_0dof.192 -im stereo_split_LCLD_0dof.md -if BINAURAL_SPLIT_PCM -o stereo_split_0dof_LCLD_pcm.wav -T ../scripts/trajectories/full_circle_in_15s.csv -fs 48
# 2DOF@512kbps
../IVAS_dec -T ../scripts/trajectories/full_circle_in_15s_delayed.csv -render_config stereo_split_LCLD_2dof.txt -om stereo_split_LCLD_2dof.md BINAURAL_SPLIT_PCM 48 stereo.192 stereo_split_LCLD_2dof.192
../ISAR_post_rend -i stereo_split_LCLD_2dof.192 -im stereo_split_LCLD_2dof.md -if BINAURAL_SPLIT_PCM -o stereo_split_2dof_LCLD_pcm.wav -T ../scripts/trajectories/full_circle_in_15s.csv -fs 48
# 3DOFHQ@768kbps
../IVAS_dec -T ../scripts/trajectories/full_circle_in_15s_delayed.csv -render_config stereo_split_LCLD_3dofhq.txt -om stereo_split_LCLD_3dofhq.md BINAURAL_SPLIT_PCM 48 stereo.192 stereo_split_LCLD_3dofhq.192
../ISAR_post_rend -i stereo_split_LCLD_3dofhq.192 -im stereo_split_LCLD_3dofhq.md -if BINAURAL_SPLIT_PCM -o stereo_split_3dofhq_LCLD_pcm.wav -T ../scripts/trajectories/full_circle_in_15s.csv -fs 48
# LC3plus
# 0DOF@256kbps
../IVAS_dec -T ../scripts/trajectories/full_circle_in_15s_delayed.csv -render_config stereo_split_LC3plus_0dof.txt -om stereo_split_LC3plus_0dof.md BINAURAL_SPLIT_PCM 48 stereo.192 stereo_split_LC3plus_0dof.192
../ISAR_post_rend -i stereo_split_LC3plus_0dof.192 -im stereo_split_LC3plus_0dof.md -if BINAURAL_SPLIT_PCM -o stereo_split_0dof_LC3plus_pcm.wav -T ../scripts/trajectories/full_circle_in_15s.csv -fs 48
# 2DOF@512kbps
../IVAS_dec -T ../scripts/trajectories/full_circle_in_15s_delayed.csv -render_config stereo_split_LC3plus_2dof.txt -om stereo_split_LC3plus_2dof.md BINAURAL_SPLIT_PCM 48 stereo.192 stereo_split_LC3plus_2dof.192
../ISAR_post_rend -i stereo_split_LC3plus_2dof.192 -im stereo_split_LC3plus_2dof.md -if BINAURAL_SPLIT_PCM -o stereo_split_2dof_LC3plus_pcm.wav -T ../scripts/trajectories/full_circle_in_15s.csv -fs 48
# 3DOFHQ@768kbps
../IVAS_dec -T ../scripts/trajectories/full_circle_in_15s_delayed.csv -render_config stereo_split_LC3plus_3dofhq.txt -om stereo_split_LC3plus_3dofhq.md BINAURAL_SPLIT_PCM 48 stereo.192 stereo_split_LC3plus_3dofhq.192
../ISAR_post_rend -i stereo_split_LC3plus_3dofhq.192 -im stereo_split_LC3plus_3dofhq.md -if BINAURAL_SPLIT_PCM -o stereo_split_3dofhq_LC3plus_pcm.wav -T ../scripts/trajectories/full_circle_in_15s.csv -fs 48Zip file with scripts and input render config files. Unzip in IVAS root or adjust paths accordingly.
Edited  by Archit Tamarapu