(C) 2022-2025 IVAS codec Public Collaboration with portions copyright Dolby International AB, Ericsson AB,
Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V., Huawei Technologies Co. LTD.,
@@ -67,16 +67,16 @@ To facilitate the preparation of items for P800-{X} listening tests, it is possi
The YAML configuration file (`scene_description_config_file.yml`) defines how individual mono files should be spatially positioned and combined into the target format. For advanced formats like OMASA or OSBA, note that additional SBA items may be required. Refer to the `examples/` folder for template `.yml` files demonstrating the expected structure and usage.
Relative paths are resolved from the working directory (not the YAML file location). Use absolute paths if you're unsure. Avoid using dots `.` in file names (e.g., use `item_xxa3s1.wav`, not `item.xx.a3s1.wav`). Windows users: Use double backslashes `\\` and add `.exe` to executables if needed. Input and output files follow structured naming conventions to encode metadata like lab, language, speaker ID, etc. These are explained in detail in the file under *Filename conventions*.
Relative paths are resolved from the working directory (not the YAML file location). Use absolute paths if you're unsure. Avoid using dots `.` in file names (e.g., use `item_xxa3s1.wav`, not `item.xx.a3s1.wav`). Windows users: Use double backslashes `\\` and add `.exe` to executables if needed. Input and output files follow structured naming conventions to encode metadata like lab, language, speaker ID, etc. These are explained in detail in the file under _Filename conventions_.
Each entry under `scenes:` describes one test item, specifying:
* `output`: output file name
* `description`: human-readable description
* `input`: list of mono `.wav` files
* `azimuth` / `elevation`: spatial placement (°)
* `level`: loudness in dB
* `shift`: timing offsets in seconds
-`output`: output file name
-`description`: human-readable description
-`input`: list of mono `.wav` files
-`azimuth` / `elevation`: spatial placement (°)
-`level`: loudness in dB
-`shift`: timing offsets in seconds
Dynamic positioning (e.g., `"-20:1.0:360"`) means the source will move over time, stepping every 20 ms.
@@ -271,6 +271,10 @@ input:
# fmt: "7_1_4"
### Define mask (HP50 or 20KBP) for input signal filtering; default = null
# mask: "HP50"
### Gain factor to be applied BEFORE any other processing (linear, or add dB suffix)
# gain_pre: 10 dB
### Gain factor to be applied AFTER any other processing (linear, or add dB suffix)
# gain_post: 3.1622776602
### Target sampling rate in Hz for resampling; default = null (no resampling)
# fs: 16000
### Target loudness in LKFS; default = null (no loudness change applied)
@@ -373,6 +377,8 @@ input:
### mono_dmx generate mono downmix condition
### evs generate an EVS coded condition (see below examples for additional required keys)
### ivas generate an IVAS coded condition (see below examples for additional required keys)
### ivas_combined generate a combined-format IVAS coded condition using two IVAS instances for each part
### (see below examples for additional required keys)
conditions_to_generate:
### Reference and anchor conditions ##########################
c01:
@@ -401,7 +407,7 @@ conditions_to_generate:
c06:
### REQUIRED: type of condition
type:ivas
### REQUIRED: Bitrates to use for coding
### REQUIRED: Bitrates to use for coding, for ivas_combined, first and second bitrates are for objects and spatial parts respectively
bitrates:
-160000
# - 32000
@@ -428,7 +434,7 @@ conditions_to_generate:
c07:
### REQUIRED: type of condition
type:ivas
### REQUIRED: Bitrates to use for coding
### REQUIRED: Bitrates to use for coding, for ivas_combined, first and second bitrates are for objects and spatial parts respectively
bitrates:
-160000
# - 32000
@@ -490,6 +496,10 @@ postprocessing:
fmt:"BINAURAL"
### REQUIRED: Target sampling rate in Hz for resampling; default = null (no resampling)
fs:48000
### Gain factor to be applied BEFORE any other processing (linear, or add dB suffix)
# gain_pre: 10 dB
### Gain factor to be applied AFTER any other processing (linear, or add dB suffix)
# gain_post: 3.1622776602
### Low-pass cut-off frequency in Hz; default = null (no filtering)
# lp_cutoff: 24000
### Target loudness in LKFS; default = null (no loudness change applied)
@@ -517,7 +527,7 @@ postprocessing:
The following values may be used for the `type` key of a condition: