The setup for a listening test from the experiments folder according to the Processing Plan (IVAS-7) and Test Plan (IVAS-8) consists of two steps:
item generation and item processing.
In the following sections the only purpose of the curly brackets is to mark the variables that thave to be replaced with the actual values.
## P800
### Item generation
To set up the P800-{X} listening test (X = 1, 2, ...9) copy your mono input files to 'experiments/selection/P800-{X}/gen_input/items_mono'.
These files have to follow the naming scheme '{l}{LL}p0{X}{name_of_item}' where 'l' stands for the listening lab designator: a (Force Technology),
b (HEAD acoustics), c (MQ University), d (Mesaqin.com), and 'LL' stands for the language: EN, GE, JA, MA, DA, FR.
The impluse responses have to be copied to experiments/selection/P800-{X}/gen_input/IRs.
To generate the items run `python -m ivas_processing_scripts.generation experiments/selection/P800-{X}/config/item_gen_P800-{X}_{l}.yml` from the root folder of the repository.
The resulting files can be found in 'experiments/selection/P800-{X}/proc_input' sorted by category.
For P800-3 the input files for the processing are already provided by the listening lab. This means this step can be skipped.
For tests with ISM input format (P800-6 and P800-7) no IRs are needed, only mono sentences
### Item processing
If the tests includes background noise, the corresponding files have to be copied to 'experiments/selection/P800-{X}/background_noise'.
The naming has to follow the scheme 'background_noise_cat{c}.wav' where 'c' denotes the category with a number between one and six.
To process the items run `python generate_test.py P800-{X},{l}` from the root folder of this repository.
The results can be found in 'experiments/selection/P800-{X}/proc_output'.
For more information about this processing step see
[How to generate the configs and process items for the selection test experiments](#how-to-generate-the-configs-and-process-items-for-the-selection-test-experiments).
# MUSHRA
todo
---
# Item generation
The `item_generation_scripts` module may be used to generate audio items for the P.800 listening test according to the scene description. All scenes must be fully described in the `SCENE.yml` file. The module takes monophonic audio
@@ -706,7 +744,7 @@ options:
--no_parallel If given, configs will not be run in parallel
--create_cfg_only If given, only create the configs and folder structure without processing items
```
Before running the script, one needs to put the input files in the respective input folder (including the background noise files, see below). If input files are missing, the script will complain ad stop. For example, for processing tests P800-3 and BS1534-4a for labs b and d, respectively, command line would look like this (no whitespace between the commas!):
Before running the script, one needs to put the input files in the respective input folder (including the background noise files, see below). If input files are missing, the script will complain and stop. For example, for processing tests P800-3 and BS1534-4a for labs b and d, respectively, command line would look like this (no whitespace between the commas!):
### Target loudness in LKFS; default = null (no loudness normalization applied)
loudness:-26
### Pre-amble and Post-amble length in seconds (default = 0.0)
preamble:0.5
postamble:1.0
### Flag for adding low-level random background noise (amplitude +-4) instead of silence; default = False (silence)
add_low_level_random_noise:true
### File designators, default is "l" for listening lab, "EN" for language, "p01" for exp and "g" for provider
listening_lab:"d"
language:"FR"
exp:"p01"
provider:"g"
### Use prefix for all input filenames (default: "")
### l stands for the 'listening_lab' designator, L stands for the 'language', e stands for the 'exp' designator (the number of consecutive letters define the length of the field)
use_input_prefix:"lLLeee"
### Use prefix for all IR filenames (default: "")
### p stands for the 'provider', e stands for the 'exp' designator (the number of consecutive letters define the length of the field)
# use_IR_prefix: "IR_pp_eee_"
### Use prefix for all output filenames (default: "")
### l stands for the 'listening_lab' designator, e stands for the 'exp' designator (the number of consecutive letters define the length of the field)
use_output_prefix:"leee"
################################################
### Scene description
################################################
### Each scene must begin by specifying the category in the following format: catN_I where N is the category index and N is the scene index
### Each scene shall de described using the following parameters/properties:
### name: filename of the generated output item (the program will save th generated items in the output_path folder, note: it is possible to use subfolders, e.g. items_stereo/x1_s01.wav)
### description: textual description of the scene
### source: filename(s) of the mono input sources (the program will search for it in the input_path folder)
### IR: filenames(s) of the input IRs (the program will search for it in the IR_path folder)
### overlap: overlap length between two input sources in seconds (negative value creates a gap)
### Note 1: use brackets [val1, val2, ...] when specifying multiple values
### Naming convention for the input mono files
### The input filenames are represented by:
### lLLeeettszz.wav
### where:
### l stands for the listening lab designator: a (Force Technology), b (HEAD acoustics), c (MQ University), d (Mesaqin.com)
### LL stands for the language: JP, FR, GE, MA, DA, EN
### eee stands for the experiment designator: p01, p02, p04, p05, p06, p07, p08, p09
### tt stands for the talker ID: f1, f2, f3, m1, m2, m3
### s stands for 'sample' and zz is the sample number; 01, ..., 14
### Naming convention for the input IR files
### The input IR filenames are represented by:
### IR_pp_eee_r_tt_mm_ffffff.wav
### where:
### pp stands for the provider: do (Dolby), no (Nokia), or (Orange), vo (VoiceAge), g (G.191)
### eee stands for the experiment designator: p01, p02, p04, p05, p06, p07, p08, p09
### r stands for the room ID: a, b, c, ...
### tt stands for the talker position: 01, 02, ...
### mm stands for the microphone position: 00, 01, 02, ...
### ffffff stands for the format ID: stAB20, stABC20, stAB100, stAB150, stMS, stBin, FOA, HOA2
### Naming convention for the generated output files
### The output filenames are represented by:
### leeeayszz.wav
### The filenames of the accompanying output metadata files (applicable to metadata-assisted spatial audio, object-based audio) are represented by:
### leeeayszz.met for metadata-assisted spatial audio
### leeeayszz.wav.o.csv for object-based audio
### where:
### l stands for the listening lab designator: a (Force Technology), b (HEAD acoustics), c (MQ University), d (Mesaqin.com)
### eee stands for the experiment designator: p01, p02, p04, p05, p06, p07, p08, p09
### a stands 'audio'
### y is the per-experiment category according to IVAS-8a: 01, 02, 03, 04, 05, 06
### s stands for sample and zz is the sample number; 01, 02, 03, 04, 05, 06, 07 (07 is the preliminary sample)