update the documentation (e575d8cb) · Commits · IVAS Codec Public Collaboration / IVAS Processing Scripts

README.md

+22 −9

Original line number	Diff line number	Diff line
		@@ -55,21 +55,34 @@ In the following sections the only purpose of the curly brackets is to mark the
		## P800

		The setup for a P800 test from the experiments folder consists of two steps:
		item generation and item processing. The two steps can be applied independent of each other.
		item generation and item processing. The two steps can be applied independently of each other.

		### Item generation

		To set up the P800-{X} listening test (X = 1, 2, ...9) copy your mono input files to `experiments/selection/P800-{X}/gen_input/items_mono`.
		These files have to follow the naming scheme `{l}{LL}p0{X}{name_of_item}` where 'l' stands for the listening lab designator: a (Force Technology),
		b (HEAD acoustics), c (MQ University), d (Mesaqin.com), and 'LL' stands for the language: EN, GE, JP, MA, DK, FR.
		To facilitate the preparation of items for P800-{X} listening tests, it is possible to generate samples of complex formats (STEREO, SBA, ISMn, OMASA, OSBA) from mono samples. To generate items, run the following command from the root of the repository:

		The impluse responses have to be copied to experiments/selection/P800-{X}/gen_input/IRs.
		```bash
		python generate_items.py --config path/to/scene_description_config_file.yml
		```

		The YAML configuration file (`scene_description_config_file.yml`) defines how individual mono files should be spatially positioned and combined into the target format. For advanced formats like OMASA or OSBA, note that additional SBA items may be required. Refer to the `examples/` folder for template `.yml` files demonstrating the expected structure and usage.

		Relative paths are resolved from the working directory (not the YAML file location). Use absolute paths if you're unsure. Avoid using dots `.` in file names (e.g., use `item_xxa3s1.wav`, not `item.xx.a3s1.wav`). Windows users: Use double backslashes `\\` and add `.exe` to executables if needed. Input and output files follow structured naming conventions to encode metadata like lab, language, speaker ID, etc. These are explained in detail in the file under Filename conventions.

		Each entry under `scenes:` describes one test item, specifying:

		* `output`: output file name
		* `description`: human-readable description
		* `input`: list of mono `.wav` files
		* `azimuth` / `elevation`: spatial placement (°)
		* `level`: loudness in dB
		* `shift`: timing offsets in seconds

		Dynamic positioning (e.g., `"-20:1.0:360"`) means the source will move over time, stepping every 20 ms.

		To generate the items run `python -m ivas_processing_scripts.generation experiments/selection/P800-{X}/config/item_gen_P800-{X}_{l}.yml` from the root folder of the repository.
		The resulting files can be found in `experiments/selection/P800-{X}/proc_input_{l}` sorted by category.
		The total duration of the output signal can be controlled using the `duration` field. The output signal may optionally be rendered to the BINAURAL format by specifying the `binaural_output` field.

		For P800-3 the input files for the processing are already provided by the listening lab. This means this step can be skipped.
		For tests with ISM input format (P800-6 and P800-7) no IRs are needed, only mono sentences
		Start by running a single scene to verify settings. Output includes both audio and optional metadata files. You can enable multiprocessing by setting `multiprocessing: true`.

		### Item processing