Merge branch 'main' of... (7c6aa734) · Commits · IVAS Codec Public Collaboration / IVAS Processing Scripts

.gitignore

+6 −2

Original line number	Diff line number	Diff line
		__pycache__/
		__pycache__/
		venv/
		*.py[cod]
		*$py.class
		@@ -11,4 +11,8 @@ venv/
		*.pcm
		*.bs
		*.192

		mc.double
		proc_input/*.wav
		proc_input/*.pcm
		proc_output/
		*~

README.md

+47 −14

Original line number	Diff line number	Diff line
		@@ -53,12 +53,16 @@ The `ivas_processing_scripts` module helps to quickly setup listening tests with

		This module may be used by executing the top level python module i.e. `python -m ivas_processing_scripts CONFIG.YML`.

		## Configuration file
		## Configuration file for processing module

		The processing module can be configured to set up a test via a YAML configuration file.

		YAML is a superset of JSON, however unlike JSON comments are permitted, which allows for the addition of useful information in the configuration file.

		## Configuration file for binaries/executables

		The user can specify custom binary paths and names via a YAML configuration file called [_binary_paths.yml_](ivas_processing_scripts/binary_paths.yml). More information on usage can be found in the comments mentioned in the file.

		## YAML reference

		A read through of the YAML reference card is highly recommended. This can be found here: <https://yaml.org/refcard.html>
		@@ -104,6 +108,7 @@ conditions_to_generate:
		bin: ~/git/ivas-codec/IVAS_dec
		postprocessing:
		fmt: "BINAURAL"
		fs: 48000
		```

		</details>
		@@ -126,6 +131,8 @@ postprocessing:
		### Deletion of temporary directories containing
		### intermediate processing files, bitstreams etc.; default = false
		# delete_tmp: true
		### Master seed for random processes like bitstream error pattern generation; default = 0
		# master_seed: 5

		### Any relative paths will be interpreted relative to the working directory the script is called from!
		### Usage of absolute paths is recommended.
		@@ -156,13 +163,6 @@ output_path: "./tmp_output"
		### searches for the specified substring in found filenames; default = null
		# input_select:
		# - "48kHz"

		### Horizontally concatenate input items into one long file; default = false
		# concatenate_input: true
		### Specify preamble duration in ms; default = 0
		# preamble: 40
		### Flag wheter to use noise (amplitude +-4) for the preamble or silence; default = false (silence)
		# pad_noise_preamble: true
		```

		</details>
		@@ -216,6 +216,37 @@ input:

		</details>

		### Optional pre-processing on whole signal(s)

		<details>
		<summary>Click to expand</summary>

		```yaml
		# preprocessing_2:
		### Options for processing of the concatenated item (concatenate_input: true) or
		### the individual items (concatenate_input: false) after previous pre-processing step
		### Horizontally concatenate input items into one long file; default = false
		# concatenate_input: true
		### Specify the concatenation order in a list of strings. If not specified, the concatenation order would be
		### as per the filesystem on the users' device
		### Should only be used if concatenate_input = true
		# concatenation_order: []
		### Specify preamble duration in ms; default = 0
		# preamble: 10000
		### Flag wheter to use noise (amplitude +-4) for the preamble or silence; default = false (silence)
		# preamble_noise: true
		### Additive background noise
		# background_noise:
		### REQUIRED: SNR for background noise in dB
		# snr: 10
		### REQUIRED: Path to background noise, must have same format and sampling rate as input signal(s)
		# background_noise_path: ".../noise.wav"
		### Seed for delay offest; default = 0
		# seed_delay: 10
		```

		</details>

		### Bitstream processing

		<details>
		@@ -244,6 +275,8 @@ input:
		# error_pattern: "path/pattern.192"
		### Error rate in percent
		# error_rate: 5
		### Additional seed to specify number of preruns; default = 0
		# prerun_seed: 2
		```
		</details>

		@@ -337,7 +370,7 @@ conditions_to_generate:
		### Path to decoder binary; default search for IVAS_dec in bin folder (primary) and PATH (secondary)
		bin: ~/git/ivas-codec/IVAS_dec
		### Decoder output format; default = postprocessing fmt
		fmt: "CICP19"
		fmt: "7_1_4"
		### Decoder output sampling rate; default = null (same as input)
		# fs: 48000
		### Additional commandline options; default = null
		@@ -378,8 +411,8 @@ conditions_to_generate:
		postprocessing:
		### REQUIRED: Target format for output
		fmt: "BINAURAL"
		### Target sampling rate in Hz for resampling; default = null (no resampling)
		# fs: 16000
		### REQUIRED: Target sampling rate in Hz for resampling; default = null (no resampling)
		fs: 48000
		### Low-pass cut-off frequency in Hz; default = null (no filtering)
		# lp_cutoff: 24000
		### Target loudness in LKFS; default = null (no loudness change applied)
		@@ -470,9 +503,9 @@ The processing chain is as follows:
		- The postprocessing stage performs a final conversion from the output of the previous stage if necessary and applies the specified processing

		---
		## ITU Tools
		## Additional Executables

		The following binaries/executables are needed for the different processing steps:
		The following additional executables are needed for the different processing steps:

		\| Processing step \| Executable \| Where to find \|
		\|---------------------------------\|-----------------------\|-------------------------------------------------------------------------------------------------------------\|
		@@ -483,7 +516,7 @@ The following binaries/executables are needed for the different processing steps
		\| Error pattern generation \| gen-patt \| https://www.itu.int/rec/T-REC-G.191-201003-S/en (Note: Version in https://github.com/openitu/STL is buggy!) \|
		\| Filtering, Resampling \| filter \| https://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_76/docs/S4-131277.zip \|
		\| Random offset/seed generation \| random \| https://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_76/docs/S4-131277.zip \|
		\| JBM network similulator \| networkSimulator_g192 \| https://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_76/docs/S4-131277.zip \|
		\| JBM network simulator \| networkSimulator_g192 \| https://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_76/docs/S4-131277.zip \|
		\| MASA rendering \| masaRenderer \| https://www.3gpp.org/ftp/TSG_SA/WG4_CODEC/TSGS4_122_Athens/Docs/S4-230221.zip \|

		The necessary binaries have to be placed in the [ivas_processing_scripts/bin](./ivas_processing_scripts/bin) folder.

examples/TEMPLATE.yml

+45 −20

Original line number	Diff line number	Diff line
		@@ -16,17 +16,15 @@
		# delete_tmp: true
		### Master seed for random processes like bitstream error pattern generation; default = 0
		# master_seed: 5
		### Additional seed to specify number of preruns; default = 0
		# prerun_seed: 2

		### Any relative paths will be interpreted relative to the working directory the script is called from!
		### Usage of absolute paths is recommended.
		### Do not use file names with dots "." in them! This is not supported, use "_" instead
		### For Windows user: please use double back slash '\\' in paths and add '.exe' to executable definitions
		### REQUIRED: Input path or file
		input_path: "~/ivas/items/HOA3"
		input_path: ".../ivas/items/HOA3"
		### REQUIRED: Output path or file
		output_path: "./tmp_output"
		output_path: ".../tmp_output"
		### Metadata path or file(s)
		### If input format is ISM{1-4} a path for the metadata files can be specified;
		### default = null (for ISM search for item_name.{wav, raw, pcm}.{0-3}.csv in input folder, otherise ignored)
		@@ -49,17 +47,6 @@ output_path: "./tmp_output"
		# input_select:
		# - "48kHz"

		### Horizontally concatenate input items into one long file; default = false
		# concatenate_input: true
		### Specify the concatenation order in a list of strings. If not specified, the concatenation order would be
		### as per the filesystem on the users' device
		### Should only be used if concatenate_input = true
		# concatenation_order: []
		### Specify preamble duration in ms; default = 0
		# preamble: 40
		### Flag wheter to use noise (amplitude +-4) for the preamble or silence; default = false (silence)
		# pad_noise_preamble: true

		################################################
		### Input configuration
		################################################
		@@ -70,7 +57,7 @@ input:
		# fs: 32000

		################################################
		### Pre-processing
		### Pre-processing on individual items
		################################################
		### Pre-processing step performed prior to core processing for all conditions
		### If not defined, preprocessing step is skipped
		@@ -97,6 +84,33 @@ input:
		### Length of window used at start/end of signal (ms); default = 0
		# window: 100

		################################################
		### Pre-processing on whole signal(s)
		################################################
		# preprocessing_2:
		### Options for processing of the concatenated item (concatenate_input: true) or
		### the individual items (concatenate_input: false) after previous pre-processing step
		### Horizontally concatenate input items into one long file; default = false
		# concatenate_input: true
		### Specify the concatenation order in a list of strings. If not specified, the concatenation order would be
		### as per the filesystem on the users' device
		### Should only be used if concatenate_input = true
		### Specify the filename with extension.
		### For example, concatenation_order: ["file3.wav", "file1.wav", "file4.wav", "file2.wav"]
		# concatenation_order: []
		### Specify preamble duration in ms; default = 0
		# preamble: 10000
		### Flag wheter to use noise (amplitude +-4) for the preamble or silence; default = false (silence)
		# preamble_noise: true
		### Additive background noise
		# background_noise:
		### REQUIRED: SNR for background noise in dB
		# snr: 10
		### REQUIRED: Path to background noise, must have same format and sampling rate as input signal(s)
		# background_noise_path: ".../noise.wav"
		### Seed for delay offest; default = 0
		# seed_delay: 10

		#################################################
		### Bitstream processing
		#################################################
		@@ -122,6 +136,8 @@ input:
		# error_pattern: "path/pattern.192"
		### Error rate in percent
		# error_rate: 5
		### Additional seed to specify number of preruns; default = 0
		# prerun_seed: 2

		################################################
		### Configuration for conditions under test
		@@ -187,6 +203,9 @@ conditions_to_generate:
		# fs: 48000
		### Additional commandline options; default = null
		# opts: ["-q", "-no_delay_cmp"]
		### Bitstream options
		# tx:
		### For possible arguments see overall bitstream modification

		### IVAS condition ###############################
		c07:
		@@ -209,11 +228,14 @@ conditions_to_generate:
		### Path to decoder binary; default search for IVAS_dec in bin folder (primary) and PATH (secondary)
		bin: ~/git/ivas-codec/IVAS_dec
		### Decoder output format; default = postprocessing fmt
		fmt: "CICP19"
		fmt: "7_1_4"
		### Decoder output sampling rate; default = null (same as input)
		# fs: 48000
		### Additional commandline options; default = null
		# opts: ["-q", "-no_delay_cmp"]
		### Bitstream options
		# tx:
		### For possible arguments see overall bitstream modification

		### EVS condition ################################
		c08:
		@@ -235,6 +257,9 @@ conditions_to_generate:
		bin: ~/git/ivas-codec/IVAS_dec
		### Decoder output sampling rate; default = null (same as input)
		# fs: 48000
		### Bitstream options
		# tx:
		### For possible arguments see overall bitstream modification

		################################################
		### Post-processing
		@@ -244,8 +269,8 @@ conditions_to_generate:
		postprocessing:
		### REQUIRED: Target format for output
		fmt: "BINAURAL"
		### Target sampling rate in Hz for resampling; default = null (no resampling)
		# fs: 16000
		### REQUIRED: Target sampling rate in Hz for resampling
		fs: 48000
		### Low-pass cut-off frequency in Hz; default = null (no filtering)
		# lp_cutoff: 24000
		### Target loudness in LKFS; default = null (no loudness change applied)

ivas_processing_scripts/init.py

+29 −29

Original line number	Diff line number	Diff line
		@@ -36,7 +36,6 @@ from itertools import repeat
		import yaml

		from ivas_processing_scripts.audiotools.metadata import check_ISM_metadata
		from ivas_processing_scripts.audiotools.wrappers.bs1770 import scale_files
		from ivas_processing_scripts.constants import (
		LOGGER_DATEFMT,
		LOGGER_FORMAT,
		@@ -44,11 +43,12 @@ from ivas_processing_scripts.constants import (
		)
		from ivas_processing_scripts.processing import chains, config
		from ivas_processing_scripts.processing.processing import (
		concat_setup,
		concat_teardown,
		preprocess,
		preprocess_2,
		preprocess_background_noise,
		process_item,
		reorder_items_list,
		reverse_process_2,
		)
		from ivas_processing_scripts.utils import DirManager, apply_func_parallel

		@@ -95,8 +95,15 @@ def main(args):
		logger = logging_init(args, cfg)

		# Re-ordering items based on concatenation order
		if cfg.concatenate_input and cfg.concatenation_order is not None:
		cfg.items_list = reorder_items_list(cfg.items_list, cfg.concatenation_order)
		if hasattr(cfg, "preprocessing_2"):
		if (
		cfg.preprocessing_2.get("concatenate_input")
		and cfg.preprocessing_2.get("concatenation_order", None) is not None
		):
		cfg.items_list = reorder_items_list(
		cfg.items_list, cfg.preprocessing_2["concatenation_order"]
		)

		# check for ISM metadata
		if cfg.input["fmt"].startswith("ISM"):
		metadata = check_ISM_metadata(
		@@ -121,12 +128,21 @@ def main(args):

		# run preprocessing only once
		if hasattr(cfg, "preprocessing"):
		preprocess(cfg, cfg.metadata_path, logger)

		if cfg.concatenate_input:
		# concatenate items if required
		concat_setup(cfg, logger)

		# save process info for background noise
		cfg.pre = cfg.proc_chains[0]["processes"][0]
		preprocess(cfg, logger)

		# preprocessing on whole signal(s)
		if hasattr(cfg, "preprocessing_2"):
		# save process info to revert it later
		cfg.pre2 = cfg.proc_chains[0]["processes"][0]
		# preprocess background noise
		if hasattr(cfg, "preprocessing") and hasattr(cfg.pre2, "background_noise"):
		preprocess_background_noise(cfg)
		# preprocess 2
		preprocess_2(cfg, logger)

		# run conditions
		for condition, out_dir, tmp_dir in zip(
		cfg.proc_chains, cfg.out_dirs, cfg.tmp_dirs
		):
		@@ -134,11 +150,6 @@ def main(args):

		logger.info(f" Generating condition: {condition['name']}")

		# # TODO: what happens when no concatenation or only one file for concatenation?
		# if condition["processes"][0].name == "ivas": # TODO: check if 0 index sufficient
		# a = {"number_frames": cfg.num_frames, "number_frames_preamble": cfg.num_frames_preamble}
		# condition["processes"][0].tx.update(a)

		apply_func_parallel(
		process_item,
		zip(
		@@ -153,19 +164,8 @@ def main(args):
		"mp" if cfg.multiprocessing else None,
		)

		if cfg.concatenate_input:
		# write out the splits, optionally remove file
		out_paths_splits, out_meta_splits = concat_teardown(cfg, logger)
		# scale individual files
		if cfg.postprocessing.get("loudness", False):
		# TODO: take care of samplingrate
		scale_files(
		out_paths_splits,
		cfg.postprocessing["fmt"],
		cfg.postprocessing["loudness"],
		cfg.postprocessing.get("fs", None),
		out_meta_splits,
		)
		if hasattr(cfg, "preprocessing_2"):
		reverse_process_2(cfg, logger)

		# copy configuration to output directory
		with open(cfg.output_path.joinpath(f"{cfg.name}.yml"), "w") as f:

ivas_processing_scripts/audiotools/audiofile.py

+2 −2

Original line number	Diff line number	Diff line
		@@ -150,8 +150,8 @@ def write(
		def concat(
		in_filenames: list,
		out_file: str,
		silence_pre: int,
		silence_post: int,
		silence_pre: Optional[int] = 0,
		silence_post: Optional[int] = 0,
		in_fs: Optional[int] = 48000,
		num_channels: Optional[int] = None,
		pad_noise: Optional[bool] = False,