Merge branch 'loudness_adjustment_after_splitting' into 'main' (63fc5ec9) · Commits · IVAS Codec Public Collaboration / IVAS Processing Scripts

README.md

+34 −22

Original line number	Diff line number	Diff line
		@@ -222,16 +222,28 @@ input:
		<summary>Click to expand</summary>

		```yaml
		### Bistream processing (transport simulation) done after encoding and before decoding
		### Bitstream processing (transport simulation) done after encoding and before decoding
		### e.g. frame error insertion or transport simulation for JBM testing
		### can be given globally or in individual conditions of type ivas or evs
		# tx:
		### REQUIRED: Path to network simulation binary
		# bs_proc_bin: ".../ivas_python_testscripts/networkSimulator_g192.exe"
		### Path to error pattern (mandatory if no information for generating the error pattern is given)
		### REQUIRED: Type of bitstream processing; possible types: "JBM" or "FER"
		#type: "JBM"

		### JBM
		### REQUIRED: either error_pattern or error_profile
		### delay error profile file
		# error_pattern: ".../dly_error_profile.dat"
		### options for the binary, possible placeholders are {error_pattern} for the error pattern,
		### {bitstream} for the bitstream to process and {bitstream_processed} for the processed bitstream
		# bs_proc_opts: [ "{error_pattern}", "{bitstream}", "{processed_bitstream}", "{processed_bitstream}_tracefile_sim", "2", "0" ]
		### Index of one of the existing delay error profile files to use (1-11)
		# error_profile: 5
		## nFramesPerPacket parameter for the network simulator (optional); default = 1
		# n_frames_per_packet: 2

		### FER
		### REQUIRED: either error_pattern or error_rate
		### Frame error pattern file
		# error_pattern: "path/pattern.192"
		### Error rate in percent
		# error_rate: 5
		```
		</details>

		@@ -420,7 +432,9 @@ No required arguments but the `type` key.
		#### EVS
		For EVS a list of at least one bitrate has to be specified with the key `bitrates`. The entries in this list can also be lists containing the bitrates used for the processing of the individual channels.
		This configuration has to match the channel configuration. If the provided list is shorter, the last value will be repeated.
		For the encoding stage `cod` and the decoding stage `dec`, the path to the IVAS_cod and IVAS_dec binaries can be specified under the key `bin`. Additionally some resampling can be applied by using the key `fs` followed by the desired sampling rate.
		For the encoding stage `cod` and the decoding stage `dec`, the path to the IVAS_cod and IVAS_dec binaries can be specified under the key `bin`.
		Additionally some resampling can be applied by using the key `fs` followed by the desired sampling rate.
		The general bitstream processing configuration can be locally overwritten for each EVS and IVAS condition with the key `tx`.
		#### IVAS
		The configuration of the IVAS condition is similar to the EVS condition. However, only one bitrate for all channels (and metadata) can be specified.
		In addition to that, the encoder and decoder take some additional arguments defined by the key `opts`.
		@@ -458,24 +472,22 @@ The processing chain is as follows:
		---
		## ITU Tools

		This module uses the ITU audio processing tools. These tools can be found here: https://github.com/openitu/STL (except for the filter binary which is deprecated). <br />
		The filter binary with all necessary filter types can be found here: https://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_76/docs/S4-131277.zip. <br />
		It also makes use of the MASA tools provided by Nokia. These can be found here: https://www.3gpp.org/ftp/TSG_SA/WG4_CODEC/TSGS4_122_Athens/Docs/S4-230221.zip.

		The following binaries/executables are needed for the different processing steps:

		\| processing step \| ITU binary \|
		\|--------------------------\|-----------------\|
		\| LP filtering \| filter \|
		\| HP filtering \| filter \|
		\| Resampling \| filter \|
		\| Loudness adjustment \| bs1770demo \|
		\| MNRU \| p50fbmnru \|
		\| ESDRU \| esdru \|
		\| MASA rendering \| masaRenderer \|
		\| Processing step \| Executable \| Where to find \|
		\|---------------------------------\|-----------------------\|-------------------------------------------------------------------------------------------------------------\|
		\| Loudness adjustment \| bs1770demo \| https://github.com/openitu/STL \|
		\| MNRU \| p50fbmnru \| https://github.com/openitu/STL \|
		\| ESDRU \| esdru \| https://github.com/openitu/STL \|
		\| Frame error pattern application \| eid-xor \| https://github.com/openitu/STL \|
		\| Error pattern generation \| gen-patt \| https://www.itu.int/rec/T-REC-G.191-201003-S/en (Note: Version in https://github.com/openitu/STL is buggy!) \|
		\| Filtering, Resampling \| filter \| https://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_76/docs/S4-131277.zip \|
		\| Random offset/seed generation \| random \| https://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_76/docs/S4-131277.zip \|
		\| JBM network similulator \| networkSimulator_g192 \| https://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_76/docs/S4-131277.zip \|
		\| MASA rendering \| masaRenderer \| https://www.3gpp.org/ftp/TSG_SA/WG4_CODEC/TSGS4_122_Athens/Docs/S4-230221.zip \|

		The necessary binaries have to be placed in the [ivas_processing_scripts/bin](./ivas_processing_scripts/bin) folder.
		For the ITU tools it is sufficient to copy the binaries while it is necessary to add some additional files for the MASA renderer.
		For most of the tools it is sufficient to copy the binaries while it is necessary to add some additional files for the MASA renderer.

		---

examples/TEMPLATE.yml

+24 −8

Original line number	Diff line number	Diff line
		@@ -14,6 +14,10 @@
		### Deletion of temporary directories containing
		### intermediate processing files, bitstreams etc.; default = false
		# delete_tmp: true
		### Master seed for random processes like bitstream error pattern generation; default = 0
		# master_seed: 5
		### Additional seed to specify number of preruns; default = 0
		# prerun_seed: 2

		### Any relative paths will be interpreted relative to the working directory the script is called from!
		### Usage of absolute paths is recommended.
		@@ -92,16 +96,28 @@ input:
		#################################################
		### Bitstream processing
		#################################################
		### Bistream processing (transport simulation) done after encoding and before decoding
		### Bitstream processing (transport simulation) done after encoding and before decoding
		### e.g. frame error insertion or transport simulation for JBM testing
		### can be given globally here or in individual conditions of type ivas or evs
		# tx:
		### REQUIRED: Path to network simulation binary
		# bs_proc_bin: ".../ivas_python_testscripts/networkSimulator_g192.exe"
		### Path to error pattern (mandatory if no information for generating the error pattern is given)
		### REQUIRED: Type of bitstream processing; possible types: "JBM" or "FER"
		#type: "JBM"

		### JBM
		### REQUIRED: either error_pattern or error_profile
		### delay error profile file
		# error_pattern: ".../dly_error_profile.dat"
		### options for the binary, possible placeholders are {error_pattern} for the error pattern,
		### {bitstream} for the bitstream to process and {bitstream_processed} for the processed bitstream
		# bs_proc_opts: [ "{error_pattern}", "{bitstream}", "{processed_bitstream}", "{processed_bitstream}_tracefile_sim", "2", "0" ]
		### Index of one of the existing delay error profile files to use (1-11)
		# error_profile: 5
		## nFramesPerPacket parameter for the network simulator; default = 1
		# n_frames_per_packet: 2

		### FER
		### REQUIRED: either error_pattern or error_rate
		### Frame error pattern file
		# error_pattern: "path/pattern.192"
		### Error rate in percent
		# error_rate: 5

		################################################
		### Configuration for conditions under test

ivas_processing_scripts/init.py

+17 −1

Original line number	Diff line number	Diff line
		@@ -36,6 +36,7 @@ from itertools import repeat
		import yaml

		from ivas_processing_scripts.audiotools.metadata import check_ISM_metadata
		from ivas_processing_scripts.audiotools.wrappers.bs1770 import scale_files
		from ivas_processing_scripts.constants import (
		LOGGER_DATEFMT,
		LOGGER_FORMAT,
		@@ -129,6 +130,11 @@ def main(args):

		logger.info(f" Generating condition: {condition['name']}")

		# # TODO: what happens when no concatenation or only one file for concatenation?
		# if condition["processes"][0].name == "ivas": # TODO: check if 0 index sufficient
		# a = {"number_frames": cfg.num_frames, "number_frames_preamble": cfg.num_frames_preamble}
		# condition["processes"][0].tx.update(a)

		apply_func_parallel(
		process_item,
		zip(
		@@ -145,7 +151,17 @@ def main(args):

		if cfg.concatenate_input:
		# write out the splits, optionally remove file
		concat_teardown(cfg, logger)
		out_paths_splits, out_meta_splits = concat_teardown(cfg, logger)
		# scale individual files
		if cfg.postprocessing.get("loudness", False):
		# TODO: take care of samplingrate
		scale_files(
		out_paths_splits,
		cfg.postprocessing["fmt"],
		cfg.postprocessing["loudness"],
		cfg.postprocessing.get("fs", None),
		out_meta_splits,
		)

		# copy configuration to output directory
		with open(cfg.output_path.joinpath(f"{cfg.name}.yml"), "w") as f:

ivas_processing_scripts/audiotools/audio.py

+1 −0

Original line number	Diff line number	Diff line
		@@ -214,6 +214,7 @@ class MetadataAssistedSpatialAudio(Audio):
		raise ValueError(
		f"Unsupported metadata assisted spatial audio format {name}"
		)
		self.metadata_files = []

		@classmethod
		def _from_file(

ivas_processing_scripts/audiotools/audioarray.py

+13 −2

Original line number	Diff line number	Diff line
		@@ -31,6 +31,7 @@
		#

		import logging
		import warnings
		from typing import Iterator, Optional, Tuple, Union

		import numpy as np
		@@ -42,7 +43,9 @@ logger = logging.getLogger("__main__")
		logger.setLevel(logging.DEBUG)


		# Functions used in this module
		"""Functions used in this module"""


		def trim(
		x: np.ndarray,
		fs: Optional[int] = 48000,
		@@ -266,6 +269,7 @@ def limiter(
		release_heuristics_mem = 0.0
		gain = 1.0
		strong_saturation_cnt = 0
		limited = False

		if x.ndim == 1:
		n_samples_x = x.shape
		@@ -324,16 +328,21 @@ def limiter(
		fr_gain = np.tile(gain * fac + frame_gain * (1.0 - fac), (n_chan_x, 1)).T
		fr_sig *= fr_gain
		gain = fr_gain[-1, 0]
		limited = True
		else:
		gain = 1.0

		release_heuristics_mem = release_heuristic
		# hard limiting for everything that still sticks out
		if (fr_sig > 32767).any() or (fr_sig < -32768).any():
		limited = True
		idx_max = np.where(fr_sig > 32767)
		fr_sig[idx_max] = 32767
		idx_min = np.where(fr_sig < -32768)
		fr_sig[idx_min] = -32768

		if limited:
		warnings.warn("Limiting had to be applied")
		return x


		@@ -405,7 +414,9 @@ def framewise_io(
		)


		# Deprecated functions (partly replaced by ITU binaries)
		"""Deprecated functions (partly replaced by ITU binaries)"""


		def resample(
		x: np.ndarray,
		in_freq: int,