diff --git a/README.md b/README.md
index 3449c8600dbcc8df25e8ec5b3fb648b6d34da891..e573150772995f206c2f3d26b025c95f2f774c1f 100755
--- a/README.md
+++ b/README.md
@@ -222,16 +222,28 @@ input:
Click to expand
```yaml
-### Bistream processing (transport simulation) done after encoding and before decoding
+### Bitstream processing (transport simulation) done after encoding and before decoding
### e.g. frame error insertion or transport simulation for JBM testing
+### can be given globally or in individual conditions of type ivas or evs
# tx:
- ### REQUIRED: Path to network simulation binary
- # bs_proc_bin: ".../ivas_python_testscripts/networkSimulator_g192.exe"
- ### Path to error pattern (mandatory if no information for generating the error pattern is given)
+ ### REQUIRED: Type of bitstream processing; possible types: "JBM" or "FER"
+ #type: "JBM"
+
+ ### JBM
+ ### REQUIRED: either error_pattern or error_profile
+ ### delay error profile file
# error_pattern: ".../dly_error_profile.dat"
- ### options for the binary, possible placeholders are {error_pattern} for the error pattern,
- ### {bitstream} for the bitstream to process and {bitstream_processed} for the processed bitstream
- # bs_proc_opts: [ "{error_pattern}", "{bitstream}", "{processed_bitstream}", "{processed_bitstream}_tracefile_sim", "2", "0" ]
+ ### Index of one of the existing delay error profile files to use (1-11)
+ # error_profile: 5
+ ## nFramesPerPacket parameter for the network simulator (optional); default = 1
+ # n_frames_per_packet: 2
+
+ ### FER
+ ### REQUIRED: either error_pattern or error_rate
+ ### Frame error pattern file
+ # error_pattern: "path/pattern.192"
+ ### Error rate in percent
+ # error_rate: 5
```
@@ -420,7 +432,9 @@ No required arguments but the `type` key.
#### EVS
For EVS a list of at least one bitrate has to be specified with the key `bitrates`. The entries in this list can also be lists containing the bitrates used for the processing of the individual channels.
This configuration has to match the channel configuration. If the provided list is shorter, the last value will be repeated.
-For the encoding stage `cod` and the decoding stage `dec`, the path to the IVAS_cod and IVAS_dec binaries can be specified under the key `bin`. Additionally some resampling can be applied by using the key `fs` followed by the desired sampling rate.
+For the encoding stage `cod` and the decoding stage `dec`, the path to the IVAS_cod and IVAS_dec binaries can be specified under the key `bin`.
+Additionally some resampling can be applied by using the key `fs` followed by the desired sampling rate.
+The general bitstream processing configuration can be locally overwritten for each EVS and IVAS condition with the key `tx`.
#### IVAS
The configuration of the IVAS condition is similar to the EVS condition. However, only one bitrate for all channels (and metadata) can be specified.
In addition to that, the encoder and decoder take some additional arguments defined by the key `opts`.
@@ -458,24 +472,22 @@ The processing chain is as follows:
---
## ITU Tools
-This module uses the ITU audio processing tools. These tools can be found here: https://github.com/openitu/STL (except for the filter binary which is deprecated).
-The filter binary with all necessary filter types can be found here: https://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_76/docs/S4-131277.zip.
-It also makes use of the MASA tools provided by Nokia. These can be found here: https://www.3gpp.org/ftp/TSG_SA/WG4_CODEC/TSGS4_122_Athens/Docs/S4-230221.zip.
-
The following binaries/executables are needed for the different processing steps:
-| processing step | ITU binary |
-|--------------------------|-----------------|
-| LP filtering | filter |
-| HP filtering | filter |
-| Resampling | filter |
-| Loudness adjustment | bs1770demo |
-| MNRU | p50fbmnru |
-| ESDRU | esdru |
-| MASA rendering | masaRenderer |
+| Processing step | Executable | Where to find |
+|---------------------------------|-----------------------|-------------------------------------------------------------------------------------------------------------|
+| Loudness adjustment | bs1770demo | https://github.com/openitu/STL |
+| MNRU | p50fbmnru | https://github.com/openitu/STL |
+| ESDRU | esdru | https://github.com/openitu/STL |
+| Frame error pattern application | eid-xor | https://github.com/openitu/STL |
+| Error pattern generation | gen-patt | https://www.itu.int/rec/T-REC-G.191-201003-S/en (Note: Version in https://github.com/openitu/STL is buggy!) |
+| Filtering, Resampling | filter | https://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_76/docs/S4-131277.zip |
+| Random offset/seed generation | random | https://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_76/docs/S4-131277.zip |
+| JBM network similulator | networkSimulator_g192 | https://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_76/docs/S4-131277.zip |
+| MASA rendering | masaRenderer | https://www.3gpp.org/ftp/TSG_SA/WG4_CODEC/TSGS4_122_Athens/Docs/S4-230221.zip |
The necessary binaries have to be placed in the [ivas_processing_scripts/bin](./ivas_processing_scripts/bin) folder.
-For the ITU tools it is sufficient to copy the binaries while it is necessary to add some additional files for the MASA renderer.
+For most of the tools it is sufficient to copy the binaries while it is necessary to add some additional files for the MASA renderer.
---
diff --git a/examples/TEMPLATE.yml b/examples/TEMPLATE.yml
index c63114e365936f087de7897c95da5e1364fb69dd..4ec760d955033baa5c5711f359d1288af8068f41 100755
--- a/examples/TEMPLATE.yml
+++ b/examples/TEMPLATE.yml
@@ -14,6 +14,10 @@
### Deletion of temporary directories containing
### intermediate processing files, bitstreams etc.; default = false
# delete_tmp: true
+### Master seed for random processes like bitstream error pattern generation; default = 0
+# master_seed: 5
+### Additional seed to specify number of preruns; default = 0
+# prerun_seed: 2
### Any relative paths will be interpreted relative to the working directory the script is called from!
### Usage of absolute paths is recommended.
@@ -92,17 +96,29 @@ input:
#################################################
### Bitstream processing
#################################################
-### Bistream processing (transport simulation) done after encoding and before decoding
+### Bitstream processing (transport simulation) done after encoding and before decoding
### e.g. frame error insertion or transport simulation for JBM testing
+### can be given globally here or in individual conditions of type ivas or evs
# tx:
- ### REQUIRED: Path to network simulation binary
- # bs_proc_bin: ".../ivas_python_testscripts/networkSimulator_g192.exe"
- ### Path to error pattern (mandatory if no information for generating the error pattern is given)
+ ### REQUIRED: Type of bitstream processing; possible types: "JBM" or "FER"
+ #type: "JBM"
+
+ ### JBM
+ ### REQUIRED: either error_pattern or error_profile
+ ### delay error profile file
# error_pattern: ".../dly_error_profile.dat"
- ### options for the binary, possible placeholders are {error_pattern} for the error pattern,
- ### {bitstream} for the bitstream to process and {bitstream_processed} for the processed bitstream
- # bs_proc_opts: [ "{error_pattern}", "{bitstream}", "{processed_bitstream}", "{processed_bitstream}_tracefile_sim", "2", "0" ]
-
+ ### Index of one of the existing delay error profile files to use (1-11)
+ # error_profile: 5
+ ## nFramesPerPacket parameter for the network simulator; default = 1
+ # n_frames_per_packet: 2
+
+ ### FER
+ ### REQUIRED: either error_pattern or error_rate
+ ### Frame error pattern file
+ # error_pattern: "path/pattern.192"
+ ### Error rate in percent
+ # error_rate: 5
+
################################################
### Configuration for conditions under test
################################################
diff --git a/ivas_processing_scripts/__init__.py b/ivas_processing_scripts/__init__.py
index 6d48d5d56ebed59be9057442af9aaf9dd2b7e759..e234efd2bdfa15da8a2716c9cc3456f51951b930 100755
--- a/ivas_processing_scripts/__init__.py
+++ b/ivas_processing_scripts/__init__.py
@@ -36,6 +36,7 @@ from itertools import repeat
import yaml
from ivas_processing_scripts.audiotools.metadata import check_ISM_metadata
+from ivas_processing_scripts.audiotools.wrappers.bs1770 import scale_files
from ivas_processing_scripts.constants import (
LOGGER_DATEFMT,
LOGGER_FORMAT,
@@ -129,6 +130,11 @@ def main(args):
logger.info(f" Generating condition: {condition['name']}")
+ # # TODO: what happens when no concatenation or only one file for concatenation?
+ # if condition["processes"][0].name == "ivas": # TODO: check if 0 index sufficient
+ # a = {"number_frames": cfg.num_frames, "number_frames_preamble": cfg.num_frames_preamble}
+ # condition["processes"][0].tx.update(a)
+
apply_func_parallel(
process_item,
zip(
@@ -145,7 +151,17 @@ def main(args):
if cfg.concatenate_input:
# write out the splits, optionally remove file
- concat_teardown(cfg, logger)
+ out_paths_splits, out_meta_splits = concat_teardown(cfg, logger)
+ # scale individual files
+ if cfg.postprocessing.get("loudness", False):
+ # TODO: take care of samplingrate
+ scale_files(
+ out_paths_splits,
+ cfg.postprocessing["fmt"],
+ cfg.postprocessing["loudness"],
+ cfg.postprocessing.get("fs", None),
+ out_meta_splits,
+ )
# copy configuration to output directory
with open(cfg.output_path.joinpath(f"{cfg.name}.yml"), "w") as f:
diff --git a/ivas_processing_scripts/audiotools/audio.py b/ivas_processing_scripts/audiotools/audio.py
index 1199889cca455ce3f808d91b2427183099eefd32..f6c45fca9778df6d02edba364e1d1221e5866f28 100755
--- a/ivas_processing_scripts/audiotools/audio.py
+++ b/ivas_processing_scripts/audiotools/audio.py
@@ -214,6 +214,7 @@ class MetadataAssistedSpatialAudio(Audio):
raise ValueError(
f"Unsupported metadata assisted spatial audio format {name}"
)
+ self.metadata_files = []
@classmethod
def _from_file(
diff --git a/ivas_processing_scripts/audiotools/audioarray.py b/ivas_processing_scripts/audiotools/audioarray.py
index 5b2c60e4f4c71487f0cf3dc14d41dcd8a5ae7043..76c7d81d79e9bafbac19d5c664b368389b10ec20 100755
--- a/ivas_processing_scripts/audiotools/audioarray.py
+++ b/ivas_processing_scripts/audiotools/audioarray.py
@@ -31,6 +31,7 @@
#
import logging
+import warnings
from typing import Iterator, Optional, Tuple, Union
import numpy as np
@@ -42,7 +43,9 @@ logger = logging.getLogger("__main__")
logger.setLevel(logging.DEBUG)
-# Functions used in this module
+"""Functions used in this module"""
+
+
def trim(
x: np.ndarray,
fs: Optional[int] = 48000,
@@ -266,6 +269,7 @@ def limiter(
release_heuristics_mem = 0.0
gain = 1.0
strong_saturation_cnt = 0
+ limited = False
if x.ndim == 1:
n_samples_x = x.shape
@@ -324,16 +328,21 @@ def limiter(
fr_gain = np.tile(gain * fac + frame_gain * (1.0 - fac), (n_chan_x, 1)).T
fr_sig *= fr_gain
gain = fr_gain[-1, 0]
+ limited = True
else:
gain = 1.0
release_heuristics_mem = release_heuristic
# hard limiting for everything that still sticks out
+ if (fr_sig > 32767).any() or (fr_sig < -32768).any():
+ limited = True
idx_max = np.where(fr_sig > 32767)
fr_sig[idx_max] = 32767
idx_min = np.where(fr_sig < -32768)
fr_sig[idx_min] = -32768
+ if limited:
+ warnings.warn("Limiting had to be applied")
return x
@@ -405,7 +414,9 @@ def framewise_io(
)
-# Deprecated functions (partly replaced by ITU binaries)
+"""Deprecated functions (partly replaced by ITU binaries)"""
+
+
def resample(
x: np.ndarray,
in_freq: int,
diff --git a/ivas_processing_scripts/audiotools/audiofile.py b/ivas_processing_scripts/audiotools/audiofile.py
index 3b127f3bd814abefc1f8db223a849bc98eb13c8d..f8fad48f4e2e0e899e815a3cda65544ef84bef39 100755
--- a/ivas_processing_scripts/audiotools/audiofile.py
+++ b/ivas_processing_scripts/audiotools/audiofile.py
@@ -177,7 +177,8 @@ def concat(
Returns
-------
- List of sample indices to split the resulting file at
+ splits
+ List of sample indices to split the resulting file at
"""
y = None
@@ -221,9 +222,10 @@ def split(
splits: list[int],
in_fs: Optional[int] = 48000,
preamble: Optional[int] = 0,
+ loudness: Optional[float] = None,
) -> list[Union[str, Path]]:
"""
- Horizontally splits audio files into multiple shorter files and applied windowing
+ Horizontally splits audio files into multiple shorter files and applies windowing and scaling
Parameters
__________
@@ -237,6 +239,8 @@ def split(
List of sample indices where to cut the signal
in_fs: Optional[int]
Input sampling rate, default 48000 Hz
+ loudness: Optional[float]
+ Desired loudness of individual files
"""
# create a list of output files
diff --git a/ivas_processing_scripts/audiotools/binaural_datasets/BRIR_IISofficialMPEG222UC_FULL.mat b/ivas_processing_scripts/audiotools/binaural_datasets/BRIR_IISofficialMPEG222UC_FULL.mat
index c6549ecee22846da17e1252a13950d1ef478eb7a..42e702db0e30fa828427b5f5dc28f3615bf3dbe6 100755
--- a/ivas_processing_scripts/audiotools/binaural_datasets/BRIR_IISofficialMPEG222UC_FULL.mat
+++ b/ivas_processing_scripts/audiotools/binaural_datasets/BRIR_IISofficialMPEG222UC_FULL.mat
@@ -1,3 +1,3 @@
version https://git-lfs.github.com/spec/v1
-oid sha256:3ebbe3d45cd35e1fd8fb17896a7917b2de6dae03fc34c3a7350288b9a53c2e9d
-size 12623137
+oid sha256:a3ddecef64dfcf8887904b5cc370c0d9723bd8fd1637e32232205cdcd739b80d
+size 12623190
diff --git a/ivas_processing_scripts/audiotools/binaural_datasets/BRIR_IISofficialMPEG222UC_LS.mat b/ivas_processing_scripts/audiotools/binaural_datasets/BRIR_IISofficialMPEG222UC_LS.mat
index 61ba946617a5c35cb56e32814b40f4e728ecdafd..1d590edb9369826d028846a346bb1b53abf9c64e 100755
--- a/ivas_processing_scripts/audiotools/binaural_datasets/BRIR_IISofficialMPEG222UC_LS.mat
+++ b/ivas_processing_scripts/audiotools/binaural_datasets/BRIR_IISofficialMPEG222UC_LS.mat
@@ -1,3 +1,3 @@
version https://git-lfs.github.com/spec/v1
-oid sha256:081a9053c8b04831d97e6f18d641d4737b2c23b076778a9b41c7b3a41d954c32
-size 6348446
+oid sha256:e2c964b96d802532c0ecf1076092c7d246a54293a3a0c4c72995953c66bfec71
+size 6348499
diff --git a/ivas_processing_scripts/audiotools/binaural_datasets/HRIR_ORANGE53_Dolby_SBA1.mat b/ivas_processing_scripts/audiotools/binaural_datasets/HRIR_ORANGE53_Dolby_SBA1.mat
index d30f75716554a9cb72cb9c36e23d256514bf30f8..4f59a8a9147c1fd346bc980ff67a7a35eea952b7 100644
--- a/ivas_processing_scripts/audiotools/binaural_datasets/HRIR_ORANGE53_Dolby_SBA1.mat
+++ b/ivas_processing_scripts/audiotools/binaural_datasets/HRIR_ORANGE53_Dolby_SBA1.mat
@@ -1,3 +1,3 @@
version https://git-lfs.github.com/spec/v1
-oid sha256:49fea9097dd01b0529cdbff150542f4612750ea03cdd75913e2d5bffcf284753
-size 4578
+oid sha256:3a9ad5d8d874ac2fb851f5d2b0b303494f1d115612e9f6cab40e5eb33591b05c
+size 4630
diff --git a/ivas_processing_scripts/audiotools/binaural_datasets/HRIR_ORANGE53_Dolby_SBA2.mat b/ivas_processing_scripts/audiotools/binaural_datasets/HRIR_ORANGE53_Dolby_SBA2.mat
index 81450496c78b1e3b4c8d8959ff53834c9bbdbc84..1ad2162acb5de9b451f1537d08a543e975c2abd8 100644
--- a/ivas_processing_scripts/audiotools/binaural_datasets/HRIR_ORANGE53_Dolby_SBA2.mat
+++ b/ivas_processing_scripts/audiotools/binaural_datasets/HRIR_ORANGE53_Dolby_SBA2.mat
@@ -1,3 +1,3 @@
version https://git-lfs.github.com/spec/v1
-oid sha256:746b5c4b12010d77bd96cb05534eb2eba6b41381fe50949a2ca3df2c75a940ba
-size 10271
+oid sha256:6fc2a15579b80493597a8096bd815e8b847fe1880bdba760d4405122878b0b0a
+size 10323
diff --git a/ivas_processing_scripts/audiotools/binaural_datasets/HRIR_ORANGE53_Dolby_SBA3.mat b/ivas_processing_scripts/audiotools/binaural_datasets/HRIR_ORANGE53_Dolby_SBA3.mat
index ffa700cadb964340fe4083f153c56faf5842fdc5..0e7c3ef463fc067bc04b6bed4ba2c7d338066d67 100644
--- a/ivas_processing_scripts/audiotools/binaural_datasets/HRIR_ORANGE53_Dolby_SBA3.mat
+++ b/ivas_processing_scripts/audiotools/binaural_datasets/HRIR_ORANGE53_Dolby_SBA3.mat
@@ -1,3 +1,3 @@
version https://git-lfs.github.com/spec/v1
-oid sha256:7704ee5a72a3c051eef04f68f346811b550bde62144a0b71f2aa8fa35a931660
-size 18177
+oid sha256:83822cfa090c345a6ece14d1ec1a92023626f467e2f8d982cf099c071dfc1080
+size 18229
diff --git a/ivas_processing_scripts/audiotools/binaural_datasets/HRIR_ORANGE53_FULL.mat b/ivas_processing_scripts/audiotools/binaural_datasets/HRIR_ORANGE53_FULL.mat
index 7fd6e3d179ab854d76ab7119a0f005266ae367d2..a2ab24e5125ad3e01323ae8f3e86f8b9419b5225 100755
--- a/ivas_processing_scripts/audiotools/binaural_datasets/HRIR_ORANGE53_FULL.mat
+++ b/ivas_processing_scripts/audiotools/binaural_datasets/HRIR_ORANGE53_FULL.mat
@@ -1,3 +1,3 @@
version https://git-lfs.github.com/spec/v1
-oid sha256:1a88a3463513647455bcc38bd7180860edfb97195602a8ff832a6be1421474f8
-size 14335861
+oid sha256:bf86a03f0b13932c5c138af22584f864b75c5733df1b01ac3fdf7750a1bdbe5f
+size 14335913
diff --git a/ivas_processing_scripts/audiotools/binaural_datasets/HRIR_ORANGE53_LS.mat b/ivas_processing_scripts/audiotools/binaural_datasets/HRIR_ORANGE53_LS.mat
index e52e031e8c2858501b29b36344144fd9d03f9760..65c2684c94cc6a51bce4ae0a25f528b959606672 100755
--- a/ivas_processing_scripts/audiotools/binaural_datasets/HRIR_ORANGE53_LS.mat
+++ b/ivas_processing_scripts/audiotools/binaural_datasets/HRIR_ORANGE53_LS.mat
@@ -1,3 +1,3 @@
version https://git-lfs.github.com/spec/v1
-oid sha256:9660be83192f7babb4f67e19653a94bc02cee7b3071065880cf618547c19d842
-size 20138
+oid sha256:2e25ef101e9e72c5d70a55bc1451a07d041d29f96a803d7d3f968f20fe403316
+size 20190
diff --git a/ivas_processing_scripts/audiotools/binaural_datasets/HRIR_ORANGE53_SBA3.mat b/ivas_processing_scripts/audiotools/binaural_datasets/HRIR_ORANGE53_SBA3.mat
deleted file mode 100755
index 0d113a34af498f3c981c6e2d4a59e8dc304851c6..0000000000000000000000000000000000000000
--- a/ivas_processing_scripts/audiotools/binaural_datasets/HRIR_ORANGE53_SBA3.mat
+++ /dev/null
@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:02c8a25178b36399054c1802f00bb5a8739f3ac950c21b0c760c046b1dba530d
-size 36201
diff --git a/ivas_processing_scripts/audiotools/binaural_datasets/README.txt b/ivas_processing_scripts/audiotools/binaural_datasets/README.txt
index e0836074908513ccad77bb2e47a73a680dfb4459..9fd37c966abf95f652245ae9ff1ae8573754b570 100755
--- a/ivas_processing_scripts/audiotools/binaural_datasets/README.txt
+++ b/ivas_processing_scripts/audiotools/binaural_datasets/README.txt
@@ -1,8 +1,9 @@
Files in this directory should contain impulse responses for use in rendering in Matlab .mat format
+Samplingrate of 48kHz is assumed
Files should adhere to the following naming scheme:
-{HRIR|BRIR}_{DATASETNAME}_{FULL|LS|SBA3}.mat
+{HRIR|BRIR}_{DATASETNAME}_{FULL|LS|SBA(1-3)}.mat
- HRIR or BRIR
specifies the type of impulse response which will be used
@@ -15,7 +16,9 @@ Files should adhere to the following naming scheme:
FULL: all available measurements on the sphere
LS: superset of supported loudspeaker layouts
(see audiotools.constants.CHANNEL_BASED_AUDIO_FORMATS["LS""])
- SBA3: impulse responses transformed to 3rd order ambisonics by external conversion
+ SBA(1-3): impulse responses transformed to ambisonics by external conversion
+ if available SBA1 is used for FOA, SBA2 for HOA2 and SBA3 for HOA3
+ if not available SBA3 is used and truncated for all Ambisonic formats
Each Matlab file should contain the following variables:
- IR
@@ -24,7 +27,7 @@ Each Matlab file should contain the following variables:
array of {azimuth, elevation, radius} of dimensions [n_channels x 3]
required for FULL, optional otherwise
- latency_s
- latency of the dataset in seconds
+ latency of the dataset in samples
optional, will be estimated if not provided
LICENSES:
diff --git a/ivas_processing_scripts/audiotools/binaural_datasets/binaural_dataset.py b/ivas_processing_scripts/audiotools/binaural_datasets/binaural_dataset.py
index ab86d66a5c4de13ca9aeaaa9a24890fa45ec1bcd..e5d5ac957b73217805df837a4032a03db611aeff 100755
--- a/ivas_processing_scripts/audiotools/binaural_datasets/binaural_dataset.py
+++ b/ivas_processing_scripts/audiotools/binaural_datasets/binaural_dataset.py
@@ -81,6 +81,8 @@ def load_hrtf(
SourcePosition = mat_contents.get("SourcePosition")
latency_s = mat_contents.get("latency_s")
+ if latency_s is not None:
+ latency_s = latency_s.astype(np.int32)[0, 0]
return IR, SourcePosition, latency_s
@@ -161,18 +163,17 @@ def load_ir(
else:
dataset_suffix = "SBA3"
- IR, SourcePosition, latency_s = load_hrtf(
- Path(__file__).parent.joinpath(
- f"{dataset_prefix}_{dataset}_{dataset_suffix}.mat"
- )
+ path_dataset = Path(__file__).parent.joinpath(
+ f"{dataset_prefix}_{dataset}_{dataset_suffix}.mat"
)
+ IR, SourcePosition, latency_s = load_hrtf(path_dataset)
if latency_s is not None:
- latency_smp = int(latency_s * 48000)
+ latency_smp = latency_s
else:
latency_smp = int(np.min(np.argmax(np.sum(np.abs(IR), axis=1), axis=0)))
warnings.warn(
- f"No latency of HRTF dataset specified in .mat file -> computed latency: {latency_smp}"
+ f"No latency of HRTF dataset specified in {path_dataset} file -> computed latency: {latency_smp} sample(s)"
)
if in_fmt.startswith("STEREO"):
@@ -182,7 +183,6 @@ def load_ir(
and not in_fmt.startswith("CUSTOM_LS")
and not in_fmt.startswith("MOZART")
):
- # TODO update, use _get_audio_dict() instead of using fromtype object?
# extract positions from the loudspeaker file
in_fmt = fromtype(in_fmt)
tmp_fmt = fromtype("LS")
@@ -200,7 +200,6 @@ def load_ir(
if j != in_fmt.lfe_index[0]:
IR[:, :, ir_index] = IR_tmp[:, :, i]
ir_index += 1
- # TODO: add custom ls support
return IR, SourcePosition, latency_smp
diff --git a/ivas_processing_scripts/audiotools/convert/__init__.py b/ivas_processing_scripts/audiotools/convert/__init__.py
index d92c2b3a24f6a4e284bd997dbf6f3b09603215cc..024faa47bb9511076cc7cc03ce8b56254f3b145f 100755
--- a/ivas_processing_scripts/audiotools/convert/__init__.py
+++ b/ivas_processing_scripts/audiotools/convert/__init__.py
@@ -62,7 +62,9 @@ def convert_file(
in_meta: Optional[list] = None,
logger: Optional[logging.Logger] = None,
**kwargs,
-):
+) -> None:
+ """Conversion function for one audio file"""
+
if not in_fmt:
raise ValueError("Input audio format must be specified!")
@@ -149,7 +151,9 @@ def convert(
esdru_alpha: Optional[float] = None,
logger: Optional[logging.Logger] = None,
**kwargs,
-):
+) -> None:
+ """Perform pre-processing, conversion and post-processing"""
+
"""pre-processing"""
process_audio(
x=input,
@@ -203,6 +207,8 @@ def process_audio(
esdru_alpha: Optional[float] = None,
logger: Optional[logging.Logger] = None,
) -> None:
+ """Perform (pre-/pos-) processing of audio"""
+
if fs is None:
fs = x.fs
diff --git a/ivas_processing_scripts/audiotools/convert/channelbased.py b/ivas_processing_scripts/audiotools/convert/channelbased.py
index 6540074c2bbf7b0baf0a6a9a58aceecca356deae..6bdd6b3378dfefc63cf9f17dafc94123094e3305 100755
--- a/ivas_processing_scripts/audiotools/convert/channelbased.py
+++ b/ivas_processing_scripts/audiotools/convert/channelbased.py
@@ -113,7 +113,6 @@ def render_cba_to_binaural(
bin.audio = cba_stereo.audio
return
- # TODO this will change if we have resampled HRTFs
cba.audio = resample_itu(cba, 48000)
old_fs = cba.fs
cba.fs = 48000
@@ -142,72 +141,82 @@ def render_cba_to_binaural(
bin.audio = resample_itu(bin, old_fs)
-# TODO rework impl.
-# def render_custom_ls_binaural(
-# custom_ls: audio.ChannelBasedAudio,
-# output: audio.BinauralAudio,
-# IR: np.ndarray,
-# SourcePosition: np.ndarray,
-# trajectory: str,
-# ):
-
-# # logger.info(" Processing channels on custom LS layout")
-# # azis = ", ".join([f"{a:7.2f}" for a in ls_azi_all])
-# # eles = ", ".join([f"{e:7.2f}" for e in ls_ele_all])
-# # logger.info(f" azi: {azis}")
-# # logger.info(f" ele: {eles}")
-# # logger.info(f" lfe_index: {lfe_index_all}")
-
-# if output.name == "BINAURAL_ROOM":
-# tmp = get_audio_type("MOZART")
-# convert_channel_based(custom_ls, tmp)
-# logger.info(f" {custom_ls.name} -> {tmp.name} -> {output.name}")
-# custom_ls.audio = tmp.audio
-# else:
-# tmp = custom_ls
-
-# ls_azi_all = tmp.ls_azi
-# ls_ele_all = tmp.ls_ele
-# lfe_index_all = tmp.lfe_index
-
-# frame_len = (IVAS_FRAME_LEN_MS // 4) * (fs // 1000)
-# sig_len = custom_ls.audio.shape[0]
-# N_frames = int(sig_len / frame_len)
-
-# i_ls = 0
-# y = np.zeros([sig_len, 2])
-# for i_chan in range(custom_ls.audio.shape[1]):
-
-# # skip LFE
-# if i_chan in lfe_index_all:
-# continue
-
-# # skip silent (or very low volume) channels
-# if np.allclose(custom_ls.audio[:, i_chan], 0.0, atol=32.0):
-# continue
-
-# ls_azi = np.repeat(ls_azi_all[i_ls], N_frames)
-# ls_ele = np.repeat(ls_ele_all[i_ls], N_frames)
-
-# azi, ele = rotateISM(ls_azi, ls_ele, trajectory=trajectory)
-
-# # TODO: use EFAP here
-# y += binaural_fftconv_framewise(
-# custom_ls.audio[:, i_chan],
-# IR,
-# SourcePosition,
-# frame_len=frame_len,
-# azi=azi,
-# ele=ele,
-# )
-# i_ls += 1
-
-# return y
+def render_custom_ls_binaural(
+ custom_ls: audio.ChannelBasedAudio,
+ output: audio.BinauralAudio,
+ IR: np.ndarray,
+ SourcePosition: np.ndarray,
+ trajectory: str,
+):
+ # TODO rework impl. (with EFAP)
+ # logger.info(" Processing channels on custom LS layout")
+ # azis = ", ".join([f"{a:7.2f}" for a in ls_azi_all])
+ # eles = ", ".join([f"{e:7.2f}" for e in ls_ele_all])
+ # logger.info(f" azi: {azis}")
+ # logger.info(f" ele: {eles}")
+ # logger.info(f" lfe_index: {lfe_index_all}")
+
+ # if output.name == "BINAURAL_ROOM":
+ # tmp = get_audio_type("MOZART")
+ # convert_channel_based(custom_ls, tmp)
+ # logger.info(f" {custom_ls.name} -> {tmp.name} -> {output.name}")
+ # custom_ls.audio = tmp.audio
+ # else:
+ # tmp = custom_ls
+ #
+ # ls_azi_all = tmp.ls_azi
+ # ls_ele_all = tmp.ls_ele
+ # lfe_index_all = tmp.lfe_index
+ #
+ # frame_len = (IVAS_FRAME_LEN_MS // 4) * (fs // 1000)
+ # sig_len = custom_ls.audio.shape[0]
+ # N_frames = int(sig_len / frame_len)
+ #
+ # i_ls = 0
+ # y = np.zeros([sig_len, 2])
+ # for i_chan in range(custom_ls.audio.shape[1]):
+ #
+ # # skip LFE
+ # if i_chan in lfe_index_all:
+ # continue
+ #
+ # # skip silent (or very low volume) channels
+ # if np.allclose(custom_ls.audio[:, i_chan], 0.0, atol=32.0):
+ # continue
+ #
+ # ls_azi = np.repeat(ls_azi_all[i_ls], N_frames)
+ # ls_ele = np.repeat(ls_ele_all[i_ls], N_frames)
+ #
+ # azi, ele = rotateISM(ls_azi, ls_ele, trajectory=trajectory)
+ #
+ # y += binaural_fftconv_framewise(
+ # custom_ls.audio[:, i_chan],
+ # IR,
+ # SourcePosition,
+ # frame_len=frame_len,
+ # azi=azi,
+ # ele=ele,
+ # )
+ # i_ls += 1
+ #
+ # return y
+ return
def render_cba_to_cba(
cba_in: audio.ChannelBasedAudio, cba_out: audio.ChannelBasedAudio
) -> None:
+ """
+ Rendering of channel-based input signal to channel-based output
+
+ Parameters
+ ----------
+ cba_in: audio.ObjectBasedAudio
+ Channel-based input audio
+ cba_out: audio.ChannelBasedAudio
+ Channel-based output audio
+ """
+
# Stereo to Mono
if cba_in.name == "STEREO" and cba_out.name == "MONO":
render_mtx = np.vstack([[0.5], [0.5]])
@@ -228,14 +237,13 @@ def render_cba_to_cba(
if i not in cba_in.lfe_index
]
)
- # TODO tmu : implement configurable LFE handling
+
# pass-through for LFE
for index in np.sort(cba_in.lfe_index):
render_mtx = np.insert(render_mtx, index, 0, axis=0)
render_mtx = np.insert(render_mtx, cba_out.lfe_index, 0, axis=1)
render_mtx[cba_in.lfe_index, cba_out.lfe_index] = 1
- # TODO tmu temporarily disable LFE rendering to MONO/STEREO
if cba_out.num_channels <= 2:
render_mtx[cba_in.lfe_index, :] = 0
@@ -243,6 +251,17 @@ def render_cba_to_cba(
def render_cba_to_sba(cba: audio.ChannelBasedAudio, sba: audio.SceneBasedAudio) -> None:
+ """
+ Rendering of channel-based input signal to SBA output
+
+ Parameters
+ ----------
+ cba: audio.ObjectBasedAudio
+ Channel-based input audio
+ sba: audio.ChannelBasedAudio
+ SBA output audio
+ """
+
if cba.name == "MONO":
raise ValueError(f"Rendering from MONO to {sba.name} is not supported.")
@@ -282,7 +301,6 @@ def rotate_cba(
Rotated multichannel signal
"""
- # TODO needs optimization, currently slow
trj_data = np.genfromtxt(trajectory, delimiter=",")
trj_frames = trj_data.shape[0]
@@ -344,8 +362,6 @@ def render_lfe_to_binaural(
if lfe.shape[1] > 1:
lfe = np.sum(lfe, axis=1)
- # TODO tmu - disabled temporarily here, disabled in C
- # TODO: add delay compensation
"""
# 120 Hz low-pass filtering for LFE using IVAS filter coefficients
if fs == 48000:
@@ -361,6 +377,7 @@ def render_lfe_to_binaural(
lfe = np.roll(lfe, round(latency_smp), axis=0)
lfe[0 : round(latency_smp), :] = 0
"""
+ lfe_delay_ns = 0
# apply gain
lfe *= LFE_gain
@@ -370,4 +387,4 @@ def render_lfe_to_binaural(
lfe = lfe[:, np.newaxis]
lfe = np.hstack([lfe, lfe])
- return lfe
+ return lfe, lfe_delay_ns
diff --git a/ivas_processing_scripts/audiotools/convert/masa.py b/ivas_processing_scripts/audiotools/convert/masa.py
index c03977ab6dbfb59046e34e1dc7d6bc3a62ac87ba..c802a78bacb27023ad0376da2a016b1ae4890968 100755
--- a/ivas_processing_scripts/audiotools/convert/masa.py
+++ b/ivas_processing_scripts/audiotools/convert/masa.py
@@ -30,7 +30,6 @@
# the United Nations Convention on Contracts on the International Sales of Goods.
#
-import logging
from pathlib import Path
from typing import Optional, Union
from warnings import warn
@@ -45,7 +44,6 @@ from ivas_processing_scripts.audiotools.wrappers.masaRenderer import masaRendere
def convert_masa(
masa: audio.MetadataAssistedSpatialAudio,
out: audio.Audio,
- logger: Optional[logging.Logger] = None,
**kwargs,
) -> audio.Audio:
"""Convert Metadata Assisted Spatial audio to the requested output format"""
@@ -76,7 +74,22 @@ def render_masa_to_binaural(
trajectory: Optional[Union[str, Path]] = None,
bin_dataset: Optional[str] = None,
**kwargs,
-):
+) -> None:
+ """
+ Binauralization of MASA audio
+
+ Parameters
+ ----------
+ masa: audio.MetadataAssistedSpatialAudio
+ MASA input audio
+ bin: audio.BinauralAudio
+ Output binaural audio
+ trajectory: Optional[Union[str, Path]]
+ Head rotation trajectory path
+ bin_dataset: Optional[str]
+ Name of binaural dataset without prefix or suffix
+ """
+
if "ROOM" in bin.name:
cba_tmp = audio.fromtype("7_1_4")
cba_tmp.fs = masa.fs
@@ -100,7 +113,18 @@ def render_masa_to_binaural(
def render_masa_to_cba(
masa: audio.MetadataAssistedSpatialAudio,
cba: audio.ChannelBasedAudio,
-):
+) -> None:
+ """
+ Rendering of MASA input signal to Channel-based format
+
+ Parameters
+ ----------
+ masa: audio.MetadataAssistedSpatialAudio
+ MASA input audio
+ cba: audio.ChannelBasedAudio
+ Channel-based output audio
+ """
+
if cba.name not in ["5_1", "7_1_4"]:
warn(
f"MasaRenderer does not support {cba.name} natively. Using 7_1_4 as an intermediate format."
@@ -118,7 +142,18 @@ def render_masa_to_cba(
def render_masa_to_sba(
masa: audio.MetadataAssistedSpatialAudio,
sba: audio.SceneBasedAudio,
-):
+) -> None:
+ """
+ Rendering of MASA input signal to SBA format
+
+ Parameters
+ ----------
+ masa: audio.MetadataAssistedSpatialAudio
+ MASA input audio
+ sba: audio.SceneBasedAudio
+ SBA output audio
+ """
+
warn(
f"MasaRenderer does not support {sba.name} natively. Using 7_1_4 as an intermediate format."
)
diff --git a/ivas_processing_scripts/audiotools/convert/objectbased.py b/ivas_processing_scripts/audiotools/convert/objectbased.py
index 4111face8d1ee5c74b2ad66f78aec28737fc2be1..c6d0f1144768abc513b186d516961e7cf3ce0be4 100755
--- a/ivas_processing_scripts/audiotools/convert/objectbased.py
+++ b/ivas_processing_scripts/audiotools/convert/objectbased.py
@@ -261,7 +261,25 @@ def rotate_oba(
ele: np.ndarray,
trajectory: Optional[str] = None,
) -> Tuple[np.ndarray, np.ndarray]:
- """Application of head tracking trajectory"""
+ """
+ Application of head tracking trajectory
+
+ Parameters:
+ ----------
+ azi: np.ndarray
+ Azimuth coordinates of objects
+ ele: np.ndarray
+ Elevation coordinates of objects
+ trajectory: str
+ Head-tracking trajectory path
+
+ Returns:
+ ----------
+ azi_rot: np.ndarray
+ Azimuth coordinates after application of trajectory
+ ele_rot: np.ndarray
+ Elevation coordinates after application of trajectory
+ """
if trajectory is None:
return azi, ele
@@ -286,14 +304,36 @@ def rotate_oba(
def render_object(
- obj_idx,
- obj_pos,
- oba,
- trajectory,
- IR,
- SourcePosition,
+ obj_idx: int,
+ obj_pos: np.ndarray,
+ oba: audio.ObjectBasedAudio,
+ trajectory: str,
+ IR: np.ndarray,
+ SourcePosition: np.ndarray,
) -> np.ndarray:
- """Binaural rendering for one ISM object"""
+ """
+ Binaural rendering for one ISM object
+
+ Parameters:
+ ----------
+ obj_idx: int
+ Index of object in list of all objects
+ obj_pos: np.ndarray
+ Position of object
+ oba: audio.ObjectBasedAudio
+ Input ISM audio object
+ trajectory: str
+ Head-tracking trajectory path
+ IR: np.ndarray
+ HRIRs for binauralization
+ SourcePosition: np.ndarray
+ Positions of HRIR measurements
+
+ Returns:
+ ----------
+ result_audio: np.ndarray
+ Binaurally rendered object
+ """
# repeat each value four times since head rotation data is on sub-frame basis
azi = np.repeat(obj_pos[:, 0], 4)
diff --git a/ivas_processing_scripts/audiotools/convert/scenebased.py b/ivas_processing_scripts/audiotools/convert/scenebased.py
index 10a91522905603eb4f2726989137d04e1650700e..b8295808b5046857c3a3bfd1c3dbe1bfe4bb13ec 100755
--- a/ivas_processing_scripts/audiotools/convert/scenebased.py
+++ b/ivas_processing_scripts/audiotools/convert/scenebased.py
@@ -83,10 +83,6 @@ def convert_scenebased(
return out
-def zero_vert_channels(sba: audio.SceneBasedAudio):
- sba.audio[:, VERT_HOA_CHANNELS_ACN[VERT_HOA_CHANNELS_ACN < sba.num_channels]] = 0
-
-
def render_sba_to_binaural(
sba: audio.SceneBasedAudio,
bin: audio.BinauralAudio,
@@ -106,7 +102,7 @@ def render_sba_to_binaural(
trajectory: Optional[Union[str, Path]]
Head rotation trajectory path
bin_dataset: Optional[str]
- Name of binaural dataset wihtout prefix or suffix
+ Name of binaural dataset without prefix or suffix
"""
if trajectory is not None:
@@ -138,7 +134,18 @@ def render_sba_to_binaural(
def render_sba_to_cba(
sba: audio.SceneBasedAudio,
cba: audio.ChannelBasedAudio,
-):
+) -> None:
+ """
+ Rendering of SBA input signal to channel-based format
+
+ Parameters
+ ----------
+ sba: audio.SceneBasedAudio
+ Scene-based input audio
+ cba: audio.ChannelBasedAudio
+ Channel-based output audio
+ """
+
render_mtx = get_allrad_mtx(sba.ambi_order, cba)
cba.audio = sba.audio @ render_mtx.T
@@ -147,6 +154,17 @@ def render_sba_to_sba(
sba_in: audio.SceneBasedAudio,
sba_out: audio.SceneBasedAudio,
) -> None:
+ """
+ Rendering of SBA input signal to SBA output format
+
+ Parameters
+ ----------
+ sba_in: audio.SceneBasedAudio
+ Scene-based input audio
+ sba_out: audio.SceneBasedAudio
+ Scene-based output audio
+ """
+
if sba_out.ambi_order > sba_in.ambi_order:
sba_out.audio = np.pad(
sba_in.audio, [[0, 0], [0, sba_out.num_channels - sba_in.num_channels]]
@@ -218,15 +236,23 @@ def rotate_sba(
""" Helper functions """
-def nchan_from_ambi_order(ambi_order: int):
+def zero_vert_channels(sba: audio.SceneBasedAudio) -> None:
+ """Remove all ambisonics parts with vertical components"""
+ sba.audio[:, VERT_HOA_CHANNELS_ACN[VERT_HOA_CHANNELS_ACN < sba.num_channels]] = 0
+
+
+def nchan_from_ambi_order(ambi_order: int) -> int:
+ """Compute number of channels based on ambisonics order"""
return (ambi_order + 1) ** 2
-def ambi_order_from_nchan(nchan: int):
+def ambi_order_from_nchan(nchan: int) -> int:
+ """Compute ambisonics order based on number of channels"""
return int(np.sqrt(nchan) - 1)
def rE_weight(order: int) -> np.ndarray:
+ """Compute max-rE weighting matrix"""
return np.array(
[
lpmv(0, l, np.cos(np.deg2rad(137.9) / (order + 1.51)))
@@ -237,12 +263,14 @@ def rE_weight(order: int) -> np.ndarray:
def n2sn(order: int) -> np.ndarray:
+ """Compute conversion matrix for N3D to SN3D normalization"""
return np.array(
[1.0 / np.sqrt(2 * l + 1) for l in range(order + 1) for _ in range(-l, l + 1)]
)
def sn2n(order: int) -> np.ndarray:
+ """Compute conversion matrix for SN3D to N3D normalization"""
return np.array(
[np.sqrt(2 * l + 1) for l in range(order + 1) for _ in range(-l, l + 1)]
)
@@ -257,7 +285,27 @@ def getRSH(
) -> np.ndarray:
"""
Returns real spherical harmonic response for the given position(s)
+
+ Parameters:
+ ----------
+ azi: np.ndarray
+ Azimuth angles
+ ele: np.ndarray
+ Elevation angles
+ ambi_order: int
+ Ambisonics order
+ norm: Optional[str]
+ Normalization of ambisonic bases.
+ Possible values: "sn3d", "n3d", everything else is interpreted as orthogonal
+ degrees: Optional[bool]
+ If true azi and ele are interpreted as angles in degrees, otherwise as radians
+
+ Returns:
+ ----------
+ response: np.ndarray
+ Real spherical harmonic response
"""
+
if degrees:
azi = np.deg2rad(azi)
ele = np.deg2rad(ele)
@@ -310,9 +358,32 @@ def get_allrad_mtx(
ambi_order: int,
cba: audio.ChannelBasedAudio,
norm: Optional[str] = "sn3d",
- rE_weight: Optional[bool] = False,
+ rE_weight_bool: Optional[bool] = False,
intensity_panning: Optional[bool] = True,
) -> np.ndarray:
+ """
+ Returns ALLRAD matrix
+
+ Parameters:
+ ----------
+ ambi_order: int
+ Ambisonics order
+ cba: audio.ChannelBasedAudio
+ Channel-based audio object
+ norm: Optional[str]
+ Normalization of ambisonic bases.
+ Possible values: "sn3d", "ortho", everything else is interpreted as n3d
+ re_weight_bool: Optional[bool]
+ Flag for max-rE weighting
+ intensity_panning: Optional[bool]
+ Flag for intensity panning
+
+ Returns:
+ ----------
+ hoa_dec: np.ndarray
+ ALLRAD matrix
+ """
+
n_harm = nchan_from_ambi_order(ambi_order)
if cba.name == "MONO":
@@ -348,7 +419,7 @@ def get_allrad_mtx(
elif norm == "ortho":
hoa_dec *= np.sqrt(4 * np.pi)
- if rE_weight:
+ if rE_weight_bool:
a_n = rE_weight(ambi_order)
nrg_pre = np.sqrt(len(n_ls_woLFE) / np.sum(a_n**2))
hoa_dec = hoa_dec @ np.diag(a_n) * nrg_pre
diff --git a/ivas_processing_scripts/audiotools/metadata.py b/ivas_processing_scripts/audiotools/metadata.py
index 045d38d88553a634c1b83995b81affa8d7a123c3..d7fd167d668385c74de88ca352cbd04a60c4aefc 100755
--- a/ivas_processing_scripts/audiotools/metadata.py
+++ b/ivas_processing_scripts/audiotools/metadata.py
@@ -146,7 +146,6 @@ class Metadata:
self.audio.append(sba)
def parse_optional_values(self, f: TextIO):
- # TODO implementation
raise NotImplementedError(
"Additional configuration keys in metadata currently unsupported!"
)
@@ -223,6 +222,22 @@ def trim_meta(
pad_noise: Optional[bool] = False,
samples: Optional[bool] = False,
) -> None:
+ """
+ Trim or pad ISM including metadata
+ positive limits trim negative limits pad
+
+ Parameters
+ ----------
+ x: audio.ObjectBasedAudio
+ ISM audio object
+ limits: Optional[Tuple[int, int]]
+ Number of samples to trim or pad at beginning and end
+ pad_noise: Optional[bool]
+ Flag for padding noise instead of silence
+ samples: Optional[bool]
+ Flag for interpreting limits as samples, otherwise milliseconds
+ """
+
if not limits:
return
@@ -289,12 +304,32 @@ def concat_meta_from_file(
input_fmt: str,
preamble: Optional[int] = None,
) -> None:
+ """
+ Concatenate ISM metadata from files
+
+ Parameters
+ ----------
+ audio_files: list[str]
+ List of audio file names
+ meta_files: list[list[str]]
+ List of corresponding metadata file names
+ out_file: list[str]
+ Name of concatenated output file
+ silence_pre: int
+ Silence inserted before each item
+ silence_post: int
+ Silence inserted after each item
+ input_fmt: str
+ Input audio format
+ preamble: Optional[int]
+ Length of preamble in milliseconds
+ """
+
# create audio objects
audio_objects = []
fs = None
for i, audio_file in enumerate(audio_files):
# metadata is cut/looped to signal length in init of audio object
- # TODO check fs for audio object when pcm
audio_object = audio.fromfile(input_fmt, audio_file, in_meta=meta_files[i])
audio_objects.append(audio_object)
if fs:
@@ -400,6 +435,7 @@ def split_meta_in_file(
split_old = 0
for idx, split in enumerate(splits):
+ out_paths_obj = []
for obj in range(audio_object.num_channels):
out_file = (
Path(out_folder)
@@ -407,7 +443,7 @@ def split_meta_in_file(
)
# add the path to our list
- out_paths.append(out_file)
+ out_paths_obj.append(out_file)
# remove preamble
if preamble:
@@ -429,15 +465,21 @@ def split_meta_in_file(
# write file
write_ISM_metadata_in_file([y], [out_file])
+ out_paths.append(out_paths_obj)
+
split_old = split
return out_paths
def check_ISM_metadata(
- in_meta: dict, num_objects: int, num_items: int, item_names: Optional[list] = None
+ in_meta: dict,
+ num_objects: int,
+ num_items: int,
+ item_names: Optional[list] = None,
) -> list:
"""Find ISM metadata"""
+
list_meta = []
if in_meta is None:
for item in item_names:
@@ -490,7 +532,7 @@ def metadata_search(
in_meta: Union[str, Path],
item_names: list[Union[str, Path]],
num_objects: int,
-) -> list:
+) -> list[list[Union[Path, str]]]:
"""Search for ISM metadata with structure item_name.{0-3}.csv in in_meta folder"""
if not item_names:
diff --git a/ivas_processing_scripts/audiotools/rotation.py b/ivas_processing_scripts/audiotools/rotation.py
index 85d8bf539b5f0b47e40592a7d73aeea14a65c92f..742548a8d7f7154a1521f651754d4fe3fc8e51bf 100755
--- a/ivas_processing_scripts/audiotools/rotation.py
+++ b/ivas_processing_scripts/audiotools/rotation.py
@@ -136,14 +136,14 @@ def SHrotmatgen(
R: np.ndarray,
order: Optional[int] = 3,
) -> np.ndarray:
- """Calculate SHD rotation matrix from that in real space
+ """
+ Calculate SHD rotation matrix from that in real space
translated from ivas_rotation.c
Parameters:
----------
R: np.ndarray
real-space rotation matrix
-
order: Optional[int]
Ambisonics order, default = 3
@@ -151,8 +151,8 @@ def SHrotmatgen(
----------
SHrotmat: np.ndarray
SHD rotation matrix
-
"""
+
dim = (order + 1) * (order + 1)
SHrotmat = np.zeros([dim, dim])
@@ -357,6 +357,8 @@ def rotateAziEle(
R: np.ndarray,
is_planar: bool = False,
) -> Tuple[float, float]:
+ """Rotate azimuth and elevation angles with rotation matrix"""
+
w = np.cos(np.deg2rad(ele))
dv = np.array(
[
diff --git a/ivas_processing_scripts/audiotools/wrappers/bs1770.py b/ivas_processing_scripts/audiotools/wrappers/bs1770.py
index 041385f802388ae02644089e1d004989c37a5495..a047c339d76c43a63013a4fbb73957d752800244 100755
--- a/ivas_processing_scripts/audiotools/wrappers/bs1770.py
+++ b/ivas_processing_scripts/audiotools/wrappers/bs1770.py
@@ -30,10 +30,11 @@
# the United Nations Convention on Contracts on the International Sales of Goods.
#
+import copy
import logging
from pathlib import Path
from tempfile import TemporaryDirectory
-from typing import Optional, Tuple
+from typing import Optional, Tuple, Union
from warnings import warn
import numpy as np
@@ -56,7 +57,7 @@ def bs1770demo(
Parameters
----------
- input : Audio
+ input: Audio
Input audio
target_loudness: Optional[float]
Desired loudness in LKFS
@@ -107,7 +108,7 @@ def bs1770demo(
]
if isinstance(input, audio.BinauralAudio):
- cmd[6] = "00" # "11" # -conf
+ cmd[6] = "00" # -conf
elif isinstance(input, audio.ChannelBasedAudio):
# if loudspeaker position fulfills the criteria, set the config string to 1 for that index
conf_str = [
@@ -147,7 +148,7 @@ def get_loudness(
target_loudness: float
Desired loudness in LKFS
loudness_format: str
- Loudness format to render to for loudness computation (default input format)
+ Loudness format to render to for loudness computation (default input format if possible)
Returns
-------
@@ -161,6 +162,7 @@ def get_loudness(
raise ValueError("Desired loudness is too high!")
if loudness_format is None:
+ # for some formats rendering is necessary prior to loudness measurement
if isinstance(input, audio.SceneBasedAudio) or isinstance(
input, audio.MetadataAssistedSpatialAudio
):
@@ -170,6 +172,7 @@ def get_loudness(
elif hasattr(input, "layout_file"):
loudness_format = input.layout_file
else:
+ # default use input format
loudness_format = input.name
# configure intermediate format
@@ -188,7 +191,7 @@ def loudness_norm(
input: audio.Audio,
target_loudness: Optional[float] = -26,
loudness_format: Optional[str] = None,
-) -> Tuple[np.ndarray, float]:
+) -> np.ndarray:
"""
Iterative loudness normalization using ITU-R BS.1770-4
Signal is iteratively scaled after rendering to the specified format
@@ -199,7 +202,7 @@ def loudness_norm(
input : Audio
Input audio
target_loudness: Optional[float]
- Desired loudness level in LKFS/dBov
+ Desired loudness level in LKFS
loudness_format: Optional[str]
Loudness format to render to for loudness computation (default input format)
@@ -207,8 +210,6 @@ def loudness_norm(
-------
norm : Audio
Normalized audio
- scale_factor: float
- Effectively applied scale factor
"""
# repeat until convergence of loudness
@@ -229,9 +230,55 @@ def loudness_norm(
num_iter += 1
- if num_iter > 10:
+ if num_iter >= 10:
warn(
f"Loudness did not converge to desired value, stopping at: {measured_loudness:.2f}"
)
return input.audio
+
+
+def scale_files(
+ file_list: list[list[Union[Path, str]]],
+ fmt: str,
+ loudness: float,
+ fs: Optional[int] = 48000,
+ in_meta: Optional[list] = None,
+) -> None:
+ """
+ Scales audio files to desired loudness
+
+ Parameters
+ ----------
+ file_list : list[list[Union[Path, str]]]
+ List of file paths in a list of the condition folders
+ fmt: str
+ Audio format of files in list
+ loudness: float
+ Desired loudness level in LKFS/dBov
+ fs: Optional[int]
+ Sampling rate
+ in_meta: Optional[list]
+ Metadata for ISM with same structure as file_list but one layer more
+ for the list of metadata for one file
+ """
+
+ if fmt.startswith("ISM") and in_meta:
+ meta_bool = True
+ else:
+ in_meta = copy.copy(file_list)
+ meta_bool = False
+
+ for folder, meta_folder in zip(file_list, in_meta):
+ for file, meta in zip(folder, meta_folder):
+ # create audio object
+ if meta_bool:
+ audio_obj = audio.fromfile(fmt, file, fs, meta)
+ else:
+ audio_obj = audio.fromfile(fmt, file, fs)
+
+ # adjust loudness
+ scaled_audio = loudness_norm(audio_obj, loudness)
+
+ # write into file
+ write(file, scaled_audio, audio_obj.fs)
diff --git a/ivas_processing_scripts/audiotools/wrappers/eid_xor.py b/ivas_processing_scripts/audiotools/wrappers/eid_xor.py
new file mode 100644
index 0000000000000000000000000000000000000000..72fb5ce2f361d726513e6b119fca95fb2f715570
--- /dev/null
+++ b/ivas_processing_scripts/audiotools/wrappers/eid_xor.py
@@ -0,0 +1,186 @@
+#!/usr/bin/env python3
+
+#
+# (C) 2022-2023 IVAS codec Public Collaboration with portions copyright Dolby International AB, Ericsson AB,
+# Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V., Huawei Technologies Co. LTD.,
+# Koninklijke Philips N.V., Nippon Telegraph and Telephone Corporation, Nokia Technologies Oy, Orange,
+# Panasonic Holdings Corporation, Qualcomm Technologies, Inc., VoiceAge Corporation, and other
+# contributors to this repository. All Rights Reserved.
+#
+# This software is protected by copyright law and by international treaties.
+# The IVAS codec Public Collaboration consisting of Dolby International AB, Ericsson AB,
+# Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V., Huawei Technologies Co. LTD.,
+# Koninklijke Philips N.V., Nippon Telegraph and Telephone Corporation, Nokia Technologies Oy, Orange,
+# Panasonic Holdings Corporation, Qualcomm Technologies, Inc., VoiceAge Corporation, and other
+# contributors to this repository retain full ownership rights in their respective contributions in
+# the software. This notice grants no license of any kind, including but not limited to patent
+# license, nor is any license granted by implication, estoppel or otherwise.
+#
+# Contributors are required to enter into the IVAS codec Public Collaboration agreement before making
+# contributions.
+#
+# This software is provided "AS IS", without any express or implied warranties. The software is in the
+# development stage. It is intended exclusively for experts who have experience with such software and
+# solely for the purpose of inspection. All implied warranties of non-infringement, merchantability
+# and fitness for a particular purpose are hereby disclaimed and excluded.
+#
+# Any dispute, controversy or claim arising under or in relation to providing this software shall be
+# submitted to and settled by the final, binding jurisdiction of the courts of Munich, Germany in
+# accordance with the laws of the Federal Republic of Germany excluding its conflict of law rules and
+# the United Nations Convention on Contracts on the International Sales of Goods.
+#
+
+import os.path
+from pathlib import Path
+from typing import Optional, Union
+
+from ivas_processing_scripts.audiotools.wrappers.gen_patt import create_error_pattern
+from ivas_processing_scripts.utils import find_binary, run
+
+
+def eid_xor(
+ error_pattern: Union[str, Path],
+ in_bitstream: Union[str, Path],
+ out_bitstream: Union[str, Path],
+) -> None:
+ """
+ Wrapper for eid-xor binary to apply error patterns for the bitstream processing
+
+ Parameters
+ ----------
+ error_pattern: Union[str, Path]
+ Path to error pattern file
+ in_bitstream: Union[str, Path]
+ Path to input bitstream file
+ out_bitstream: Union[str, Path]
+ Output path for modified bitstream
+ """
+
+ # find binary
+ binary = find_binary("eid-xor")
+
+ # check for valid inputs
+ if not Path(in_bitstream).is_file():
+ raise ValueError(
+ f"Input bitstream file {in_bitstream} for bitstream processing does not exist"
+ )
+ elif not Path(error_pattern).is_file():
+ raise ValueError(
+ f"Error pattern file {error_pattern} for bitstream processing does not exist"
+ )
+
+ # set up command line
+ cmd = [
+ str(binary),
+ "-vbr", # Enables variable bit rate operation
+ "-fer", # Error pattern is a frame erasure pattern
+ in_bitstream,
+ error_pattern,
+ out_bitstream,
+ ]
+
+ # run command
+ run(cmd)
+
+ return
+
+
+def create_and_apply_error_pattern(
+ in_bitstream: Union[Path, str],
+ out_bitstream: Union[Path, str],
+ len_sig: int,
+ error_pattern: Optional[Union[Path, str]] = None,
+ error_rate: Optional[float] = None,
+ preamble: Optional[int] = 0,
+ master_seed: Optional[int] = 0,
+ prerun_seed: Optional[int] = 0,
+) -> None:
+ """
+ Function to create (or use existing) frame error pattern for bitstream processing
+
+ Parameters
+ ----------
+ in_bitstream: Union[Path, str]
+ Path of input bitstream
+ out_bitstream: Union[Path, str]
+ Path of output bitstream
+ len_sig: int
+ Length of signal in frames
+ error_pattern: Optional[Union[Path, str]]
+ Path to existing error pattern
+ error_rate: float
+ Error rate in percent
+ preamble: Optional[int]
+ Length of preamble in frames
+ master_seed: Optional[int]
+ Master seed for error pattern generation
+ prerun_seed: Optional[int]
+ Number of preruns in seed generation
+ """
+
+ if error_pattern is None:
+ # create error pattern
+ if error_rate is not None:
+ error_pattern = in_bitstream.parent.joinpath("error_pattern").with_suffix(
+ ".192"
+ )
+ create_error_pattern(
+ len_sig, error_pattern, error_rate, preamble, master_seed, prerun_seed
+ )
+ else:
+ raise ValueError(
+ "Either error pattern or error rate has to be specified for bitstream processing"
+ )
+ elif error_rate is not None:
+ raise ValueError(
+ "Error pattern and error rate are specified for bitstream processing. Can't use both"
+ )
+
+ # apply error pattern
+ eid_xor(error_pattern, in_bitstream, out_bitstream)
+
+ return
+
+
+def validate_error_pattern_application(
+ error_pattern: Optional[Union[Path, str]] = None,
+ error_rate: Optional[int] = None,
+) -> None:
+ """
+ Validate settings for the network simulator
+
+ Parameters
+ ----------
+ error_pattern: Optional[Union[Path, str]]
+ Path to existing error pattern
+ error_rate: Optional[int]
+ Frame error rate
+ """
+
+ if find_binary("gen-patt") is None:
+ raise FileNotFoundError(
+ "The binary gen-patt for error pattern generation was not found! Please check the configuration."
+ )
+ if find_binary("eid-xor") is None:
+ raise FileNotFoundError(
+ "The binary eid-xor for error patter application was not found! Please check the configuration."
+ )
+ if error_pattern is not None:
+ if not os.path.exists(os.path.realpath(error_pattern)):
+ raise FileNotFoundError(
+ f"The frame error profile file {error_pattern} was not found! Please check the configuration."
+ )
+ if error_rate is not None:
+ raise ValueError(
+ "Frame error pattern and error rate are specified for bitstream processing. Can't use both! Please check the configuration."
+ )
+ else:
+ if error_rate is None:
+ raise ValueError(
+ "Either error rate or error pattern has to be specified for FER bitstream processing."
+ )
+ elif error_rate < 0 or error_rate > 100:
+ raise ValueError(
+ f"Specified error rate of {error_rate}% is either too large or too small."
+ )
+ return
diff --git a/ivas_processing_scripts/audiotools/wrappers/esdru.py b/ivas_processing_scripts/audiotools/wrappers/esdru.py
index a26ff511f379b493b0288cad156c56dc6556eac9..4e0dfbea812a5a2e0aed97e229cad26ebbd5777b 100755
--- a/ivas_processing_scripts/audiotools/wrappers/esdru.py
+++ b/ivas_processing_scripts/audiotools/wrappers/esdru.py
@@ -32,7 +32,7 @@
from pathlib import Path
from tempfile import TemporaryDirectory
-from typing import Optional, Tuple
+from typing import Optional
import numpy as np
@@ -47,7 +47,7 @@ def esdru(
sf: Optional[int] = 48000,
e_step: Optional[float] = 0.5,
seed: Optional[int] = 1,
-) -> Tuple[np.ndarray, int]:
+) -> np.ndarray:
"""
Wrapper for ESDRU (Ericsson spatial distortion reference unit) Recommendation ITU-T P.811, requires esdru binary
@@ -72,8 +72,10 @@ def esdru(
binary = find_binary("esdru")
- if input.num_channels != 2:
- raise Exception("Input audio is not stereo.")
+ if not isinstance(input, audio.BinauralAudio) and not input.name == "STEREO":
+ raise Exception(
+ "ESDRU condition only available for STEREO or BINAURAL output format"
+ )
if alpha <= 0.0 or alpha >= 1.0:
raise Exception(
diff --git a/ivas_processing_scripts/audiotools/wrappers/filter.py b/ivas_processing_scripts/audiotools/wrappers/filter.py
index fa58200b8e53806167025a8e55a1b3abb7b430db..1efbf9be0740f979a2ac2864a23408575ef82ae5 100755
--- a/ivas_processing_scripts/audiotools/wrappers/filter.py
+++ b/ivas_processing_scripts/audiotools/wrappers/filter.py
@@ -197,8 +197,6 @@ def lpfilter_itu(
Output low-pass filtered array
"""
- # TODO: change filter functions such that audio is modified -> no return value
-
# find right filter type for cut-off frequency
flt_types = ["LP1p5", "LP35", "LP7", "LP10", "LP12", "LP14", "LP20"]
flt_vals = [1500, 3500, 7000, 10000, 12000, 14000, 20000]
@@ -291,7 +289,7 @@ def resample_itu(
fs_new: int,
) -> np.ndarray:
"""
- Resampling of multichannel audio signal
+ Resampling of multi-channel audio array
Parameters
----------
@@ -347,15 +345,15 @@ def resample_itu(
y.audio = delay_compensation(
y.audio, flt_type=flt, fs=y.fs, up=up[i], down=down[i]
)
- if up[i]:
- if flt == "SHQ2":
- y.fs = y.fs * 2
- elif flt == "SHQ3":
- y.fs = y.fs * 3
- elif down[i]:
- if flt == "SHQ2":
- y.fs = int(y.fs / 2)
- elif flt == "SHQ3":
- y.fs = int(y.fs / 3)
+ # if up[i]:
+ # if flt == "SHQ2":
+ # y.fs = y.fs * 2
+ # elif flt == "SHQ3":
+ # y.fs = y.fs * 3
+ # elif down[i]:
+ # if flt == "SHQ2":
+ # y.fs = int(y.fs / 2)
+ # elif flt == "SHQ3":
+ # y.fs = int(y.fs / 3)
return y.audio
diff --git a/ivas_processing_scripts/audiotools/wrappers/gen_patt.py b/ivas_processing_scripts/audiotools/wrappers/gen_patt.py
new file mode 100644
index 0000000000000000000000000000000000000000..aa480af1103e972ba9173d31df1055bf53a47277
--- /dev/null
+++ b/ivas_processing_scripts/audiotools/wrappers/gen_patt.py
@@ -0,0 +1,164 @@
+#!/usr/bin/env python3
+
+#
+# (C) 2022-2023 IVAS codec Public Collaboration with portions copyright Dolby International AB, Ericsson AB,
+# Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V., Huawei Technologies Co. LTD.,
+# Koninklijke Philips N.V., Nippon Telegraph and Telephone Corporation, Nokia Technologies Oy, Orange,
+# Panasonic Holdings Corporation, Qualcomm Technologies, Inc., VoiceAge Corporation, and other
+# contributors to this repository. All Rights Reserved.
+#
+# This software is protected by copyright law and by international treaties.
+# The IVAS codec Public Collaboration consisting of Dolby International AB, Ericsson AB,
+# Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V., Huawei Technologies Co. LTD.,
+# Koninklijke Philips N.V., Nippon Telegraph and Telephone Corporation, Nokia Technologies Oy, Orange,
+# Panasonic Holdings Corporation, Qualcomm Technologies, Inc., VoiceAge Corporation, and other
+# contributors to this repository retain full ownership rights in their respective contributions in
+# the software. This notice grants no license of any kind, including but not limited to patent
+# license, nor is any license granted by implication, estoppel or otherwise.
+#
+# Contributors are required to enter into the IVAS codec Public Collaboration agreement before making
+# contributions.
+#
+# This software is provided "AS IS", without any express or implied warranties. The software is in the
+# development stage. It is intended exclusively for experts who have experience with such software and
+# solely for the purpose of inspection. All implied warranties of non-infringement, merchantability
+# and fitness for a particular purpose are hereby disclaimed and excluded.
+#
+# Any dispute, controversy or claim arising under or in relation to providing this software shall be
+# submitted to and settled by the final, binding jurisdiction of the courts of Munich, Germany in
+# accordance with the laws of the Federal Republic of Germany excluding its conflict of law rules and
+# the United Nations Convention on Contracts on the International Sales of Goods.
+#
+
+from os import getcwd
+from pathlib import Path
+from tempfile import TemporaryDirectory
+from typing import Optional, Union
+
+from ivas_processing_scripts.audiotools.wrappers.random_seed import random_seed
+from ivas_processing_scripts.utils import find_binary, run
+
+ERROR_PATTERNS_DIR = Path(__file__).parent.parent.parent.joinpath("error_patterns")
+
+
+def gen_patt(
+ len_sig: int,
+ path_pattern: Union[Path, str],
+ error_rate: float,
+ start: Optional[int] = 0,
+ working_dir: Optional[Union[Path, str]] = None,
+) -> None:
+ """
+ Wrapper for gen-patt binary to create error patterns for the bitstream processing
+
+ Parameters
+ ----------
+ len_sig: int
+ Length of signal in frames
+ path_pattern: Union[Path, str]
+ Path of output pattern
+ error_rate: float
+ Error rate in percent
+ start: Optional[int]
+ Start frame of error pattern (length preamble)
+ working_dir: Optional[Union[Path, str]]
+ Directory where binary should be called (sta file has to be in this dir if desired)
+ """
+
+ # find binary
+ binary = find_binary("gen-patt")
+
+ if working_dir is None:
+ working_dir = getcwd()
+
+ # set up command line
+ cmd = [
+ str(binary),
+ "-tailstat", # Statistics performed on the tail
+ "-fer", # Frame erasure mode using Gilbert model
+ "-g192", # Save error pattern in 16-bit G.192 format
+ "-gamma", # Correlation for BER|FER modes
+ str(0),
+ "-rate",
+ str(error_rate / 100),
+ "-tol", # Max deviation of specified BER/FER/BFER
+ str(0.001),
+ "-reset", # Reset EID state in between iteractions
+ "-n",
+ str(int(len_sig)),
+ "-start",
+ str(int(start) + 1),
+ path_pattern,
+ ]
+
+ # run command
+ run(cmd, cwd=working_dir)
+
+ return
+
+
+def create_error_pattern(
+ len_sig: int,
+ path_pattern: Union[Path, str],
+ frame_error_rate: float,
+ preamble: Optional[int] = 0,
+ master_seed: Optional[int] = 0,
+ prerun_seed: Optional[int] = 0,
+) -> None:
+ """
+ Creates error pattern with desired frame error rate for bitstream processing
+
+ Parameters
+ ----------
+ len_sig: int
+ Length of signal in frames
+ path_pattern: Union[Path, str]
+ Path of output pattern
+ frame_error_rate: float
+ Error rate in percent
+ preamble: Optional[int]
+ Length of preamble in frames
+ master_seed: Optional[int]
+ Master seed for error pattern generation
+ prerun_seed: optional[int]
+ Number of preruns in seed generation
+ """
+
+ with TemporaryDirectory() as tmp_dir:
+ tmp_dir = Path(tmp_dir)
+
+ sta_file = ERROR_PATTERNS_DIR.joinpath("sta_template")
+ tmp_sta_file = tmp_dir.joinpath("sta")
+
+ # compute seed
+ seed = random_seed(master_seed, prerun_seed)
+
+ # open file and modify
+ lines = []
+ with open(sta_file, "r") as sta_file_txt:
+ lines.append(sta_file_txt.readline()) # not changed
+ lines.append(f"{sta_file_txt.readline()[:-2]}{frame_error_rate/100}\n")
+ lines.append(sta_file_txt.readline()) # not changed
+ lines.append(f"{sta_file_txt.readline()[:-2]}{seed}\n")
+ lines.append(sta_file_txt.readline()) # not changed
+ lines.append(
+ f"{sta_file_txt.readline()[:-2]}{1-(frame_error_rate/100*2)}\n"
+ )
+ lines.append(sta_file_txt.readline()) # not changed
+ lines.append(
+ f"{sta_file_txt.readline()[:-2]}{1-(frame_error_rate/100*2)}\n"
+ )
+ lines.append(sta_file_txt.readline()) # not changed
+
+ with open(tmp_sta_file, "w") as tmp_sta_file_txt:
+ tmp_sta_file_txt.write("".join(lines))
+
+ gen_patt(
+ len_sig=len_sig,
+ error_rate=frame_error_rate,
+ path_pattern=path_pattern,
+ start=preamble,
+ working_dir=tmp_dir,
+ )
+
+ return
diff --git a/ivas_processing_scripts/audiotools/wrappers/masaRenderer.py b/ivas_processing_scripts/audiotools/wrappers/masaRenderer.py
index 682fc91afa6c4a892eae558c4f2c779cacc6d494..7b5eafda04f862df703495f4d59c4c386e3c15c2 100755
--- a/ivas_processing_scripts/audiotools/wrappers/masaRenderer.py
+++ b/ivas_processing_scripts/audiotools/wrappers/masaRenderer.py
@@ -60,6 +60,7 @@ def masaRenderer(
output : np.ndarray
MASA rendered to out_fmt
"""
+
binary = find_binary("masaRenderer")
if out_fmt not in ["5_1", "7_1_4", "BINAURAL"]:
@@ -79,7 +80,7 @@ def masaRenderer(
str(binary),
output_mode,
"", # 2 -> inputPcm
- str(masa.metadata_file.resolve()),
+ str(masa.metadata_files.resolve()),
"", # 4 -> outputPcm
]
diff --git a/ivas_processing_scripts/audiotools/wrappers/networkSimulator.py b/ivas_processing_scripts/audiotools/wrappers/networkSimulator.py
new file mode 100644
index 0000000000000000000000000000000000000000..fa1b7509bab016fa1f643f564a45293a3da1cc0d
--- /dev/null
+++ b/ivas_processing_scripts/audiotools/wrappers/networkSimulator.py
@@ -0,0 +1,198 @@
+#!/usr/bin/env python3
+
+#
+# (C) 2022-2023 IVAS codec Public Collaboration with portions copyright Dolby International AB, Ericsson AB,
+# Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V., Huawei Technologies Co. LTD.,
+# Koninklijke Philips N.V., Nippon Telegraph and Telephone Corporation, Nokia Technologies Oy, Orange,
+# Panasonic Holdings Corporation, Qualcomm Technologies, Inc., VoiceAge Corporation, and other
+# contributors to this repository. All Rights Reserved.
+#
+# This software is protected by copyright law and by international treaties.
+# The IVAS codec Public Collaboration consisting of Dolby International AB, Ericsson AB,
+# Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V., Huawei Technologies Co. LTD.,
+# Koninklijke Philips N.V., Nippon Telegraph and Telephone Corporation, Nokia Technologies Oy, Orange,
+# Panasonic Holdings Corporation, Qualcomm Technologies, Inc., VoiceAge Corporation, and other
+# contributors to this repository retain full ownership rights in their respective contributions in
+# the software. This notice grants no license of any kind, including but not limited to patent
+# license, nor is any license granted by implication, estoppel or otherwise.
+#
+# Contributors are required to enter into the IVAS codec Public Collaboration agreement before making
+# contributions.
+#
+# This software is provided "AS IS", without any express or implied warranties. The software is in the
+# development stage. It is intended exclusively for experts who have experience with such software and
+# solely for the purpose of inspection. All implied warranties of non-infringement, merchantability
+# and fitness for a particular purpose are hereby disclaimed and excluded.
+#
+# Any dispute, controversy or claim arising under or in relation to providing this software shall be
+# submitted to and settled by the final, binding jurisdiction of the courts of Munich, Germany in
+# accordance with the laws of the Federal Republic of Germany excluding its conflict of law rules and
+# the United Nations Convention on Contracts on the International Sales of Goods.
+#
+
+import os.path
+from pathlib import Path
+from typing import Optional, Union
+
+from ivas_processing_scripts.utils import find_binary, run
+
+LIST_JBM_PROFILES = range(12)
+ERROR_PATTERNS_DIR = Path(__file__).parent.parent.parent.joinpath("dly_error_profiles")
+
+
+def validate_network_simulator(
+ error_pattern: Optional[Union[Path, str]] = None,
+ error_profile: Optional[int] = None,
+ n_frames_per_packet: Optional[int] = None,
+) -> None:
+ """
+ Validate settings for the network simulator
+
+ Parameters
+ ----------
+ error_pattern: Optional[Union[Path, str]]
+ Path to existing error pattern
+ error_profile: Optional[int]
+ Index of existing error pattern
+ n_frames_per_packet: Optional[int]
+ Number of frames per paket
+ """
+
+ if find_binary("networkSimulator_g192") is None:
+ raise FileNotFoundError(
+ "The network simulator binary was not found! Please check the configuration."
+ )
+ if error_pattern is not None:
+ if not os.path.exists(os.path.realpath(error_pattern)):
+ raise FileNotFoundError(
+ f"The network simulator error profile file {error_pattern} was not found! Please check the configuration."
+ )
+ if error_profile is not None:
+ raise ValueError(
+ "JBM pattern and JBM profile number are specified for bitstream processing. Can't use both! Please check the configuration."
+ )
+ elif error_profile is not None:
+ if error_profile not in LIST_JBM_PROFILES:
+ raise ValueError(
+ f"JBM profile number {error_profile} does not exist, should be between {LIST_JBM_PROFILES[0]} and {LIST_JBM_PROFILES[-1]}"
+ )
+ if n_frames_per_packet is not None and n_frames_per_packet not in [1, 2]:
+ raise ValueError(
+ f"n_frames_per_paket is {n_frames_per_packet}. Should be 1 or 2. Please check your configuration."
+ )
+
+ return
+
+
+def network_simulator(
+ error_pattern: Union[str, Path],
+ in_bitstream: Union[str, Path],
+ out_bitstream: Union[str, Path],
+ n_frames_per_packet: int,
+ offset: int,
+) -> None:
+ """
+ Wrapper for networkSimulator_g192 binary to apply error patterns for the bitstream processing
+
+ Parameters
+ ----------
+ error_pattern: Union[str, Path]
+ Path to error pattern file
+ in_bitstream: Union[str, Path]
+ Path to input bitstream file
+ out_bitstream: Union[str, Path]
+ Output path for modified bitstream
+ n_frames_per_packet: int,
+ Number of frames per paket [1,2]
+ offset: Optional[int]
+ delay offset
+ """
+
+ # find binary
+ binary = find_binary("networkSimulator_g192")
+
+ # check for valid inputs
+ if not Path(in_bitstream).is_file():
+ raise ValueError(
+ f"Input bitstream file {in_bitstream} for bitstream processing does not exist"
+ )
+ elif not Path(error_pattern).is_file():
+ raise ValueError(
+ f"Error pattern file {error_pattern} for bitstream processing does not exist"
+ )
+
+ # set up command line
+ cmd = [
+ str(binary),
+ error_pattern,
+ in_bitstream,
+ out_bitstream,
+ f"{out_bitstream}_tracefile_sim",
+ str(n_frames_per_packet),
+ str(offset),
+ ]
+
+ # run command
+ run(cmd)
+
+ return
+
+
+def apply_network_simulator(
+ in_bitstream: Union[Path, str],
+ out_bitstream: Union[Path, str],
+ error_pattern: Optional[Union[Path, str]] = None,
+ error_profile: Optional[int] = None,
+ n_frames_per_packet: Optional[int] = None,
+ offset: Optional[int] = 0,
+) -> None:
+ """
+ Function to apply a network simulator profile to a bitstreaam
+
+ Parameters
+ ----------
+ in_bitstream: Union[Path, str]
+ Path of input bitstream
+ out_bitstream: Union[Path, str]
+ Path of output bitstream
+ error_pattern: Optional[Union[Path, str]]
+ Path to existing error pattern
+ error_profile: Optional[int]
+ Index of existing error pattern
+ n_frames_per_packet: Optional[int]
+ Number of frames per paket
+ offset: Optional[int]
+ delay offset
+ """
+
+ if error_pattern is None:
+ # create error pattern
+ if error_profile is not None:
+ if error_profile in LIST_JBM_PROFILES:
+ error_pattern = ERROR_PATTERNS_DIR.joinpath(
+ f"dly_error_profile_{error_profile}.dat"
+ )
+ else:
+ raise ValueError(
+ f"JBM profile number {error_profile} does not exist, should be between {LIST_JBM_PROFILES[0]} and {LIST_JBM_PROFILES[-1]}"
+ )
+ else:
+ raise ValueError(
+ "Either error pattern or error profile number has to be specified for network simulator bitstream processing"
+ )
+ elif error_profile is not None:
+ raise ValueError(
+ "JBM pattern and JBM profile number are specified for bitstream processing. Can't use both"
+ )
+
+ if n_frames_per_packet is None:
+ n_frames_per_packet = 1
+ if error_profile is not None and error_profile == 5:
+ n_frames_per_packet = 2
+
+ # apply error pattern
+ network_simulator(
+ error_pattern, in_bitstream, out_bitstream, n_frames_per_packet, offset
+ )
+
+ return
diff --git a/ivas_processing_scripts/audiotools/wrappers/p50fbmnru.py b/ivas_processing_scripts/audiotools/wrappers/p50fbmnru.py
index 0c67b679c54e12a3d20826200eba1e99672422de..76601048c56de10eb085ce4eb879552371b2e570 100755
--- a/ivas_processing_scripts/audiotools/wrappers/p50fbmnru.py
+++ b/ivas_processing_scripts/audiotools/wrappers/p50fbmnru.py
@@ -32,7 +32,6 @@
from pathlib import Path
from tempfile import TemporaryDirectory
-from typing import Tuple
from warnings import warn
import numpy as np
@@ -46,7 +45,7 @@ from ivas_processing_scripts.utils import find_binary, run
def p50fbmnru(
input: audio.Audio,
q_db: float,
-) -> Tuple[np.ndarray, int]:
+) -> np.ndarray:
"""
Wrapper for P.50 Fullband MNRU (Modulated Noise Reference Unit), requires p50fbmnru binary
The mode is M (Modulated Noise) as specified in section 5.2.1 of S4-141392 - EVS-7c Processing functions for characterization phase v110.doc
diff --git a/ivas_processing_scripts/audiotools/wrappers/random_seed.py b/ivas_processing_scripts/audiotools/wrappers/random_seed.py
new file mode 100644
index 0000000000000000000000000000000000000000..802f68b9e78ff95d8426e44bb0e8a837279149e7
--- /dev/null
+++ b/ivas_processing_scripts/audiotools/wrappers/random_seed.py
@@ -0,0 +1,84 @@
+#!/usr/bin/env python3
+
+#
+# (C) 2022-2023 IVAS codec Public Collaboration with portions copyright Dolby International AB, Ericsson AB,
+# Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V., Huawei Technologies Co. LTD.,
+# Koninklijke Philips N.V., Nippon Telegraph and Telephone Corporation, Nokia Technologies Oy, Orange,
+# Panasonic Holdings Corporation, Qualcomm Technologies, Inc., VoiceAge Corporation, and other
+# contributors to this repository. All Rights Reserved.
+#
+# This software is protected by copyright law and by international treaties.
+# The IVAS codec Public Collaboration consisting of Dolby International AB, Ericsson AB,
+# Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V., Huawei Technologies Co. LTD.,
+# Koninklijke Philips N.V., Nippon Telegraph and Telephone Corporation, Nokia Technologies Oy, Orange,
+# Panasonic Holdings Corporation, Qualcomm Technologies, Inc., VoiceAge Corporation, and other
+# contributors to this repository retain full ownership rights in their respective contributions in
+# the software. This notice grants no license of any kind, including but not limited to patent
+# license, nor is any license granted by implication, estoppel or otherwise.
+#
+# Contributors are required to enter into the IVAS codec Public Collaboration agreement before making
+# contributions.
+#
+# This software is provided "AS IS", without any express or implied warranties. The software is in the
+# development stage. It is intended exclusively for experts who have experience with such software and
+# solely for the purpose of inspection. All implied warranties of non-infringement, merchantability
+# and fitness for a particular purpose are hereby disclaimed and excluded.
+#
+# Any dispute, controversy or claim arising under or in relation to providing this software shall be
+# submitted to and settled by the final, binding jurisdiction of the courts of Munich, Germany in
+# accordance with the laws of the Federal Republic of Germany excluding its conflict of law rules and
+# the United Nations Convention on Contracts on the International Sales of Goods.
+#
+
+from typing import Optional
+
+from ivas_processing_scripts.utils import find_binary, run
+
+
+def random_seed(
+ master_seed: Optional[int] = 0,
+ prerun_seed: Optional[int] = 0,
+ hexa: Optional[bool] = True,
+) -> int:
+ """
+
+ Parameters
+ ----------
+ master_seed: Optional[int]
+ Master seed for error pattern generation
+ prerun_seed: Optional[int]
+ Number of preruns in seed generation
+ hexa: Optonal[bool]
+ Flag if output should be in hexadecimal or decimal format
+
+ Returns
+ -------
+ result: int
+ One random value
+ """
+
+ # find binary
+ binary = find_binary("random")
+
+ # set up command line
+ cmd = [
+ str(binary),
+ "-n", # Number of items
+ str(1),
+ "-s",
+ str(master_seed),
+ "-d",
+ str(prerun_seed),
+ "-r", # value range for results
+ str(0),
+ str(99999999),
+ ]
+
+ # run command
+ result = run(cmd)
+ result = int(result.stdout[:-1])
+
+ if hexa:
+ result = hex(result)
+
+ return result
diff --git a/ivas_processing_scripts/bin/README.txt b/ivas_processing_scripts/bin/README.txt
index 00bdbec2c39937e87e73c98c4d16c2ac035a8427..2bbdd44fad830c9d3b0717df8f28845c5e890892 100755
--- a/ivas_processing_scripts/bin/README.txt
+++ b/ivas_processing_scripts/bin/README.txt
@@ -1,9 +1,14 @@
-Necessary binaries/executables:
-- ITU tools: https://github.com/openitu/STL
- - bs1770demo used for loudness measurement and adjustment
- - filter used for 50Hz high-pass filtering, 3.5kHz and 7kHz low-pass filtering and resampling
- - esdru used for ESDRU condition
- - p50fbmnru used for MNRU condition
-- MASA tools: https://www.3gpp.org/ftp/TSG_SA/WG4_CODEC/TSGS4_122_Athens/Docs/S4-230221.zip
- - masaRenderer used for rendering of MASA signals to binaural, 7.1+4 and 5.1 format (Attention: some files have to be in the same directory as the renderer)
+Necessary additional executables:
+
+| Processing step | Executable | Where to find |
+|---------------------------------|-----------------------|-------------------------------------------------------------------------------------------------------------|
+| Loudness adjustment | bs1770demo | https://github.com/openitu/STL |
+| MNRU | p50fbmnru | https://github.com/openitu/STL |
+| ESDRU | esdru | https://github.com/openitu/STL |
+| Frame error pattern application | eid-xor | https://github.com/openitu/STL |
+| Error pattern generation | gen-patt | https://www.itu.int/rec/T-REC-G.191-201003-S/en (Note: Version in https://github.com/openitu/STL is buggy!) |
+| Filtering, Resampling | filter | https://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_76/docs/S4-131277.zip |
+| Random offset/seed generation | random | https://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_76/docs/S4-131277.zip |
+| JBM network similulator | networkSimulator_g192 | https://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_76/docs/S4-131277.zip |
+| MASA rendering | masaRenderer | https://www.3gpp.org/ftp/TSG_SA/WG4_CODEC/TSGS4_122_Athens/Docs/S4-230221.zip |
diff --git a/ivas_processing_scripts/constants.py b/ivas_processing_scripts/constants.py
index ff56b9bf29e07358b4a7b52f335882784e174e7c..9b3ef3034edd0e2a53097b2c2ee9b085be6ad184 100755
--- a/ivas_processing_scripts/constants.py
+++ b/ivas_processing_scripts/constants.py
@@ -58,6 +58,8 @@ DEFAULT_CONFIG = {
"git_sha": f"{get_gitsha()}",
"multiprocessing": True,
"delete_tmp": False,
+ "master_seed": 0,
+ "prerun_seed": 0,
"concatenate_input": False,
"concat_silence": {
"pre": 0,
@@ -69,7 +71,7 @@ DEFAULT_CONFIG = {
# postprocessing
"postprocessing": {
"hp50": False,
- "limit": True,
+ "limit": False,
},
}
DEFAULT_CONFIG_EVS = {
diff --git a/ivas_processing_scripts/dly_error_profiles/dly_error_profile_0.dat b/ivas_processing_scripts/dly_error_profiles/dly_error_profile_0.dat
new file mode 100644
index 0000000000000000000000000000000000000000..804fcc90e91c1b6c31475a2b7bd575591e229695
--- /dev/null
+++ b/ivas_processing_scripts/dly_error_profiles/dly_error_profile_0.dat
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9a271f2a916b0b6ee6cecb2426f0b3206ef074578be55d9bc94f6f3fe3ab86aa
+size 2
diff --git a/ivas_processing_scripts/dly_error_profiles/dly_error_profile_1.dat b/ivas_processing_scripts/dly_error_profiles/dly_error_profile_1.dat
new file mode 100644
index 0000000000000000000000000000000000000000..cb7a22599b201cce6b6d9204cff47d62d923b437
--- /dev/null
+++ b/ivas_processing_scripts/dly_error_profiles/dly_error_profile_1.dat
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e3c23b360bd99f9d56d4c39912af8f399b4b3809bb23beecf9fe7be3ac162d72
+size 30000
diff --git a/ivas_processing_scripts/dly_error_profiles/dly_error_profile_10.dat b/ivas_processing_scripts/dly_error_profiles/dly_error_profile_10.dat
new file mode 100644
index 0000000000000000000000000000000000000000..c6d88aa980be14f7d2502cadf4a3991c4d389753
--- /dev/null
+++ b/ivas_processing_scripts/dly_error_profiles/dly_error_profile_10.dat
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:fe10c959f17a185a272fae419b328de835981f4e845b7ac581b44dac6b9833f8
+size 31247
diff --git a/ivas_processing_scripts/dly_error_profiles/dly_error_profile_11.dat b/ivas_processing_scripts/dly_error_profiles/dly_error_profile_11.dat
new file mode 100644
index 0000000000000000000000000000000000000000..7553ab3973049a05529561522ca9427e611c889f
--- /dev/null
+++ b/ivas_processing_scripts/dly_error_profiles/dly_error_profile_11.dat
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:950ba178f3cacaeec6c3dd97e651177c9768290770edcc998ab7b8925722b147
+size 71025
diff --git a/ivas_processing_scripts/dly_error_profiles/dly_error_profile_2.dat b/ivas_processing_scripts/dly_error_profiles/dly_error_profile_2.dat
new file mode 100644
index 0000000000000000000000000000000000000000..6bdf655719cd3dffea1aa58fcd370914513a2dea
--- /dev/null
+++ b/ivas_processing_scripts/dly_error_profiles/dly_error_profile_2.dat
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:400ab310d7d5d9aef139939fbe1130c510d275255394ab1e15b32162034ad704
+size 29982
diff --git a/ivas_processing_scripts/dly_error_profiles/dly_error_profile_3.dat b/ivas_processing_scripts/dly_error_profiles/dly_error_profile_3.dat
new file mode 100644
index 0000000000000000000000000000000000000000..1668e4e32c10d0479d7a1facbc754ff654b5d357
--- /dev/null
+++ b/ivas_processing_scripts/dly_error_profiles/dly_error_profile_3.dat
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:589722d6abd82ce5bdec9e5a57ac37011fa6385de9ad74a6df660bdc8b2f7b05
+size 29962
diff --git a/ivas_processing_scripts/dly_error_profiles/dly_error_profile_4.dat b/ivas_processing_scripts/dly_error_profiles/dly_error_profile_4.dat
new file mode 100644
index 0000000000000000000000000000000000000000..3fafe8e55527c48a4f6a48fc3b5810ee17bc93ec
--- /dev/null
+++ b/ivas_processing_scripts/dly_error_profiles/dly_error_profile_4.dat
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0db043e25d133b68488f3a815f2f438c97a773f287c1cc50281dfcc3a38c1ea1
+size 29820
diff --git a/ivas_processing_scripts/dly_error_profiles/dly_error_profile_5.dat b/ivas_processing_scripts/dly_error_profiles/dly_error_profile_5.dat
new file mode 100644
index 0000000000000000000000000000000000000000..cd00cd9eee545126786a36608704ec409ea151e2
--- /dev/null
+++ b/ivas_processing_scripts/dly_error_profiles/dly_error_profile_5.dat
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ff5fc111d516dfa335e85c381b04c229ba40576099af7ae079559d49147f23fd
+size 29556
diff --git a/ivas_processing_scripts/dly_error_profiles/dly_error_profile_6.dat b/ivas_processing_scripts/dly_error_profiles/dly_error_profile_6.dat
new file mode 100644
index 0000000000000000000000000000000000000000..940ad87c89a7fc6174721a456dfe17f545c5d022
--- /dev/null
+++ b/ivas_processing_scripts/dly_error_profiles/dly_error_profile_6.dat
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d661669942ee60dc6a2bd83e2e05e3daea06149928d350eca97a72868fd9c522
+size 30000
diff --git a/ivas_processing_scripts/dly_error_profiles/dly_error_profile_7.dat b/ivas_processing_scripts/dly_error_profiles/dly_error_profile_7.dat
new file mode 100644
index 0000000000000000000000000000000000000000..ab1e94f669b32f27cabd33266622bdfd6c79223a
--- /dev/null
+++ b/ivas_processing_scripts/dly_error_profiles/dly_error_profile_7.dat
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:cd1f5a09a5fef9aacf7b4b030af813c982ce117eb0ff79008e248d575a146824
+size 31737
diff --git a/ivas_processing_scripts/dly_error_profiles/dly_error_profile_8.dat b/ivas_processing_scripts/dly_error_profiles/dly_error_profile_8.dat
new file mode 100644
index 0000000000000000000000000000000000000000..c0369c37c809b6d346a3e717439098cfede17c94
--- /dev/null
+++ b/ivas_processing_scripts/dly_error_profiles/dly_error_profile_8.dat
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0de3fd89270f89f4d134344b57925f3b7a115dd581cfa9d36ecb26d39cf4b76f
+size 31503
diff --git a/ivas_processing_scripts/dly_error_profiles/dly_error_profile_9.dat b/ivas_processing_scripts/dly_error_profiles/dly_error_profile_9.dat
new file mode 100644
index 0000000000000000000000000000000000000000..ca8384de7e720cc5ee600f3f91477a0a3861dcd2
--- /dev/null
+++ b/ivas_processing_scripts/dly_error_profiles/dly_error_profile_9.dat
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5f0eed7aa93fe00fe88595a95c25a4ba250298df73159a42bc53cfcafeefa294
+size 31343
diff --git a/ivas_processing_scripts/error_patterns/sta_template b/ivas_processing_scripts/error_patterns/sta_template
new file mode 100644
index 0000000000000000000000000000000000000000..0683ed82890180bf873b531d6c601a5588fdcb1a
--- /dev/null
+++ b/ivas_processing_scripts/error_patterns/sta_template
@@ -0,0 +1,9 @@
+EID
+BER = x
+GAMMA = 0.000000
+RAN-seed = x
+Current State = G
+GOOD->GOOD = x
+GOOD->BAD = 1.000000
+BAD ->GOOD = x
+BAD ->BAD = 1.000000
\ No newline at end of file
diff --git a/ivas_processing_scripts/processing/chains.py b/ivas_processing_scripts/processing/chains.py
index 1761cbbdeb02156ecc7ff3e356f57fcca14a391b..fc696d922dd1d6819be38b637db2c9d8f77dcf5f 100755
--- a/ivas_processing_scripts/processing/chains.py
+++ b/ivas_processing_scripts/processing/chains.py
@@ -189,6 +189,32 @@ def get_processing_chain(
cod_cfg = cond_cfg["cod"]
dec_cfg = cond_cfg["dec"]
+ # Frame error pattern bitstream modification
+ if "tx" in cond_cfg.keys():
+ tx_cfg = cond_cfg["tx"]
+ elif hasattr(cfg, "tx"):
+ if cfg.tx.get("type", None) == "FER":
+ tx_cfg = {
+ "type": cfg.tx.get("type", None),
+ "error_pattern": cfg.tx.get("error_pattern", None),
+ "error_rate": cfg.tx.get("error_rate", None),
+ "master_seed": cfg.master_seed,
+ "prerun_seed": cfg.prerun_seed,
+ }
+ elif cfg.tx.get("type", None) == "JBM":
+ tx_cfg = {
+ "type": cfg.tx.get("type", None),
+ "error_pattern": cfg.tx.get("error_pattern", None),
+ "error_profile": cfg.tx.get("error_profile", None),
+ "n_frames_per_packet": cfg.tx.get("n_frames_per_packet", 1),
+ }
+ else:
+ raise ValueError(
+ "Type of bitstream procesing either missing or not valid"
+ )
+ else:
+ tx_cfg = None
+
chain["processes"].append(
EVS(
{
@@ -201,6 +227,8 @@ def get_processing_chain(
"dec_bin": dec_cfg.get("bin"),
"dec_opts": dec_cfg.get("opts"),
"multiprocessing": cfg.multiprocessing,
+ "tx": tx_cfg,
+ "preamble": cfg.preamble,
}
)
)
@@ -211,15 +239,29 @@ def get_processing_chain(
cod_cfg = cond_cfg["cod"]
dec_cfg = cond_cfg["dec"]
- # local tx overrides global one, or just allow global?
+ # Frame error pattern bitstream modification
if "tx" in cond_cfg.keys():
tx_cfg = cond_cfg["tx"]
elif hasattr(cfg, "tx"):
- tx_cfg = {
- "bin": cfg.tx["bs_proc_bin"],
- "error_pattern": cfg.tx["error_pattern"],
- "opts": cfg.tx["bs_proc_opts"],
- }
+ if cfg.tx.get("type", None) == "FER":
+ tx_cfg = {
+ "type": cfg.tx.get("type", None),
+ "error_pattern": cfg.tx.get("error_pattern", None),
+ "error_rate": cfg.tx.get("error_rate", None),
+ "master_seed": cfg.master_seed,
+ "prerun_seed": cfg.prerun_seed,
+ }
+ elif cfg.tx.get("type", None) == "JBM":
+ tx_cfg = {
+ "type": cfg.tx.get("type", None),
+ "error_pattern": cfg.tx.get("error_pattern", None),
+ "error_profile": cfg.tx.get("error_profile", None),
+ "n_frames_per_packet": cfg.tx.get("n_frames_per_packet", 1),
+ }
+ else:
+ raise ValueError(
+ "Type of bitstream procesing either missing or not valid"
+ )
else:
tx_cfg = None
@@ -237,6 +279,7 @@ def get_processing_chain(
"dec_opts": dec_cfg.get("opts"),
"multiprocessing": cfg.multiprocessing,
"tx": tx_cfg,
+ "preamble": cfg.preamble,
}
)
)
diff --git a/ivas_processing_scripts/processing/evs.py b/ivas_processing_scripts/processing/evs.py
index 98cb5ce84abb421f05bbfcca03776afb9649ae07..ebe6ebf0ef6440382f3226e4424668ffeb244a85 100755
--- a/ivas_processing_scripts/processing/evs.py
+++ b/ivas_processing_scripts/processing/evs.py
@@ -31,18 +31,29 @@
#
import logging
+import os.path
import platform
from itertools import repeat
from pathlib import Path
from shutil import copyfile
-from typing import Optional
+from typing import Optional, Union
from ivas_processing_scripts.audiotools import audio
from ivas_processing_scripts.audiotools.audiofile import (
combine,
parse_wave_header,
+ read,
split_channels,
)
+from ivas_processing_scripts.audiotools.constants import IVAS_FRAME_LEN_MS
+from ivas_processing_scripts.audiotools.wrappers.eid_xor import (
+ create_and_apply_error_pattern,
+ validate_error_pattern_application,
+)
+from ivas_processing_scripts.audiotools.wrappers.networkSimulator import (
+ apply_network_simulator,
+ validate_network_simulator,
+)
from ivas_processing_scripts.processing.processing import Processing
from ivas_processing_scripts.utils import apply_func_parallel, run
@@ -89,6 +100,19 @@ class EVS(Processing):
self.bitrate.extend(
[self.bitrate[-1]] * (self.in_fmt.num_channels - len(self.bitrate))
)
+ # existence of error pattern files (if given) already here
+ if self.tx is not None:
+ if self.tx.get("type", None) == "JBM":
+ validate_network_simulator(
+ self.tx["error_pattern"],
+ self.tx["error_profile"],
+ self.tx["n_frames_per_packet"],
+ )
+ elif self.tx.get("type", None) == "FER":
+ validate_error_pattern_application(
+ self.tx["error_pattern"],
+ self.tx["error_rate"],
+ )
def process(
self, in_file: Path, out_file: Path, in_meta, logger: logging.Logger
@@ -158,7 +182,13 @@ class EVS(Processing):
show_progress=False,
)
- # TODO check eid-xor/networkSimulator here
+ split_chan_bs = apply_func_parallel(
+ self.simulate_tx,
+ zip(split_chan_files, split_chan_bs, repeat(logger)),
+ None,
+ "mt" if self.multiprocessing else None,
+ show_progress=False,
+ )
# run all decoders
logger.debug(f"Running EVS decoders for {out_file.stem.split('.')[0]}")
@@ -236,6 +266,60 @@ class EVS(Processing):
run(cmd, logger=logger)
+ def simulate_tx(
+ self,
+ in_file: Union[Path, str],
+ bitstream: Path,
+ logger: Optional[logging.Logger] = None,
+ ) -> Union[Path, str]:
+ if self.tx is not None:
+ if self.tx["type"] == "JBM":
+ bs, ext = os.path.splitext(bitstream)
+ bitstream_processed = Path(f"{bs}_processed{ext}")
+ logger.debug(f"Network simulator {bitstream} -> {bitstream_processed}")
+ apply_network_simulator(
+ bitstream,
+ bitstream_processed,
+ self.tx["error_pattern"],
+ self.tx["error_profile"],
+ self.tx["n_frames_per_packet"],
+ )
+ # add -voip cmdline option to the decoder
+ # TODO: tracefile also?
+ if self.dec_opts:
+ if "-voip" not in self.dec_opts:
+ self.dec_opts.extend(["-voip"])
+ else:
+ self.dec_opts = ["-voip"]
+ return bitstream_processed
+
+ elif self.tx["type"] == "FER":
+ bs, ext = os.path.splitext(bitstream)
+ bitstream_processed = Path(f"{bs}_processed{ext}")
+ signal, _ = read(in_file, fs=self.in_fs, nchannels=1)
+ frame_number = len(signal) / self.in_fs / IVAS_FRAME_LEN_MS * 1000
+ if self.preamble:
+ frame_number_preamble = self.preamble / IVAS_FRAME_LEN_MS
+ else:
+ frame_number_preamble = 0
+ logger.debug(
+ f"Frame loss simulator {bitstream} -> {bitstream_processed}"
+ )
+ create_and_apply_error_pattern(
+ in_bitstream=bitstream,
+ out_bitstream=bitstream_processed,
+ len_sig=frame_number,
+ preamble=frame_number_preamble,
+ error_pattern=self.tx["error_pattern"],
+ error_rate=self.tx["error_rate"],
+ master_seed=self.tx["master_seed"],
+ prerun_seed=self.tx["prerun_seed"],
+ )
+
+ return bitstream_processed
+ else:
+ return bitstream
+
def dec(
self,
bitstream: Path,
diff --git a/ivas_processing_scripts/processing/ivas.py b/ivas_processing_scripts/processing/ivas.py
index d7709360d50e01cbcc2c12b5449dea24e8dcbd03..ddfbc20743efcfdce440e895b41d67982bc58879 100755
--- a/ivas_processing_scripts/processing/ivas.py
+++ b/ivas_processing_scripts/processing/ivas.py
@@ -33,12 +33,20 @@
import logging
import os.path
import platform
-from copy import deepcopy
from pathlib import Path
-from typing import Optional
+from typing import Optional, Union
from ivas_processing_scripts.audiotools import audio
-from ivas_processing_scripts.audiotools.audiofile import parse_wave_header
+from ivas_processing_scripts.audiotools.audiofile import parse_wave_header, read
+from ivas_processing_scripts.audiotools.constants import IVAS_FRAME_LEN_MS
+from ivas_processing_scripts.audiotools.wrappers.eid_xor import (
+ create_and_apply_error_pattern,
+ validate_error_pattern_application,
+)
+from ivas_processing_scripts.audiotools.wrappers.networkSimulator import (
+ apply_network_simulator,
+ validate_network_simulator,
+)
from ivas_processing_scripts.processing.processing import Processing
from ivas_processing_scripts.utils import run
@@ -50,6 +58,8 @@ class IVAS(Processing):
self.name = "ivas"
self.in_fmt = audio.fromtype(self.in_fmt)
self.out_fmt = audio.fromtype(self.out_fmt)
+ if not hasattr(self, "dec_opts"):
+ self.dec_opts = None
def _validate(self):
if not self.cod_bin or not Path(self.cod_bin).exists():
@@ -70,11 +80,19 @@ class IVAS(Processing):
raise FileNotFoundError(
"The IVAS decoder binary was not found! Please check the configuration."
)
+
+ # existence of error pattern files (if given) already here
if self.tx is not None:
- if not self.tx["bin"] or not Path(self.tx["bin"]).exists():
- bin = self.tx["bin"]
- raise FileNotFoundError(
- f"The transport simulator binary {bin} was not found! Please check the configuration."
+ if self.tx.get("type", None) == "JBM":
+ validate_network_simulator(
+ self.tx["error_pattern"],
+ self.tx["error_profile"],
+ self.tx["n_frames_per_packet"],
+ )
+ elif self.tx.get("type", None) == "FER":
+ validate_error_pattern_application(
+ self.tx["error_pattern"],
+ self.tx["error_rate"],
)
def process(
@@ -100,7 +118,7 @@ class IVAS(Processing):
self.enc(in_file, bitstream, in_meta, logger)
- bitstream = self.simulate_tx(bitstream, logger)
+ bitstream = self.simulate_tx(in_file, bitstream, logger)
self.dec(bitstream, out_file, logger)
@@ -191,27 +209,59 @@ class IVAS(Processing):
run(cmd, logger=logger)
def simulate_tx(
- self, bitstream: Path, logger: Optional[logging.Logger] = None
- ) -> Path:
- # TODO run eid-xor/networkSimulator
+ self,
+ in_file: Union[Path, str],
+ bitstream: Path,
+ logger: Optional[logging.Logger] = None,
+ ) -> Union[Path, str]:
if self.tx is not None:
- cmd = [self.tx["bin"]]
- bs, ext = os.path.splitext(bitstream)
- error_pattern = self.tx["error_pattern"]
- bitstream_processed = Path(bs + "." + os.path.basename(error_pattern) + ext)
- opts = deepcopy(self.tx["opts"])
- opts = [
- x.format(
- error_pattern=error_pattern,
- bitstream=bitstream,
- processed_bitstream=bitstream_processed,
+ if self.tx["type"] == "JBM":
+ bs, ext = os.path.splitext(bitstream)
+ bitstream_processed = Path(f"{bs}_processed{ext}")
+ logger.debug(f"Network simulator {bitstream} -> {bitstream_processed}")
+ apply_network_simulator(
+ bitstream,
+ bitstream_processed,
+ self.tx["error_pattern"],
+ self.tx["error_profile"],
+ self.tx["n_frames_per_packet"],
)
- for x in opts
- ]
- logger.debug(f"Network simulator {bitstream} -> {bitstream_processed}")
- cmd.extend(opts)
- run(cmd, logger=logger)
- return bitstream_processed
+ # add -voip cmdline option to the decoder
+ # TODO: tracefile also?
+ if self.dec_opts:
+ if "-voip" not in self.dec_opts:
+ self.dec_opts.extend(["-voip"])
+
+ else:
+ self.dec_opts = ["-voip"]
+ return bitstream_processed
+
+ elif self.tx["type"] == "FER":
+ bs, ext = os.path.splitext(bitstream)
+ bitstream_processed = Path(f"{bs}_processed{ext}")
+ signal, _ = read(
+ in_file, fs=self.in_fs, nchannels=self.in_fmt.num_channels
+ )
+ frame_number = len(signal) / self.in_fs / IVAS_FRAME_LEN_MS * 1000
+ if self.preamble:
+ frame_number_preamble = self.preamble / IVAS_FRAME_LEN_MS
+ else:
+ frame_number_preamble = 0
+ logger.debug(
+ f"Frame loss simulator {bitstream} -> {bitstream_processed}"
+ )
+ create_and_apply_error_pattern(
+ in_bitstream=bitstream,
+ out_bitstream=bitstream_processed,
+ len_sig=frame_number,
+ preamble=frame_number_preamble,
+ error_pattern=self.tx["error_pattern"],
+ error_rate=self.tx["error_rate"],
+ master_seed=self.tx["master_seed"],
+ prerun_seed=self.tx["prerun_seed"],
+ )
+
+ return bitstream_processed
else:
return bitstream
diff --git a/ivas_processing_scripts/processing/processing.py b/ivas_processing_scripts/processing/processing.py
index 411f32790ccec7f379d1b97c3a9733d2ddf2976e..d0b80a7660b4dfd2aa3a9b5e94f4f57eb5d9c0fa 100755
--- a/ivas_processing_scripts/processing/processing.py
+++ b/ivas_processing_scripts/processing/processing.py
@@ -36,7 +36,6 @@ from itertools import repeat
from pathlib import Path
from shutil import copyfile
from typing import Iterable, Union
-from warnings import warn
from ivas_processing_scripts.audiotools.audiofile import concat, split
from ivas_processing_scripts.audiotools.metadata import (
@@ -63,112 +62,104 @@ def concat_setup(cfg: TestConfig, logger: logging.Logger):
if any([i for i in cfg.items_list if i.suffix == ".txt"]):
raise SystemExit("Concatenation for text files is unsupported")
- if len(cfg.items_list) > 1:
- logger.info(f"Concatenating input files in directory {cfg.input_path}")
+ logger.info(f"Concatenating input files in directory {cfg.input_path}")
- # concatenate ISM metadata
- if cfg.input["fmt"].startswith("ISM"):
- cfg.concat_meta = []
- for obj_idx in range(len(cfg.metadata_path[0])):
- cfg.concat_meta.append(
- cfg.output_path.joinpath(
- f"{cfg.input_path.name}_concatenated.wav.{obj_idx}.csv"
- )
+ # concatenate ISM metadata
+ if cfg.input["fmt"].startswith("ISM"):
+ cfg.concat_meta = []
+ for obj_idx in range(len(cfg.metadata_path[0])):
+ cfg.concat_meta.append(
+ cfg.output_path.joinpath(
+ f"{cfg.input_path.name}_concatenated.wav.{obj_idx}.csv"
)
-
- concat_meta_from_file(
- cfg.items_list,
- cfg.metadata_path,
- cfg.concat_meta,
- cfg.concat_silence.get("pre", 0),
- cfg.concat_silence.get("post", 0),
- cfg.input["fmt"],
- preamble=cfg.preamble,
)
- # set input to the concatenated file we have just written to the output dir
- cfg.metadata_path = [cfg.concat_meta]
-
- # concatenate audio
- cfg.concat_file = cfg.output_path.joinpath(
- f"{cfg.input_path.name}_concatenated.wav"
- )
-
- cfg.splits = concat(
+ concat_meta_from_file(
cfg.items_list,
- cfg.concat_file,
+ cfg.metadata_path,
+ cfg.concat_meta,
cfg.concat_silence.get("pre", 0),
cfg.concat_silence.get("post", 0),
- cfg.input.get("fs", 48000),
+ cfg.input["fmt"],
preamble=cfg.preamble,
- pad_noise_preamble=cfg.pad_noise_preamble,
)
- # save item naming for splits naming in the end
- cfg.split_names = []
- for name in cfg.items_list:
- cfg.split_names.append(Path(name).stem.split(".")[0])
# set input to the concatenated file we have just written to the output dir
- cfg.items_list = [cfg.concat_file]
+ cfg.metadata_path = [cfg.concat_meta]
- # write out splits
- with open(cfg.concat_file.with_suffix(".splits.log"), "w") as f:
- print(", ".join([str(s) for s in cfg.splits]), file=f)
- print(", ".join([str(sn) for sn in cfg.split_names]), file=f)
- print(", ".join([str(i.stem) for i in cfg.items_list]), file=f)
+ # concatenate audio
+ cfg.concat_file = cfg.output_path.joinpath(
+ f"{cfg.input_path.name}_concatenated.wav"
+ )
- logger.info(
- f"Splits written to file {cfg.concat_file.with_suffix('.splits.log')}"
- )
+ cfg.splits = concat(
+ cfg.items_list,
+ cfg.concat_file,
+ cfg.concat_silence.get("pre", 0),
+ cfg.concat_silence.get("post", 0),
+ cfg.input.get("fs", 48000),
+ preamble=cfg.preamble,
+ pad_noise_preamble=cfg.pad_noise_preamble,
+ )
- else:
- warn(
- "Concatenation specified with a single item this will have no effect. Please use preprocessing if padding is required."
- )
- cfg.splits = []
+ # save item naming for splits naming in the end
+ cfg.split_names = []
+ for name in cfg.items_list:
+ cfg.split_names.append(Path(name).stem.split(".")[0])
+ # set input to the concatenated file we have just written to the output dir
+ cfg.items_list = [cfg.concat_file]
+
+ # write out splits
+ with open(cfg.concat_file.with_suffix(".splits.log"), "w") as f:
+ print(", ".join([str(s) for s in cfg.splits]), file=f)
+ print(", ".join([str(sn) for sn in cfg.split_names]), file=f)
+ print(", ".join([str(i.stem) for i in cfg.items_list]), file=f)
+
+ logger.info(f"Splits written to file {cfg.concat_file.with_suffix('.splits.log')}")
def concat_teardown(cfg: TestConfig, logger: logging.Logger):
- try:
- num_splits = len(cfg.splits)
- except AttributeError:
+ if not cfg.splits:
raise ValueError("Splitting not possible without split marker")
output_format = cfg.postprocessing["fmt"]
- if num_splits <= 1:
- logger.info("No splitting of output file necessary since only one signal used.")
+ out_files = []
+ out_meta = []
- else:
- logger.info(f"Splitting output file in directory {cfg.output_path}")
+ logger.info(f"Splitting output file in directory {cfg.output_path}")
+
+ for odir in cfg.out_dirs:
+ path_input = odir / cfg.items_list[0].name
+ out_paths = split(
+ path_input, odir, cfg.split_names, cfg.splits, preamble=cfg.preamble
+ )
+
+ logger.debug(
+ f"Resulting split files condition {odir.name}: {', '.join([str(op) for op in out_paths])}"
+ )
+ out_files.append(out_paths)
+ # split ISM metadata
+ if output_format.startswith("ISM"):
for odir in cfg.out_dirs:
path_input = odir / cfg.items_list[0].name
- out_paths = split(
- path_input, odir, cfg.split_names, cfg.splits, preamble=cfg.preamble
- )
-
- logger.debug(
- f"Resulting split files condition {odir.name}: {', '.join([str(op) for op in out_paths])}"
+ out_meta_paths = split_meta_in_file(
+ path_input,
+ odir,
+ cfg.split_names,
+ cfg.splits,
+ output_format,
+ preamble=cfg.preamble,
)
-
- # split ISM metadata
- if output_format.startswith("ISM"):
- for odir in cfg.out_dirs:
- path_input = odir / cfg.items_list[0].name
- split_meta_in_file(
- path_input,
- odir,
- cfg.split_names,
- cfg.splits,
- output_format,
- preamble=cfg.preamble,
- )
+ out_meta.append(out_meta_paths)
# remove concatenated file
if cfg.delete_tmp:
cfg.concat_file.unlink(missing_ok=True)
+ return out_files, out_meta
+
def preprocess(cfg, in_meta, logger):
preprocessing = cfg.proc_chains[0]