Commit 63fc5ec9 authored by Anika Treffehn's avatar Anika Treffehn
Browse files

Merge branch 'loudness_adjustment_after_splitting' into 'main'

Loudness adjustment after splitting and bitstream processing

See merge request !7
parents e007ca5c aed8c36e
Loading
Loading
Loading
Loading
Loading
+34 −22
Original line number Diff line number Diff line
@@ -222,16 +222,28 @@ input:
<summary>Click to expand</summary>

```yaml
### Bistream processing (transport simulation) done after encoding and before decoding
### Bitstream processing (transport simulation) done after encoding and before decoding
### e.g. frame error insertion or transport simulation for JBM testing
### can be given globally or in individual conditions of type ivas or evs
# tx:
    ### REQUIRED: Path to network simulation binary
    # bs_proc_bin: ".../ivas_python_testscripts/networkSimulator_g192.exe"
    ### Path to error pattern (mandatory if no information for generating the error pattern is given)
    ### REQUIRED: Type of bitstream processing; possible types: "JBM" or "FER"
    #type: "JBM"
    
    ### JBM
    ### REQUIRED: either error_pattern or error_profile
    ### delay error profile file
    # error_pattern: ".../dly_error_profile.dat"
    ### options for the binary, possible placeholders are {error_pattern} for the error pattern,
    ### {bitstream} for the bitstream to process and {bitstream_processed} for the processed bitstream
    # bs_proc_opts: [ "{error_pattern}",  "{bitstream}",  "{processed_bitstream}",  "{processed_bitstream}_tracefile_sim", "2", "0" ]
    ### Index of one of the existing delay error profile files to use (1-11)
    # error_profile: 5
    ## nFramesPerPacket parameter for the network simulator (optional); default = 1
    # n_frames_per_packet: 2
    
    ### FER
    ### REQUIRED: either error_pattern or error_rate
    ### Frame error pattern file
    # error_pattern: "path/pattern.192"
    ### Error rate in percent
    # error_rate: 5
```
</details>

@@ -420,7 +432,9 @@ No required arguments but the `type` key.
#### EVS
For EVS a list of at least one bitrate has to be specified with the key `bitrates`. The entries in this list can also be lists containing the bitrates used for the processing of the individual channels.
This configuration has to match the channel configuration. If the provided list is shorter, the last value will be repeated.
For the encoding stage `cod` and the decoding stage `dec`, the path to the IVAS_cod and IVAS_dec binaries can be specified under the key `bin`. Additionally some resampling can be applied by using the key `fs` followed by the desired sampling rate.
For the encoding stage `cod` and the decoding stage `dec`, the path to the IVAS_cod and IVAS_dec binaries can be specified under the key `bin`.
Additionally some resampling can be applied by using the key `fs` followed by the desired sampling rate.
The general bitstream processing configuration can be locally overwritten for each EVS and IVAS condition with the key `tx`.
#### IVAS
The configuration of the IVAS condition is similar to the EVS condition. However, only one bitrate for all channels (and metadata) can be specified.
In addition to that, the encoder and decoder take some additional arguments defined by the key `opts`.
@@ -458,24 +472,22 @@ The processing chain is as follows:
---
## ITU Tools

This module uses the ITU audio processing tools. These tools can be found here: https://github.com/openitu/STL (except for the filter binary which is deprecated). <br />
The filter binary with all necessary filter types can be found here: https://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_76/docs/S4-131277.zip. <br />
It also makes use of the MASA tools provided by Nokia. These can be found here: https://www.3gpp.org/ftp/TSG_SA/WG4_CODEC/TSGS4_122_Athens/Docs/S4-230221.zip.

The following binaries/executables are needed for the different processing steps:

| processing step          | ITU binary      |
|--------------------------|-----------------|
| LP filtering             |    filter       |
| HP filtering             |    filter       |
| Resampling               |    filter       |
| Loudness adjustment      |    bs1770demo   |
| MNRU                     |    p50fbmnru    |
| ESDRU                    |    esdru        |
| MASA rendering           |    masaRenderer |
| Processing step                 | Executable            | Where to find                                                                                               |
|---------------------------------|-----------------------|-------------------------------------------------------------------------------------------------------------|
| Loudness adjustment             | bs1770demo            | https://github.com/openitu/STL                                                                              |
| MNRU                            | p50fbmnru             | https://github.com/openitu/STL                                                                              |
| ESDRU                           | esdru                 | https://github.com/openitu/STL                                                                              |
| Frame error pattern application | eid-xor               | https://github.com/openitu/STL                                                                              |
| Error pattern generation        | gen-patt              | https://www.itu.int/rec/T-REC-G.191-201003-S/en (Note: Version in https://github.com/openitu/STL is buggy!) |
| Filtering, Resampling           | filter                | https://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_76/docs/S4-131277.zip                                       |
| Random offset/seed generation   | random                | https://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_76/docs/S4-131277.zip                                       |
| JBM network similulator         | networkSimulator_g192 | https://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_76/docs/S4-131277.zip                                       |
| MASA rendering                  | masaRenderer          | https://www.3gpp.org/ftp/TSG_SA/WG4_CODEC/TSGS4_122_Athens/Docs/S4-230221.zip                               |

The necessary binaries have to be placed in the [ivas_processing_scripts/bin](./ivas_processing_scripts/bin) folder.
For the ITU tools it is sufficient to copy the binaries while it is necessary to add some additional files for the MASA renderer.
For most of the tools it is sufficient to copy the binaries while it is necessary to add some additional files for the MASA renderer.

---

+24 −8
Original line number Diff line number Diff line
@@ -14,6 +14,10 @@
### Deletion of temporary directories containing 
### intermediate processing files, bitstreams etc.; default = false
# delete_tmp: true
### Master seed for random processes like bitstream error pattern generation; default = 0
# master_seed: 5
### Additional seed to specify number of preruns; default = 0
# prerun_seed: 2

### Any relative paths will be interpreted relative to the working directory the script is called from!
### Usage of absolute paths is recommended.
@@ -92,16 +96,28 @@ input:
#################################################
### Bitstream processing
#################################################
### Bistream processing (transport simulation) done after encoding and before decoding
### Bitstream processing (transport simulation) done after encoding and before decoding
### e.g. frame error insertion or transport simulation for JBM testing
### can be given globally here or in individual conditions of type ivas or evs
# tx:
    ### REQUIRED: Path to network simulation binary
    # bs_proc_bin: ".../ivas_python_testscripts/networkSimulator_g192.exe"
    ### Path to error pattern (mandatory if no information for generating the error pattern is given)
    ### REQUIRED: Type of bitstream processing; possible types: "JBM" or "FER"
    #type: "JBM"
    
    ### JBM
    ### REQUIRED: either error_pattern or error_profile
    ### delay error profile file
    # error_pattern: ".../dly_error_profile.dat"
    ### options for the binary, possible placeholders are {error_pattern} for the error pattern,
    ### {bitstream} for the bitstream to process and {bitstream_processed} for the processed bitstream
    # bs_proc_opts: [ "{error_pattern}",  "{bitstream}",  "{processed_bitstream}",  "{processed_bitstream}_tracefile_sim", "2", "0" ]
    ### Index of one of the existing delay error profile files to use (1-11)
    # error_profile: 5
    ## nFramesPerPacket parameter for the network simulator; default = 1
    # n_frames_per_packet: 2
    
    ### FER
    ### REQUIRED: either error_pattern or error_rate
    ### Frame error pattern file
    # error_pattern: "path/pattern.192"
    ### Error rate in percent
    # error_rate: 5
    
################################################
### Configuration for conditions under test
+17 −1
Original line number Diff line number Diff line
@@ -36,6 +36,7 @@ from itertools import repeat
import yaml

from ivas_processing_scripts.audiotools.metadata import check_ISM_metadata
from ivas_processing_scripts.audiotools.wrappers.bs1770 import scale_files
from ivas_processing_scripts.constants import (
    LOGGER_DATEFMT,
    LOGGER_FORMAT,
@@ -129,6 +130,11 @@ def main(args):

            logger.info(f"  Generating condition: {condition['name']}")

            # # TODO: what happens when no concatenation or only one file for concatenation?
            # if condition["processes"][0].name == "ivas":  # TODO: check if 0 index sufficient
            #     a = {"number_frames": cfg.num_frames, "number_frames_preamble": cfg.num_frames_preamble}
            #     condition["processes"][0].tx.update(a)

            apply_func_parallel(
                process_item,
                zip(
@@ -145,7 +151,17 @@ def main(args):

        if cfg.concatenate_input:
            # write out the splits, optionally remove file
            concat_teardown(cfg, logger)
            out_paths_splits, out_meta_splits = concat_teardown(cfg, logger)
            # scale individual files
            if cfg.postprocessing.get("loudness", False):
                # TODO: take care of samplingrate
                scale_files(
                    out_paths_splits,
                    cfg.postprocessing["fmt"],
                    cfg.postprocessing["loudness"],
                    cfg.postprocessing.get("fs", None),
                    out_meta_splits,
                )

    # copy configuration to output directory
    with open(cfg.output_path.joinpath(f"{cfg.name}.yml"), "w") as f:
+1 −0
Original line number Diff line number Diff line
@@ -214,6 +214,7 @@ class MetadataAssistedSpatialAudio(Audio):
            raise ValueError(
                f"Unsupported metadata assisted spatial audio format {name}"
            )
        self.metadata_files = []

    @classmethod
    def _from_file(
+13 −2
Original line number Diff line number Diff line
@@ -31,6 +31,7 @@
#

import logging
import warnings
from typing import Iterator, Optional, Tuple, Union

import numpy as np
@@ -42,7 +43,9 @@ logger = logging.getLogger("__main__")
logger.setLevel(logging.DEBUG)


# Functions used in this module
"""Functions used in this module"""


def trim(
    x: np.ndarray,
    fs: Optional[int] = 48000,
@@ -266,6 +269,7 @@ def limiter(
    release_heuristics_mem = 0.0
    gain = 1.0
    strong_saturation_cnt = 0
    limited = False

    if x.ndim == 1:
        n_samples_x = x.shape
@@ -324,16 +328,21 @@ def limiter(
            fr_gain = np.tile(gain * fac + frame_gain * (1.0 - fac), (n_chan_x, 1)).T
            fr_sig *= fr_gain
            gain = fr_gain[-1, 0]
            limited = True
        else:
            gain = 1.0

        release_heuristics_mem = release_heuristic
        # hard limiting for everything that still sticks out
        if (fr_sig > 32767).any() or (fr_sig < -32768).any():
            limited = True
        idx_max = np.where(fr_sig > 32767)
        fr_sig[idx_max] = 32767
        idx_min = np.where(fr_sig < -32768)
        fr_sig[idx_min] = -32768

    if limited:
        warnings.warn("Limiting had to be applied")
    return x


@@ -405,7 +414,9 @@ def framewise_io(
    )


# Deprecated functions (partly replaced by ITU binaries)
"""Deprecated functions (partly replaced by ITU binaries)"""


def resample(
    x: np.ndarray,
    in_freq: int,
Loading