diff --git a/readme.txt b/readme.txt index 62d21f51bacdf05764e470b763bbd0ae659f0023..129393976b0c84c780f7fb194fa45bdbb3407ebd 100644 --- a/readme.txt +++ b/readme.txt @@ -35,10 +35,11 @@ These files represent the 3GPP EVS Codec Extension for Immersive Voice and Audio Services (IVAS) floating-point C simulation. All code is writtten in ISO/IEC C99. The system is implemented as six separate programs: - IVAS_cod IVAS Encoder - IVAS_dec IVAS Decoder - IVAS_rend IVAS External Renderer - IVAS_cod_fmtsw IVAS Encoder with support for format switching + IVAS_cod IVAS Encoder + IVAS_dec IVAS Decoder + IVAS_rend IVAS External Renderer + ISAR_post_rend ISAR Post Renderer + IVAS_cod_fmtsw IVAS Encoder with support for format switching ambi_converter example program for Ambisonics format conversion For encoding using the coder program, the input is a binary @@ -124,7 +125,7 @@ should have the following structure: . `-- c-code |-- readme.txt - |-- Makefile + |-- Makefile |-- Workspace_msvc |-- apps |-- lib_com @@ -133,8 +134,9 @@ should have the following structure: |-- lib_enc |-- lib_isar |-- lib_lc3plus - |-- lib_rend + |-- lib_rend |-- lib_util + |-- scripts The package includes a Makefile for gcc, which has been verified on 32-bit Linux systems. The code can be compiled by entering the directory @@ -150,7 +152,7 @@ To compile the code, please open "Workspace_msvc\Workspace_msvc.sln" and build "encoder" for the encoder, "decoder" for the decoder, and "renderer" for the renderer executable. The resulting encoder/decoder/renderer/ISAR_post_renderer executables are "IVAS_cod.exe", "IVAS_dec.exe", "IVAS_rend.exe", and -"ISAR_post_rend.exe". All reside in the c-code main directory. In addition, this +"ISAR_post_rend.exe". All reside in the c-code main directory. In addition, this directory will contain a version of the encoder with support for format switching (named "IVAS_cod_fmtsw.exe") and an example program for Ambisonics format conversion (named "ambi_converter.exe"). @@ -294,7 +296,7 @@ Options: EVS RTP Payload Format or rtpdump files containing TS26.253 Annex A IVAS RTP Payload Format. The SDP parameter hf_only is required. Reading RFC4867 AMR/AMR-WB RTP payload format is not supported. --Tracefile TF : VoIP mode: Generate trace file named TF. Requires -no_delay_cmp to +-Tracefile TF : VoIP mode: Generate trace file named TF. Requires -no_delay_cmp to be enabled so that trace contents remain in sync with audio output. -fec_cfg_file : Optimal channel aware configuration computed by the JBM as described in Section 6.3.1 of TS26.448. The output is @@ -306,7 +308,7 @@ Options: Format files, the magic word in the mime file is used to determine which of the two supported formats is in use. default bitstream file format is G.192 --fr L : render frame size in ms L=(5,10,20), default is 20 +-fr L : render frame size in ms L=(5,10,20), default is 20 -hrtf File : HRTF filter File used in BINAURAL rendering -T File : Head rotation specified by external trajectory File -otr tracking_type : Head orientation tracking type: 'none', 'ref', 'avg', 'ref_vec' @@ -322,14 +324,13 @@ Options: left or l or 90->left, right or r or -90->right, center or c or 0->middle -exof File : External orientation trajectory File for simulation of external orientations -dpid ID : Directivity pattern ID(s) (space-separated list of up to 4 numbers can be - specified) for binaural output configuration --aeid ID | File : Acoustic environment ID (number > 0) or - alternatively, it can be a text file where each line contains "ID duration" - for BINAURAL_ROOM_REVERB output configuration. + specified) for binaural output configurations +-aeid ID | File : Acoustic environment ID (number > 0) or a text file where each line + contains "ID duration" for BINAURAL_ROOM_REVERB output configuration -obj_edit File : Object editing instructions file or NULL for built-in example --level level : Complexity level, level = (1, 2, 3), will be defined after characterisation. --om File : Coded metadata File for BINAURAL_SPLIT_PCM OutputConf - Currently, all values default to level 3 (full functionality). +-level level : Complexity level, level = (1, 2, 3), will be defined after characterisation + Currently, all values default to level 3 (full functionality) +-om File : Coded metadata File for BINAURAL_SPLIT_PCM output configuration -q : Quiet mode, limit printouts to terminal, default is deactivated @@ -358,31 +359,30 @@ Options: -render_config File : Binaural renderer configuration parameters in File (only for binaural outputs) -room_size (S|M|L) : Selects default reverb based on a room size (S - small | M - medium | L - large) -non_diegetic_pan P : Panning mono non-diegetic sound to stereo -90<= P <= 90 - left or l or 90->left, right or r or -90->right, center or c or 0 ->middle + left or l or 90->left, right or r or -90->right, center or c or 0 ->middle -exof File : External orientation trajectory File for simulation of external orientations -dpid ID : Directivity pattern ID(s) (space-separated list of up to 4 numbers can be - specified) for binaural outputs --aeid ID | File : Acoustic environment ID (number > 0) - alternatively, it can be a text file where each line contains "ID duration" for BINAURAL_ROOM_REVERB output. + specified) for binaural output configurations +-aeid ID | File : Acoustic environment ID (number > 0) or a text file where each line + contains "ID duration" for BINAURAL_ROOM_REVERB output configuration -lp Position : Output LFE position. Comma-delimited triplet of [gain, azimuth, elevation] where gain is linear - (like --gain, -g) and azimuth, elevation are in degrees. - If specified, overrides the default behavior which attempts to map input to output LFE channel(s) + (like --gain, -g) and azimuth, elevation are in degrees + If specified, overrides the default behavior which attempts to map input to output LFE channel(s) -lm File : LFE panning matrix File (CSV table) containing a matrix of dimensions [ num_input_lfe x num_output_channels ] with elements specifying linear routing gain (like --gain, -g). - If specified, overrides the output LFE position option and the default behavior which attempts to map input to output LFE channel(s) + If specified, overrides the output LFE position option and the default behavior which attempts to map input to output LFE channel(s) -no_delay_cmp : Turn off delay compensation -g : Input gain (linear, not in dB) to be applied to input audio file -l : List supported audio formats -smd : Metadata Synchronization Delay in ms, Default is 0. Quantized by 5ms subframes. --om File : Coded metadata File (only for BINAURAL_SPLIT_PCM output) --prbfi File : BFI File (only for BINAURAL_SPLIT_PCM output) --level level : Complexity level, level = (1, 2, 3), will be defined after characterisation. +-om File : Coded metadata File for BINAURAL_SPLIT_PCM output configuration +-level level : Complexity level, level = (1, 2, 3), will be defined after characterisation Currently, all values default to level 3 (full functionality). -q : Quiet mode, limit printouts to terminal, default is deactivated -The usage of the "ISAR_post_rend" program as follows: ------------------------------------------------------ +The usage of the "ISAR_post_rend" program is as follows: +-------------------------------------------------------- Usage: ISAR_post_rend [options] @@ -396,6 +396,34 @@ Options: -prbfi File : BFI File +The usage of the "ambi_converter" program is as follows: +-------------------------------------------------------- + +Usage: ambi_converter input_file output_file input_convention output_convention + +input_convention and output convention must be an integer number in [0,5] +the following conventions are supported: +0 : ACN-SN3D +1 : ACN-N3D +2 : FuMa-MaxN +3 : FuMa-FuMa +4 : SID-SN3D +5 : SID-N3D + +Either the input or the output convention must always be ACN-SN3D. + + +The usage of the "IVAS_cod_fmtsw" program is as follows: +-------------------------------------------------------- + +Usage: IVAS_cod_fmtsw format_switching_file + +Mandatory parameters: +--------------------- +format_switching_file: Text file containing a valid encoder command line in each line + + + MULTICHANNEL LOUDSPEAKER INPUT / OUTPUT CONFIGURATIONS ====================================================== The loudspeaker positions for each MC layouts are assumed to have the following azimuth and elevation @@ -423,31 +451,6 @@ omitted, the LFE input is downmixed to all channels with a factor of 1/N. Positi the LFE channel. Maximum number of supported loudskpeakers N is 16. An example custom loudspeaker layout file is available: ls_setup_16ch_8+4+4.txt -The usage of the "ambi_converter" program as follows: ------------------------------------------------------ - -Usage: ambi_converter input_file output_file input_convention output_convention - -input_convention and output convention must be an integer number in [0,5] -the following conventions are supported: -0 : ACN-SN3D -1 : ACN-N3D -2 : FuMa-MaxN -3 : FuMa-FuMa -4 : SID-SN3D -5 : SID-N3D - -Either the input or the output convention must always be ACN-SN3D. - -The usage of the "IVAS_cod_fmtsw" program is as follows: --------------------------------------------------------- - -Usage: IVAS_cod_fmtsw format_switching_file - -Mandatory parameters: ---------------------- -format_switching_file: Text file containing a valid encoder command line in each line - RUNNING THE SELF TEST ===================== @@ -691,17 +694,17 @@ The parameters for the object editing in decoder for the supported formats can b parameter file. Each row of the file corresponds to one 20 ms IVAS frame. The row contains one or more of the following parameters separated by a comma: -bg_gain= linear gain to be applied on the SBA/MASA component in OSBA/OMASA, no effect for ISM -obj__gain= linear gain to be applied on object , 0-based indexing -obj__relgain=0|1 if 1, obj__gain is interpreted as a relative modification. default is absolute modification -obj__azi= azimuth angle in degrees to be applied on object , 0-based indexing -obj__relazi=0|1 if 1, obj__azi is interpreted as a relative modification. default is absolute modification -obj__ele= elevation angle in degrees to be applied on object , 0-based indexing -obj__relele=0|1 if 1, obj__ele is interpreted as a relative modification. default is absolute modification +bg_gain= linear gain to be applied on the SBA/MASA component in OSBA/OMASA, no effect for ISM +obj__gain= linear gain to be applied on object , 0-based indexing +obj__relgain=0|1 if 1, obj__gain is interpreted as a relative modification. default is absolute modification +obj__azi= azimuth angle in degrees to be applied on object , 0-based indexing +obj__relazi=0|1 if 1, obj__azi is interpreted as a relative modification. default is absolute modification +obj__ele= elevation angle in degrees to be applied on object , 0-based indexing +obj__relele=0|1 if 1, obj__ele is interpreted as a relative modification. default is absolute modification obj__radius= linear radius to be applied on object , 0-based indexing obj__relradius=0|1 if 1, obj__radius is interpreted as a relative modification. default is absolute modification -obj__yaw= yaw angle in degrees to be applied on object , 0-based indexing -obj__relyaw=0|1 if 1, obj__yaw is interpreted as a relative modification. default is absolute modification +obj__yaw= yaw angle in degrees to be applied on object , 0-based indexing +obj__relyaw=0|1 if 1, obj__yaw is interpreted as a relative modification. default is absolute modification obj__pitch= pitch angle in degrees to be applied on object , 0-based indexing obj__relpitch=0|1 if 1, obj__pitch is interpreted as a relative modification. default is absolute modification @@ -720,4 +723,3 @@ typedef struct { u_int32 length; /* size of the RTP packet in bytes */ (u_int8 * length) RTP_packet; /* RTP packet (sized length * byte) */ } RTP_streaming_packet; -