Commit 6281ceb1 authored by Jan Kiene's avatar Jan Kiene
Browse files

Merge branch 'kiene/readme-adjustment-for-hash-generation' into 'main'

adjust hash generation section in readme

See merge request !243
parents be6f041c ab99ea39
Loading
Loading
Loading
Loading
Loading
+7 −2
Original line number Diff line number Diff line
@@ -118,9 +118,14 @@ After the processing is finished, the outputs will be present in the respective

  - These scripts collect items from each experiments `proc_output*` folder(s) and puts the needed files for the listening test into a `proc_final` folder. This folder needs to be uploaded for the dry run and the final delivery of the listening items to the labs.

### Hash generation
### Hash generation and checking for duplicates

The hashes for the `proc_final` can be generated using the [get_md5.py](other/get_md5.py) script:
The hashes for the `proc_final` can be generated using the [get_md5.py](other/get_md5.py) script.
This script also checks for identical hashes and thus identifies duplicates in the output files which are reported in a printout.
When generating hashes one should check if duplicates are reported and if yes, what files are identical - note that there might be duplicates between the actual test and the preliminaries/training which is ok.
If there is a case with three or more items being the same or two items being the same inside the test or the preliminaries, the input files should be checked for duplicates.

Script usage:

```shell
> python other/get_md5.py --help