Dataset
Dataset
Class representing the structure of a dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
Dataset name. |
required |
audio_dir
|
str
|
Audio directory. |
required |
ref_dir
|
str
|
Reference directory. |
required |
Source code in asrbench\dataset.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 |
|
name: str
property
Dataset identifier.
pairs: List[TranscribePair]
property
Dataset data pairs.
check_directories()
Check if the Dataset directories are valid.
Source code in asrbench\dataset.py
56 57 58 59 |
|
from_config(name, config)
classmethod
Set up Dataset from config Dict in configfile.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
dataset identifier. |
required |
config
|
Dict[str, str]
|
dictionary containing the dataset configuration. |
required |
Source code in asrbench\dataset.py
104 105 106 107 108 109 110 111 112 113 114 115 116 |
|
get_audio_files()
It takes all the files from the audio directory. If the directory is empty it raises an error.
Source code in asrbench\dataset.py
73 74 75 76 77 78 79 80 81 82 83 84 |
|
get_data()
Set up dataset TranscriberPairs.
Source code in asrbench\dataset.py
43 44 45 46 47 48 49 50 51 52 53 54 |
|
get_ref_by_audio(audio)
Fetches the contents of the reference file from the path of the audio file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
audio
|
Path
|
Path for audio file. |
required |
Source code in asrbench\dataset.py
86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 |
|