Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add SMS-WSJ RETURNN datasets #116

Open
wants to merge 47 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 41 commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
59edbeb
add SMS-WSJ RETURNN datasets
vieting Feb 1, 2023
d95f335
black formatting
vieting Feb 2, 2023
2da7e59
remove hard coded label for padding
vieting Feb 2, 2023
7bfa8d4
add SmsWsjMixtureEarlyBpeDataset
vieting Feb 2, 2023
8c539ea
black formatting
vieting Feb 2, 2023
9ad34aa
fix data_types
vieting Feb 3, 2023
ac0718c
bpe add data types
vieting Feb 6, 2023
5c0ea9e
allow original alignment
vieting Feb 6, 2023
ff2f0ea
black
vieting Feb 6, 2023
44774a8
Update common/datasets/sms_wsj/returnn_datasets.py
vieting Feb 7, 2023
eed5171
Update common/datasets/sms_wsj/returnn_datasets.py
vieting Feb 7, 2023
1db9771
Update common/datasets/sms_wsj/returnn_datasets.py
vieting Feb 7, 2023
10a13cf
Update common/datasets/sms_wsj/returnn_datasets.py
vieting Feb 7, 2023
f7ef64a
Update common/datasets/sms_wsj/returnn_datasets.py
vieting Feb 7, 2023
0f4d363
Update common/datasets/sms_wsj/returnn_datasets.py
vieting Feb 7, 2023
c64c6fc
Update common/datasets/sms_wsj/returnn_datasets.py
vieting Feb 7, 2023
f39ebda
review comments
vieting Feb 7, 2023
8b7ad02
black
vieting Feb 7, 2023
f736a50
_segment_to_rasr str input
vieting Feb 9, 2023
4445b29
sms wsj add init
vieting Feb 9, 2023
19d033e
caching: use local files if available
vieting Feb 9, 2023
ac70e9d
non standard imports inside classes
vieting Feb 9, 2023
30a90f7
black
vieting Feb 9, 2023
dcce15a
update of super() usage
vieting Feb 9, 2023
e80e210
simplify buffer logic
vieting Feb 9, 2023
f0a0fd8
print to log
vieting Feb 9, 2023
f2aac11
avoid name RASR
vieting Feb 9, 2023
d55bd5a
simons review comments
vieting Feb 9, 2023
2f0c484
update caching logic
vieting Feb 10, 2023
4074af4
returnn log
vieting Feb 10, 2023
c188b82
pad start int
vieting Feb 10, 2023
439da8f
log unzipping command
vieting Feb 10, 2023
f58844a
rename rasr variable
vieting Feb 10, 2023
caa18a3
use sequence buffer class
vieting Feb 10, 2023
55ff943
black
vieting Feb 10, 2023
8ecf218
chmod -f
vieting Feb 13, 2023
cab1c3c
chmod force exit 0
vieting Feb 13, 2023
15ebbdc
read data from zip instead of unzipping
vieting Feb 13, 2023
b051045
cleanup
vieting Feb 13, 2023
3823e2b
black
vieting Feb 13, 2023
b8c3811
fix buffer logic
vieting Feb 14, 2023
fca6ea3
explicit num_outputs required
vieting Feb 14, 2023
45fb9a0
prefetch buffer size
vieting Feb 14, 2023
c2cb0a0
fix init seq order
vieting Feb 14, 2023
a7fc519
allow shuffling
vieting Feb 14, 2023
ac50834
try reading from zip 5 times
vieting Feb 15, 2023
48ad8b5
black
vieting Feb 15, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Empty file.
Loading