Skip to content

Commit

Permalink
[!158][RELEASE] Automatic Subtitling with SBAAM (ACL2024)
Browse files Browse the repository at this point in the history
# Which work do we release?

"SBAAM! Eliminating Transcript Dependency in Automatic Subtitling"

# What changes does this release refer to?

ec9480d9f5de12269f420848c9c55f820089da4b d0d8ac1ee13c2bd12ab9a483fc2aa6b0653651f5 e2d7504f8d3245532c7e781f8c7b3cb93709d8de cb56d5a6af98913e501a5eb54159c57179433960 fc1f065bfce66922815c840de213d01978917543 16fb354c87ee5397c26b97fb54547c3d463a9dff dee3b0f125d155fc28574ba235c88bc2367e6e76 fea2e98b7f033380f19c718de250332c7f0bf322 fa0e7a3a750db9096bdbcefae1591a2d890dd771
  • Loading branch information
mgaido91 committed May 28, 2024
1 parent ec8ce81 commit 80fe2ca
Show file tree
Hide file tree
Showing 3 changed files with 329 additions and 8 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ Dedicated README for each work can be found in the `fbk_works` directory.

### 2024

- [[ACL 2024] **SBAAM! Eliminating Transcript Dependency in Automatic Subtitling**](fbk_works/SBAAM.md)
- [[ACL 2024] **When Good and Reproducible Results are a Giant with Feet of Clay: The Importance of Software Quality in NLP**](fbk_works/BUGFREE_CONFORMER.md)
- [[LREC-COLING 2024] **How do Hyenas deal with Human Speech? Speech Recognition and Translation with ConfHyena**](fbk_works/HYENA_COLING2024.md)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -84,10 +84,10 @@ def aligns(self, boundaries_indexes):
raise NotImplementedError("Subclasses of AttentionMatrixProcessor should implement aligns")


class CustomAttentionAligner(AttentionMatrixProcessor):
class SBAAMNoForceEndAttentionAligner(AttentionMatrixProcessor):
"""
Custom method specifically designed by FBK to determine block boundaries,
trying to maximize the value of the attention area of corresponding text and audio.
Determines subtitling block boundaries, trying to maximize the value of the attention area
of corresponding text and audio.
"""
def normalize(self):
self.std_normalize()
Expand Down Expand Up @@ -131,10 +131,13 @@ def aligns(self, boundaries_indexes):
return splitting_time_idxs


class CustomForcedEndAttentionAligner(CustomAttentionAligner):
class SBAAMAttentionAligner(SBAAMNoForceEndAttentionAligner):
"""
The current method does not properly estimate the end time of the last eob.
SBAAMNoForceEnd does not properly estimate the end time of the last eob.
As a workaround, this forces the last eob to terminate at the end of the audio.
This is the method used and described in
`"SBAAM! Eliminating Transcript Dependency in Automatic Subtitling" <>`_.
"""
def aligns(self, boundaries_indexes):
splitting_time_idxs = super().aligns(boundaries_indexes)
Expand Down Expand Up @@ -243,8 +246,8 @@ def aligns(self, boundaries_indexes):

class AttentionAlignerArgparse(argparse.Action):
AVAILABLE_ALIGNERS = {
"custom": CustomAttentionAligner,
"custom-forceend": CustomForcedEndAttentionAligner,
"sbaam-noforce": SBAAMNoForceEndAttentionAligner,
"sbaam": SBAAMAttentionAligner,
"dtw-medianf": DTWMedianFilterAttentionAligner,
}

Expand Down Expand Up @@ -356,7 +359,7 @@ def main(args):
parser.add_argument('--alignment-operator',
action=AttentionAlignerArgparse,
choices=AttentionAlignerArgparse.AVAILABLE_ALIGNERS.keys(),
default=AttentionAlignerArgparse.AVAILABLE_ALIGNERS['custom-forceend'],
default=AttentionAlignerArgparse.AVAILABLE_ALIGNERS['sbaam'],
help="method to use to perform alignments")
parser.add_argument('--remove-last-frame', action='store_true', default=False,
help="if set, last token is removed before computing alignments")
Expand Down
Loading

0 comments on commit 80fe2ca

Please sign in to comment.