Speech-MASSIVE dataset is released with CC-BY-NC-SA-4.0 license.
Speech-MASSIVE covers 12 languages (Arabic, German, Spanish, French, Hungarian, Korean, Dutch, Polish, European Portuguese, Russian, Turkish, and Vietnamese) from different families and inherits from MASSIVE the annotations for the intent prediction and slot-filling tasks. The dataset is collected with Prolific by hiring crowd source workers for recording and validation.