This paper has not yet been published. Once published, the code will be made publicly available.
This method involves using the LRS2, LRS3, and Vox2 datasets to create a multimodal speech separation dataset. The corresponding folders in the provided GitHub repository contain the files necessary to build the datasets, and the code in the repository can be used to construct the multimodal datasets.