Ygr avsr #25

yanghaha0908 · 2024-01-15T04:36:09Z

merge AVSR

ddlBoJack · 2024-01-15T05:09:33Z

src/llama_recipes/datasets/avsr_dataset.py

+            "visual": inputBatch[2],  #torch.Size([4, 146, 1, 112, 112])
+            "vis_len": inputBatch[3],  #torch.Size([4])
+
+            "targetoutBatch": targetoutBatch,  #torch.Size([4, 50])


ddlBoJack · 2024-01-15T05:09:43Z

src/llama_recipes/models/slam_model.py

+        visual = kwargs.get("visual", None)  #torch.Size([2, 151, 1, 112, 112])
+        vis_len = kwargs.get("vis_len", None)  #tensor([ 77, 151], device='cuda:0', dtype=torch.int32)
+        maskw2v = kwargs.get("maskw2v", None)  #True
+        targetoutBatch =  kwargs.get("targetoutBatch", None)  #torch.Size([2, 29])


ddlBoJack · 2024-01-15T05:11:29Z

src/llama_recipes/models/slam_model.py

@@ -184,13 +187,25 @@ def forward(self,
        audio_mel_post_mask = kwargs.get("audio_mel_post_mask", None) # 2x downsample for whisper
        audio_mask = kwargs.get("audio_mask", None)


name should change to audio_mel_mask

ddlBoJack · 2024-01-15T05:11:41Z

src/llama_recipes/models/slam_model.py

@@ -184,13 +187,25 @@ def forward(self,
        audio_mel_post_mask = kwargs.get("audio_mel_post_mask", None) # 2x downsample for whisper
        audio_mask = kwargs.get("audio_mask", None)

+        audio = kwargs.get("audio", None)  #torch.Size([2, 96480])
+        audiomask = kwargs.get("audiomask", None)  #删 #torch.Size([2, 96480])


name should change to audio_mask

ddlBoJack · 2024-01-15T05:17:42Z

src/llama_recipes/datasets/avsr_dataset.py

+            "vis_len": inputBatch[3],  #torch.Size([4])
+
+            "targetoutBatch": targetoutBatch,  #torch.Size([4, 50])
+            "targetLenBatch": targetLenBatch.long(), #torch.Size([4])
            'maskw2v': True,


should always be false

yanghaha0908 added 5 commits January 15, 2024 11:24

save 1 version

113d702

save 1 version

d2ea3d9

debug

307d22e

merge avsr

79abbb8

merge avsr

7c8525c

ddlBoJack reviewed Jan 15, 2024

View reviewed changes

ddlBoJack approved these changes Jan 15, 2024

View reviewed changes

ddlBoJack merged commit e0fbb3d into main Jan 15, 2024
1 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ygr avsr #25

Ygr avsr #25

yanghaha0908 commented Jan 15, 2024

ddlBoJack Jan 15, 2024

ddlBoJack Jan 15, 2024

ddlBoJack Jan 15, 2024

ddlBoJack Jan 15, 2024

ddlBoJack Jan 15, 2024

		@@ -184,13 +187,25 @@ def forward(self,
		audio_mel_post_mask = kwargs.get("audio_mel_post_mask", None) # 2x downsample for whisper
		audio_mask = kwargs.get("audio_mask", None)

Ygr avsr #25

Ygr avsr #25

Conversation

yanghaha0908 commented Jan 15, 2024

ddlBoJack Jan 15, 2024

Choose a reason for hiding this comment

ddlBoJack Jan 15, 2024

Choose a reason for hiding this comment

ddlBoJack Jan 15, 2024

Choose a reason for hiding this comment

ddlBoJack Jan 15, 2024

Choose a reason for hiding this comment

ddlBoJack Jan 15, 2024

Choose a reason for hiding this comment