You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey!
I have been trying to optimize the inference pipeline using batch inference. But everything used in the pipeline is written considering one image at a time. I was successful in creating batches of data but somehow the after process of pasting the upscaled image back on the original image takes unusually long periods of time.
My video is 15 sec long and in total has 433 frames. I am running on colab and using T4, current time for inferenc is 4 min 37 sec.
I am using onnx version of the model.
# -------------------- start to processing ---------------------# No need to optimize this loop, done in 37 secs on T4.# Convert the input_img_list to a tensor for batching.defprocess_batch_for_inference(cropped_images_list):
""" Function to handle batch inference. """#print("Processing the cropped faces into batches for smooth inference!")print(f"Number of images being processed: {len(cropped_images_list)}")
cropped_face_t=img2tensor(cropped_images_list, bgr2rgb=True, float32=True)
cropped_face_t=torch.stack(cropped_face_t).to(device)
normalize(cropped_face_t, (0.5, 0.5, 0.5), (0.5, 0.5, 0.5), inplace=True)
print(f"Shape of cropped faces tensor: {cropped_face_t.shape}")
try:
forward_time=time.time()
withtorch.no_grad():
ort_inputs= {ort_session.get_inputs()[0].name: cropped_face_t.cpu().numpy()}
#print(f"shape of the inputs : {ort_inputs.shape}")ort_outs=ort_session.run(None, ort_inputs)
output=torch.from_numpy(ort_outs[0])
print(f"shape of the output : {output.shape}")
restored_face=tensor2img(output, rgb2bgr=True, min_max=(-1, 1))
asserttype(restored_face) ==list, f"Output should be a list, got {type(restored_face)}"assertall(x.ndim==3forxinrestored_face), "Image should be 3-dimensional"print(f'Inference time: {time.time() -forward_time:.2f}s')
deloutputtorch.cuda.empty_cache()
exceptExceptionaserror:
print(f'Failed inference for CodeFormer: {error}')
traceback.print_exc()
restored_face=tensor2img(cropped_face_t, rgb2bgr=True, min_max=(-1, 1))
forfaceinrestored_face:
face=face.astype('uint8')
foriinrange(len(cropped_images_list)):
face_helper.add_restored_face(restored_face[i], cropped_images_list[i])
This is my forward pass function.
# Process each image in the current batchimgs= []
all_cropped_images_list= []
affine_matrices= []
fori, img_pathinenumerate(input_img_list):
# clean all the intermediate results to process the next imageface_helper.clean_all()
ifisinstance(img_path, str):
img_name=os.path.basename(img_path)
basename, ext=os.path.splitext(img_name)
print(f'[{i+1}/{test_img_num}] Processing: {img_name}')
img=cv2.imread(img_path, cv2.IMREAD_COLOR)
else: # for video processingbasename=str(i).zfill(6)
img_name=f'{video_name}_{basename}'ifinput_videoelsebasenameprint(f'[{i+1}/{test_img_num}] Processing: {img_name}')
img=img_pathifargs.has_aligned:
# the input faces are already cropped and alignedimg=cv2.resize(img, (512, 512), interpolation=cv2.INTER_LINEAR)
face_helper.is_gray=is_gray(img, threshold=10)
ifface_helper.is_gray:
print('Grayscale input: True')
face_helper.cropped_faces= [img]
else:
face_helper.read_image(img)
imgs.append(face_helper.input_img)
# get face landmarks for each facenum_det_faces=face_helper.get_face_landmarks_5(
only_center_face=args.only_center_face, resize=640, eye_dist_threshold=5)
print(f'\tdetect {num_det_faces} faces')
# align and warp each face#print(face_helper.all_landmarks_5)face_helper.align_warp_face()
affine_matrices.append(face_helper.affine_matrices[0])
foridx, cropped_facesinenumerate(face_helper.cropped_faces):
all_cropped_images_list.append(cropped_faces/255.)
face_helper.affine_matrices=affine_matricesprint(len(affine_matrices))
print(len(face_helper.affine_matrices))
# Start of processingprint(f"Processing batches of images for cropping out faces")
batched_input_img_list= []
batch_num=0forcropped_imagesinall_cropped_images_list:
batched_input_img_list.append(cropped_images)
iflen(batched_input_img_list) ==args.batch_size:
# Batch inferencebatch_num+=1print(f"Processing batch no {batch_num}")
process_batch_for_inference(batched_input_img_list)
# Clear the batched input list for next batchbatched_input_img_list= []
iflen(batched_input_img_list) >0andlen(batched_input_img_list) <args.batch_size:
batch_num+=1print(f"Processing last batch {batch_num}")
# Batch inference for the last batchprocess_batch_for_inference(batched_input_img_list)
# Clear the batched input list for next batchbatched_input_img_list= []
This is how i create batches to be processed.
#restoring back the faces on the images.print(f"length of imgs {len(imgs)}")
#for img in imgs:# paste_backifnotargs.has_aligned:
# upsample the backgroundifbg_upsamplerisnotNone:
# Now only support RealESRGAN for upsampling backgroundbg_img=bg_upsampler.enhance(img, outscale=args.upscale)[0]
else:
bg_img=None#print(f"shape of affine matrices {len(face_helper.affine_matrices)}")face_helper.get_inverse_affine(None)
# paste each restored face to the input imageifargs.face_upsampleandface_upsamplerisnotNone:
restored_img=face_helper.paste_faces_to_input_image(upsample_img=bg_img, draw_box=args.draw_box, face_upsampler=face_upsampler)
else:
print(f"Length of restored faces {len(face_helper.restored_faces)}")
print(f" length of inverse affine matrices {len(face_helper.inverse_affine_matrices)}")
ifbg_imgisnotNone:
print(f"Shape of bg_img : {bg_img.shape}")
restored_imgs=face_helper.paste_faces_to_input_image(upsample_img=bg_img, draw_box=args.draw_box)
asserttype(restored_imgs) ==list, f"Output should be a list, got {type(restored_imgs)}"# save facesassertlen(all_cropped_images_list) ==len(restored_imgs), "Length of cropped faces and restored faces should be the same"foridx, (cropped_face, restored_face) inenumerate(zip(all_cropped_images_list, restored_imgs)):
# save cropped faceifnotargs.has_aligned:
save_crop_path=os.path.join(result_root, 'cropped_faces', f'{basename}_{idx:02d}.png')
imwrite(cropped_face, save_crop_path)
# save restored faceifargs.has_aligned:
save_face_name=f'{basename}.png'else:
save_face_name=f'{basename}_{idx:02d}.png'ifargs.suffixisnotNone:
save_face_name=f'{save_face_name[:-4]}_{args.suffix}.png'save_restore_path=os.path.join(result_root, 'restored_faces', save_face_name)
imwrite(restored_face, save_restore_path)
forrestored_imginrestored_imgs:
# save restored imgifnotargs.has_alignedandrestored_imgisnotNone:
ifargs.suffixisnotNone:
basename=f'{basename}_{args.suffix}'save_restore_path=os.path.join(result_root, 'final_results', f'{basename}.png')
imwrite(restored_img, save_restore_path)
And for some unknown reason this particular part takes around 2 to 3 minutes to be completed. Has anyone worked out a faster pipeline for videos? Or any suggestions on optimizations on this? Appreciate the help. Thanks
The text was updated successfully, but these errors were encountered:
Hey!
I have been trying to optimize the inference pipeline using batch inference. But everything used in the pipeline is written considering one image at a time. I was successful in creating batches of data but somehow the after process of pasting the upscaled image back on the original image takes unusually long periods of time.
My video is 15 sec long and in total has 433 frames. I am running on colab and using T4, current time for inferenc is 4 min 37 sec.
I am using onnx version of the model.
This is my forward pass function.
This is how i create batches to be processed.
And for some unknown reason this particular part takes around 2 to 3 minutes to be completed. Has anyone worked out a faster pipeline for videos? Or any suggestions on optimizations on this? Appreciate the help. Thanks
The text was updated successfully, but these errors were encountered: