Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: --crop-bottom for equirect data #3525

Merged
merged 2 commits into from
Nov 20, 2024
Merged

Fix: --crop-bottom for equirect data #3525

merged 2 commits into from
Nov 20, 2024

Conversation

kevinddchen
Copy link
Contributor

@kevinddchen kevinddchen commented Nov 13, 2024

I am trying to process an equirect video using --crop-bottom=0.2 to remove myself from the bottom portion of the frames.

ns-process-data video \
    --camera-type equirectangular \
    --images-per-equirect 8 \
    --crop-bottom 0.2 \
    ...

Example frame:

frame_00001

However, I get training images that include my body:

frame_00002_2

I was able to identify an error in this function:

def _crop_bound_arr_vertical(
bound_arr: list, fov: int, crop_factor: Tuple[float, float, float, float] = (0.0, 0.0, 0.0, 0.0)
) -> list:
"""Returns a list of vertical bounds adjusted for cropping.
Args:
bound_arr (list): Original list of vertical bounds in ascending order.
fov (int): Field of view of the camera.
crop_factor (Tuple[float, float, float, float]): Crop arr (top, bottom, left, right).
Returns:
list: Cropped bound arr
"""
if crop_factor[1] > 0:
bound_arr = _crop_bottom(bound_arr, fov, crop_factor[1])
if crop_factor[0] > 0:
bound_arr = _crop_top(bound_arr, fov, crop_factor[0])
return bound_arr

bound_arr is initially the array [-45, 0, 45], which means that it will generate perspective training images using 120 deg FoV cameras pointed at altitudes -45, 0, and 45 degrees. However, with --crop-bottom=0.2 we end up with bound_arr == [-57.75, -25.5, -6.0] at the end of the function, which is not what we want.

Doing the math a little bit, I discovered that it is likely that the implementations for _crop_top() and _crop_bottom() were accidentally swapped, as once i made this change, I ended up with bound_arr == [6.0, 25.5, 57.75] which is more correct. Indeed, a +6 degree altitude is the minimum value for excluding the bottom 20% of the equirect for a 120 deg FoV perspective camera. This is the new training image created:

frame_00003

Additional notes:

  • Using --crop-factor 0.2 0 0 0 (i.e. crop top 20%) now correctly gives bound_arr == [-57.75, -25.5, -6.0]
  • I think the implementation is inherently incorrect, since using --crop-factor 0.1 0.1 0 0 (i.e. crop top and bottom 10%) gives bound_arr == [-22.3125, -4.125, 12.0], where all altitudes should actually be between -12 and 12 degrees. But that is work for another PR (which I am happy to do!)

@jb-ye jb-ye requested a review from THE-COB November 19, 2024 16:59
@kevinddchen kevinddchen merged commit 758ea19 into main Nov 20, 2024
3 checks passed
@kevinddchen kevinddchen deleted the kchen/crop branch November 20, 2024 08:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants