Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bounding box is invalid Error #1175

Closed
TolgaAktas opened this issue Mar 14, 2023 · 12 comments
Closed

Bounding box is invalid Error #1175

TolgaAktas opened this issue Mar 14, 2023 · 12 comments
Labels
datasets Geospatial or benchmark datasets

Comments

@TolgaAktas
Copy link

Description

I am taking the intersection of the two Landsat8 datasets that have different root paths. One is for the shadow-free images and the other one for the shadowed images. I am getting the following error message although my images are limited to one WRS path row location for the minimally viable testing.

I have the following code but I am getting the error at sampler stage.

image

Here's the error message

image

Steps to reproduce

  1. I have downloaded the Landsat8 products for path = 16 & row = 30 at multiple dates. I moved the images with 0 - 2 % cloud cover ratio to a clean_image directory and moved the others (10 - 40% cover) to a shadowimage directory.

  2. I instantiated Landsat8 objects using these rootpaths, and defined bands as
    bands = ["B1","B2","B3","B4","B5","B6","B7","B8","B9","B10", "B11","QA_PIXEL"]

  3. I take the intersection of two datasets that are from the same area but having different cloud cover ratios.

  4. At this point the sampler on the dataset would have

Version

0.3.1

@TolgaAktas
Copy link
Author

More information about the datasets:

image

@adamjstewart
Copy link
Collaborator

Can you print all_set too? I suspect that it has size 0, that is, that there are no images between the two sets that overlap. Although you may have multiple images from the same location, if they aren't from the same time, then they have no overlap. If you want to ignore the time dimension, you would need to modify these lines from:

max(self.mint, other.mint),
min(self.maxt, other.maxt),

to:

min(self.mint, other.mint),
max(self.maxt, other.maxt),

@adamjstewart adamjstewart added the datasets Geospatial or benchmark datasets label Mar 14, 2023
@TolgaAktas
Copy link
Author

You are right, trying to print the datasets produces the same error message. But the creating IntersectionDataset silently fails without any errors. When you say they have to be from the same time, is literally the same date and time? That would be impossible to match, but I rather need them to be from the same area and it's OK if it's different times.

image

@adamjstewart
Copy link
Collaborator

Yes, if the images have temporal resolution down to the second, then they would need to have a matching second to be considered intersecting. If the images don't have any temporal component or it's at the year resolution, then things would be easier.

Actually, an easier hack to get what you want would be:

from torchgeo.datasets import Landsat8

Landsat8.filename_regex = "_(?P<band>[A-Z0-9_]+)\."

clean_set = ...
noisy_set = ...

This removes the date part of the regex so that TorchGeo assumes there isn't any way to know the datetime at which an image was taken and instead only considers spatial intersection.

@TolgaAktas
Copy link
Author

That's indeed a good workaround, I will try that. But I just realized that with the screenshots that I have sent you, the problem is with minx and maxx indeed. I specifically tried to keep a minimal dataset to have a working first prototype and here's what the data_dir looks like. Should I be getting bounding box errors?

image

@adamjstewart
Copy link
Collaborator

The bounding box errors are just a symptom of Toblerity/rtree#204, the real issue is that the intersection dataset contains 0 images.

The reason that it contains 0 images is because clean_set and noisy_set don't contain any images from the same day, and therefore have no overlap. If you disable the time component, it should work.

@TolgaAktas
Copy link
Author

I tried changing to the regex you provided but I think that leaves out a lot of essential information for the Dataset object to find the dataset. I am getting a "No Landsat8 data was found in '~/datasets/landsat/wrs1630/clean' " error with the new regex. Any way to just leave out the date component?

@adamjstewart
Copy link
Collaborator

Oops, try this:

Landsat8.filename_regex = ".*_(?P<band>[A-Z0-9_]+)\."

We use re.match, not re.search, so it always starts looking from the beginning of the filename.

@TolgaAktas
Copy link
Author

TolgaAktas commented Mar 14, 2023

Ok, this helps working around the issue. I can correctly build and sample and get batches. But the images from two sets are still the identical (see torch.allclose() at the bottom) although the datasets and their images are different. Here's the code and the data_dir tree showing that the images are different. Any ideas ?

PS: That's why I was asking about how the intersectiondataset's samples are stacked up.

image

image

@adamjstewart
Copy link
Collaborator

It could be that it's sampling from a corner where both images have nodata pixels. I would try printing the sample and checking to see if it's all zeros. Also print the shape to make sure you're getting what you want. Also, the data loader returns dictionaries, not images. Shouldn't you be using elem["image"] to access the image?

@TolgaAktas
Copy link
Author

TolgaAktas commented Mar 14, 2023

I had wrote a custom collate_fn function, since I am trying to use this dataloader to connect to a different codebase, which requires me to use tensors. But you are right.

The identical images issue also have to do with my custom collate_fn function, looking into it. I wiill close the ticket for now since I think we solved the main problem !

@adamjstewart
Copy link
Collaborator

Great! Feel free to reopen or open new issues/discussions if you have any other problems!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datasets Geospatial or benchmark datasets
Projects
None yet
Development

No branches or pull requests

2 participants