About the readdata_Oxford.py question? #3

PMRS-lab · 2024-11-26T14:30:14Z

I couldn't find any executable function in the readdata_Oxford.py file. The file only contains class functions. How can I accurately crop the spliced satellite image into the corresponding area of the ground image

ZiminXia · 2024-11-27T08:47:20Z

Thank you for your interest in this project.
If you check the training code, you can find the next_pair_batch function is being called.

CrossViewMetricLocalization/train_Oxford.py

Line 132 in cc76e78

batch_sat, batch_grd, batch_gt = input_data.next_pair_batch(batch_size)

The cropping is done at

CrossViewMetricLocalization/readdata_Oxford.py

Line 227 in cc76e78

    
           # crop a satellite patch centered at the location of the ground image offseted by a randomly generated amount

PMRS-lab · 2024-11-27T14:22:42Z

Thank you for your interest in this project. If you check the training code, you can find the next_pair_batch function is being called.

CrossViewMetricLocalization/train_Oxford.py

Line 132 in cc76e78

batch_sat, batch_grd, batch_gt = input_data.next_pair_batch(batch_size)

The cropping is done at

CrossViewMetricLocalization/readdata_Oxford.py

Line 227 in cc76e78

# crop a satellite patch centered at the location of the ground image offseted by a randomly generated amount

Thank you very much for your answer. Another question is whether the network does not reduce the perspective difference between ground images and satellite images when extracting features. From the network structure you showed, it seems to directly learn ground images without performing perspective changes. Will this affect the similarity calculation results? After all, if the perspective is not changed and feature extraction is performed directly, the difference between the two images is quite large.

PMRS-lab · 2024-11-27T14:25:28Z

Thank you for your interest in this project. If you check the training code, you can find the next_pair_batch function is being called.

CrossViewMetricLocalization/train_Oxford.py

Line 132 in cc76e78

batch_sat, batch_grd, batch_gt = input_data.next_pair_batch(batch_size)

The cropping is done at

CrossViewMetricLocalization/readdata_Oxford.py

Line 227 in cc76e78

# crop a satellite patch centered at the location of the ground image offseted by a randomly generated amount

Can it be considered that the satellite images were cropped during the training process, without first outputting the cropped images as a satellite image block dataset?

ZiminXia · 2024-11-27T14:33:04Z

Regarding the dataloader, the cropping is done on the fly since the random offsets are generated at each iteration.

Regarding the model architecture, we did not use common perspective changes, e.g., polar transformation and homography because 1. polar transformation assumes center alignment between ground and aerial view, an assumption that does not hold for fine-grained localization. 2. homography ignores above ground objects, limiting the model solely to lane markings, etc.

In general, we find the use of global descriptors is strong enough to pull two views together. Of course, a strong perspective transformation may further improve the performance, but we see obvious limitations in commonly used ones.

PMRS-lab · 2024-11-28T08:48:42Z

Regarding the dataloader, the cropping is done on the fly since the random offsets are generated at each iteration.

Regarding the model architecture, we did not use common perspective changes, e.g., polar transformation and homography because 1. polar transformation assumes center alignment between ground and aerial view, an assumption that does not hold for fine-grained localization. 2. homography ignores above ground objects, limiting the model solely to lane markings, etc.

In general, we find the use of global descriptors is strong enough to pull two views together. Of course, a strong perspective transformation may further improve the performance, but we see obvious limitations in commonly used ones.

Thank you very much. Yesterday, you mentioned the common perspective projection transformation and would like to ask an open-ended question. Do you think using orthographic projection transformation would have better results than polar coordinate transformation and homography transformation. Best wiehes!

PMRS-lab changed the title ~~About In readdata_Oxford.py question?~~ About the readdata_Oxford.py question? Nov 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About the readdata_Oxford.py question? #3

About the readdata_Oxford.py question? #3

PMRS-lab commented Nov 26, 2024

ZiminXia commented Nov 27, 2024

PMRS-lab commented Nov 27, 2024

PMRS-lab commented Nov 27, 2024

ZiminXia commented Nov 27, 2024

PMRS-lab commented Nov 28, 2024

About the readdata_Oxford.py question? #3

About the readdata_Oxford.py question? #3

Comments

PMRS-lab commented Nov 26, 2024

ZiminXia commented Nov 27, 2024

PMRS-lab commented Nov 27, 2024

PMRS-lab commented Nov 27, 2024

ZiminXia commented Nov 27, 2024

PMRS-lab commented Nov 28, 2024