-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About the readdata_Oxford.py question? #3
Comments
Thank you for your interest in this project. CrossViewMetricLocalization/train_Oxford.py Line 132 in cc76e78
The cropping is done at
|
Thank you very much for your answer. Another question is whether the network does not reduce the perspective difference between ground images and satellite images when extracting features. From the network structure you showed, it seems to directly learn ground images without performing perspective changes. Will this affect the similarity calculation results? After all, if the perspective is not changed and feature extraction is performed directly, the difference between the two images is quite large. |
Can it be considered that the satellite images were cropped during the training process, without first outputting the cropped images as a satellite image block dataset? |
Regarding the dataloader, the cropping is done on the fly since the random offsets are generated at each iteration. Regarding the model architecture, we did not use common perspective changes, e.g., polar transformation and homography because 1. polar transformation assumes center alignment between ground and aerial view, an assumption that does not hold for fine-grained localization. 2. homography ignores above ground objects, limiting the model solely to lane markings, etc. In general, we find the use of global descriptors is strong enough to pull two views together. Of course, a strong perspective transformation may further improve the performance, but we see obvious limitations in commonly used ones. |
Thank you very much. Yesterday, you mentioned the common perspective projection transformation and would like to ask an open-ended question. Do you think using orthographic projection transformation would have better results than polar coordinate transformation and homography transformation. Best wiehes! |
I couldn't find any executable function in the readdata_Oxford.py file. The file only contains class functions. How can I accurately crop the spliced satellite image into the corresponding area of the ground image
The text was updated successfully, but these errors were encountered: