Skip to content

Latest commit

 

History

History
96 lines (72 loc) · 4.01 KB

README.md

File metadata and controls

96 lines (72 loc) · 4.01 KB

teaser.png

OpenStreetView-5M
The Many Roads to Global Visual Geolocation 📍🌍

Official PyTorch implementation of OpenStreetView-5M: The Many Roads to Global Visual Geolocation.

First authors: Guillaume Astruc, Nicolas Dufour, Ioannis Siglidis
Second authors: Constantin Aronssohn, Nacim Bouia, Stephanie Fu, Romain Loiseau, Van Nguyen Nguyen, Charles Raude, Elliot Vincent, Lintao XU, Hongyu Zhou
Last author: Loic Landrieu
Research Institute: Imagine, LIGM, Ecole des Ponts, Univ Gustave Eiffel, CNRS, Marne-la-Vallée, France

Introduction 🌍

OpenStreetView-5M is the first large-scale open geolocation benchmark of streetview images.
To get a sense of the difficulty of the benchmark, you can play our demo.
Our dataset was used in an extensive benchmark of which we provide the best model.
For more details and results, please check out our paper and project page.

Dataset 💾

OpenStreetView-5M is hosted at huggingface/datasets/osv5m/osv5m. To download and extract it run:

python scripts/download-dataset.py

For different ways of importing the dataset see DATASET.md

Inference 🔥

Our best model on OSV-5M can also be found on huggingface.

from PIL import Image
from models.huggingface import Geolocalizer

geolocalizer = Geolocalizer.from_pretrained('osv5m/baseline')
img = Image.open('.media/examples/img1.jpeg')
x = geolocalizer.transform(img).unsqueeze(0) # transform the image using our dedicated transformer
gps = geolocalizer(x) # B, 2 (lat, lon - tensor in rad)

To reproduce results for the model on huggingface, run:

python evaluation.py exp=eval_best_model dataset.global_batch_size=1024

Benchmark 🛰️

benchmark.png To replicate all the the experiments of our paper, we provide dedicated scripts in scripts/experiments.

Installation 🌱

To install our conda environment, run:

conda env create -f environment.yaml
conda activate osv5m

Replication 🛠️

To run most methods, you first need to precompute the QuadTrees (roughly 10 minutes):

python scripts/preprocessing/preprocess.py data_dir=datasets do_split=1000 #You will need to run this code with other splitting/depth arguments if you want to use different quadtree arguments

Use the configs/exp folder to select the experiment you want. Feel free to explore it. All evaluated models from the paper have a dedicated config file

# Using more workers in the dataloader
computer.num_workers=20

#Change number of devices available
computer.devices=1

#Change batch_size distributed to all devices
dataset.global_batch_size=2

#Changing mode train or eval, default is train
mode=eval

# All these parameters and more can be changed from the config file!

# train best model
python train.py exp=best_model computer.devices=1 computer.num_workers=16 dataset.global_batch_size=2

Citing 💫

@article{osv5m,
    title = {{OpenStreetView-5M}: {T}he Many Roads to Global Visual Geolocation},
    author = {Astruc, Guillaume and Dufour, Nicolas and Siglidis, Ioannis
      and Aronssohn, Constantin and Bouia, Nacim and Fu, Stephanie and Loiseau, Romain
      and Nguyen, Van Nguyen and Raude, Charles and Vincent, Elliot and Xu, Lintao
      and Zhou, Hongyu and Landrieu, Loic},
    journal = {CVPR},
    year = {2024},
  }