Skip to content

usail-hkust/Awesome-Urban-Foundation-Models

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 

Repository files navigation

Awesome-Urban-Foundation-Models

Awesome Testing Status Visits Badge PRs Welcome Stars

An Awesome Collection of Urban Foundation Models (UFMs).

News 🎉

🌟 2025-01: We update a significantly extended version in Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models, where we formally define and conceptualize Urban General Intelligence; propose a forward-looking general framework for the development of versatile UFMs; integrate the up-to-date literature on UFMs, adding a new section on geovector-based UFMs and several subsections, etc.

🌟 2024-05: Urban Foundation Models: A Survey has been accepted as a Tutorial Track Paper at KDD'24 and will be published in the conference proceedings. Additionally, we will host a tutorial on Urban Foundation Models at the KDD'24 conference. More details can be found on the tutorial website.

Urban Foundation Models (UFMs)

Urban Foundation Models (UFMs) are a family of large-scale models pre-trained on vast amounts of multi-source, multi-granularity, and multimodal urban data. They acquire notable general-purpose capabilities in the pre-training phase, exhibiting remarkable emergent abilities and adaptability dedicated to a range of urban application domains, such as transportation, urban planning, energy management, environmental monitoring, and public safety and security.

UFM

Survey Paper

[KDD'24 Version]

Urban Foundation Models: A Survey

Authors: Weijia Zhang, Jindong Han, Zhao Xu, Hang Ni, Hao Liu, Hui Xiong

[Extended Version]

Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models

Authors: Weijia Zhang, Jindong Han, Zhao Xu, Hang Ni, Tengfei Lyu, Hao Liu, Hui Xiong

🌟 If you find this resource helpful, please consider starring this repository and citing our survey paper:

@inproceedings{ufmsurvey-kdd2024,
  title={Urban Foundation Models: A Survey},
  author={Zhang, Weijia and Han, Jindong and Xu, Zhao and Ni, Hang and Liu, Hao and Xiong, Hui},
  booktitle={Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining},
  pages={6633--6643},
  year={2024}
}
@misc{zhang2024urban,
      title={Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models}, 
      author={Weijia Zhang and Jindong Han and Zhao Xu and Hang Ni and Hao Liu and Hui Xiong},
      year={2024},
      eprint={2402.01749},
      archivePrefix={arXiv},
      primaryClass={cs.CY}
}

Outline

Taxonomy

1. Language-based Models

1.1 Unimodal Pre-training

Geo-text

  • (KDD'22) ERNIE-GeoL: A Geography-and-Language Pre-trained Model and its Applications in Baidu Maps [paper]

1.2 Unimodal Adaptation

Prompt engineering

  • (Urban Informatics'24) Towards Human-AI Collaborative Urban Science Research Enabled by Pre-trained Large Language Models [paper]
  • (ICLR'24) GeoLLM: Extracting Geospatial Knowledge from Large Language Models [paper]
  • (arXiv 2023.10) Can Large Language Models be Good Path Planners? A Benchmark and Investigation on Spatial-temporal Reasoning [paper]
  • (arXiv 2023.05) GPT4GEO: How a Language Model Sees the World's Geography [paper]
  • (arXiv 2023.05) On the Opportunities and Challenges of Foundation Models for Geospatial Artificial Intelligence [paper]
  • (arXiv 2023.05) ChatGPT is on the Horizon: Could a Large Language Model be Suitable for Intelligent Traffic Safety Research and Applications? [paper]
  • (GIScience'23) Evaluating the Effectiveness of Large Language Models in Representing Textual Descriptions of Geometry and Spatial Relations [paper]
  • (SIGSPATIAL'23) Are Large Language Models Geospatially Knowledgeable? [paper]
  • (SIGSPATIAL'23) Towards Understanding the Geospatial Skills of ChatGPT: Taking a Geographic Information Systems (GIS) Exam [paper]

Model fine-tuning

  • (arXiv 2024.6) CityGPT: Empowering Urban Spatial Cognition of Large Language Models [paper]
  • (arXiv 2024.6) UrbanLLM: Autonomous Urban Activity Planning and Management with Large Language Models [paper]
  • (arXiv 2024.3) LAMP: A Language Model on the Map [paper]
  • (WSDM'24) K2: A Foundation Language Model for Geoscience Knowledge Understanding and Utilization [paper]
  • (EMNLP'23) GeoLM: Empowering Language Models for Geospatially Grounded Language Understanding [paper]
  • (KDD'23) QUERT: Continual Pre-training of Language Model for Query Understanding in Travel Domain Search [paper]
  • (TOIS'23) Improving First-stage Retrieval of Point-of-interest Search by Pre-training Models [paper]
  • (EMNLP'22) SpaBERT: A Pretrained Language Model from Geographic Data for Geo-Entity Representation [paper]

2. Vision-based Models

2.1 Unimodal Pre-training

On-site visual data

  • (WWW'23) Knowledge-infused Contrastive Learning for Urban Imagery-based Socioeconomic Prediction [paper]
  • (CIKM'22) Predicting Multi-level Socioeconomic Indicators from Structural Urban Imagery [paper]
  • (AAAI'20) Urban2Vec: Incorporating Street View Imagery and POIs for Multi-Modal Urban Neighborhood Embedding [paper]

Remote sensing data

  • (TGRS'24) Change-Agent: Toward Interactive Comprehensive Remote Sensing Change Interpretation and Analysis [paper]
  • (JSTARS'24) A Billion-scale Foundation Model for Remote Sensing Images [paper]
  • (TGRS'23) A Decoupling Paradigm With Prompt Learning for Remote Sensing Image Change Captioning [paper]
  • (TGRS'23) Foundation Model-Based Multimodal Remote Sensing Data Classification [paper]
  • (TGRS'23) RingMo-Sense: Remote Sensing Foundation Model for Spatiotemporal Prediction via Spatiotemporal Evolution Disentangling [paper]
  • (ICCV'23) Towards Geospatial Foundation Models via Continual Pretraining [paper]
  • (ICCV'23) Scale-MAE: A Scale-Aware Masked Autoencoder for Multiscale Geospatial Representation Learning [paper]
  • (ICML'23) CSP: Self-Supervised Contrastive Spatial Pre-Training for Geospatial-Visual Representations [paper]
  • (TGRS'22) Advancing Plain Vision Transformer Toward Remote Sensing Foundation Model [paper]
  • (TGRS'22) RingMo: A Remote Sensing Foundation Model With Masked Image Modeling [paper]

Urban Raster data

  • (arXiv 2024.05) Aurora: A Foundation Model of the Atmosphere [paper]
  • (arXiv 2023.04) FengWu: Pushing the Skillful Global Medium-range Weather Forecast beyond 10 Days Lead [paper]
  • (arXiv 2023.04) W-MAE: Pre-trained Weather Model with Masked Autoencoder for Multi-variable Weather Forecasting [paper]
  • (Science'23) Learning skillful medium-range global weather forecasting [paper]
  • (Nature'23) Accurate Medium-range Global Weather Forecasting with 3D Neural Networks [paper]
  • (ICML'23) ClimaX: A Foundation Model for Weather and Climate [paper]
  • (arXiv 2022.02) FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural Operators [paper]

2.2 Unimodal Adaptation

Prompt engineering

  • (TGRS'24) RSPrompter: Learning to Prompt for Remote Sensing Instance Segmentation based on Visual Foundation Model [paper]
  • (NeurIPS'23) SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model [paper]

Model fine-tuning

  • (arXiv 2023.11) GeoSAM: Fine-tuning SAM with Sparse and Dense Visual Prompting for Automated Segmentation of Mobility Infrastructure [paper]
  • (arXiv 2023.02) Learning Generalized Zero-Shot Learners for Open-Domain Image Geolocalization [paper]
  • (NeurIPS'23) GeoCLIP: Clip-Inspired Alignment between Locations and Images for Effective Worldwide Geo-localization [paper]
  • (TGRS'23) RingMo-SAM: A Foundation Model for Segment Anything in Multimodal Remote-Sensing Images [paper]
  • (IJAEOG'22) Migratable Urban Street Scene Sensing Method based on Vsion Language Pre-trained Model [paper]

3. Time Series-based Models

3.1 Unimodal Pre-training

Ordinary time series

  • (ICML'24) A decoder-only foundation model for time-series forecasting [paper]
  • (arXiv 2024.03) UniTS: Building a Unified Time Series Model [paper]
  • (arXiv 2024.02) Timer: Transformers for Time Series Analysis at Scale [paper]
  • (arXiv 2024.02) Generative Pretrained Hierarchical Transformer for Time Series Forecasting [paper]
  • (arXiv 2024.02) TimeSiam: A Pre-Training Framework for Siamese Time-Series Modeling [paper]
  • (arXiv 2024.01) TTMs: Fast Multi-level Tiny Time Mixers for Improved Zero-shot and Few-shot Forecasting of Multivariate Time Series [paper]
  • (arXiv 2024.01) Himtm: Hierarchical multi-scale masked time series modeling for long-term forecasting [paper]
  • (arXiv 2023.12) Prompt-based Domain Discrimination for Multi-source Time Series Domain Adaptation [paper]
  • (arXiv 2023.11) PT-Tuning: Bridging the Gap between Time Series Masked Reconstruction and Forecasting via Prompt Token Tuning [paper]
  • (arXiv 2023.10) UniTime: A Language-Empowered Unified Model for Cross-Domain Time Series Forecasting [paper]
  • (arXiv 2023.03) SimTS: Rethinking Contrastive Representation Learning for Time Series Forecasting [paper]
  • (arXiv 2023.01) Ti-MAE: Self-Supervised Masked Time Series Autoencoders [paper]
  • (NeurIPS'23) Forecastpfn: Synthetically-trained zero-shot forecasting [paper]
  • (NeurIPS'23) SimMTM: A Simple Pre-Training Framework for Masked Time-Series Modeling [paper]
  • (NeurIPS'23) Lag-llama: Towards foundation models for time series forecasting [paper]
  • (ICLR'23) A Time Series is Worth 64 Words: Long-term Forecasting with Transformers [paper]
  • (KDD'23) TSMixer: Lightweight MLP-Mixer Model for Multivariate Time Series Forecasting [paper]
  • (AAAI'22) TS2Vec: Towards Universal Representation of Time Series [paper]
  • (ICLR'22) CoST: Contrastive Learning of Disentangled Seasonal-Trend Representations for Time Series Forecasting [paper]
  • (TNNLS'22) Self-Supervised Autoregressive Domain Adaptation for Time Series Data [paper]
  • (IJCAI'21) Time-Series Representation Learning via Temporal and Contextual Contrasting [paper]
  • (ICLR'21) Unsupervised Representation Learning for Time Series with Temporal Neighborhood Coding [paper]
  • (AAAI'21) Meta-Learning Framework with Applications to Zero-Shot Time-Series Forecasting [paper]
  • (AAAI'21) Time Series Domain Adaptation via Sparse Associative Structure Alignment [paper]
  • (KDD'21) A Transformer-based Framework for Multivariate Time Series Representation Learning [paper]
  • (KDD'20) Multi-Source Deep Domain Adaptation with Weak Supervision for Time-Series Sensor Data [paper]
  • (NeurIPS'19) Unsupervised Scalable Representation Learning for Multivariate Time Series [paper]

Spatial-correlated time series

  • (KDD'24) UniST: A Prompt-Empowered Universal Model for Urban Spatio-Temporal Prediction [paper]
  • (NeurIPS'23) GPT-ST: Generative Pre-Training of Spatio-Temporal Graph Neural Networks [paper]
  • (CIKM'23) Mask- and Contrast-Enhanced Spatio-Temporal Learning for Urban Flow Prediction [paper]
  • (CIKM'23) Cross-city Few-Shot Traffic Forecasting via Traffic Pattern Bank [paper]
  • (KDD'23) Transferable Graph Structure Learning for Graph-based Traffic Forecasting Across Cities [paper]
  • (KDD'22) Selective Cross-City Transfer Learning for Traffic Prediction via Source City Region Re-Weighting [paper]
  • (WSDM'22) ST-GSP: Spatial-Temporal Global Semantic Representation Learning for Urban Flow Prediction [paper]
  • (SIGSPATIAL'22) When Do Contrastive Learning Signals Help Spatio-Temporal Graph Forecasting? [paper]
  • (KDD'22) Pre-training Enhanced Spatial-temporal Graph Neural Network for Multivariate Time Series Forecasting [paper]
  • (WWW'19) Learning from Multiple Cities: A Meta-Learning Approach for Spatial-Temporal Prediction [paper]
  • (IJCAI'18) Cross-City Transfer Learning for Deep Spatio-Temporal Prediction [paper]

3.2 Unimodal Adaptation

Prompt tuning

  • (arXiv 2023.12) Prompt-based Domain Discrimination for Multi-source Time Series Domain Adaptation [paper]
  • (arXiv 2023.11) PT-Tuning: Bridging the Gap between Time Series Masked Reconstruction and Forecasting via Prompt Token Tuning [paper]
  • (arXiv 2023.05) Spatial-temporal Prompt Learning for Federated Weather Forecasting [paper]
  • (CIKM'23) PromptST: Prompt-Enhanced Spatio-Temporal Multi-Attribute Prediction [paper]
  • (IJCAI'23) Prompt Federated Learning for Weather Forecasting: Toward Foundation Models on Meteorological Data [paper]

3.3 Cross-modal Adaptation

Prompt engineering

  • (TKDE'22) PromptCast: A New Prompt-based Learning Paradigm for Time Series Forecasting [paper]
  • (NeurIPS'23) Large Language Models Are Zero-Shot Time Series Forecasters [paper]

Model fine-tuning

  • (arXiv 2024.02) AutoTimes: Autoregressive Time Series Forecasters via Large Language Models [paper]
  • (ICLR'24) TEMPO: Prompt-based Generative Pre-trained Transformer for Time Series Forecasting [paper]
  • (arXiv 2024.03) TPLLM: A Traffic Prediction Framework Based on Pretrained Large Language Models [paper]
  • (arXiv 2024.01) How can large language models understand spatial-temporal data? [paper]
  • (arXiv 2024.01) Spatial-temporal large language model for traffic prediction [paper]
  • (arXiv 2023.11) One Fits All: Universal Time Series Analysis by Pretrained LM and Specially Designed Adaptors [paper]
  • (arXiv 2023.11) GATGPT: A Pre-trained Large Language Model with Graph Attention Network for Spatiotemporal Imputation [paper]
  • (arXiv 2023.08) LLM4TS: Two-Stage Fine-Tuning for Time-Series Forecasting with Pre-Trained LLMs [paper]
  • (NeurIPS'23) One Fits All: Power General Time Series Analysis by Pretrained LM [paper]

Model reprogramming

  • (arXiv 2024.08) Empowering Pre-Trained Language Models for Spatio-Temporal Forecasting via Decoupling Enhanced Discrete Reprogramming [paper]
  • (KDD'24) UrbanGPT: Spatio-Temporal Large Language Models [paper]
  • (ICLR'24) Time-LLM: Time Series Forecasting by Reprogramming Large Language Models [paper]
  • (arXiv 2023.08) TEST: Text Prototype Aligned Embedding to Activate LLM’s Ability for Time Series [paper]

4. Trajectory-based Models

4.1 Unimodal Pre-training

Road network trajectory

  • (arXiv 2024.08) TrajFM: A Vehicle Trajectory Foundation Model for Region and Task Transferability [paper]
  • (WWW'24) More Than Routing: Joint GPS and Route Modeling for Refine Trajectory Representation Learning [paper]
  • (KDD'23) Lightpath: Lightweight and scalable path representation learning [paper]
  • (ICDM'23) Self-supervised Pre-training for Robust and Generic Spatial-Temporal Representations [paper]
  • (TKDE'23) Pre-Training General Trajectory Embeddings With Maximum Multi-View Entropy Coding [paper]
  • (ICDE'23) Self-supervised trajectory representation learning with temporal regularities and travel semantics [paper]
  • (VLDBJ'22) Unified route representation learning for multi-modal transportation recommendation with spatiotemporal pre-training [paper]
  • (CIKM'21) Robust road network representation learning: When traffic patterns meet traveling semantics [paper]
  • (IJCAI'21) Unsupervised path representation learning with curriculum negative sampling [paper]
  • (TIST'20) Trembr: Exploring road networks for trajectory representation learning [paper]
  • (ICDE'18) Deep representation learning for trajectory similarity computation [paper]
  • (IJCNN'17) Trajectory clustering via deep representation learning [paper]

Free space trajectory

  • (AAAI'23) Contrastive pre-training with adversarial perturbations for check-in sequence representation learning [paper]
  • (KBS'21) Self-supervised human mobility learning for next location prediction and trajectory classification [paper]
  • (AAAI'21) Pre-training context and time aware location embeddings from spatial-temporal trajectories for user next location prediction [paper]
  • (KDD'20) Learning to simulate human mobility [paper]

4.2 Unimodal Adaptation

Model fine-tuning

  • (ToW'23) Pre-Training Across Different Cities for Next POI Recommendation [paper]
  • (TIST'23) Doing more with less: overcoming data scarcity for poi recommendation via cross-region transfer [paper]
  • (CIKM'21) Region invariant normalizing flows for mobility transfer [paper]

4.3 Cross-modal Adaptation

Prompt engineering

  • (arXiv 2024.03) DrPlanner: Diagnosis and Repair of Motion Planners Using Large Language Models [paper]
  • (arXiv 2023.11) Exploring Large Language Models for Human Mobility Prediction under Public Events [paper]
  • (arXiv 2023.10) Large Language Models for Spatial Trajectory Patterns Mining [paper]
  • (arXiv 2023.10) Gpt-driver: Learning to drive with gpt [paper]
  • (arXiv 2023.10) Languagempc: Large language models as decision makers for autonomous driving [paper]
  • (arXiv 2023.09) Can you text what is happening? Integrating pre-trained language encoders into trajectory prediction models for autonomous driving [paper]
  • (arXiv 2023.08) Where would i go next? large language models as human mobility predictors [paper]

Model fine-tuning

  • (TIV'24) Traj-llm: A new exploration for empowering trajectory prediction with pre-trained large language models [paper]
  • (SIGSPATIAL'22) Leveraging language foundation models for human mobility forecasting [paper]

5. Geovector-based Models

5.1 Unimodal Pre-training

Point-based data

  • (CIKM'24) G2PTL: A Geography-Graph Pre-trained Model [paper]
  • (Applied Science'22) GeoBERT: Pre-Training Geospatial Representation Learning on Point-of-Interest [paper]
  • (CIKM'21) GeoVectors: A Linked Open Corpus of OpenStreetMap Embeddings on World Scale [paper]
  • (TKDE'21) Pre-Training Time-Aware Location Embeddings from Spatial-Temporal Trajectories [paper]

Polyline-based data

  • (CIKM'22) Jointly Contrastive Representation Learning on Road Network and Trajectory [paper]
  • (CIKM'21) Robust road network representation learning: When traffic patterns meet traveling semantics [paper]
  • (SIGSPATIAL'20) Enabling Finer Grained Place Embeddings using Spatial Hierarchy from Human Mobility Trajectories [paper]

Polygon-based data

  • (arXiv 2024.08) Urban Region Pre-training and Prompting: A Graph-based Approach [paper]
  • (arXiv 2024.05) Learning Geospatial Region Embedding with Heterogeneous Graph [paper]
  • (KDD'23) Urban region representation learning with openstreetmap building footprints [paper]
  • (AAAI'20) Learning Geo-Contextual Embeddings for Commuting Flow Prediction [paper]

5.2 Unimodal Adaptation

Model fine-tuning

  • (arXiv 2024.06) Fine-tuning of Geospatial Foundation Models for Aboveground Biomass Estimation [paper]

Prompt tuning

  • (AAAI'23) Heterogeneous Region Embedding with Prompt Learning [paper]

5.3 Cross-modal Adaptation

Prompt engineering

  • (Eval4NLP'23) Zero-shot Probing of Pretrained Language Models for Geography Knowledge [paper]

Model fine-tuning

  • (EMNLP'24) Pretraining and Finetuning Language Models on Geospatial Networks for Accurate Address Matching [paper]

6. Multimodal-based Models

6.1 Multimodal Pre-training

Multimodal urban data

  • (KDD'24) ReFound: Crafting a Foundation Model for Urban Region Understanding upon Language and Visual Foundations [paper]
  • (WWW'24) UrbanCLIP: Learning Text-enhanced Urban Region Profiling with Contrastive Language-Image Pretraining from the Web [paper]
  • (TITS'23) Parallel Transportation in TransVerse: From Foundation Models to DeCAST [paper]
  • (arXiv 2023.12) AllSpark: A Multimodal Spatiotemporal General Model [paper]
  • (arXiv 2023.10) City Foundation Models for Learning General Purpose Representations from OpenStreetMap [paper]

6.2 Multimodal Adaptation

Prompt engineering

  • (arXiv 2024.02) Large Language Model for Participatory Urban Planning [paper]
  • (arXiv 2023.09) TrafficGPT: Viewing, Processing and Interacting with Traffic Foundation Models [paper]
  • (arXiv 2023.07) GeoGPT: Understanding and Processing Geospatial Tasks through An Autonomous GPT [paper]

Model fine-tuning

  • (ICML'24) GeoReasoner: Geo-localization with Reasoning in Street Views using a Large Vision-Language Model [paper]
  • (AAAI'24) VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View [paper]
  • (arXiv 2024.02) TransGPT: Multi-modal Generative Pre-trained Transformer for Transportation [paper]
  • (arXiv 2023.12) Urban Generative Intelligence (UGI): A Foundational Platform for Agents in Embodied City Environment [paper]

7. Others

  • (AAAI'24) Prompt to Transfer: Sim-to-Real Transfer for Traffic Signal Control with Prompt Learning[paper]
  • (IJMLC'24) Open-ti: Open traffic intelligence with augmented language model[paper]
  • (ITSC'23) Building transportation foundation model via generative graph transformer[paper]
  • (ITSC'23) Can chatgpt enable its? the case of mixed traffic control via reinforcement learning[paper]
  • (ITSC'23) Transworldng: Traffic simulation via foundation model[paper]

8. Contributing

👍 Contributions to this repository are welcome!

If you have come across relevant resources, feel free to open an issue or submit a pull request.

- (*conference|journal*) paper_name [[pdf](link)][[code](link)]

About

An Awesome Collection of Urban Foundation Models (UFMs).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published