Skip to content
This repository has been archived by the owner on Mar 22, 2022. It is now read-only.

Latest commit

 

History

History
144 lines (87 loc) · 6.91 KB

README.md

File metadata and controls

144 lines (87 loc) · 6.91 KB

Google Sheets to SQL Sync

I don't recommend using this tool anymore these days. I found airbyte a well-maintained and feature-rich alternative.

Build Status cljdoc badge Clojars Project

Keep your SQL database in sync with Google Sheets using googlesheets-sql-sync.

Let users manually insert data using Google Sheets while having the power of all available SQL tooling for further processing.

googlesheets-sql-sync uses JDBC and bundles the PostgreSQL and MySQL drivers. Additional drivers can be added any time. If you also would like to have support for SQLite, open an issue and it can probably be added in no time.

Assumptions and simplifications

To simplify the task of synchronisation, the following assumptions are made:

  • Sync happens not too frequently. Think minutes, not milliseconds.
  • Number of tables to sync is not too high. Maybe a hundred but not a million.
  • The Google Sheets are not too big. They might contain a thousand rows but not millions.
  • Headers are mostly stable. It is an exception to rename, add or delete columns.

This allows for a few simplifications in the implementation:

  • Do all specified sync tasks in sequence. There is enough time for this.
  • Limit API requests to Google's API to one a second to prevent rate limits as much as possible.
  • Truncate each table on every sync to ensure all changes are applied.
  • Log error when table schema doesn't match headers from sheet and require the user to check error manually. Most likely the user simply drops the table in this case and moves on.
  • Re-read the config file from disk before each sync interval, which allows for adjusting the config while the system is running.

Setup

Installation

  1. Make sure you have Java 8+ installed. Check by running java -version
  2. Download latest googlesheets-sql-sync.jar from Github.

Setup Google Application

  1. Create a new Project or work in an existing one
  2. Enable the Sheets API
  3. Setup your app's OAuth consent screen If this is an organisation-internal service, you most likely want to set it as internal and select the scope spreadsheets.readonly.
  4. Create a new OAuth client ID or use an existing one
  5. Set "Application type" to "Web application"
  6. Set at least one correct "Authorized redirect URI". To run googlesheets-sql-sync on your local machine with default settings use http://localhost:9955/oauth
  7. Keep "Client ID" and "Client secret" handy for later

Usage

  1. Create an empty config file
java -jar googlesheets-sql-sync.jar --init
  1. Now fill out the missing information in the config file.

  2. Use your Google credentials from above.

  3. Specify at least one target and one sheet using that target.

  4. You can find more DB options in the JDBC docs for PostgreSQL, MySQL or SQLite.

  5. Name the table as you wish for it to appear in your database.

  6. To get a spreadsheet_id, open one of your Google Sheets and copy the part between /d/ and /edit from the URL bar in your Browser.

  7. Specify the range using the A1:Z10. Skip the number to select all rows - like A:ZZ. You can also specify a sheet if your spreadsheet contains multiple sheets by prefixing th range like SomeSheet!A:ZZ. For example, the spreadsheet_id for https://docs.google.com/spreadsheets/d/1q5BNyL7-FnApmkjq45HlKPK-W-pdEmTrtpz0iaHm8p0/edit#gid=0 is 1q5BNyL7-FnApmkjq45HlKPK-W-pdEmTrtpz0iaHm8p0.

  8. Start the program with:

java -jar googlesheets-sql-sync.jar
  1. You will be prompted to visit an OAuth URL to authorize and connect your Google Account.

  2. After successful authorization, a first sync is triggered and further ones will occur in the specified interval.

Running without Server

Often you don't want to open up another port just for OAuth of a small sync tool. To work around this you can run java -jar googlesheets-sql-sync.jar --auth-only on your local machine, then copy the generated googlesheets_sql_sync.auth.json file to your server and on the server run java -jar googlesheets-sql-sync.jar --no-server

Single Sync

If you don't want to keep the syncing your data in an interval but want to sync only once, you can do so by passing the single-sync like this: java -jar googlesheets-sql-sync.jar --single-sync

Metrics

By default a simple Prometheus counter googlesheets_sql_sync_count is provided at /metrics.

Customization

The program can be configured using command line flags. To see available options, run:

java -jar googlesheets-sql-sync.jar --help

Pitfalls

  • When you authenticate a Google OAuth app, then throw away your .auth.json file and try to re-authenticate, Google for some reason will only send you access_token and expires_in, no refresh_token. To fix this go to https://myaccount.google.com/permissions remove the app's permission and try again.

Let me know if you run into any issues or if you have any suggestions for improvements.

Use as Clojure package

  • You can generate a config file with '#googlesheets-sql-sync.config/generate
  • Generate and validate options for running the system with '#googlesheets-sql-sync.options/defaults and '#googlesheets-sql-sync.options/validate
  • Run the system with '#googlesheets-sql-sync.core/start
  • Overwrite '#googlesheets-sql-sync.log/info, '#googlesheets-sql-sync.log/warn, #googlesheets-sql-sync.log/error to modify or disable logging.

Development

  • Make sure you have Leinigen installed.
  • lein test Run tests
  • lein run Run the whole system
  • lein repl Start in dev mode with REPL enabled

Building for production

  • Run lein uberjar

License

MIT