Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[firestore-bigquery-export] Add lifecycle event #1303

Merged
merged 23 commits into from
Oct 17, 2023
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions firestore-bigquery-export/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ feature - add oldData to the record

fixed - updating table metadata too often

feature - add lifecycle event to export existing documents to Bigquery

## Version 0.1.26

docs - correct service account name
Expand Down
4 changes: 3 additions & 1 deletion firestore-bigquery-export/POSTINSTALL.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,9 @@ When defining a specific BigQuery project, a manual step to set up permissions i

### _(Optional)_ Import existing documents

This extension only sends the content of documents that have been changed -- it does not export your full dataset of existing documents into BigQuery. So, to backfill your BigQuery dataset with all the documents in your collection, you can run the import script provided by this extension.
If you chose not to automatically import existing documents when you installed this extension, you can backfill your BigQuery dataset with all the documents in your collection using the import script.

If you don't either enable automatic import or run the import script, the extension only exports the content of documents that are created or changed after installation.

The import script can read all existing documents in a Cloud Firestore collection and insert them into the raw changelog table created by this extension. The script adds a special changelog for each document with the operation of `IMPORT` and the timestamp of epoch. This is to ensure that any operation on an imported document supersedes the `IMPORT`.

Expand Down
8 changes: 6 additions & 2 deletions firestore-bigquery-export/PREINSTALL.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,9 +49,13 @@ The response should be indentical in structure.

#### Backfill your BigQuery dataset

This extension only sends the content of documents that have been changed -- it does not export your full dataset of existing documents into BigQuery. So, to backfill your BigQuery dataset with all the documents in your collection, you can run the [import script](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/IMPORT_EXISTING_DOCUMENTS.md) provided by this extension.
To import documents that already exist at installation time into BigQuery, answer **Yes** when the installer asks "Import existing Firestore documents into BigQuery?" The extension will export existing documents as part of the installation and update processes.

**Important:** Run the import script over the entire collection _after_ installing this extension, otherwise all writes to your database during the import might be lost.
Alternatively, you can run the external [import script](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/IMPORT_EXISTING_DOCUMENTS.md) to backfill existing documents. If you plan to use this script, answer **No** when prompted to import existing documents.
dackers86 marked this conversation as resolved.
Show resolved Hide resolved

**Important:** Run the external import script over the entire collection _after_ installing this extension, otherwise all writes to your database during the import might be lost.

If you don't either enable automatic import or run the import script, the extension only exports the content of documents that are created or changed after installation.

#### Generate schema views

Expand Down
16 changes: 14 additions & 2 deletions firestore-bigquery-export/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,9 +57,13 @@ The response should be indentical in structure.

#### Backfill your BigQuery dataset

This extension only sends the content of documents that have been changed -- it does not export your full dataset of existing documents into BigQuery. So, to backfill your BigQuery dataset with all the documents in your collection, you can run the [import script](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/IMPORT_EXISTING_DOCUMENTS.md) provided by this extension.
To import documents that already exist at installation time into BigQuery, answer **Yes** when the installer asks "Import existing Firestore documents into BigQuery?" The extension will export existing documents as part of the installation and update processes.

**Important:** Run the import script over the entire collection _after_ installing this extension, otherwise all writes to your database during the import might be lost.
Alternatively, you can run the external [import script](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/IMPORT_EXISTING_DOCUMENTS.md) to backfill existing documents. If you plan to use this script, answer **No** when prompted to import existing documents.
dackers86 marked this conversation as resolved.
Show resolved Hide resolved

**Important:** Run the external import script over the entire collection _after_ installing this extension, otherwise all writes to your database during the import might be lost.
dackers86 marked this conversation as resolved.
Show resolved Hide resolved

If you don't either enable automatic import or run the import script, the extension only exports the content of documents that are created or changed after installation.

#### Generate schema views

Expand Down Expand Up @@ -108,12 +112,20 @@ To install an extension, your project must be on the [Blaze (pay as you go) plan

* Transform function URL: Specify a function URL to call that will transform the payload that will be written to BigQuery. See the pre-install documentation for more details.

* Import existing Firestore documents into BigQuery: Do you want to import existing documents from your Firestore collection into BigQuery? These documents will have each have a special changelog with the operation of `IMPORT` and the timestamp of epoch. This ensures that any operation on an imported document supersedes the import record.

* Existing documents collection: What is the path of the the Cloud Firestore Collection you would like to import from? (This may, or may not, be the same Collection for which you plan to mirror changes.)

* Docs per backfill: When importing existing documents, how many should be imported at once? The default value of 200 should be ok for most users. If you are using a transform function or have very large documents, you may need to set this to a lower number.



**Cloud Functions:**

* **fsexportbigquery:** Listens for document changes in your specified Cloud Firestore collection, then exports the changes into BigQuery.

* **fsimportexistingdocs:** Imports exisitng documents in ${param:IMPORT_COLLECTION_PATH} into BigQuery. Imported documents will have a special changelog with the operation of `IMPORT` and the timestamp of epoch.



**APIs Used**:
Expand Down
56 changes: 56 additions & 0 deletions firestore-bigquery-export/extension.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,15 @@ resources:
eventTrigger:
eventType: providers/cloud.firestore/eventTypes/document.write
resource: projects/${param:PROJECT_ID}/databases/(default)/documents/${param:COLLECTION_PATH}/{documentId}
- name: fsimportexistingdocs
type: firebaseextensions.v1beta.function
description:
Imports exisitng documents in ${param:IMPORT_COLLECTION_PATH} into BigQuery. Imported documents will have
a special changelog with the operation of `IMPORT` and the timestamp of epoch.
properties:
location: ${param:LOCATION}
runtime: nodejs14
taskQueueTrigger: {}

params:
- param: LOCATION
Expand Down Expand Up @@ -352,3 +361,50 @@ params:
value: no
default: no
required: true

- param: DO_BACKFILL
label: Import existing Firestore documents into BigQuery
dackers86 marked this conversation as resolved.
Show resolved Hide resolved
description: >-
Do you want to import existing documents from your Firestore collection into BigQuery? These documents
will have each have a special changelog with the operation of `IMPORT` and the timestamp of epoch.
This ensures that any operation on an imported document supersedes the import record.
type: select
default: false
options:
- label: Yes
value: true
- label: No
value: false

- param: IMPORT_COLLECTION_PATH
label: Existing documents collection
description: >-
What is the path of the the Cloud Firestore Collection you would like to import from?
(This may, or may not, be the same Collection for which you plan to mirror changes.)
type: string
example: posts
validationRegex: "^[^/]+(/[^/]+/[^/]+)*$"
validationErrorMessage: Firestore collection paths must be an odd number of segments separated by slashes, e.g. "path/to/collection".
default: posts
required: true

- param: DOCS_PER_BACKFILL
label: Docs per backfill
description: >-
When importing existing documents, how many should be imported at once?
The default value of 200 should be ok for most users.
If you are using a transform function or have very large documents, you may need to set this to a lower number.
ifielker marked this conversation as resolved.
Show resolved Hide resolved
type: string
example: 200
validationRegex: "^[1-9][0-9]*$"
validationErrorMessage: Must be a postive integer.
default: 200
required: true

lifecycleEvents:
onInstall:
function: fsimportexistingdocs
processingMessage: Checking whether to import existing documents
onUpdate:
dackers86 marked this conversation as resolved.
Show resolved Hide resolved
function: fsimportexistingdocs
processingMessage: Checking whether to import existing documents
ifielker marked this conversation as resolved.
Show resolved Hide resolved
Loading