Skip to content

Commit

Permalink
Add instructions for setting up Analytics (#494)
Browse files Browse the repository at this point in the history
* integrate custom docs with new UI

* more edits

* use website wording for intro

* fix numbering in table

* rename and some edits

* rename manage_repo file, per Bo

* Merge.

* formatting edits

* updates per Keyur's feedback

* Fix typos

* fix nav order

* fix link to API key request form

* update form link

* update key request form and output dir env var

* Revert to gerund

Though the style guide says to just use imperatives, "get started" just sounds weird. Also this is more consistent with "troubleshooting"

* new troubleshooting entry

* fix typo

* new data container procedures

* more work

* more work

* complete data draft

* more changes

* more changes

* more revisions

* update troubleshooting doc etc.

* new version of diagrams

* remove data loading problems troubleshooting entry; can't reproduce

* revert title change

* add example for not mixing entity types

* first draft of changes

* more edits

* commit changes

* starting again on GA stuff

* more text

* more edits

* finish analytics section

* add FAQ entry

* save file

* add external link

* resolve some weird conflict

* fix another weird conflict
  • Loading branch information
kmoscoe authored Sep 3, 2024
1 parent de4b478 commit d2536d6
Show file tree
Hide file tree
Showing 5 changed files with 66 additions and 3 deletions.
Binary file added assets/images/custom_dc/analytics1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/images/custom_dc/analytics2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/images/custom_dc/analytics3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions custom_dc/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,3 +45,7 @@ The ML model runs entirely on your custom Data Commons instance, inside the Dock

No. However, you have the ability to improve query quality by improving your [search descriptions](/custom_dc/custom_data.html#varparams).

### How can I find out what terms my users are searching on?

The best way to record your users' search queries is with Google Analytics. Data Commons exports many custom Google Analytics events that you can use to create dimensions to report on. In particular, for NL queries, there are three different event types, that are triggered when a user submits a query, when results are returned and so on. See [https://github.com/datacommonsorg/website/blob/f5e8e87c2291d87dfa37a3a887f01d7ff28d6467/static/js/shared/ga_events.ts](https://github.com/datacommonsorg/website/blob/f5e8e87c2291d87dfa37a3a887f01d7ff28d6467/static/js/shared/ga_events.ts){: target="_blank"} for details. For procedures on setting this up, see [Report on custom dimensions](/custom_dc/launch_cloud.html#custom-dimensions).

65 changes: 62 additions & 3 deletions custom_dc/launch_cloud.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,21 +15,24 @@ parent: Build your own Data Commons

When you are ready to launch your site to external traffic, there are many tasks you will need to perform, including:

- Configure your Cloud Service to serve external traffic, over SSL. GCP offers many options for this; see [Mapping custom domains](https://cloud.google.com/run/docs/mapping-custom-domains){: target="_blank"}.
- Configure your Cloud Service to serve external traffic, over SSL. GCP offers many options for this; see [Mapping a domain using a global external Application Load Balancer](https://cloud.google.com/run/docs/mapping-custom-domains#https-load-balancer){: target="_blank"}.
- Optionally, restrict access to your service; see [Custom audiences (services)](https://cloud.google.com/run/docs/configuring/custom-audiences){: target="_blank"}.
- Optionally, add a caching layer to improve performance. We have provided specific procedures to set up a Redis Memorystore in [Improve database performance](#redis).
<!--- Optionally, add [Google Analytics](https://marketingplatform.google.com/about/analytics/){: target="_blank"} to track your website's usage. -->
- Optionally, add [Google Analytics](https://marketingplatform.google.com/about/analytics/){: target="_blank"} to track your website's usage. Procedures for configuring Google Analytics support are in [Add Google Analytics tracking](#analytics).

## Improve database performance {#redis}

We recommend that you use a caching layer to improve the performance of your database. We recommend [Google Cloud Redis Memorystore](https://cloud.google.com/memorystore){: target="_blank"}, a fully managed solution, which will boost the performance of both natural-language searches and regular database lookups in your site. Redis Memorystore runs as a standalone instance in a Google-managed virtual private cloud (VPC), and connects to your VPC network ("default" or otherwise) via [direct peering](https://cloud.google.com/vpc/docs/vpc-peering){: target="_blank"}. Your Cloud Run service connects to the instance using a [VPC connector](https://cloud.google.com/vpc/docs/serverless-vpc-access){: target="_blank"}.
We recommend that you use a caching layer to improve the performance of your database. We recommend [Google Cloud Redis Memorystore](https://cloud.google.com/memorystore){: target="_blank"}, a fully managed solution, which will boost the performance of both natural-language searches and regular database lookups in your site. Redis Memorystore runs as a standalone instance in a Google-managed virtual private cloud (VPC), and connects to your VPC network ("default" or otherwise) via [direct peering](https://cloud.google.com/vpc/docs/vpc-peering){: target="_blank"}. Your Cloud Run service connects to the instance using a [VPC connector](https://cloud.google.com/vpc/docs/serverless-vpc-access){: target="_blank"}.

In the following procedures, we show you how to create a Redis instance that connects to your project's "default" VPC network.

**Step 1: Create the Redis instance**

The following is a sample configuration that you can tune as needed. For additional information, see [Create and manage Redis instances](https://cloud.google.com/memorystore/docs/redis/create-manage-instances){: target="_blank"}.
The following is a sample configuration that you can tune as needed. For additional information, see [Create and manage Redis instances](https://cloud.google.com/memorystore/docs/redis/create-manage-instances){: target="_blank"}.

1. Go to [https://console.cloud.google.com/memorystore/redis/instances](https://console.cloud.google.com/memorystore/redis/instances){: target="_blank"} for your project.
1. Go to [https://console.cloud.google.com/memorystore/redis/instances](https://console.cloud.google.com/memorystore/redis/instances){: target="_blank"} for your project.
1. Select the **Redis** tab and click **Create Instance**.
1. If prompted to enable the Redis API server, accept.
Expand All @@ -47,6 +50,7 @@ The following is a sample configuration that you can tune as needed. For additio

**Step 3: Create the VPC connector**

1. Go to [https://console.cloud.google.com/networking/connectors/list](https://console.cloud.google.com/networking/connectors/list){: target="_blank"} for your instance.
1. Go to [https://console.cloud.google.com/networking/connectors/list](https://console.cloud.google.com/networking/connectors/list){: target="_blank"} for your instance.
1. If you are prompted to enable the VPC Access API, accept.
1. In the **Serverless VPC Access** screen, click **Create Connector**.
Expand All @@ -57,6 +61,7 @@ The following is a sample configuration that you can tune as needed. For additio
1. In the **IP Range** field, enter a valid IP range; for example, `10.9.0.0`.
1. Click **Create**.

For additional information, see [Serverless VPC Access](https://cloud.google.com/vpc/docs/serverless-vpc-access){: target="_blank"}.
For additional information, see [Serverless VPC Access](https://cloud.google.com/vpc/docs/serverless-vpc-access){: target="_blank"}.

**Step 4: Configure your Cloud Run service to connect to the VPC**
Expand All @@ -80,4 +85,58 @@ To verify that traffic is hitting the cache:
1. Run some queries against your running Cloud Run service.
1. In the Cloud Console, go to the Memorystore page and select Redis instance.
1. Under **Instance Functions**, click **Monitoring**.
1. Scroll to the **Cache Hit Ratio** graph. You should see a significant percentage of your traffic hitting the cache.
1. Scroll to the **Cache Hit Ratio** graph. You should see a significant percentage of your traffic hitting the cache.

## Add Google Analytics reporting {#analytics}

Google Analytics provides detailed reports on user engagement with your site. In addition, Data Commons provides a number of custom parameters you can use to report on specific attributes of a Data Commons site such as, search queries, specific page views, etc.

### Enable Analytics tracking

1. If you don't already have a Google Analytics account, create one, following the procedures in [Set up Analytics for a website and/or app](https://support.google.com/analytics/answer/9304153){: target="_blank"}. Record the Analytics tag ID assigned to your account.
1. Go to the Cloud Console for your [Cloud Run service](https://console.cloud.google.com/run/), and click **Edit & deploy new revision**.
1. Expand **Variables and secrets* and click **Add new variable**.
1. Add the name `GOOGLE_ANALYTICS_TAG_ID` and in the value field, type in your tag ID.
1. Click **Deploy** to redeploy the service. Data collection will take a day or two to start and begin showing up in your reports.

### Report on custom dimensions {#custom-dimensions}

Data Commons exports many Google Analytics [custom events](https://support.google.com/analytics/answer/12229021){: target="_blank"} and [parameters](https://support.google.com/analytics/answer/13675006){: target="_blank"}, to allow Data Commons-specific features to be logged, such as search queries, specific page views, etc. You can use these to create custom reports and explorations. The full set is defined in [`website/static/js/shared/ga_events.ts`](https://github.com/datacommonsorg/website/blob/7f896a982e8567cd96a0d8b01d1cd5eaaf285974/static/js/shared/ga_events.ts){: target="blank"}. Before you can get reports on them, you need to create [custom dimensions](https://support.google.com/analytics/answer/14240153){: target="blank"} from them.

To create a custom dimension for a Data Commons custom event:

1. In the [Google Analytics dashboard](https://analytics.google.com/analytics/web/){: target="blank"} for your account, go to the **Admin** page.
1. Select **Data display** > **Custom definitions**.
1. Click **Create custom dimension**.
1. Keep the **Scope** as **Event** and click the **Event parameter** > **Select event parameter** drop-down to see the list of custom event parameters.

![Custom parameters](/assets/images/custom_dc/analytics1.png){: width="400"}

1. Select the parameter you need, for example, **query**.
1. Add a dimension name and description. These can be anything you want but the name should be meaningful as it will show up in reports; for example, `Search query`.
1. When done, click **Save**.
1. Select **Data display** > **Events** and you should see a number of new custom events that have been added to your account.

To create a report based on a custom event:

1. In the [Google Analytics dashboard](https://analytics.google.com/analytics/web/){: target="blank"} for your account, go to the **Explore** page and select **Blank - create a new exploration**.
1. Select **Variables** > **Dimensions** > **+** to open the **Select dimensions** window.
1. Select the **Custom**, select the dimension you want, for example, **Search query**, and click **Import**.

![Custom parameters](/assets/images/custom_dc/analytics2.png){: width="400"}

1. Select **Variables** > **Metrics** > **+** to open the **Select metrics** window.
1. Select the relevant metric you want, such as users, sessions, or views, etc. and click **Import**.
1. Select **Settings** > **Rows** > **Drop or select dimension** and from the drop-down menu, select the dimension you want, such as **Search query**.
1. Select **Settings** > **Values** > **Drop or select metric** and from the drop-down menu, select the metric of interest, such as users, sessions, views, etc.
1. Edit any other settings you like and name the report. For the first 48 hours you will see **(not set)** for the first row. Afterwards, rows will be populated with real values.

![Custom exploration](/assets/images/custom_dc/analytics3.png){: width="400"}








0 comments on commit d2536d6

Please sign in to comment.