Skip to content

Setting up the Cloudfront collector

Yali Sassoon edited this page Aug 2, 2013 · 8 revisions

HOME > SNOWPLOW SETUP GUIDE > Step 1: setup a Collector > [Setup the Cloudfront collector](setting up the cloudfront collector) > Overview

Introduction

The Cloudfront collector is the most common collector employed by Snowplow uses.

How it works

A tracking pixel (called i) is uploaded to Amazon Cloudfront CDN. The Snowplow Tracker sends data to the collector by making a GET request for the pixel, and appending the data to be passed to the pixel query string. The Cloudfront Collector uses Cloudfront logging to record the request (including the query string) to S3.

Advantages of the Cloudfront Collector

  1. Simple and robust (no moving parts). All the collector does is faithfully log GET requests from trackers. Because logging is done using the standard Amazon Cloudfront logging, it is incredibly reliable.
  2. Scalable. The Cloudfront collector is powered by Amazon's cloud infrastructure: specifically its content delivery network, which is built to billions of requests per day.

Setting up the Cloudfront collector: an overview

Pre-requisites

This guide assumes you have:

Setting up the Cloudfront Collector is a five stage process:

  1. [Setup a bucket on Amazon S3 for the 1x1 tracking pixel] 1-bucket i. This is the pixel that will be requested by every GET made by the Snowplow tracker.
  2. [Upload the tracking pixel] 2-tracking-pixel to the bucket.
  3. [Create a bucket on S3 for the Snowplow logs] 3-s3, generated by the Cloudfront collector.
  4. [Create a Cloudfront distribution] 4-cf for serving the tracking pixel that is now stored in S3. This will ensure that the pixel is fetched very quickly (using Cloudfront's CDN) and crucially we will use Cloudfront logging to record every request made of the tracking pixel. These requests will contain all the data passed to the collector from the tracker, appended to the GET request in the form of a query string.
  5. [Test your tracking pixel on Cloudfront] 5-test.

Note: We recommend running all Snowplow AWS operations through an IAM user with the bare minimum permissions required to run Snowplow. Please see our IAM user setup page for more information on doing this.

Return to the setup guide.

HOME > SNOWPLOW SETUP GUIDE > Step 1: Setup a Collector > [Setup the Cloudfront Collector] (Setting-up-the-Cloudfront-collector)

Setup Snowplow

  1. [Setup a Collector] (setting-up-a-collector)
  1. [Step 2: Setup a Tracker] (setting-up-a-tracker)
  2. [Step 3: Setup EmrEtlRunner] (setting-up-EmrEtlRunner)
  3. [Step 4: Setup the StorageLoader] (setting-up-storageloader)
  4. [Step 5: Analyze your data!] (Getting started analyzing Snowplow data)

Useful resources

Clone this wiki locally