Skip to content

File Uploads

Foo Yong Jie edited this page May 20, 2020 · 5 revisions

Within the near future, GoGovSG will release a feature that allows users to upload files and share them via shortlinks. This page seeks to document our design decisions and thought process behind the implementation of this feature.

Constraints

When designing our implementation of this feature, these were the constraints we took into account.

  1. One-to-one mapping between a shortlink and an S3 bucket's object key - This allows us to very quickly tell at a glance which link does an object belong to. This provides us with another guarantee—if a particular short link has not been taken yet, this also means the corresponding S3 bucket's key is available.
  2. Deletion of shortlinks are not allowed.

S3 configuration

The S3 bucket was originally configured with a bucket-wide public-read policy. This was in alignment with the philosophy of GoGovSG being a public link shortener. However, if we wished to allow file urls to be disabled, we would need to be able to set certain S3 objects to be private. This could only be done through setting an object's access control list(ACL). The behavior of S3 bucket policies and object-specific ACLs necessitated a switch in our configuration. Instead, we now have a bucket policy that sets all objects to be private by default; each object would need to have the 'public-read' ACL set in order to be visible.

Phases in a file upload

In light of the constraints and S3 configuration, the file upload process is to be split into three operations.

  1. Creation of the shortUrl - This serves as a way for us to 'reserve' both the shortlink and S3 bucket key. If this operation fails, we know that there might be a collision in bucket key, and therefore should not perform the upload operation.
  2. Upload file to S3 - In this upload step, the client could either obtain a pre-signed URL to upload the file directly to S3, or send the file to the server and have it forwarded to the bucket.
  3. Set object's ACL to be 'public-read'

Ensuring atomicity

The fact that this upload operation spans multiple services necessitates a guarantee on atomicity. We would not want shortlinks pointing to nonexistent S3 objects, and neither should there be any orphaned S3 objects that do not belong to a shortlink.

To be Continued

Clone this wiki locally