diff --git a/README.md b/README.md
index 5fe7715745..2bca6e9e75 100644
--- a/README.md
+++ b/README.md
@@ -2,32 +2,18 @@
 
 Cortex is an open source platform for deploying machine learning models—trained with nearly any framework—as production web services.
 
-<br>
-
-<!-- Set header Cache-Control=no-cache on the S3 object metadata (see https://help.github.com/en/articles/about-anonymized-image-urls) -->
 ![Demo](https://d1zqebknpdh033.cloudfront.net/demo/gif/v0.8.gif)
 
-<br>
-
 ## Key features
 
-- **Autoscaling:** Cortex automatically scales APIs to handle production workloads.
-
-- **Multi framework:** Cortex supports TensorFlow, PyTorch, scikit-learn, XGBoost, and more.
-
-- **CPU / GPU support:** Cortex can run inference on CPU or GPU infrastructure.
-
-- **Spot instances:** Cortex supports EC2 spot instances.
-
-- **Rolling updates:** Cortex updates deployed APIs without any downtime.
-
-- **Log streaming:** Cortex streams logs from deployed models to your CLI.
-
-- **Prediction monitoring:** Cortex monitors network metrics and tracks predictions.
-
-- **Minimal configuration:** Deployments are defined in a single `cortex.yaml` file.
-
-<br>
+* **Autoscaling:** Cortex automatically scales APIs to handle production workloads.
+* **Multi framework:** Cortex supports TensorFlow, PyTorch, scikit-learn, XGBoost, and more.
+* **CPU / GPU support:** Cortex can run inference on CPU or GPU infrastructure.
+* **Spot instances:** Cortex supports EC2 spot instances.
+* **Rolling updates:** Cortex updates deployed APIs without any downtime.
+* **Log streaming:** Cortex streams logs from deployed models to your CLI.
+* **Prediction monitoring:** Cortex monitors network metrics and tracks predictions.
+* **Minimal configuration:** Deployments are defined in a single `cortex.yaml` file.
 
 ## Usage
 
@@ -92,19 +78,15 @@ positive  8
 negative  4
 ```
 
-<br>
-
 ## How it works
 
-The CLI sends configuration and code to the cluster every time you run `cortex deploy`. Each model is loaded into a Docker container, along with any Python packages and request handling code. The model is exposed as a web service using Elastic Load Balancing (ELB), TensorFlow Serving, and ONNX Runtime. The containers are orchestrated on Elastic Kubernetes Service (EKS) while logs and metrics are streamed to CloudWatch.
-
-<br>
+The CLI sends configuration and code to the cluster every time you run `cortex deploy`. Each model is loaded into a Docker container, along with any Python packages and request handling code. The model is exposed as a web service using Elastic Load Balancing \(ELB\), TensorFlow Serving, and ONNX Runtime. The containers are orchestrated on Elastic Kubernetes Service \(EKS\) while logs and metrics are streamed to CloudWatch.
 
 ## Examples
 
-<!-- CORTEX_VERSION_README_MINOR x5 -->
-- [Sentiment analysis](https://github.com/cortexlabs/cortex/tree/0.11/examples/tensorflow/sentiment-analyzer) in TensorFlow with BERT
-- [Image classification](https://github.com/cortexlabs/cortex/tree/0.11/examples/tensorflow/image-classifier) in TensorFlow with Inception
-- [Text generation](https://github.com/cortexlabs/cortex/tree/0.11/examples/pytorch/text-generator) in PyTorch with DistilGPT2
-- [Reading comprehension](https://github.com/cortexlabs/cortex/tree/0.11/examples/pytorch/reading-comprehender) in PyTorch with ELMo-BiDAF
-- [Iris classification](https://github.com/cortexlabs/cortex/tree/0.11/examples/sklearn/iris-classifier) in scikit-learn
+* [Sentiment analysis](https://github.com/cortexlabs/cortex/tree/0.11/examples/tensorflow/sentiment-analyzer) in TensorFlow with BERT
+* [Image classification](https://github.com/cortexlabs/cortex/tree/0.11/examples/tensorflow/image-classifier) in TensorFlow with Inception
+* [Text generation](https://github.com/cortexlabs/cortex/tree/0.11/examples/pytorch/text-generator) in PyTorch with DistilGPT2
+* [Reading comprehension](https://github.com/cortexlabs/cortex/tree/0.11/examples/pytorch/reading-comprehender) in PyTorch with ELMo-BiDAF
+* [Iris classification](https://github.com/cortexlabs/cortex/tree/0.11/examples/sklearn/iris-classifier) in scikit-learn
+
diff --git a/docs/cluster/aws.md b/docs/cluster-management/aws.md
similarity index 78%
rename from docs/cluster/aws.md
rename to docs/cluster-management/aws.md
index b935cdd573..cc9d935910 100644
--- a/docs/cluster/aws.md
+++ b/docs/cluster-management/aws.md
@@ -2,4 +2,5 @@
 
 As of now, Cortex only runs on AWS. We plan to support other cloud providers in the future. If you don't have an AWS account you can get started with one [here](https://portal.aws.amazon.com/billing/signup#/start).
 
-Follow this [tutorial](https://aws.amazon.com/premiumsupport/knowledge-center/create-access-key) to create an access key. Enable programmatic access for the IAM user, and attach the built-in `AdministratorAccess` policy to your IAM user (or see [security](security.md) for a minimal access configuration).
+Follow this [tutorial](https://aws.amazon.com/premiumsupport/knowledge-center/create-access-key) to create an access key. Enable programmatic access for the IAM user, and attach the built-in `AdministratorAccess` policy to your IAM user \(or see [security](security.md) for a minimal access configuration\).
+
diff --git a/docs/cluster/config.md b/docs/cluster-management/config.md
similarity index 93%
rename from docs/cluster/config.md
rename to docs/cluster-management/config.md
index 57b85d8013..ca810a809f 100644
--- a/docs/cluster/config.md
+++ b/docs/cluster-management/config.md
@@ -1,8 +1,6 @@
 # Cluster configuration
 
-The Cortex cluster may be configured by providing a configuration file to `cortex cluster up` or `cortex cluster update` via the  `--config` flag (e.g. `cortex cluster up --config=cluster.yaml`). Below is the schema for the cluster configuration file, with default values shown (unless otherwise specified):
-
-<!-- CORTEX_VERSION_BRANCH_STABLE -->
+The Cortex cluster may be configured by providing a configuration file to `cortex cluster up` or `cortex cluster update` via the `--config` flag \(e.g. `cortex cluster up --config=cluster.yaml`\). Below is the schema for the cluster configuration file, with default values shown \(unless otherwise specified\):
 
 ```yaml
 # cluster.yaml
@@ -83,3 +81,4 @@ image_istio_pilot: cortexlabs/istio-pilot:0.11.0
 image_istio_citadel: cortexlabs/istio-citadel:0.11.0
 image_istio_galley: cortexlabs/istio-galley:0.11.0
 ```
+
diff --git a/docs/cluster/security.md b/docs/cluster-management/security.md
similarity index 86%
rename from docs/cluster/security.md
rename to docs/cluster-management/security.md
index 451c1b01c3..3b1c19598c 100644
--- a/docs/cluster/security.md
+++ b/docs/cluster-management/security.md
@@ -8,7 +8,7 @@ If you are not using a sensitive AWS account and do not have a lot of experience
 
 The operator requires read permissions for any S3 bucket containing exported models, read and write permissions for the Cortex S3 bucket, read and write permissions for the Cortex CloudWatch log group, and read and write permissions for CloudWatch metrics. The policy below may be used to restrict the Operator's access:
 
-```json
+```javascript
 {
     "Version": "2012-10-17",
     "Statement": [
@@ -43,8 +43,9 @@ In order to connect to the operator via the CLI, you must provide valid AWS cred
 
 ## API access
 
-By default, your Cortex APIs will be accessible to all traffic. You can restrict access using AWS security groups. Specifically, you will need to edit the security group with the description: "Security group for Kubernetes ELB <ELB name> (istio-system/apis-ingressgateway)".
+By default, your Cortex APIs will be accessible to all traffic. You can restrict access using AWS security groups. Specifically, you will need to edit the security group with the description: "Security group for Kubernetes ELB  \(istio-system/apis-ingressgateway\)".
 
 ## HTTPS
 
-All APIs are accessible via HTTPS. The certificate is autogenerated during installation using `localhost` as the Common Name (CN). Therefore, clients will need to skip certificate verification (e.g. `curl -k`) when using HTTPS.
+All APIs are accessible via HTTPS. The certificate is autogenerated during installation using `localhost` as the Common Name \(CN\). Therefore, clients will need to skip certificate verification \(e.g. `curl -k`\) when using HTTPS.
+
diff --git a/docs/cluster/uninstall.md b/docs/cluster-management/uninstall.md
similarity index 96%
rename from docs/cluster/uninstall.md
rename to docs/cluster-management/uninstall.md
index fcaa0bdfa7..4c32bd3778 100644
--- a/docs/cluster/uninstall.md
+++ b/docs/cluster-management/uninstall.md
@@ -4,7 +4,7 @@
 
 1. [AWS credentials](aws.md)
 2. [Docker](https://docs.docker.com/install)
-3. [Cortex CLI](install.md)
+3. [Cortex CLI](../install.md)
 4. [AWS CLI](https://aws.amazon.com/cli)
 
 ## Uninstalling Cortex
@@ -34,3 +34,4 @@ aws s3 rb --force s3://<bucket-name>
 # delete the log group
 aws logs describe-log-groups --log-group-name-prefix=<log_group_name> --query logGroups[*].[logGroupName] --output text | xargs -I {} aws logs delete-log-group --log-group-name {}
 ```
+
diff --git a/docs/cluster/update.md b/docs/cluster-management/update.md
similarity index 94%
rename from docs/cluster/update.md
rename to docs/cluster-management/update.md
index a38c9535a8..56ac9bb2c8 100644
--- a/docs/cluster/update.md
+++ b/docs/cluster-management/update.md
@@ -15,8 +15,6 @@ cortex cluster update
 
 ## Upgrading to a newer version of Cortex
 
-<!-- CORTEX_VERSION_MINOR -->
-
 ```bash
 # spin down your cluster
 cortex cluster down
@@ -30,3 +28,4 @@ cortex version
 # spin up your cluster
 cortex cluster up
 ```
+
diff --git a/docs/development.md b/docs/contributing/development.md
similarity index 86%
rename from docs/development.md
rename to docs/contributing/development.md
index 68492d33f5..459a840843 100644
--- a/docs/development.md
+++ b/docs/contributing/development.md
@@ -1,11 +1,11 @@
-# Development Environment
+# Development
 
 ## Prerequisites
 
-1. Go (>=1.12.9)
-1. Docker
-1. eksctl
-1. kubectl
+1. Go \(&gt;=1.12.9\)
+2. Docker
+3. eksctl
+4. kubectl
 
 ## Cortex Dev Environment
 
@@ -135,23 +135,24 @@ path/to/cortex/bin/cortex deploy
 If you're making changes in the operator and want faster iterations, you can run an off-cluster operator.
 
 1. `make operator-stop` to stop the in-cluster operator
-1. `make devstart` to run the off-cluster operator (which rebuilds the CLI and restarts the Operator when files change)
-1. `path/to/cortex/bin/cortex configure` (on a separate terminal) to configure your cortex CLI to use the off-cluster operator. When prompted for operator URL, use `http://localhost:8888`
+2. `make devstart` to run the off-cluster operator \(which rebuilds the CLI and restarts the Operator when files change\)
+3. `path/to/cortex/bin/cortex configure` \(on a separate terminal\) to configure your cortex CLI to use the off-cluster operator. When prompted for operator URL, use `http://localhost:8888`
 
 Note: `make cortex-up-dev` will start Cortex without installing the operator.
 
 If you want to switch back to the in-cluster operator:
 
 1. `<ctrl+C>` to stop your off-cluster operator
-1. `make operator-start` to install the operator in your cluster
-1. `path/to/cortex/bin/cortex configure` to configure your cortex CLI to use the in-cluster operator. When prompted for operator URL, use the URL shown when running `make cortex-info`
+2. `make operator-start` to install the operator in your cluster
+3. `path/to/cortex/bin/cortex configure` to configure your cortex CLI to use the in-cluster operator. When prompted for operator URL, use the URL shown when running `make cortex-info`
 
 ## Dev Workflow
 
 1. `make cortex-up-dev`
-1. `make devstart`
-1. Make changes
-1. `make registry-dev`
-1. Test your changes with projects in `examples` or your own
+2. `make devstart`
+3. Make changes
+4. `make registry-dev`
+5. Test your changes with projects in `examples` or your own
 
 See `Makefile` for additional dev commands
+
diff --git a/docs/dependencies/python-packages.md b/docs/dependency-management/python-packages.md
similarity index 70%
rename from docs/dependencies/python-packages.md
rename to docs/dependency-management/python-packages.md
index 977002b354..0007211467 100644
--- a/docs/dependencies/python-packages.md
+++ b/docs/dependency-management/python-packages.md
@@ -2,7 +2,7 @@
 
 ## PyPI packages
 
-You can install your required PyPI packages and import them in your Python files. Cortex looks for a `requirements.txt` file in the top level Cortex project directory (i.e. the directory which contains `cortex.yaml`):
+You can install your required PyPI packages and import them in your Python files. Cortex looks for a `requirements.txt` file in the top level Cortex project directory \(i.e. the directory which contains `cortex.yaml`\):
 
 ```text
 ./iris-classifier/
@@ -12,7 +12,7 @@ You can install your required PyPI packages and import them in your Python files
 └── requirements.txt
 ```
 
-Note that some packages are pre-installed by default (see [predictor](../deployments/predictor.md) or [request handlers](../deployments/request-handlers.md) depending on which runtime you're using).
+Note that some packages are pre-installed by default \(see [predictor](../deployments/predictor.md) or [request handlers](../deployments/request-handlers.md) depending on which runtime you're using\).
 
 ## Private packages on GitHub
 
@@ -28,7 +28,7 @@ You can generate a personal access token by following [these steps](https://help
 
 ## Project files
 
-Cortex makes all files in the project directory (i.e. the directory which contains `cortex.yaml`) available to request handlers. Python bytecode files (`*.pyc`, `*.pyo`, `*.pyd`), files or folders that start with `.`, and `cortex.yaml` are excluded.
+Cortex makes all files in the project directory \(i.e. the directory which contains `cortex.yaml`\) available to request handlers. Python bytecode files \(`*.pyc`, `*.pyo`, `*.pyd`\), files or folders that start with `.`, and `cortex.yaml` are excluded.
 
 The contents of the project directory is available in `/mnt/project/` in the API containers. For example, if this is your project directory:
 
@@ -53,3 +53,4 @@ def pre_inference(sample, signature, metadata):
   print(config)
   ...
 ```
+
diff --git a/docs/dependencies/system-packages.md b/docs/dependency-management/system-packages.md
similarity index 91%
rename from docs/dependencies/system-packages.md
rename to docs/dependency-management/system-packages.md
index fe7423b60e..aff25ebadf 100644
--- a/docs/dependencies/system-packages.md
+++ b/docs/dependency-management/system-packages.md
@@ -1,8 +1,8 @@
 # System packages
 
-Cortex uses Docker images to deploy your models. These images can be replaced with custom images that you can augment with your system packages and libraries. You will need to push your custom images to a container registry that your cluster has access to (e.g. [Docker Hub](https://hub.docker.com/) or [AWS ECR](https://aws.amazon.com/ecr/)).
+Cortex uses Docker images to deploy your models. These images can be replaced with custom images that you can augment with your system packages and libraries. You will need to push your custom images to a container registry that your cluster has access to \(e.g. [Docker Hub](https://hub.docker.com/) or [AWS ECR](https://aws.amazon.com/ecr/)\).
 
-See the `image paths` section in [cluster configuration](../cluster/config.md) for a complete list of customizable images.
+See the `image paths` section in [cluster configuration](../cluster-management/config.md) for a complete list of customizable images.
 
 ## Create a custom image
 
@@ -14,7 +14,7 @@ mkdir my-api && cd my-api && touch Dockerfile
 
 Specify the base image you want to override followed by your customizations. The sample Dockerfile below inherits from Cortex's Python serving image and installs the `tree` system package.
 
-```dockerfile
+```text
 # Dockerfile
 
 FROM cortexlabs/predictor-serve
@@ -79,3 +79,4 @@ def predict(sample, metadata):
     subprocess.run(["tree"])
     ...
 ```
+
diff --git a/docs/deployments/autoscaling.md b/docs/deployments/autoscaling.md
index e4772be01c..4ddecfdb80 100644
--- a/docs/deployments/autoscaling.md
+++ b/docs/deployments/autoscaling.md
@@ -8,4 +8,5 @@ Cortex adjusts the number of replicas that are serving predictions by monitoring
 
 ## Autoscaling Nodes
 
-Cortex spins up and down nodes based on the aggregate resource requests of all APIs. The number of nodes will be at least `min_instances` and no more than `max_instances` (configured during installation and modifiable via `cortex cluster update` or the [AWS console](https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-manual-scaling.html)).
+Cortex spins up and down nodes based on the aggregate resource requests of all APIs. The number of nodes will be at least `min_instances` and no more than `max_instances` \(configured during installation and modifiable via `cortex cluster update` or the [AWS console](https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-manual-scaling.html)\).
+
diff --git a/docs/cluster/cli.md b/docs/deployments/cli.md
similarity index 99%
rename from docs/cluster/cli.md
rename to docs/deployments/cli.md
index da2e7100e6..7ebd82f849 100644
--- a/docs/cluster/cli.md
+++ b/docs/deployments/cli.md
@@ -186,3 +186,4 @@ Usage:
 Flags:
   -h, --help   help for completion
 ```
+
diff --git a/docs/deployments/compute.md b/docs/deployments/compute.md
index 296810dd29..433148b82b 100644
--- a/docs/deployments/compute.md
+++ b/docs/deployments/compute.md
@@ -11,22 +11,22 @@ For example:
     cpu: 1
     gpu: 1
     mem: 1G
-
 ```
 
-CPU, GPU, and memory requests in Cortex correspond to compute resource requests in Kubernetes. In the example above, the API will only be scheduled once 1 CPU, 1GPU, and 1G of memory are available on any instance, and the deployment will be guaranteed to have access to those resources throughout its execution. In some cases, resource requests can be (or may default to) `Null`.
+CPU, GPU, and memory requests in Cortex correspond to compute resource requests in Kubernetes. In the example above, the API will only be scheduled once 1 CPU, 1GPU, and 1G of memory are available on any instance, and the deployment will be guaranteed to have access to those resources throughout its execution. In some cases, resource requests can be \(or may default to\) `Null`.
 
 ## CPU
 
-One unit of CPU corresponds to one virtual CPU on AWS. Fractional requests are allowed, and can be specified as a floating point number or via the "m" suffix (`0.2` and `200m` are equivalent).
+One unit of CPU corresponds to one virtual CPU on AWS. Fractional requests are allowed, and can be specified as a floating point number or via the "m" suffix \(`0.2` and `200m` are equivalent\).
 
 ## Memory
 
-One unit of memory is one byte. Memory can be expressed as an integer or by using one of these suffixes: `K`, `M`, `G`, `T` (or their power-of two counterparts: `Ki`, `Mi`, `Gi`, `Ti`). For example, the following values represent roughly the same memory: `128974848`, `129e6`, `129M`, `123Mi`.
+One unit of memory is one byte. Memory can be expressed as an integer or by using one of these suffixes: `K`, `M`, `G`, `T` \(or their power-of two counterparts: `Ki`, `Mi`, `Gi`, `Ti`\). For example, the following values represent roughly the same memory: `128974848`, `129e6`, `129M`, `123Mi`.
 
 ## GPU
 
 1. Make sure your AWS account is subscribed to the [EKS-optimized AMI with GPU Support](https://aws.amazon.com/marketplace/pp/B07GRHFXGM).
 2. You may need to [file an AWS support ticket](https://console.aws.amazon.com/support/cases#/create?issueType=service-limit-increase&limitType=ec2-instances) to incease the limit for your desired instance type.
-3. Set instance type to an AWS GPU instance (e.g. p2.xlarge) when installing Cortex.
+3. Set instance type to an AWS GPU instance \(e.g. p2.xlarge\) when installing Cortex.
 4. Note that one unit of GPU corresponds to one virtual GPU on AWS. Fractional requests are not allowed.
+
diff --git a/docs/deployments/deployments.md b/docs/deployments/deployments.md
index 4179ce271a..d56e3c908f 100644
--- a/docs/deployments/deployments.md
+++ b/docs/deployments/deployments.md
@@ -15,3 +15,4 @@ Deployments are used to group a set of APIs that are deployed together. It must
 - kind: deployment
   name: my_deployment
 ```
+
diff --git a/docs/deployments/onnx.md b/docs/deployments/onnx.md
index 0dac8e3c6e..2e25097d45 100644
--- a/docs/deployments/onnx.md
+++ b/docs/deployments/onnx.md
@@ -26,7 +26,7 @@ Deploy ONNX models as web services.
     mem: <string>  # memory request per replica (default: Null)
 ```
 
-See [packaging ONNX models](../packaging/onnx.md) for information about exporting ONNX models.
+See [packaging ONNX models](../packaging-models/onnx.md) for information about exporting ONNX models.
 
 ## Example
 
@@ -45,6 +45,7 @@ See [packaging ONNX models](../packaging/onnx.md) for information about exportin
 You can log information about each request by adding a `?debug=true` parameter to your requests. This will print:
 
 1. The raw sample
-2. The value after running the `pre_inference` function (if provided)
+2. The value after running the `pre_inference` function \(if provided\)
 3. The value after running inference
-4. The value after running the `post_inference` function (if provided)
+4. The value after running the `post_inference` function \(if provided\)
+
diff --git a/docs/deployments/prediction-monitoring.md b/docs/deployments/prediction-monitoring.md
index b22a9c6067..5ab04a7a38 100644
--- a/docs/deployments/prediction-monitoring.md
+++ b/docs/deployments/prediction-monitoring.md
@@ -24,3 +24,4 @@ For classification models, the tracker should be configured with `model_type: cl
   tracker:
     model_type: classification
 ```
+
diff --git a/docs/deployments/predictor.md b/docs/deployments/predictor.md
index 34f78f11f6..4ae017e206 100644
--- a/docs/deployments/predictor.md
+++ b/docs/deployments/predictor.md
@@ -1,13 +1,15 @@
 # Predictor APIs
 
+## Predictor APIs
+
 You can deploy models from any Python framework by implementing Cortex's Predictor interface. The interface consists of an `init()` function and a `predict()` function. The `init()` function is responsible for preparing the model for serving, downloading vocabulary files, etc. The `predict()` function is called on every request and is responsible for responding with a prediction.
 
 In addition to supporting Python models via the Predictor interface, Cortex can serve the following exported model formats:
 
-- [TensorFlow](tensorflow.md)
-- [ONNX](onnx.md)
+* [TensorFlow](tensorflow.md)
+* [ONNX](onnx.md)
 
-## Configuration
+### Configuration
 
 ```yaml
 - kind: api
@@ -31,7 +33,7 @@ In addition to supporting Python models via the Predictor interface, Cortex can
     mem: <string>  # memory request per replica (default: Null)
 ```
 
-### Example
+#### Example
 
 ```yaml
 - kind: api
@@ -42,22 +44,22 @@ In addition to supporting Python models via the Predictor interface, Cortex can
     gpu: 1
 ```
 
-## Debugging
+### Debugging
 
 You can log information about each request by adding a `?debug=true` parameter to your requests. This will print:
 
 1. The raw sample
 2. The value after running the `predict` function
 
-# Predictor
+## Predictor
 
 A Predictor is a Python file that describes how to initialize a model and use it to make a prediction.
 
-The lifecycle of a replica running a Predictor starts with loading the implementation file and executing code in the global scope. Once the implementation is loaded, Cortex calls the `init()` function to allow for any additional preparations. The `init()` function is typically used to download and initialize the model. It receives the metadata object, which is an arbitrary dictionary defined in the API configuration (it can be used to pass in the path to the exported/pickled model, vocabularies, aggregates, etc). Once the `init()` function is executed, the replica is available to accept requests. Upon receiving a request, the replica calls the `predict()` function with the JSON payload and the metadata object. The `predict()` function is responsible for returning a prediction from a sample.
+The lifecycle of a replica running a Predictor starts with loading the implementation file and executing code in the global scope. Once the implementation is loaded, Cortex calls the `init()` function to allow for any additional preparations. The `init()` function is typically used to download and initialize the model. It receives the metadata object, which is an arbitrary dictionary defined in the API configuration \(it can be used to pass in the path to the exported/pickled model, vocabularies, aggregates, etc\). Once the `init()` function is executed, the replica is available to accept requests. Upon receiving a request, the replica calls the `predict()` function with the JSON payload and the metadata object. The `predict()` function is responsible for returning a prediction from a sample.
 
 Global variables can be shared across functions safely because each replica handles one request at a time.
 
-## Implementation
+### Implementation
 
 ```python
 # initialization code and variables can be declared here in global scope
@@ -87,7 +89,7 @@ def predict(sample, metadata):
     """
 ```
 
-## Example
+### Example
 
 ```python
 import boto3
@@ -123,7 +125,7 @@ def predict(sample, metadata):
     return labels[torch.argmax(output[0])]
 ```
 
-## Pre-installed packages
+### Pre-installed packages
 
 The following packages have been pre-installed and can be used in your implementations:
 
@@ -154,4 +156,5 @@ torchvision==0.4.2
 xgboost==0.90
 ```
 
-Learn how to install additional packages [here](../dependencies/python-packages.md).
+Learn how to install additional packages [here](../dependency-management/python-packages.md).
+
diff --git a/docs/deployments/python-client.md b/docs/deployments/python-client.md
index 7b5bfbf758..781d54e888 100644
--- a/docs/deployments/python-client.md
+++ b/docs/deployments/python-client.md
@@ -2,7 +2,6 @@
 
 The Python client can be used to programmatically deploy models to a Cortex Cluster.
 
-<!-- CORTEX_VERSION_BRANCH_STABLE, e.g. v0.9.0 -->
 ```bash
 pip install git+https://github.com/cortexlabs/cortex.git@v0.11.0#egg=cortex\&subdirectory=pkg/workloads/cortex/client
 ```
@@ -43,3 +42,4 @@ sample = {
 resp = requests.post(api_url, json=sample)
 resp.json()
 ```
+
diff --git a/docs/deployments/request-handlers.md b/docs/deployments/request-handlers.md
index 55931a2853..fb01d1d6f7 100644
--- a/docs/deployments/request-handlers.md
+++ b/docs/deployments/request-handlers.md
@@ -84,4 +84,5 @@ tensorflow-hub==0.7.0  # TensorFlow runtime only
 tensorflow==2.0.0  # TensorFlow runtime only
 ```
 
-Learn how to install additional packages [here](../dependencies/python-packages.md).
+Learn how to install additional packages [here](../dependency-management/python-packages.md).
+
diff --git a/docs/deployments/statuses.md b/docs/deployments/statuses.md
index 261243c4f2..18fd7938af 100644
--- a/docs/deployments/statuses.md
+++ b/docs/deployments/statuses.md
@@ -1,12 +1,13 @@
 # API statuses
 
-| Status                | Meaning |
-|-----------------------|---|
-| live                  | API is deployed and ready to serve prediction requests (at least one replica is running) |
-| pending               | API is pending |
-| creating              | API is being created |
-| stopping              | API is stopping |
-| stopped               | API is stopped |
-| error                 | API was not created due to an error; run `cortex logs <name>` to view the logs |
-| error (out of memory) | API was terminated due to excessive memory usage; try allocating more memory to the API and re-deploying |
-| compute unavailable   | API could not start due to insufficient memory, CPU, or GPU in the cluster; some replicas may be ready |
+| Status | Meaning |
+| :--- | :--- |
+| live | API is deployed and ready to serve prediction requests \(at least one replica is running\) |
+| pending | API is pending |
+| creating | API is being created |
+| stopping | API is stopping |
+| stopped | API is stopped |
+| error | API was not created due to an error; run `cortex logs <name>` to view the logs |
+| error \(out of memory\) | API was terminated due to excessive memory usage; try allocating more memory to the API and re-deploying |
+| compute unavailable | API could not start due to insufficient memory, CPU, or GPU in the cluster; some replicas may be ready |
+
diff --git a/docs/deployments/tensorflow.md b/docs/deployments/tensorflow.md
index 723d3e243a..81e54d6748 100644
--- a/docs/deployments/tensorflow.md
+++ b/docs/deployments/tensorflow.md
@@ -27,7 +27,7 @@ Deploy TensorFlow models as web services.
     mem: <string>  # memory request per replica (default: Null)
 ```
 
-See [packaging TensorFlow models](../packaging/tensorflow.md) for how to export a TensorFlow model.
+See [packaging TensorFlow models](../packaging-models/tensorflow.md) for how to export a TensorFlow model.
 
 ## Example
 
@@ -46,6 +46,7 @@ See [packaging TensorFlow models](../packaging/tensorflow.md) for how to export
 You can log information about each request by adding a `?debug=true` parameter to your requests. This will print:
 
 1. The raw sample
-2. The value after running the `pre_inference` function (if provided)
+2. The value after running the `pre_inference` function \(if provided\)
 3. The value after running inference
-4. The value after running the `post_inference` function (if provided)
+4. The value after running the `post_inference` function \(if provided\)
+
diff --git a/docs/cluster/install.md b/docs/install.md
similarity index 62%
rename from docs/cluster/install.md
rename to docs/install.md
index dc4d0ae620..247e34c222 100644
--- a/docs/cluster/install.md
+++ b/docs/install.md
@@ -3,13 +3,12 @@
 ## Prerequisites
 
 1. [Docker](https://docs.docker.com/install)
-2. [AWS credentials](aws.md)
+2. [AWS credentials](cluster-management/aws.md)
 
 ## Installation
 
-See [cluster configuration](config.md) to learn how you can customize your cluster.
+See [cluster configuration](cluster-management/config.md) to learn how you can customize your cluster.
 
-<!-- CORTEX_VERSION_MINOR -->
 ```bash
 # install the Cortex CLI on your machine
 bash -c "$(curl -sS https://raw.githubusercontent.com/cortexlabs/cortex/0.11/get-cli.sh)"
@@ -18,12 +17,10 @@ bash -c "$(curl -sS https://raw.githubusercontent.com/cortexlabs/cortex/0.11/get
 cortex cluster up
 ```
 
-Note: This will create resources in your AWS account which aren't included in the free tier, e.g. an EKS cluster, two Elastic Load Balancers, and EC2 instances (quantity and type as specified above). To use GPU nodes, you may need to subscribe to the [EKS-optimized AMI with GPU Support](https://aws.amazon.com/marketplace/pp/B07GRHFXGM) and [file an AWS support ticket](https://console.aws.amazon.com/support/cases#/create?issueType=service-limit-increase&limitType=ec2-instances) to increase the limit for your desired instance type.
+Note: This will create resources in your AWS account which aren't included in the free tier, e.g. an EKS cluster, two Elastic Load Balancers, and EC2 instances \(quantity and type as specified above\). To use GPU nodes, you may need to subscribe to the [EKS-optimized AMI with GPU Support](https://aws.amazon.com/marketplace/pp/B07GRHFXGM) and [file an AWS support ticket](https://console.aws.amazon.com/support/cases#/create?issueType=service-limit-increase&limitType=ec2-instances) to increase the limit for your desired instance type.
 
 ## Deploy a model
 
-<!-- CORTEX_VERSION_MINOR -->
-
 ```bash
 # clone the Cortex repository
 git clone -b 0.11 https://github.com/cortexlabs/cortex.git
@@ -53,4 +50,5 @@ curl -X POST -H "Content-Type: application/json" \
 cortex delete iris
 ```
 
-See [uninstall](uninstall.md) if you'd like to uninstall Cortex.
+See [uninstall](cluster-management/uninstall.md) if you'd like to uninstall Cortex.
+
diff --git a/examples/sklearn/iris-classifier/README.md b/docs/iris-classifier.md
similarity index 93%
rename from examples/sklearn/iris-classifier/README.md
rename to docs/iris-classifier.md
index 91151b7606..590dbba342 100644
--- a/examples/sklearn/iris-classifier/README.md
+++ b/docs/iris-classifier.md
@@ -1,15 +1,13 @@
-# Deploy a scikit-learn model as a web service
+# Tutorial
 
 This example shows how to deploy a classifier trained on the famous [iris data set](https://archive.ics.uci.edu/ml/datasets/iris) using scikit-learn.
 
-<br>
-
 ## Train your model
 
 1. Create a Python file `trainer.py`.
 2. Use scikit-learn's `LogisticRegression` to train your model.
-3. Add code to pickle your model (you can use other serialization libraries such as joblib).
-4. Upload it to S3 (boto3 will need access to valid AWS credentials).
+3. Add code to pickle your model \(you can use other serialization libraries such as joblib\).
+4. Upload it to S3 \(boto3 will need access to valid AWS credentials\).
 
 ```python
 import boto3
@@ -46,8 +44,6 @@ $ pip3 install sklearn boto3
 $ python3 trainer.py
 ```
 
-<br>
-
 ## Implement a predictor
 
 1. Create another Python file `predictor.py`.
@@ -84,8 +80,6 @@ def predict(sample, metadata):
     return labels[label_id]
 ```
 
-<br>
-
 ## Specify Python dependencies
 
 Create a `requirements.txt` file to specify the dependencies needed by `predictor.py`. Cortex will automatically install them into your runtime once you deploy:
@@ -96,9 +90,7 @@ Create a `requirements.txt` file to specify the dependencies needed by `predicto
 numpy
 ```
 
-You can skip dependencies that are [pre-installed](../../../docs/deployments/predictor.md#pre-installed-packages) to speed up the deployment process. Note that `pickle` is part of the Python standard library so it doesn't need to be included.
-
-<br>
+You can skip dependencies that are [pre-installed](deployments/predictor.md#pre-installed-packages) to speed up the deployment process. Note that `pickle` is part of the Python standard library so it doesn't need to be included.
 
 ## Configure a deployment
 
@@ -117,8 +109,6 @@ Create a `cortex.yaml` file and add the configuration below. A `deployment` spec
     model: s3://cortex-examples/sklearn/iris-classifier/model.pkl
 ```
 
-<br>
-
 ## Deploy to AWS
 
 `cortex deploy` takes the declarative configuration from `cortex.yaml` and creates it on your Cortex cluster:
@@ -142,8 +132,6 @@ endpoint: http://***.amazonaws.com/iris/classifier
 
 The output above indicates that one replica of the API was requested and is available to serve predictions. Cortex will automatically launch more replicas if the load increases and spin down replicas if there is unused capacity.
 
-<br>
-
 ## Serve real-time predictions
 
 We can use `curl` to test our prediction service:
@@ -156,8 +144,6 @@ $ curl http://***.amazonaws.com/iris/classifier \
 "iris-setosa"
 ```
 
-<br>
-
 ## Configure prediction tracking
 
 Add a `tracker` to your `cortex.yaml` and specify that this is a classification model:
@@ -197,8 +183,6 @@ positive  8
 negative  4
 ```
 
-<br>
-
 ## Configure compute resources
 
 This model is fairly small but larger models may require more compute resources. You can configure this in your `cortex.yaml`:
@@ -239,8 +223,6 @@ positive  8
 negative  4
 ```
 
-<br>
-
 ## Add another API
 
 If you trained another model and want to A/B test it with your previous model, simply add another `api` to your configuration and specify the new model:
@@ -290,8 +272,6 @@ another-classifier   live     1            1           1           8s
 classifier           live     1            1           1           5m
 ```
 
-<br>
-
 ## Clean up
 
 Run `cortex delete` to spin down your API:
@@ -306,3 +286,4 @@ deleting another-classifier api
 Running `cortex delete` will free up cluster resources and allow Cortex to scale down to the minimum number of instances you specified during cluster installation. It will not spin down your cluster.
 
 Any questions? [chat with us](https://gitter.im/cortexlabs/cortex).
+
diff --git a/docs/packaging/onnx.md b/docs/packaging-models/onnx.md
similarity index 96%
rename from docs/packaging/onnx.md
rename to docs/packaging-models/onnx.md
index 08ff7a6da5..7576719ce1 100644
--- a/docs/packaging/onnx.md
+++ b/docs/packaging-models/onnx.md
@@ -1,8 +1,8 @@
-# Packaging ONNX models
+# ONNX
 
 Export your trained model to the ONNX model format. Here is an example of an sklearn model being exported to ONNX:
 
-```Python
+```python
 from sklearn.linear_model import LogisticRegression
 from onnxmltools import convert_sklearn
 from onnxconverter_common.data_types import FloatTensorType
@@ -32,3 +32,4 @@ Reference your model in an `api`:
   onnx:
     model: s3://my-bucket/model.onnx
 ```
+
diff --git a/docs/packaging/tensorflow.md b/docs/packaging-models/tensorflow.md
similarity index 79%
rename from docs/packaging/tensorflow.md
rename to docs/packaging-models/tensorflow.md
index e54778c939..ec63620ebc 100644
--- a/docs/packaging/tensorflow.md
+++ b/docs/packaging-models/tensorflow.md
@@ -1,9 +1,8 @@
-# Packaging TensorFlow models
+# TensorFlow
 
-<!-- CORTEX_VERSION_MINOR -->
-Export your trained model and upload the export directory, or a checkpoint directory containing the export directory (which is usually the case if you used `estimator.train_and_evaluate`). An example is shown below (here is the [complete example](https://github.com/cortexlabs/cortex/blob/0.11/examples/tensorflow/sentiment-analyzer)):
+Export your trained model and upload the export directory, or a checkpoint directory containing the export directory \(which is usually the case if you used `estimator.train_and_evaluate`\). An example is shown below \(here is the [complete example](https://github.com/cortexlabs/cortex/blob/0.11/examples/tensorflow/sentiment-analyzer)\):
 
-```Python
+```python
 import tensorflow as tf
 
 ...
@@ -56,3 +55,4 @@ Reference the zipped model in an `api`:
   tensorflow:
     model: s3://my-bucket/bert.zip
 ```
+
diff --git a/docs/summary.md b/docs/summary.md
index 8bb25b9ecd..dfb0dd2f87 100644
--- a/docs/summary.md
+++ b/docs/summary.md
@@ -1,10 +1,10 @@
-# Summary
+# Table of contents
 
 * [Deploy machine learning models in production](../README.md)
-* [Install](cluster/install.md)
-* [Tutorial](../examples/sklearn/iris-classifier/README.md)
+* [Install](install.md)
+* [Tutorial](iris-classifier.md)
 * [GitHub](https://github.com/cortexlabs/cortex)
-* [Examples](https://github.com/cortexlabs/cortex/tree/0.11/examples)  <!-- CORTEX_VERSION_MINOR -->
+* [Examples](https://github.com/cortexlabs/cortex/tree/0.11/examples)
 * [Chat with us](https://gitter.im/cortexlabs/cortex)
 * [Email us](mailto:hello@cortex.dev)
 * [We're hiring](https://angel.co/cortex-labs-inc/jobs)
@@ -19,28 +19,29 @@
 * [Autoscaling](deployments/autoscaling.md)
 * [Prediction monitoring](deployments/prediction-monitoring.md)
 * [Compute](deployments/compute.md)
-* [CLI commands](cluster/cli.md)
+* [CLI commands](deployments/cli.md)
 * [API statuses](deployments/statuses.md)
 * [Python client](deployments/python-client.md)
 
 ## Packaging models
 
-* [TensorFlow](packaging/tensorflow.md)
-* [ONNX](packaging/onnx.md)
+* [TensorFlow](packaging-models/tensorflow.md)
+* [ONNX](packaging-models/onnx.md)
 
 ## Dependency management
 
-* [Python packages](dependencies/python-packages.md)
-* [System packages](dependencies/system-packages.md)
+* [Python packages](dependency-management/python-packages.md)
+* [System packages](dependency-management/system-packages.md)
 
 ## Cluster management
 
-* [Cluster configuration](cluster/config.md)
-* [AWS credentials](cluster/aws.md)
-* [Security](cluster/security.md)
-* [Update](cluster/update.md)
-* [Uninstall](cluster/uninstall.md)
+* [Cluster configuration](cluster-management/config.md)
+* [AWS credentials](cluster-management/aws.md)
+* [Security](cluster-management/security.md)
+* [Update](cluster-management/update.md)
+* [Uninstall](cluster-management/uninstall.md)
 
 ## Contributing
 
-* [Development](development.md)
+* [Development](contributing/development.md)
+