GitBook: [0.11] 38 pages modified

cortexlabs · Nov 28, 2019 · 2e2962a · 2e2962a
1 parent e6604eb
commit 2e2962a
Show file tree

Hide file tree

Showing 25 changed files with 127 additions and 149 deletions.
diff --git a/README.md b/README.md
@@ -2,32 +2,18 @@
 
 Cortex is an open source platform for deploying machine learning models—trained with nearly any framework—as production web services.
 
-<br>
-
-<!-- Set header Cache-Control=no-cache on the S3 object metadata (see https://help.github.com/en/articles/about-anonymized-image-urls) -->
 ![Demo](https://d1zqebknpdh033.cloudfront.net/demo/gif/v0.8.gif)
 
-<br>
-
 ## Key features
 
-- **Autoscaling:** Cortex automatically scales APIs to handle production workloads.
-
-- **Multi framework:** Cortex supports TensorFlow, PyTorch, scikit-learn, XGBoost, and more.
-
-- **CPU / GPU support:** Cortex can run inference on CPU or GPU infrastructure.
-
-- **Spot instances:** Cortex supports EC2 spot instances.
-
-- **Rolling updates:** Cortex updates deployed APIs without any downtime.
-
-- **Log streaming:** Cortex streams logs from deployed models to your CLI.
-
-- **Prediction monitoring:** Cortex monitors network metrics and tracks predictions.
-
-- **Minimal configuration:** Deployments are defined in a single `cortex.yaml` file.
-
-<br>
+* **Autoscaling:** Cortex automatically scales APIs to handle production workloads.
+* **Multi framework:** Cortex supports TensorFlow, PyTorch, scikit-learn, XGBoost, and more.
+* **CPU / GPU support:** Cortex can run inference on CPU or GPU infrastructure.
+* **Spot instances:** Cortex supports EC2 spot instances.
+* **Rolling updates:** Cortex updates deployed APIs without any downtime.
+* **Log streaming:** Cortex streams logs from deployed models to your CLI.
+* **Prediction monitoring:** Cortex monitors network metrics and tracks predictions.
+* **Minimal configuration:** Deployments are defined in a single `cortex.yaml` file.
 
 ## Usage
 
@@ -92,19 +78,15 @@ positive  8
 negative  4
 ```
 
-<br>
-
 ## How it works
 
-The CLI sends configuration and code to the cluster every time you run `cortex deploy`. Each model is loaded into a Docker container, along with any Python packages and request handling code. The model is exposed as a web service using Elastic Load Balancing (ELB), TensorFlow Serving, and ONNX Runtime. The containers are orchestrated on Elastic Kubernetes Service (EKS) while logs and metrics are streamed to CloudWatch.
-
-<br>
+The CLI sends configuration and code to the cluster every time you run `cortex deploy`. Each model is loaded into a Docker container, along with any Python packages and request handling code. The model is exposed as a web service using Elastic Load Balancing \(ELB\), TensorFlow Serving, and ONNX Runtime. The containers are orchestrated on Elastic Kubernetes Service \(EKS\) while logs and metrics are streamed to CloudWatch.
 
 ## Examples
 
-<!-- CORTEX_VERSION_README_MINOR x5 -->
-- [Sentiment analysis](https://github.com/cortexlabs/cortex/tree/0.11/examples/tensorflow/sentiment-analyzer) in TensorFlow with BERT
-- [Image classification](https://github.com/cortexlabs/cortex/tree/0.11/examples/tensorflow/image-classifier) in TensorFlow with Inception
-- [Text generation](https://github.com/cortexlabs/cortex/tree/0.11/examples/pytorch/text-generator) in PyTorch with DistilGPT2
-- [Reading comprehension](https://github.com/cortexlabs/cortex/tree/0.11/examples/pytorch/reading-comprehender) in PyTorch with ELMo-BiDAF
-- [Iris classification](https://github.com/cortexlabs/cortex/tree/0.11/examples/sklearn/iris-classifier) in scikit-learn
+* [Sentiment analysis](https://github.com/cortexlabs/cortex/tree/0.11/examples/tensorflow/sentiment-analyzer) in TensorFlow with BERT
+* [Image classification](https://github.com/cortexlabs/cortex/tree/0.11/examples/tensorflow/image-classifier) in TensorFlow with Inception
+* [Text generation](https://github.com/cortexlabs/cortex/tree/0.11/examples/pytorch/text-generator) in PyTorch with DistilGPT2
+* [Reading comprehension](https://github.com/cortexlabs/cortex/tree/0.11/examples/pytorch/reading-comprehender) in PyTorch with ELMo-BiDAF
+* [Iris classification](https://github.com/cortexlabs/cortex/tree/0.11/examples/sklearn/iris-classifier) in scikit-learn
+
diff --git a/docs/cluster/aws.md → docs/cluster-management/aws.md b/docs/cluster/aws.md → docs/cluster-management/aws.md
@@ -2,4 +2,5 @@
 
 As of now, Cortex only runs on AWS. We plan to support other cloud providers in the future. If you don't have an AWS account you can get started with one [here](https://portal.aws.amazon.com/billing/signup#/start).
 
-Follow this [tutorial](https://aws.amazon.com/premiumsupport/knowledge-center/create-access-key) to create an access key. Enable programmatic access for the IAM user, and attach the built-in `AdministratorAccess` policy to your IAM user (or see [security](security.md) for a minimal access configuration).
+Follow this [tutorial](https://aws.amazon.com/premiumsupport/knowledge-center/create-access-key) to create an access key. Enable programmatic access for the IAM user, and attach the built-in `AdministratorAccess` policy to your IAM user \(or see [security](security.md) for a minimal access configuration\).
+
diff --git a/docs/cluster/config.md → docs/cluster-management/config.md b/docs/cluster/config.md → docs/cluster-management/config.md
@@ -1,8 +1,6 @@
 # Cluster configuration
 
-The Cortex cluster may be configured by providing a configuration file to `cortex cluster up` or `cortex cluster update` via the  `--config` flag (e.g. `cortex cluster up --config=cluster.yaml`). Below is the schema for the cluster configuration file, with default values shown (unless otherwise specified):
-
-<!-- CORTEX_VERSION_BRANCH_STABLE -->
+The Cortex cluster may be configured by providing a configuration file to `cortex cluster up` or `cortex cluster update` via the `--config` flag \(e.g. `cortex cluster up --config=cluster.yaml`\). Below is the schema for the cluster configuration file, with default values shown \(unless otherwise specified\):
 
 ```yaml
 # cluster.yaml
@@ -83,3 +81,4 @@ image_istio_pilot: cortexlabs/istio-pilot:0.11.0
 image_istio_citadel: cortexlabs/istio-citadel:0.11.0
 image_istio_galley: cortexlabs/istio-galley:0.11.0
 ```
+
diff --git a/docs/cluster/security.md → docs/cluster-management/security.md b/docs/cluster/security.md → docs/cluster-management/security.md
@@ -8,7 +8,7 @@ If you are not using a sensitive AWS account and do not have a lot of experience
 
 The operator requires read permissions for any S3 bucket containing exported models, read and write permissions for the Cortex S3 bucket, read and write permissions for the Cortex CloudWatch log group, and read and write permissions for CloudWatch metrics. The policy below may be used to restrict the Operator's access:
 
-```json
+```javascript
 {
     "Version": "2012-10-17",
     "Statement": [
@@ -43,8 +43,9 @@ In order to connect to the operator via the CLI, you must provide valid AWS cred
 
 ## API access
 
-By default, your Cortex APIs will be accessible to all traffic. You can restrict access using AWS security groups. Specifically, you will need to edit the security group with the description: "Security group for Kubernetes ELB <ELB name> (istio-system/apis-ingressgateway)".
+By default, your Cortex APIs will be accessible to all traffic. You can restrict access using AWS security groups. Specifically, you will need to edit the security group with the description: "Security group for Kubernetes ELB  \(istio-system/apis-ingressgateway\)".
 
 ## HTTPS
 
-All APIs are accessible via HTTPS. The certificate is autogenerated during installation using `localhost` as the Common Name (CN). Therefore, clients will need to skip certificate verification (e.g. `curl -k`) when using HTTPS.
+All APIs are accessible via HTTPS. The certificate is autogenerated during installation using `localhost` as the Common Name \(CN\). Therefore, clients will need to skip certificate verification \(e.g. `curl -k`\) when using HTTPS.
+
diff --git a/docs/cluster/uninstall.md → docs/cluster-management/uninstall.md b/docs/cluster/uninstall.md → docs/cluster-management/uninstall.md
@@ -4,7 +4,7 @@
 
 1. [AWS credentials](aws.md)
 2. [Docker](https://docs.docker.com/install)
-3. [Cortex CLI](install.md)
+3. [Cortex CLI](../install.md)
 4. [AWS CLI](https://aws.amazon.com/cli)
 
 ## Uninstalling Cortex
@@ -34,3 +34,4 @@ aws s3 rb --force s3://<bucket-name>
 # delete the log group
 aws logs describe-log-groups --log-group-name-prefix=<log_group_name> --query logGroups[*].[logGroupName] --output text | xargs -I {} aws logs delete-log-group --log-group-name {}
 ```
+
diff --git a/docs/cluster/update.md → docs/cluster-management/update.md b/docs/cluster/update.md → docs/cluster-management/update.md
@@ -15,8 +15,6 @@ cortex cluster update
 
 ## Upgrading to a newer version of Cortex
 
-<!-- CORTEX_VERSION_MINOR -->
-
 ```bash
 # spin down your cluster
 cortex cluster down
@@ -30,3 +28,4 @@ cortex version
 # spin up your cluster
 cortex cluster up
 ```
+
diff --git a/docs/development.md → docs/contributing/development.md b/docs/development.md → docs/contributing/development.md
@@ -1,11 +1,11 @@
-# Development Environment
+# Development
 
 ## Prerequisites
 
-1. Go (>=1.12.9)
-1. Docker
-1. eksctl
-1. kubectl
+1. Go \(&gt;=1.12.9\)
+2. Docker
+3. eksctl
+4. kubectl
 
 ## Cortex Dev Environment
 
@@ -135,23 +135,24 @@ path/to/cortex/bin/cortex deploy
 If you're making changes in the operator and want faster iterations, you can run an off-cluster operator.
 
 1. `make operator-stop` to stop the in-cluster operator
-1. `make devstart` to run the off-cluster operator (which rebuilds the CLI and restarts the Operator when files change)
-1. `path/to/cortex/bin/cortex configure` (on a separate terminal) to configure your cortex CLI to use the off-cluster operator. When prompted for operator URL, use `http://localhost:8888`
+2. `make devstart` to run the off-cluster operator \(which rebuilds the CLI and restarts the Operator when files change\)
+3. `path/to/cortex/bin/cortex configure` \(on a separate terminal\) to configure your cortex CLI to use the off-cluster operator. When prompted for operator URL, use `http://localhost:8888`
 
 Note: `make cortex-up-dev` will start Cortex without installing the operator.
 
 If you want to switch back to the in-cluster operator:
 
 1. `<ctrl+C>` to stop your off-cluster operator
-1. `make operator-start` to install the operator in your cluster
-1. `path/to/cortex/bin/cortex configure` to configure your cortex CLI to use the in-cluster operator. When prompted for operator URL, use the URL shown when running `make cortex-info`
+2. `make operator-start` to install the operator in your cluster
+3. `path/to/cortex/bin/cortex configure` to configure your cortex CLI to use the in-cluster operator. When prompted for operator URL, use the URL shown when running `make cortex-info`
 
 ## Dev Workflow
 
 1. `make cortex-up-dev`
-1. `make devstart`
-1. Make changes
-1. `make registry-dev`
-1. Test your changes with projects in `examples` or your own
+2. `make devstart`
+3. Make changes
+4. `make registry-dev`
+5. Test your changes with projects in `examples` or your own
 
 See `Makefile` for additional dev commands
+
diff --git a/docs/dependencies/python-packages.md → .../dependency-management/python-packages.md b/docs/dependencies/python-packages.md → .../dependency-management/python-packages.md
@@ -2,7 +2,7 @@
 
 ## PyPI packages
 
-You can install your required PyPI packages and import them in your Python files. Cortex looks for a `requirements.txt` file in the top level Cortex project directory (i.e. the directory which contains `cortex.yaml`):
+You can install your required PyPI packages and import them in your Python files. Cortex looks for a `requirements.txt` file in the top level Cortex project directory \(i.e. the directory which contains `cortex.yaml`\):
 
 ```text
 ./iris-classifier/
@@ -12,7 +12,7 @@ You can install your required PyPI packages and import them in your Python files
 └── requirements.txt
 ```
 
-Note that some packages are pre-installed by default (see [predictor](../deployments/predictor.md) or [request handlers](../deployments/request-handlers.md) depending on which runtime you're using).
+Note that some packages are pre-installed by default \(see [predictor](../deployments/predictor.md) or [request handlers](../deployments/request-handlers.md) depending on which runtime you're using\).
 
 ## Private packages on GitHub
 
@@ -28,7 +28,7 @@ You can generate a personal access token by following [these steps](https://help
 
 ## Project files
 
-Cortex makes all files in the project directory (i.e. the directory which contains `cortex.yaml`) available to request handlers. Python bytecode files (`*.pyc`, `*.pyo`, `*.pyd`), files or folders that start with `.`, and `cortex.yaml` are excluded.
+Cortex makes all files in the project directory \(i.e. the directory which contains `cortex.yaml`\) available to request handlers. Python bytecode files \(`*.pyc`, `*.pyo`, `*.pyd`\), files or folders that start with `.`, and `cortex.yaml` are excluded.
 
 The contents of the project directory is available in `/mnt/project/` in the API containers. For example, if this is your project directory:
 
@@ -53,3 +53,4 @@ def pre_inference(sample, signature, metadata):
   print(config)
   ...
 ```
+
diff --git a/docs/dependencies/system-packages.md → .../dependency-management/system-packages.md b/docs/dependencies/system-packages.md → .../dependency-management/system-packages.md
@@ -1,8 +1,8 @@
 # System packages
 
-Cortex uses Docker images to deploy your models. These images can be replaced with custom images that you can augment with your system packages and libraries. You will need to push your custom images to a container registry that your cluster has access to (e.g. [Docker Hub](https://hub.docker.com/) or [AWS ECR](https://aws.amazon.com/ecr/)).
+Cortex uses Docker images to deploy your models. These images can be replaced with custom images that you can augment with your system packages and libraries. You will need to push your custom images to a container registry that your cluster has access to \(e.g. [Docker Hub](https://hub.docker.com/) or [AWS ECR](https://aws.amazon.com/ecr/)\).
 
-See the `image paths` section in [cluster configuration](../cluster/config.md) for a complete list of customizable images.
+See the `image paths` section in [cluster configuration](../cluster-management/config.md) for a complete list of customizable images.
 
 ## Create a custom image
 
@@ -14,7 +14,7 @@ mkdir my-api && cd my-api && touch Dockerfile
 
 Specify the base image you want to override followed by your customizations. The sample Dockerfile below inherits from Cortex's Python serving image and installs the `tree` system package.
 
-```dockerfile
+```text
 # Dockerfile
 
 FROM cortexlabs/predictor-serve
@@ -79,3 +79,4 @@ def predict(sample, metadata):
     subprocess.run(["tree"])
     ...
 ```
+
diff --git a/docs/deployments/autoscaling.md b/docs/deployments/autoscaling.md
@@ -8,4 +8,5 @@ Cortex adjusts the number of replicas that are serving predictions by monitoring
 
 ## Autoscaling Nodes
 
-Cortex spins up and down nodes based on the aggregate resource requests of all APIs. The number of nodes will be at least `min_instances` and no more than `max_instances` (configured during installation and modifiable via `cortex cluster update` or the [AWS console](https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-manual-scaling.html)).
+Cortex spins up and down nodes based on the aggregate resource requests of all APIs. The number of nodes will be at least `min_instances` and no more than `max_instances` \(configured during installation and modifiable via `cortex cluster update` or the [AWS console](https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-manual-scaling.html)\).
+
diff --git a/docs/cluster/cli.md → docs/deployments/cli.md b/docs/cluster/cli.md → docs/deployments/cli.md
@@ -186,3 +186,4 @@ Usage:
 Flags:
   -h, --help   help for completion
 ```
+
diff --git a/docs/deployments/compute.md b/docs/deployments/compute.md
@@ -11,22 +11,22 @@ For example:
     cpu: 1
     gpu: 1
     mem: 1G
-
 ```
 
-CPU, GPU, and memory requests in Cortex correspond to compute resource requests in Kubernetes. In the example above, the API will only be scheduled once 1 CPU, 1GPU, and 1G of memory are available on any instance, and the deployment will be guaranteed to have access to those resources throughout its execution. In some cases, resource requests can be (or may default to) `Null`.
+CPU, GPU, and memory requests in Cortex correspond to compute resource requests in Kubernetes. In the example above, the API will only be scheduled once 1 CPU, 1GPU, and 1G of memory are available on any instance, and the deployment will be guaranteed to have access to those resources throughout its execution. In some cases, resource requests can be \(or may default to\) `Null`.
 
 ## CPU
 
-One unit of CPU corresponds to one virtual CPU on AWS. Fractional requests are allowed, and can be specified as a floating point number or via the "m" suffix (`0.2` and `200m` are equivalent).
+One unit of CPU corresponds to one virtual CPU on AWS. Fractional requests are allowed, and can be specified as a floating point number or via the "m" suffix \(`0.2` and `200m` are equivalent\).
 
 ## Memory
 
-One unit of memory is one byte. Memory can be expressed as an integer or by using one of these suffixes: `K`, `M`, `G`, `T` (or their power-of two counterparts: `Ki`, `Mi`, `Gi`, `Ti`). For example, the following values represent roughly the same memory: `128974848`, `129e6`, `129M`, `123Mi`.
+One unit of memory is one byte. Memory can be expressed as an integer or by using one of these suffixes: `K`, `M`, `G`, `T` \(or their power-of two counterparts: `Ki`, `Mi`, `Gi`, `Ti`\). For example, the following values represent roughly the same memory: `128974848`, `129e6`, `129M`, `123Mi`.
 
 ## GPU
 
 1. Make sure your AWS account is subscribed to the [EKS-optimized AMI with GPU Support](https://aws.amazon.com/marketplace/pp/B07GRHFXGM).
 2. You may need to [file an AWS support ticket](https://console.aws.amazon.com/support/cases#/create?issueType=service-limit-increase&limitType=ec2-instances) to incease the limit for your desired instance type.
-3. Set instance type to an AWS GPU instance (e.g. p2.xlarge) when installing Cortex.
+3. Set instance type to an AWS GPU instance \(e.g. p2.xlarge\) when installing Cortex.
 4. Note that one unit of GPU corresponds to one virtual GPU on AWS. Fractional requests are not allowed.
+
diff --git a/docs/deployments/deployments.md b/docs/deployments/deployments.md
@@ -15,3 +15,4 @@ Deployments are used to group a set of APIs that are deployed together. It must
 - kind: deployment
   name: my_deployment
 ```
+
diff --git a/docs/deployments/onnx.md b/docs/deployments/onnx.md
@@ -26,7 +26,7 @@ Deploy ONNX models as web services.
     mem: <string>  # memory request per replica (default: Null)
 ```
 
-See [packaging ONNX models](../packaging/onnx.md) for information about exporting ONNX models.
+See [packaging ONNX models](../packaging-models/onnx.md) for information about exporting ONNX models.
 
 ## Example
 
@@ -45,6 +45,7 @@ See [packaging ONNX models](../packaging/onnx.md) for information about exportin
 You can log information about each request by adding a `?debug=true` parameter to your requests. This will print:
 
 1. The raw sample
-2. The value after running the `pre_inference` function (if provided)
+2. The value after running the `pre_inference` function \(if provided\)
 3. The value after running inference
-4. The value after running the `post_inference` function (if provided)
+4. The value after running the `post_inference` function \(if provided\)
+
diff --git a/docs/deployments/prediction-monitoring.md b/docs/deployments/prediction-monitoring.md
@@ -24,3 +24,4 @@ For classification models, the tracker should be configured with `model_type: cl
   tracker:
     model_type: classification
 ```
+
Original file line number	Diff line number	Diff line change
Expand Up		@@ -2,4 +2,5 @@

		As of now, Cortex only runs on AWS. We plan to support other cloud providers in the future. If you don't have an AWS account you can get started with one [here](https://portal.aws.amazon.com/billing/signup#/start).

		Follow this [tutorial](https://aws.amazon.com/premiumsupport/knowledge-center/create-access-key) to create an access key. Enable programmatic access for the IAM user, and attach the built-in `AdministratorAccess` policy to your IAM user (or see [security](security.md) for a minimal access configuration).
		Follow this [tutorial](https://aws.amazon.com/premiumsupport/knowledge-center/create-access-key) to create an access key. Enable programmatic access for the IAM user, and attach the built-in `AdministratorAccess` policy to your IAM user \(or see [security](security.md) for a minimal access configuration\).
Original file line number	Diff line number	Diff line change
Expand Up		@@ -8,4 +8,5 @@ Cortex adjusts the number of replicas that are serving predictions by monitoring

		## Autoscaling Nodes

		Cortex spins up and down nodes based on the aggregate resource requests of all APIs. The number of nodes will be at least `min_instances` and no more than `max_instances` (configured during installation and modifiable via `cortex cluster update` or the [AWS console](https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-manual-scaling.html)).
		Cortex spins up and down nodes based on the aggregate resource requests of all APIs. The number of nodes will be at least `min_instances` and no more than `max_instances` \(configured during installation and modifiable via `cortex cluster update` or the [AWS console](https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-manual-scaling.html)\).
Original file line number	Diff line number	Diff line change
Expand Up		@@ -186,3 +186,4 @@ Usage:
		Flags:
		-h, --help help for completion
		```
Original file line number	Diff line number	Diff line change
Expand Up		@@ -15,3 +15,4 @@ Deployments are used to group a set of APIs that are deployed together. It must
		- kind: deployment
		name: my_deployment
		```
Original file line number	Diff line number	Diff line change
Expand Up		@@ -24,3 +24,4 @@ For classification models, the tracker should be configured with `model_type: cl
		tracker:
		model_type: classification
		```