Skip to content

cschaefer26/ClassifierAWS

Repository files navigation

Classifier AWS Deployment

This is a minimal implementation of a pipeline that deploys a simple text classifier into the AWS cloud using the AWS Cloud Development Kit (CDK).

Installation

Create a virtual environment and install the dependencies:

python -m venv .env
source .venv/bin/activate
pip install -r requirements.txt

You can test the installation by running:

PYTHONPATH=. pytest classifier/tests

Before Deployment

  • Make sure you have a valid AWS account and installed the command line tools CDK installed
  • Create an AWS IAM user providing the credentials for creating resources.
  • Set up a CLI profile with the IAM credentials and a default region (e.g. eu-central-1)
  • Get your 12-digit IAM account id using aws sts get-caller-identity
  • Create an AWS GitHub connection which allows AWS to clone your fork of this repo. The connection has a unique id (ARN).
  • Open the file infrastructure/app.py and put the default region (e.g. eu-central-1) and account id and github connection ARN into the dictionary shared_context

Deployment

Train a valid classifier:

PYTHONPATH=. python  classifier/train.py

The model will be saved under /tmp/classifier.pkl. When you trigger the deployment the pipeline will look for the model in the S3 bucket specified in the shared_context in the file infrastructure/app.py, thus you need to upload it there first. Create a new bucket in the AWS S3 console with the name classifier-serving-model-bucket and upload the classifier to the bucket.

Go to infrastructure, create a virtual environment and install the dependencies:

cd infrastructure
python -m venv .env
source .venv/bin/activate
pip install -r requirements.txt

Synthesize and deploy the CloudFormation template:

cd infrastructure
cdk synth
cdk deploy classifier-cicd-stack classifier-networking-stack classifier-serving-stack

Once the deployment is finished you can go to the AWS console and verify that the CodePipeline build went through. Logs are under CloudWatch/insights. The classifier will is exposed to the internet via a LoadBalancer, whose DNS you can find under the EC2 service: Go to Load balancers and click on the running instance, the DNS will be displayed there. If you copy+paste the dns-address to your browser as: dns-address/classify then the input text field for the classifier should be displayed.

Make sure you destroy the resources once you don't need them anymore:

cdk destroy classifier-cicd-stack classifier-networking-stack classifier-serving-stack

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages