AWS CloudFormation template for deploying the NLP Sandbox benchmarking infrastructure
# Install pre-commit
pre-commit install
# Run pre-commit
pre-commit run --all-files
While most of the S3 configuration is included in sceptre/nlpsandbox/templates/s3.yaml, but the resource that needs to be configured manually is the cloudfront distribution. Follow these AWS instructions to serve HTTPS requests for your S3 bucket.
- When updating redirections on the S3 bucket, you may have to clear the cloudfront cache. Follow this stackoverflow solution.
The instructions below are used to install the CloudWatch agents on the EC2 instances to collect metrics such as CPU, memory, disk and network usage.
-
Ssh into the EC2 instance.
-
Download the CloudWatch agent for Ubuntu (x86-64).
wget https://s3.amazonaws.com/amazoncloudwatch-agent/ubuntu/amd64/latest/amazon-cloudwatch-agent.deb
-
Install the agent.
sudo dpkg -i -E ./amazon-cloudwatch-agent.deb
-
Download the CloudWatch agent configuration.
wget https://github.com/nlpsandbox/nlpsandbox-infra/blob/main/cloudwatch-config.json
-
Start the CloudWatch agent.
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file:cloudwatch-config.json -s
-
Check the start of the CloudWatch agent.
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -m ec2 -a status
To stop the agent:
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -m ec2 -a stop
Create a log group named /var/log/syslog
that will gather logs from the file
with the same name.
aws logs create-log-group --log-group-name /var/log/syslog
Create a log stream per file or type of log files and add it to an existing log
group. Here we name the stream following the name of the instance
(controller
).
aws logs create-log-stream \
--log-group-name /var/log/syslog \
--log-stream-name controller
Send messages to a log stream for testing. These messages should then be visible
in AWS Console
> CloudWatch
> Log groups
.
aws logs put-log-events \
--log-group-name /var/log/syslog \
--log-stream-name controller \
--log-events \
timestamp=1630159633000,message="This message contains an Error" \
timestamp=1630159633000,message="checking progress or starting new job"
Another way of testing that syslog message reach AWS is by printing a message to syslog.
echo -e "This is a test message captured by syslog" | tee >(exec logger)
Create a log filter that listen to the log group /var/log/syslog
.
aws logs put-metric-filter \
--log-group-name /var/log/syslog \
--filter-name ErrorCount \
--filter-pattern 'Error' \
--metric-transformations \
metricName=Count,metricNamespace=MyNamespace,metricValue=1,defaultValue=0
The pattern value is case sensitive. Also, the error priority defined by the syslog file is "err".
aws logs put-metric-filter \
--log-group-name /var/log/syslog \
--filter-name errCount \
--filter-pattern 'err' \
--metric-transformations \
metricName=Count,metricNamespace=MyNamespace,metricValue=1,defaultValue=0
In order to read the content of /var/syslog
to CW, add the user cwagent
that
runs the agent to the group adm
.
sudo usermod -a -G adm cwagent
List in the section "logs" of the configuration file of the CW agent the files whose content need to be pushed to CW.
"logs": {
"logs_collected": {
"files": {
"collect_list": [
{
"file_path": "/opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log",
"log_group_name": "amazon-cloudwatch-agent.log",
"log_stream_name":"controller"
},
{
"file_path": "/var/log/syslog",
"log_group_name": "/var/log/syslog",
"log_stream_name":"controller"
}
]
}
}
}
Create the log group:
aws logs create-log-group --log-group-name docker-logs
Create the log stream:
aws logs create-log-stream \
--log-group-name docker-logs \
--log-stream-name controller
Start a container with the log driver awslogs
:
docker run --rm \
--log-driver=awslogs \
--log-opt awslogs-group=docker-logs \
--log-opt awslogs-stream=controller \
alpine echo 'Test message'