Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upload headshots to S3 via django-storages #1134

Merged
merged 17 commits into from
Jul 15, 2024
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions .env.local → .env.local.example
Original file line number Diff line number Diff line change
@@ -1,24 +1,24 @@
DJANGO_SECRET_KEY=replacethiswithsomethingsecret
DATABASE_URL=postgis://postgres:postgres@postgres:5432/lametro
SEARCH_URL=http://elasticsearch:9200
DEBUG=True
DJANGO_DEBUG=True
DJANGO_ALLOWED_HOSTS=localhost,127.0.0.1,0.0.0.0
SHOW_TEST_EVENTS=False
MERGE_HOST=https://datamade-metro-pdf-merger-testing.s3.amazonaws.com/
MERGE_ENDPOINT=http://host.docker.internal:8080/api/experimental/dags/make_packet/dag_runs
FLUSH_KEY=super secret junk
REFRESH_KEY=something very secret
API_KEY=test api key
SMART_LOGIC_KEY=smartlogic api key
SMART_LOGIC_ENVIRONMENT=0ef5d755-1f43-4a7e-8b06-7591bed8d453
SMART_LOGIC_ENVIRONMENT=d3807554-347e-4091-90ea-f107a906aaff
SMART_LOGIC_KEY=smart logic key
ANALYTICS_TRACKING_CODE=
REMOTE_ANALYTICS_FOLDER=
SENTRY_DSN=
LOCAL_DOCKER=True
AWS_KEY=
AWS_SECRET=
AWS_S3_ACCESS_KEY_ID=
AWS_S3_SECRET_ACCESS_KEY=
AWS_STORAGE_BUCKET_NAME=la-metro-headshots-staging
RECAPTCHA_PUBLIC_KEY=
RECAPTCHA_PRIVATE_KEY=
GOOGLE_API_KEY=
AWS_STORAGE_BUCKET_NAME="la-metro-headshots-staging"
GOOGLE_SERVICE_ACCT_API_KEY=
1 change: 1 addition & 0 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -64,4 +64,5 @@ jobs:
run: |
flake8 .
black --check .
cp .env.local.example .env.local
pytest -sv
17 changes: 1 addition & 16 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,23 +1,15 @@
# Byte-compiled / optimized / DLL files
.DS_Store

# Settings
settings_deployment.py

# Collected Static Files
static/**
**/static/images/ocd-person/*.jpg
**/static/pdf/agenda-*.pdf

solr-4.10.4/**

**/__pycache__
.pytest_cache/

downloads
/keyrings/live/pubring.gpg~
/keyrings/live/pubring.kbx~
/keyrings/live/secring.gpg

merged_pdfs/*
debug.log
Expand All @@ -27,15 +19,8 @@ debug.log
*.csv
!historical_events.csv
!lametro_divisions.csv
.gnupg/

*_bk/

lametro/secrets.py
/configs/settings_deployment.upgrade.py

/configs/upgrade-config.conf
/configs/upgrade.conf.nginx
/configs/upgrade.conf.supervisor
.env
.venv
.env.local
16 changes: 1 addition & 15 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,20 +1,11 @@
FROM ubuntu:20.04 as builder

# Clone and build Blackbox
RUN apt-get update && \
apt-get install -y build-essential git-core && \
git clone https://github.com/StackExchange/blackbox.git && \
cd blackbox && \
make copy-install

FROM python:3.10-slim-bullseye
LABEL maintainer "DataMade <[email protected]>"

RUN apt-get update && \
apt-get install -y libpq-dev gcc gdal-bin gnupg && \
apt-get install -y libxml2-dev libxslt1-dev antiword unrtf poppler-utils \
tesseract-ocr flac ffmpeg lame libmad0 libsox-fmt-mp3 \
sox libjpeg-dev swig libpulse-dev curl && \
sox libjpeg-dev swig libpulse-dev curl git && \
apt-get clean && \
Copy link
Collaborator Author

@hancush hancush Jul 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Need git to install django-councilmatic from the branch, will remove when new release is cut.

rm -rf /var/cache/apt/* /var/lib/apt/lists/*

Expand All @@ -28,11 +19,6 @@ RUN pip install pip==24.0 && \

COPY . /app

# Copy Blackbox executables from builder stage
COPY --from=builder /usr/local/bin/blackbox* /usr/local/bin/
COPY --from=builder /usr/local/bin/_blackbox* /usr/local/bin/
COPY --from=builder /usr/local/bin/_stack_lib.sh /usr/local/bin/

RUN DJANGO_SETTINGS_MODULE=councilmatic.minimal_settings python manage.py collectstatic

ENTRYPOINT ["/app/docker-entrypoint.sh"]
3 changes: 3 additions & 0 deletions Dockerfile.dev
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
FROM ghcr.io/metro-records/la-metro-councilmatic:main

RUN apt-get update && \
apt-get install -y git

COPY ./requirements.txt /app/requirements.txt
RUN pip install pip==24.0 && \
pip install --no-cache-dir -r requirements.txt
177 changes: 26 additions & 151 deletions README.md
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line 17 here refers to a legacy way of running the app without docker. is this still useful to us? those old docs talk about using Solr which has been removed here, so they're also out of date. If we want to keep the reference to the old magicks, we could maybe change the language to something like:

These days, we run apps in containers for local development. More on that here. For reference, if you'd like to read up on how we used to run this app locally, see the legacy setup instructions.

The line in question:

These days, we run apps in containers for local development. More on that [here](https://github.com/datamade/how-to/docker/local-development.md). Prefer to run the app locally? See the [legacy setup instructions](https://github.com/datamade/la-metro-councilmatic/blob/b8bc14f6d90f1b05e24b5076b1bfcd5e0d37527a/README.md).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great call. Since we don't run apps without Docker any more, I just removed the reference to the old setup instructions.

Original file line number Diff line number Diff line change
Expand Up @@ -38,28 +38,14 @@ to set up the git hooks.

Since hooks are run locally, you can modify which scripts are run before each commit by modifying `.pre-commit-config.yaml`.

### Get the API key
### Get the Legistar API key

There should be an entry in the DataMade LastPass account called 'LA Metro - secrets.py.' Copy its contents into a file called `secrets.py` and place it in `lametro/`.

### Generate the deployment settings

Run `cp councilmatic/settings_deployment.py.example councilmatic/settings_deployment.py`.

### Install OS level dependencies:

* [Docker](https://www.docker.com/get-started)

### Run the application

```bash
docker-compose up -d
```

Note that you can omit the `-d` flag to follow the application and service logs. If you prefer a quieter environment, you can view one log stream at a time with `docker-compose logs -f SERVICE_NAME`, where `SERVICE_NAME` is the name of one of the services defined in `docker-compose.yml`, e.g., `app`, `postgres`, etc.

When the command exits (`-d`) or your logs indicate that your app is up and running, visit http://localhost:8001 to visit your shiny, new local application!

### Load in the data

The Metro app ingests updated data from the Legistar API several times an hour.
Expand All @@ -80,19 +66,33 @@ docker-compose run --rm scrapers
This may take a few minutes to an hour, depending on the volume of recent
updates.

Once it's finished, head over to http://localhost:8001 to view your shiny new app!
### Run the application

First, create your own local env file:

```bash
cp .env.local.example .env.local
```

Next, bring up the app:

```bash
docker-compose up
```

When your logs indicate that your app is up and running, visit http://localhost:8001 to visit your shiny, new local application!

### Optional: Populate the search index

If you wish to use search in your local install, you need a SmartLogic API
key. Initiated DataMade staff may decrypt application secrets for use:
key. Initiated DataMade staff may retrieve values for the `SMART_LOGIC_ENVIRONMENT`
and `SMART_LOGIC_KEY` environment variables from Heroku:

```bash
blackbox_cat configs/settings_deployment.staging.py
heroku config:get SMART_LOGIC_ENVIRONMENT SMART_LOGIC_KEY -a la-metro-councilmatic-staging
```

Grab the `SMARTLOGIC_API_KEY` value from the decrypted settings, and swap it
into your local `councilmatic/settings_deployment.py` file.
Paste these values into your `.env.local` file.

Then, run the `refresh_guid` management command to grab the appropriate
classifications for topics in the database.
Expand All @@ -109,8 +109,7 @@ Haystack.
docker-compose run --rm app python manage.py update_index
```

When the command exits, your search index has been filled. (You can view the
Solr admin panel at http://localhost:8987/solr.)
When the command exits, your search index has been filled.

## Running arbitrary scrapes
Occasionally, after a while without running an event scrape, you may find that your local app is broken. If this happens, make sure you have events in your database that are scheduled for the future, as the app queries for upcoming events in order to render the landing page.
Expand All @@ -135,133 +134,21 @@ It's sometimes helpful to make sure you have a specific bill in your database fo
docker-compose run --rm scrapers pupa update lametro bills matter_ids=<bill_matter_id> --rpm=0
```

## Making changes to the Solr schema

Did you make a change to the schema file that Solr uses to make its magic (`solr_configs/conf/schema.xml`)? Did you add a new field or adjust how Solr indexes data? If so, you need to take a few steps – locally and on the server.

### Local development

First, remove your Solr container.

```bash
# Remove your existing Metro containers and the volume containing your Solr data
docker-compose down
docker volume rm la-metro-councilmatic_lametro-solr-data

# Build the containers anew
docker-compose up -d
```

Then, rebuild your index.

```bash
python manage.py refresh_guid # Run if you made a change to facets based on topics
docker-compose run --rm app python manage.py rebuild_index --batch-size=50
```

### On the Server

The Dockerized versions of Solr on the server need your attention, too. Perform
the following steps first on staging, then – after confirming that everything
is working as expected – on production.

1. Deploy your changes to the appropriate environment (staging or production).
- To deploy to staging, merge the relevant PR into `main`.
- To deploy to production, [create and push a tag](https://github.com/datamade/deploy-a-site/blob/master/How-to-deploy-with-continuous-deployment.md#3-deploy-to-production).

2. Shell into the server, and `cd` into the relevant project directory.
```bash
ssh [email protected]

# Staging project directory: lametro-staging
# Production project directory: lametro
cd /home/datamade/${PROJECT_DIRECTORY}
```

3. Remove and restart the Solr container.
```bash
# Staging Solr container: lametro-staging-solr
# Production Solr container: lametro-production-solr
sudo docker stop ${SOLR_CONTAINER}
sudo docker rm ${SOLR_CONTAINER}

sudo docker-compose -f docker-compose.deployment.yml up -d ${SOLR_CONTAINER}
```

4. Solr will only apply changes to the schema and config upon core creation, so
consult the Solr logs to confirm the core was remade.
```bash
# Staging Solr service: solr-staging
# Production Solr service: solr-production
sudo docker-compose -f docker-compose.deployment.yml logs -f ${SOLR_SERVICE}
```

You should see logs resembling this:

```bash
Attaching to ${SOLR_CONTAINER}
Executing /opt/docker-solr/scripts/solr-create -c ${SOLR_CORE} -d /la-metro-councilmatic_configs
Running solr in the background. Logs are in /opt/solr/server/logs
Waiting up to 180 seconds to see Solr running on port 8983 [\]
Started Solr server on port 8983 (pid=64). Happy searching!

Solr is running on http://localhost:8983
Creating core with: -c ${SOLR_CORE} -d /la-metro-councilmatic_configs
INFO - 2020-11-18 13:57:09.874; org.apache.solr.util.configuration.SSLCredentialProviderFactory; Processing SSL Credential Provider chain: env;sysprop

Created new core '${SOLR_CORE}' <---- IMPORTANT MESSAGE
Checking core
```

If you see something like "Skipping core creation", you need to perform the
additional step of recreating the Solr core.

```bash
# Staging Solr core: lametro-staging
# Production Solr core: lametro
sudo docker exec ${SOLR_CONTAINER} solr delete -c ${SOLR_CORE}
sudo docker exec ${SOLR_CONTAINER} solr-create -c ${SOLR_CORE} -d /la-metro-councilmatic_configs
```

Note that we remove and recreate the core, rather than the blunt force
option of removing the Docker volume containg the Solr data, because the
staging and production Solr containers use the same volume, so removing it
would wipe out both indexes at once.

5. Switch to the `datamade` user.
```bash
sudo su - datamade
```

6. Rebuild the index:
```bash
# Staging and production virtual environments are named after the corresponding project directory
source ~/.virtualenvs/${PROJECT_DIRECTORY}/bin/activate
python manage.py refresh_guid # Run if you made a change to facets based on topics
python manage.py rebuild_index --batch-size=50
```

Nice! The production server should have the newly edited schema and freshly
built index, ready to search, filter, and facet.

## Connecting to AWS S3 for development

If you want to use the S3 bucket, you’ll need the AWS S3 API keys. This can be found by running:
```bash
blackbox_cat configs/settings_deployment.staging.py
```
```bash
heroku config:get AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY -a la-metro-councilmatic-staging
```

Grab the values for the `AWS_S3_ACCESS_KEY_ID` and the `AWS_S3_SECRET_ACCESS_KEY`. Then, find/create your `.env` file in the `councilmatic/` folder and paste in your values, naming them `ACCESS_KEY` and `SECRET_KEY` respectively.
Grab the values for the `AWS_ACCESS_KEY_ID` and the `AWS_SECRET_ACCESS_KEY` and add them to your `.env.local` file.

Now you should be able to start uploading some files!

## Adding a new board member

Hooray! A new member has been elected or appointed to the Board of Directors.
Metro will provide a headshot and bio for the new member. There are a few
changes you need to make so they appear correctly on the site.

**N.b., these changes can be made in any order.**
There are a few changes you need to make so they appear correctly on the site.

### Update the scraper

Expand All @@ -275,14 +162,6 @@ new member that were created without a post.
Person.objects.get(family_name='<MEMBER LAST NAME>').memberships.filter(organization__name='Board of Directors', post__isnull=True).delete()
```

### Update the Metro app

- Add the new member's headshot to the `lametro/static/images/manual-headshots`
directory. **Be sure to follow the naming convention `${given_name}-${family_name}.jpg`, all lowercase with punctuation stripped.**
- Add the new member's bio to the `MEMBER_BIOS` object in `councilmatic/settings_jurisdiction.py`, again **following the `${given_name}-${family_name}.jpg` naming convention.**
- Example: https://github.com/datamade/la-metro-councilmatic/pull/686
- Tip: Replace newlines in the provided bio with `<br /><br />`.

### Check your work

To confirm your changes worked, run the app locally and confirm the following:
Expand All @@ -295,10 +174,6 @@ listing and confirm the new member is listed with the correct post, e.g.,
scraper (e.g., does the member's name as it appears in the API match the
key you added?), that your changes to the scraper have been deployed, and
that a person scrape has been run since the deployment.
- View the new member's detail page and confirm that their headshot and bio
appear as expected, and without any formatting issues.

If everything looks good, you can deploy to staging, check again, then push the changes to the live site.

## A note on tests

Expand Down
Loading
Loading