Skip to content

Commit

Permalink
Merge pull request juju#15544 from barrettj12/docs-check
Browse files Browse the repository at this point in the history
juju#15544

juju#15406 introduced a Docs workflow which syncs CLI docs to Discourse on `push` to `develop`. No feedback was provided on PRs - so merging a PR could potentially cause this workflow to fail without warning. This PR adds a pre-merge check on PRs to ensure that the doc sync will succeed after merging.

The `discourse-sync` script has been majorly refactored, and now has four commands:
- `check`: ensures that each doc has an entry in the `$TOPIC_IDS` yaml file, and that the corresponding topic exists on Discourse.
- `sync`: syncs the generated documentation to Discourse.
- `create <doc-names> ...`: create new Discourse topics for the provided docs, and add the new topic IDs to the `$TOPIC_IDS` file.
- `delete`: delete all topics in the `$TOPIC_IDS` file.

The Docs workflow has been updated to match. The `check` command will now be run on `pull_request`, while `sync` will be run on `push`. If a PR adds a new Juju command, then the `check` command will find that there is no corresponding topic ID, and fail. This will warn the PR author and allow them to manually create the new topic (using the `create` command) and check in the changes.
  • Loading branch information
jujubot authored May 2, 2023
2 parents e2c7b4c + edccf70 commit b27ec03
Show file tree
Hide file tree
Showing 3 changed files with 215 additions and 84 deletions.
18 changes: 16 additions & 2 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,9 @@ on:
push:
branches:
- 'develop'
pull_request:
branches:
- 'develop'
workflow_dispatch:

jobs:
Expand Down Expand Up @@ -36,11 +39,22 @@ jobs:
juju documentation --split --out=$DOCS_DIR --discourse-ids $TOPIC_IDS
# TODO: save $DOCS_DIR as an artifact

- name: pip install requirements
run: |
python3 -m pip install -r ./scripts/discourse-sync/requirements.txt
- name: Check docs on Discourse
if: github.event_name == 'pull_request'
env:
DOCS_DIR: ${{ steps.gen.outputs.dir }}
run: |
python3 ./scripts/discourse-sync/main.py check
- name: Sync docs to Discourse
if: github.event_name == 'push'
env:
DISCOURSE_API_USERNAME: ${{ secrets.DISCOURSE_API_USERNAME }}
DISCOURSE_API_KEY: ${{ secrets.DISCOURSE_API_KEY }}
DOCS_DIR: ${{ steps.gen.outputs.dir }}
run: |
python3 -m pip install -r ./scripts/discourse-sync/requirements.txt
python3 ./scripts/discourse-sync/main.py
python3 ./scripts/discourse-sync/main.py sync
44 changes: 23 additions & 21 deletions scripts/discourse-sync/README.md
Original file line number Diff line number Diff line change
@@ -1,26 +1,28 @@
# discourse-sync

This Python script is used to sync our CLI docs to Discourse using the output
of the `juju documentation` command. It requires the following environment
variables to be set:
This Python script syncs Markdown docs to Discourse.

## Commands

| Command | Description |
|--------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `check` | Check that all docs in `$DOCS_DIR` have a corresponding entry in the `$TOPIC_IDS` file, and that this topic exists on Discourse. To be run on a pull request. |
| `sync` | Sync all docs in `$DOCS_DIR` to Discourse. To be run on each commit to the main branch. |
| `create <doc-names> ...` | Create new topics for the provided doc names, and update the `$TOPIC_IDS` file. |
| `delete` | Delete all topics with IDs listed in the `$TOPIC_IDS` file. |

| Variable name | Description |
|--------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `DISCOURSE_HOST` | URL for the Discourse server to sync docs to. If not set, this defaults to `https://discourse.charmhub.io/`. |
| `DISCOURSE_API_USERNAME` | Username to use for Discourse API requests. Must be a user with access to the API key provided in `DISCOURSE_API_KEY`. Use your own Discourse username if running locally. |
| `DISCOURSE_API_KEY` | [API key](https://meta.discourse.org/t/create-and-configure-an-api-key/230124) for accessing the Discourse server's API. |
| `DOCS_DIR` | Path to a directory containing Markdown files to sync (i.e. the argument provided to the `--out` flag of `juju documentation`). |
| `POST_IDS` | Path to a YAML file mapping each doc name to its post ID on Discourse. |
| `CI` | Set to `true` if this workflow is running in CI. | |

When we discover a doc with no corresponding entry in the `POST_IDS` file, one
of two things can happen, depending on whether we are running the script inside
a CI workflow:
- When running in CI (i.e. `$CI == 'true'`), we can't make persistent changes
to the `POST_IDS` file, so log a warning for now, and exit with a nonzero
return status at the end.
- When running locally (i.e. `CI` is unset), we create a new Discourse post for
the doc, and add the URL as a new entry in the `POST_IDS` file.
## Configuration

The script can be configured using the following environment variables:

| Variable name | Description |
|--------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `DISCOURSE_HOST` | URL for the Discourse server to sync docs to. If not set, this defaults to `https://discourse.charmhub.io/`. |
| `DISCOURSE_API_USERNAME` | Username to use for Discourse API requests. Must be a user with access to the API key provided in `DISCOURSE_API_KEY`. Use your own Discourse username if running locally. |
| `DISCOURSE_API_KEY` | [API key](https://meta.discourse.org/t/create-and-configure-an-api-key/230124) for accessing the Discourse server's API. |
| `DOCS_DIR` | Path to a directory containing Markdown files to sync (i.e. the argument provided to the `--out` flag of `juju documentation`). |
| `TOPIC_IDS` | Path to a YAML file mapping each doc name to its topic ID on Discourse. |


## Suggested usage
Expand All @@ -31,6 +33,6 @@ export DISCOURSE_API_KEY=[api-key]
export DOCS_DIR=./docs
export TOPIC_IDS=./.github/discourse-topic-ids.yaml

juju documentation --split --out=$DOCS_DIR --no-index --discourse-ids $TOPIC_IDS
python3 ./scripts/discourse-sync/main.py
juju documentation --split --out=$DOCS_DIR --discourse-ids=$TOPIC_IDS # --no-index
python3 ./scripts/discourse-sync/main.py sync
```
237 changes: 176 additions & 61 deletions scripts/discourse-sync/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,99 +2,214 @@
import sys
import yaml
from pydiscourse import DiscourseClient
from pydiscourse.exceptions import DiscourseClientError


# Get configuration from environment variables
DISCOURSE_HOST = os.environ.get('DISCOURSE_HOST', 'https://discourse.charmhub.io/')
DISCOURSE_API_USERNAME = os.environ.get('DISCOURSE_API_USERNAME')
DISCOURSE_API_KEY = os.environ.get('DISCOURSE_API_KEY')
DOCS_DIR = os.environ.get('DOCS_DIR')
TOPIC_IDS = os.environ.get('TOPIC_IDS')

client = DiscourseClient(
host=DISCOURSE_HOST,
api_username=DISCOURSE_API_USERNAME,
api_key=DISCOURSE_API_KEY,
)


def main():
# Get configuration from environment variables
DISCOURSE_HOST = os.environ.get('DISCOURSE_HOST', 'https://discourse.charmhub.io/')
DISCOURSE_API_USERNAME = os.environ.get('DISCOURSE_API_USERNAME')
DISCOURSE_API_KEY = os.environ.get('DISCOURSE_API_KEY')
DOCS_DIR = os.environ.get('DOCS_DIR')
TOPIC_IDS = os.environ.get('TOPIC_IDS')
CI = os.environ.get('CI')

client = DiscourseClient(
host=DISCOURSE_HOST,
api_username=DISCOURSE_API_USERNAME,
api_key=DISCOURSE_API_KEY,
)
if len(sys.argv) < 1:
sys.exit('no command provided, must be one of: check, sync, create, delete')

command = sys.argv[1]
if command == 'check':
check()
elif command == 'sync':
sync()
elif command == 'create':
create()
elif command == 'delete':
delete()
else:
exit(f'unknown command "{command}"')


def check():
"""
Check all docs in the DOCS_DIR have a corresponding entry in the TOPIC_IDS
file, and that the corresponding topic exists on Discourse.
"""
topic_ids = get_topic_ids()
no_topic_id = []
no_discourse_topic = []

for entry in os.scandir(DOCS_DIR):
if not is_markdown_file(entry):
print(f'entry {entry.name}: not a Markdown file: skipping')
continue

doc_name = removesuffix(entry.name, ".md")

if doc_name not in topic_ids:
print(f'doc {doc_name}: no topic ID found')
no_topic_id.append(doc_name)
continue

topic_id = topic_ids[doc_name]
print(f'doc {doc_name} (topic #{topic_id}): checking topic on Discourse')
try:
client.topic(
slug='',
topic_id=topic_ids[doc_name],
)
except DiscourseClientError:
print(f'doc {doc_name} (topic #{topic_id}): not found on Discourse')
no_discourse_topic.append(doc_name)

if no_topic_id:
print(f"The following docs don't have corresponding entries in {TOPIC_IDS}.")
print(f"Please create new Discourse topics for them, and add the new topic IDs to {TOPIC_IDS}.")
for doc_name in no_topic_id:
print(f' - {doc_name}')

if no_discourse_topic:
print("The following docs don't have corresponding topics on Discourse.")
print(f"Please create new Discourse topics for them, and update the topic IDs in f{TOPIC_IDS}.")
for doc_name in no_discourse_topic:
print(f' - {doc_name} (topic #{topic_ids[doc_name]})')

if no_topic_id or no_discourse_topic:
sys.exit(1)

with open(TOPIC_IDS, 'r') as file:
topic_ids = yaml.safe_load(file)
if topic_ids is None:
topic_ids = {}

# Keep track of docs which don't have a matching topic ID.
# If any are found while running in CI, we need to fail and ask the
# developer to manually intervene.
not_found_docs = []
def sync():
"""
Sync all docs in the DOCS_DIR with their corresponding topics on Discourse.
"""
topic_ids = get_topic_ids()
couldnt_sync = {} # doc_name -> reason

for entry in os.scandir(DOCS_DIR):
if not is_markdown_file(entry):
print(f'file {entry.name}: not a Markdown file: skipping')
print(f'entry {entry.name}: not a Markdown file: skipping')
continue

doc_name = removesuffix(entry.name, ".md")
content = open(entry.path, 'r').read()

if topic_ids and doc_name in topic_ids:
print(f'doc {doc_name} (topic #{topic_ids[doc_name]}): checking for changes')
if doc_name not in topic_ids:
couldnt_sync[doc_name] = 'no topic ID in yaml file'
continue

topic_id = topic_ids[doc_name]
print(f'doc {doc_name} (topic #{topic_id}): checking for changes')
try:
# API call to get the post ID from the topic ID
# TODO: we could save the post IDs in a separate yaml file and
# avoid this extra API call
topic = client.topic(
slug='',
topic_id=topic_ids[doc_name],
topic_id=topic_id,
)
post_id = topic['post_stream']['posts'][0]['id']
except DiscourseClientError:
couldnt_sync[doc_name] = f'no topic with ID #{topic_id} on Discourse'
continue

# Get current contents of post
post_id = topic['post_stream']['posts'][0]['id']
# Get current contents of post
try:
post2 = client.post_by_id(
post_id=post_id
)
current_contents = post2['raw']
if current_contents == content.rstrip('\n'):
print(f'doc {doc_name} (topic #{topic_ids[doc_name]}): already up-to-date: skipping')
continue
except DiscourseClientError as e:
couldnt_sync[doc_name] = f"couldn't get post for topic ID #{topic_id}: {e}"
continue

# Update Discourse post
print(f'doc {doc_name} (topic #{topic_ids[doc_name]}): updating')
current_contents = post2['raw']
if current_contents == content.rstrip('\n'):
print(f'doc {doc_name} (topic #{topic_ids[doc_name]}): already up-to-date: skipping')
continue

# Update Discourse post
print(f'doc {doc_name} (topic #{topic_ids[doc_name]}): updating')
try:
client.update_post(
post_id=post_id,
content=content,
)
except DiscourseClientError as e:
couldnt_sync[doc_name] = f"couldn't update post with ID #{post_id}: {e}"
continue

elif CI == 'true':
# Running in CI - we can't edit the TOPIC_IDS yaml file.
# Log a warning and return a non-zero exit code later
print(f'WARNING: no topic ID found for doc {doc_name}')
not_found_docs.append(doc_name)

else:
# Create new Discourse post
print(f'doc {doc_name}: no topic ID found: creating new post')
post = client.create_post(
title=post_title(doc_name),
category_id=22,
content=content,
tags=['olm', 'autogenerated'],
)
new_topic_id = post['topic_id']
print(f'doc {doc_name}: created new topic #{new_topic_id}')

# Save topic ID in yaml map for later
topic_ids[doc_name] = new_topic_id
with open(TOPIC_IDS, 'w') as file:
yaml.safe_dump(topic_ids, file)

if len(not_found_docs) > 0:
print("The following docs don't have corresponding entries in the TOPIC_IDS yaml file. Please run "
"scripts/discourse-sync locally to create new topics and update the topic IDs.")
for doc_name in not_found_docs:
print(f' - {doc_name}')
if len(couldnt_sync) > 0:
print("Failed to sync the following docs:")
for doc_name, reason in couldnt_sync.items():
print(f' - {doc_name}: {reason}')
sys.exit(1)


def create():
"""
Create new Discourse topics for each doc name provided.
"""
topic_ids = get_topic_ids()
docs = sys.argv[2:]

for doc_name in docs:
if doc_name in topic_ids:
print(f'skipping doc {doc_name}, it already has a topic ID')
continue

path = os.path.join(DOCS_DIR, doc_name+'.md')
try:
content = open(path, 'r').read()
except OSError as e:
print(f"couldn't open {path}: {e}")
continue

# Create new Discourse post
print(f'creating new post for doc {doc_name}')
post = client.create_post(
title=post_title(doc_name),
category_id=22,
content=content,
tags=['olm', 'autogenerated'],
)
new_topic_id = post['topic_id']
print(f'doc {doc_name}: created new topic #{new_topic_id}')

# Save topic ID in yaml map for later
topic_ids[doc_name] = new_topic_id
with open(TOPIC_IDS, 'w') as file:
yaml.safe_dump(topic_ids, file)


def delete():
"""
Delete all Discourse topics in the TOPIC_IDS file.
"""
topic_ids = get_topic_ids()

for doc_name, topic_id in topic_ids.items():
print(f'deleting doc {doc_name} (topic #{topic_id})')
client.delete_topic(
topic_id=topic_id
)

# Update topic ID yaml map
del topic_ids[doc_name]
with open(TOPIC_IDS, 'w') as file:
yaml.safe_dump(topic_ids, file)


def get_topic_ids():
with open(TOPIC_IDS, 'r') as file:
topic_ids = yaml.safe_load(file)
return topic_ids or {}


def is_markdown_file(entry: os.DirEntry) -> bool:
return entry.is_file() and entry.name.endswith(".md")

Expand Down

0 comments on commit b27ec03

Please sign in to comment.