Skip to content

Commit

Permalink
chore: add dynamic require failure to runbook (#1315)
Browse files Browse the repository at this point in the history
This PR moves the "Errors Encountered in the Past" section from the
`operator-runbook` to a new document named `errors-encountered`. An
entry for `Transliterator` task failures caused by missing files has
been added to the `errors-encountered` document which explains that the
cause may be from a dynamic require being used in a dependency. A link
to the `errors-encountered` doc has been added to the appendix section
in the `operator-runbook`.

----

*By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache-2.0 license*

---------

Signed-off-by: Francis <[email protected]>
  • Loading branch information
colifran authored Sep 25, 2023
1 parent b553bc9 commit 8b03c5f
Show file tree
Hide file tree
Showing 2 changed files with 67 additions and 39 deletions.
65 changes: 65 additions & 0 deletions docs/errors-encountered.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Errors Encountered in the Past

## General Errors

### `Forbidden: null`

Usually, "Forbidden" with no additional details comes when you attempt to read
S3 objects that are SSE-encrypted, but you don't have permissions to decrypt
using the KMS key that encrypted the object; or when you attempt to read an
object from S3 that does not exist, or when you simply don't have the
appropriate IAM permissions for.

If you see this error, try checking that IAM permissions are configured
correctly for the respective backend component (including policies on VPC
resources if Construct Hub is running in a VPC, etc.).

## Transliteration Task Errors

### Running Out of File Descriptors (`ENOFILE`)

The Transliterator task in particular has been susceptible to running out of
file descriptors in the past, making the task extremely slow, or causing it to
fail or time out (sending the StepFunctions heart beat requires opening a
network connection, which requires at least 1 available file descriptor).

In order to determine where file descriptors are going, tasks can be configured
to have `lsof` run on each heartbeat tick, which will display the list of all
open files to `STDOUT`, which will be visible in the task's log.

To enable this feature, the task input must contain an
`env.RUN_LSOF_ON_HEARTBEAT` key with a string value (the value is arbitrary, but
must be truthy for Javascript - so non-empty - for the logging to be enabled).

In the case of the Transliterator task, the command includes the entire state
machine's input object, so one can simply re-run the state machine after having
merged the following into the state machine input object:

```json
{
"env": {
"RUN_LSOF_ON_HEARTBEAT": "YES"
}
}
```

### Missing Files

Esbuild bundling does not allow dynamically requiring dependencies. As an example,
the following code snippet is incompatible with esbuild's bundling:

```ts
require('./commands').forEach(function (command) {
require('./src/' + command);
});
```

In one instance, a dependency upgrade introduced a new dependency that was performing
a dynamic require. By default, the dynamic require error in esbuild is suppressed.
As a result, the bundle used in the Transliterator task was missing files and was
failing on start-up.

If you see Transliterator task failures where the stack trace points to missing files,
this may be a result of a dynamic require being used. It is recommended that you
look at any dependency upgrades and whether they introduced a new dependency that
might be using a dynamic require.
41 changes: 2 additions & 39 deletions docs/operator-runbook.md
Original file line number Diff line number Diff line change
Expand Up @@ -736,43 +736,6 @@ ECS tasks emit logs into CloudWatch under a log group called
in its name and the log stream `transliterator/Resource/$TASKID` (e.g.
`transliterator/Resource/6b5c48f0a7624396899c6a3c8474d5c7`).

## Errors encountered in the past
## Appendix

### `Forbidden: null`

Usually, "Forbidden" with no additional details comes when you attempt to read
S3 objects that are SSE-encrypted, but you don't have permissions to decrypt
using the KMS key that encrypted the object; or when you attempt to read an
object from S3 that does not exist, or when you simply don't have the
appropriate IAM permissions for.

If you see this error, try checking that IAM permissions are configured
correctly for the respective backend component (including policies on VPC
resources if Construct Hub is running in a VPC, etc.).

### Running out of file descriptors (`ENOFILE`)

The Transliterator task in particular has been susceptible to running out of
file descriptors in the past, making the task extremely slow, or causing it to
fail or time out (sending the StepFunctions heart beat requires opening a
network connection, which requires at least 1 available file descriptor).

In order to determine where file descriptors are going, tasks can be configured
to have `lsof` run on each heartbeat tick, which will display the list of all
open files to `STDOUT`, which will be visible in the task's log.

To enable this feature, the task input must contain an
`env.RUN_LSOF_ON_HEARTBEAT` key with a string value (the value is arbitrary, but
must be truthy for Javascript - so non-empty - for the logging to be enabled).

In the case of the Transliterator task, the command includes the entire state
machine's input object, so one can simply re-run the state machine after having
merged the following into the state machine input object:

```json
{
"env": {
"RUN_LSOF_ON_HEARTBEAT": "YES"
}
}
```
1. [Errors encountered in the past](./errors-encountered.md)

0 comments on commit 8b03c5f

Please sign in to comment.