docs(docs): Incorporate feedback from Engineering discussion and comm…

…ents.
input-output-hk · Oct 18, 2024 · 3de9ef1 · 3de9ef1
1 parent db5a99c
commit 3de9ef1
Show file tree

Hide file tree

Showing 2 changed files with 81 additions and 15 deletions.
diff --git a/.config/dictionaries/project.dic b/.config/dictionaries/project.dic
@@ -106,6 +106,7 @@ icudtl
 ideascale
 idents
 ilap
+IMEI
 Instantitation
 integ
 Intellij

diff --git a/docs/src/architecture/09_architecture_decisions/0008-structured-logging.md b/docs/src/architecture/09_architecture_decisions/0008-structured-logging.md
@@ -26,27 +26,88 @@ That both the Frontend and Backend are using logging libraries capable of struct
 * We will use structured logging for all log messages that are to be sent to a log collection system.
 * We will not use string formatting within structured log messages, but embed all data within fields in the log message.
 * All log levels *MUST* use structured logging.
-  As we may selectively enable debug or trace level logs in production to help investigate issues.
+  * It is possible that **DEBUG** or **TRACE** level logs can selectively be enabled in production to help investigate issues.
 * Log text messages should be specific enough to identify the actual log messages when searching or summarizing logs.
-Don't use the same log message multiple times in the same code.
+  * Avoid using the same log message multiple times in the same code.
 * As far as we are able, we should be consistent with field names.
-For example, we always use "error" for error values, not "error" or "err" or "e".
-Fields do not need to have the same name as the variable, what is important is the field name be unambiguous and consistent.
+  * For example, we always use "error" for error values, not "error" or "err" or "e".
+  * Fields do not need to have the same name as the variable, what is important is the field name be unambiguous and consistent.
 * Logs *MUST* include the date and time of the event being logged.
-Date and Time *MUST* always be referenced to UTC, and not Local Time.
+  * Date and Time *MUST* always be referenced to UTC, and not Local Time.
+  * Date and Time *SHOULD* be formatted according to [RFC3339] (An open formalized subset of [ISO8601]).
+  * If [RFC3339] is not possible to use, then [ISO8601] format *MAY* be used instead.
 * Logs *SHOULD*, if possible, include the filename, and line of the log message in the code.
-* It is *RECOMMENDED* that basic log information be pre-configured such that all log messages automatically include it.
-* API Endpoints *MUST* obfuscate sensitive information before logging, such as IP Addresses of the client.
-However, in the case of data that uniquely identifies a client or session, the obfuscation should be consistent.
-  * For example, `ip addresses` can be hashed and turned into a `UUID`.
-The `UUID` in this case will always be the same for an `ip address`.
-However, the privacy of the client is protected because it is not easily possible to derive the `ip address` from the UUID.
+* Loggers *SHOULD* be configured to automatically include log fields required for every log message.
+* All logs *MUST* obfuscate a client's private information before logging.
+  * A non-exhaustive list of private information includes:
+    * IP Addresses
+    * User-Agent Strings
+    * Location Information (Longitude/Latitude)
+    * Device identifiers (IMEI, MAC Address, etc.)
+    * Browser Fingerprints
+  * For data like this, that uniquely identifies a client or session, the obfuscation of that data should be consistent.
+    * For example, `ip addresses` can be hashed and turned into a [UUID].
+      * The [UUID] in this case will always be the same for an `ip address`.
+      * The privacy of the client is protected because it is not easily possible to derive the `ip address` from the [UUID].
+      * This gives us the ability to cluster multiple interactions from a common connection without revealing the users' identity.
+      * It also allows the Logs to help provide support to a user if they supply their IP Address. (Client Directed Unmasking).
+* *NEVER* log Security sensitive information either in the clear or obfuscated, such as (non-exhaustive):
+  * Usernames
+  * Email Addresses
+  * Passwords
+  * API Keys
+  * Private Encryption Keys
 
 *NOTE :* **IF AND ONLY IF** log messages are a developer convenience,
 **AND** will never be collected for online diagnostic purposes,
-**THEN** they do not need to follow this adr.
+**THEN** they do not need to follow this ADR.
 *IF in doubt, use structured logging.*
 
+### Log Levels
+
+The Standard Abstract Log levels are defined (from the highest priority to lowest) as:
+
+* **CRITICAL** *(Optional)* - When a system has completely failed.
+  * **ERROR** can also be used for this.
+* **ERROR** - Something has failed which normally should not.
+  * Does not have to be a fatal failure.
+  * The Log message should clearly show if the error is fatal or unrecoverable.
+    * For example, a failed DB connection is an *ERROR* but is not fatal.
+    * Errors should be considered not fatal, unless they explicitly state otherwise.
+* **WARNING** - Something which can happen, but normally shouldn't.
+* **INFO** - Important System informational messages.
+* **DEBUG** - Highest level detailed system operation logs.
+* **TRACE** - Verbose and detailed system operation logs.
+
+Individual languages may use different terms for these, and may have more or less defined levels.
+Use the language's native log levels which map closest to this list.
+For example, some languages do not have *CRITICAL*, we would map that level to *ERROR*.
+In all such cases, use the next most appropriate level which matches closest with the above definitions.
+
+#### Logging in Libraries
+
+* Libraries *SHOULD NOT* use *INFO* level logs.
+  * this level of log should be used only by applications or services.
+* Libraries *MAY* use any other log level.
+
+### Log Verbosity
+
+Logs should aim to provide exactly the right amount of information, and no more.
+
+* All **ERROR** logs should be logged in the most appropriate location.
+  * **ERROR** or **WARNING** logs, *SHOULD* include enough context to identify the actual source of the error.
+* Info level logs should only be used to provide ***IMPORTANT*** operational data.
+  * Such as:
+    * Build or configuration information.
+    * The start of internal persistent services
+  * **INFO** logs should not typically continuously stream as a result of normal system use.
+  * API Endpoint statistics logs are an example of an exception to this guidance.
+    * However, they *SHOULD* also have a method of selectively enabling/disabling these *INFO* logs.
+    * The same principle *SHOULD* apply to any other regularly logged **INFO** level logs that may be required.
+* **DEBUG** level logs should be assumed to be enabled selectively in production.
+  * Their purpose *SHOULD* be aimed to helping to diagnose faults.
+* Detailed and Verbose or Streaming Debug level logs should be at **TRACE** level.
+
 ### Example of Unstructured Logging to avoid
 
 An example of unstructured logging in rust, which is not to be used:
@@ -57,18 +118,18 @@ An example of unstructured logging in rust, which is not to be used:
 
 The problems this example exhibits are:
 
-1. This is difficult to process.  
+1. This is difficult to process with automation.
 2. It would need to be searched with a regex.
 3. The fixed words probably appear in similar patterns in other log messages, making it difficult to discern what the log is about.
 4. It may be easy for a developer to read.
-But, it is not easy to read when there are 10s of 1000s of messages from dozens of instances of a service.
+However, it is not easy to read when there are tens of thousands of logs from multiple instances of a service.
 
 ### Example of Structured Logging to utilize
 
 Instead, we should use:
 
 ```rust
-    error!(name=name, thing=%thing, something=?something, error=?err "An error occurred processing named updates to the database.");
+    error!(name=name, thing=%thing, something=?something, error=?err, "An error occurred processing named updates to the database");
 ```
 
 * Each piece of dynamic data is a field, including the string representation of the error.
@@ -93,3 +154,7 @@ Without it, log messages quickly become difficult to use, which defeats the purp
 
 * [Sematext: What is structured logging](https://sematext.com/glossary/structured-logging/)
 * [Stackify: What Is Structured Logging and Why Developers Need It](https://stackify.com/what-is-structured-logging-and-why-developers-need-it/)
+
+[RFC3339]: <https://www.rfc-editor.org/rfc/rfc3339>
+[ISO8601]: <https://www.iso.org/iso-8601-date-and-time-format.html>
+[UUID]: <https://www.rfc-editor.org/rfc/rfc9562.html>