diff --git a/.github/workflows/mega-linter.yml b/.github/workflows/mega-linter.yml index 220fd133d2..59f3fd6e06 100644 --- a/.github/workflows/mega-linter.yml +++ b/.github/workflows/mega-linter.yml @@ -51,7 +51,7 @@ jobs: # Upload MegaLinter artifacts - name: Archive production artifacts if: always() - uses: actions/upload-artifact@v2 + uses: actions/upload-artifact@v4 with: name: MegaLinter reports path: | diff --git a/README.md b/README.md index 5db720cc72..df4a5a9859 100644 --- a/README.md +++ b/README.md @@ -19,11 +19,11 @@ This is our playbook. All contributions welcome! Please feel free to submit a [p ## "The" Checklist -If you do nothing else follow the [Engineering Fundamentals Checklist](docs/ENG-FUNDAMENTALS-CHECKLIST.md)! It's here to help follow the Engineering Fundamentals. +If you do nothing else follow the [Engineering Fundamentals Checklist](docs/engineering-fundamentals-checklist.md)! It's here to help follow the Engineering Fundamentals. ## Structure of a Sprint -A [breakdown of sections](docs/SPRINT-STRUCTURE.md) according to the structure of an Agile sprint. +A [breakdown of sections](docs/the-first-week-of-an-ise-project.md) according to the structure of an Agile sprint. ## General Guidance @@ -38,19 +38,19 @@ A [breakdown of sections](docs/SPRINT-STRUCTURE.md) according to the structure o * Report product issues found and provide clear and repeatable engineering feedback! * We all own our code and each one of us has an obligation to make all parts of the solution great. -## QuickLinks +## Resources -* [Engineering Fundamentals Checklist](docs/ENG-FUNDAMENTALS-CHECKLIST.md) -* [Structure of a Sprint](docs/SPRINT-STRUCTURE.md) +* [Engineering Fundamentals Checklist](docs/engineering-fundamentals-checklist.md) +* [The first week of an ISE project](docs/the-first-week-of-an-ise-project.md) ## Engineering Fundamentals -* [Accessibility](docs/accessibility/README.md) +* [Accessibility](docs/non-functional-requirements/accessibility.md) * [Agile Development](docs/agile-development/README.md) * [Automated Testing](docs/automated-testing/README.md) * [Code Reviews](docs/code-reviews/README.md) -* [Continuous Delivery (CD)](docs/continuous-delivery/README.md) -* [Continuous Integration (CI)](docs/continuous-integration/README.md) +* [Continuous Delivery (CD)](docs/CI-CD/continuous-delivery.md) +* [Continuous Integration (CI)](docs/CI-CD/continuous-integration.md) * [Design](docs/design/readme.md) * [Developer Experience](docs/developer-experience/README.md) * [Documentation](docs/documentation/README.md) @@ -59,12 +59,12 @@ A [breakdown of sections](docs/SPRINT-STRUCTURE.md) according to the structure o * [Security](docs/security/README.md) * [Privacy](docs/privacy/README.md) * [Source Control](docs/source-control/README.md) -* [Reliability](docs/reliability/README.md) +* [Reliability](docs/non-functional-requirements/reliability.md) ## Fundamentals for Specific Technology Areas * [Machine Learning Fundamentals](docs/machine-learning/README.md) -* [User-Interface Engineering](docs/user-interface-engineering/README.md) +* [User-Interface Engineering](docs/UI-UX/README.md) ## Contributing diff --git a/docs/.pages b/docs/.pages new file mode 100644 index 0000000000..dc39d89874 --- /dev/null +++ b/docs/.pages @@ -0,0 +1,10 @@ +nav: + - ISE Engineering Fundamentals Playbook: README.md + - Engineering Fundamentals Checklist: engineering-fundamentals-checklist.md + - The First Week of an ISE Project: the-first-week-of-an-ise-project.md + - Who is ISE?: ISE.md + - Agile Development: agile-development + - Automated Testing: automated-testing + - CI/CD: CI-CD + - ... + - UI/UX: UI-UX diff --git a/docs/continuous-integration/CICD.md b/docs/CI-CD/README.md similarity index 58% rename from docs/continuous-integration/CICD.md rename to docs/CI-CD/README.md index de5da3f64a..bc7a75b4c5 100644 --- a/docs/continuous-integration/CICD.md +++ b/docs/CI-CD/README.md @@ -1,10 +1,10 @@ -# Continuous Integration and Delivery +# Continuous Integration and Continuous Delivery -Continuous Integration is the engineering practice of frequently committing code in a shared repository, ideally several times a day, and performing an automated build on it. These changes are built with other simultaneous changes to the system, which enables early detection of integration issues between multiple developers working on a project. Build breaks due to integration failures are treated as the highest priority issue for all the developers on a team and generally work stops until they are fixed. +[**Continuous Integration (CI)**](./continuous-integration.md) is the engineering practice of frequently committing code in a shared repository, ideally several times a day, and performing an automated build on it. These changes are built with other simultaneous changes to the system, which enables early detection of integration issues between multiple developers working on a project. Build breaks due to integration failures are treated as the highest priority issue for all the developers on a team and generally work stops until they are fixed. Paired with an automated testing approach, continuous integration also allows us to also test the integrated build such that we can verify that not only does the code base still build correctly, but also is still functionally correct. This is also a best practice for building robust and flexible software systems. -Continuous Delivery takes the Continuous Integration concept further to also test deployments of the integrated code base on a replica of the environment it will be ultimately deployed on. This enables us to learn early about any unforeseen operational issues that arise from our changes as quickly as possible and also learn about gaps in our test coverage. +[**Continuous Delivery (CD)**](./continuous-delivery.md) takes the **Continuous Integration (CI)** concept further to also test deployments of the integrated code base on a replica of the environment it will be ultimately deployed on. This enables us to learn early about any unforeseen operational issues that arise from our changes as quickly as possible and also learn about gaps in our test coverage. The goal of all of this is to ensure that the main branch is always shippable, meaning that we could, if we needed to, take a build from the main branch of our code base and ship it on production. @@ -14,6 +14,23 @@ Our expectation is that CI/CD should be used in all the engineering projects tha For a much deeper understanding of all of these concepts, the books [Continuous Integration](https://www.amazon.com/Continuous-Integration-Improving-Software-Reducing/dp/0321336380) and [Continuous Delivery](https://www.amazon.com/gp/product/0321601912) provide a comprehensive background. +## Why CI/CD + +- We want to have an automated build and deployment of our software +- We want automated configuration of all components +- We want to be able to quickly re-build the environment from scratch in case of disaster +- We want the latest version of the code to always be deployed to our dev/test environments +- We want a reliable release strategy, where the policies for release are well understood by all + +## The Fundamentals + +- We run a quality pipeline (with linting, unit tests etc.) on each PR/update of the main branch +- All cloud resources (including secrets and permissions) are provisioned through infrastructure as code templates – ex. Terraform, Bicep (ARM), Pulumi etc. +- All release candidates are deployed to a non-production environment through an automated process (ex Azure DevOps or Github pipelines) +- Releases are deployed to the production environment through an automated process +- Release rollbacks are carried out through a repeatable process +- Our release pipeline runs automated tests, validating all release candidate artifact(s) end-to-end against a non-production environment + ## Tools ### Azure Pipelines diff --git a/docs/continuous-delivery/README.md b/docs/CI-CD/continuous-delivery.md similarity index 98% rename from docs/continuous-delivery/README.md rename to docs/CI-CD/continuous-delivery.md index ad795a6b4f..8eb9a676d2 100644 --- a/docs/continuous-delivery/README.md +++ b/docs/CI-CD/continuous-delivery.md @@ -49,7 +49,7 @@ Code changes released into the *test* environment typically targets the main bra The very first deployment of any application should be showcased to the customer in a production-like environment (*UAT*) to solicit feedback early. The UAT environment is used to obtain product owner sign-off acceptance to ultimately promote the release to production. -#### Criteria for a production-like environment +#### Criteria for a Production-Like Environment * Runs the same operating system as production. * Has the same software installed as production. @@ -57,7 +57,7 @@ The very first deployment of any application should be showcased to the customer * Mirrors production's networking topology. * Simulated production-like load tests are executed following a release to surface any latency or throughput degradation. -#### Modeling your Release Pipeline +#### Modeling Your Release Pipeline It's critical to model your test and release process to establish a common understanding between the application engineers and customer stakeholders. Specifically aligning expectations for how many cloud environments need to be pre-provisioned as well as defining sign-off gate roles and responsibilities. @@ -85,7 +85,7 @@ The stages within your release workflow are ultimately testing a version of your A release should be running for a period of time before it's considered live and allowed to accept user traffic. These *warm up* activities may include application server(s) and database(s) pre-fill any dependent cache(s) as well as establish all service connections (eg *connection pool allocations, etc*). -#### Pre-production releases +#### Pre-production Releases Application release candidates should be deployed to a staging environment similar to production for carrying out final manual/automated tests (*including capacity testing*). Your production and staging / pre-prod cloud environments should be setup at the beginning of your project. @@ -135,13 +135,13 @@ Canary releases simplify rollbacks as you can avoid routing users to bad applica Try to limit the number of versions of your application running parallel in production, as it can complicate maintenance and monitoring controls. -### Low code solutions +### Low Code Solutions Low code solutions have increased their participation in the applications and processes and because of that it is required that a proper conjunction of disciplines improve their development. -Here is a guide for [continuous deployment for Low Code Solutions](low-code-solutions/README.md). +Here is a guide for [continuous deployment for Low Code Solutions](recipes/cd-on-low-code-solutions.md). -## References +## Resources * [Continuous Delivery](https://www.continuousdelivery.com/) by Jez Humble, David Farley. * [Continuous integration vs. continuous delivery vs. continuous deployment](https://www.atlassian.com/continuous-delivery/principles/continuous-integration-vs-delivery-vs-deployment) diff --git a/docs/continuous-integration/README.md b/docs/CI-CD/continuous-integration.md similarity index 94% rename from docs/continuous-integration/README.md rename to docs/CI-CD/continuous-integration.md index bc43027ec8..1a291434b1 100644 --- a/docs/continuous-integration/README.md +++ b/docs/CI-CD/continuous-integration.md @@ -25,7 +25,7 @@ A robust build automation pipeline will: ## Build Definition Managed in Git -### Code / manifest artifacts required to build your project should be maintained in within your project(s) git repository(s) +### Code / Manifest Artifacts Required to Build Your Project Should be Maintained Within Your Projects Git Repository - CI provider-specific build pipeline definition(s) should reside within your project(s) git repository(s). @@ -44,8 +44,8 @@ An automated build should encompass the following principles: ### Code Style Checks - Code across an engineering team must be formatted to agreed coding standards. Such standards keep code consistent, and most importantly easy for the team and customer(s) to read and refactor. Code styling consistency encourages collective ownership for project scrum teams and our partners. -- There are several open source code style validation tools available to choose from ([code style checks](https://github.com/checkstyle/checkstyle), [StyleCop](https://en.wikipedia.org/wiki/StyleCop)). The [Code Review recipes section](../code-reviews/recipes/README.md) of the playbook has suggestions for linters and preferred styles for a number of languages. -- Your code and documentation should avoid the use of non-inclusive language wherever possible. Follow the [Inclusive Linting section](inclusive-linting.md) to ensure your project promotes an inclusive work environment for both the team and for customers. +- There are several open source code style validation tools available to choose from ([code style checks](https://github.com/checkstyle/checkstyle), [StyleCop](https://en.wikipedia.org/wiki/StyleCop)). The [Code Review recipes section](../code-reviews/recipes/) of the playbook has suggestions for linters and preferred styles for a number of languages. +- Your code and documentation should avoid the use of non-inclusive language wherever possible. Follow the [Inclusive Linting section](./recipes/inclusive-linting.md) to ensure your project promotes an inclusive work environment for both the team and for customers. - We recommend incorporating security analysis tools within the build stage of your pipeline such as: code credential scanner, security risk detection, static analysis, etc. For Azure DevOps, you can add a security scan task to your pipeline by installing the [Microsoft Security Code Analysis Extension](https://secdevtools.azurewebsites.net/#pills-onboard). GitHub Actions supports a similar extension with the [RIPS security scan solution](https://github.com/marketplace/actions/rips-security-scan). - Code standards are maintained within a single configuration file. There should be a step in your build pipeline that asserts code in the latest commit conforms to the known style definition. @@ -57,21 +57,21 @@ An automated build should encompass the following principles: - It's essential to have a build that's runnable through standalone scripts and not dependent on a particular IDE. Build pipeline targets can be triggered locally on their desktops through their IDE of choice. The build process should maintain enough flexibility to run within a CI server as well. As an example, dockerizing your build process offers this level of flexibility as VSCode and IntelliJ supports [docker plugin](https://code.visualstudio.com/docs/containers/overview) extensions. -### DevOps security checks +### DevOps Security Checks -- Introduce security to your project at early stages. Follow the [DevSecOps section](dev-sec-ops/README.md) to introduce security practices, automation, tools and frameworks as part of the CI. +- Introduce security to your project at early stages. Follow the [DevSecOps section](./dev-sec-ops/README.md) to introduce security practices, automation, tools and frameworks as part of the CI. ## Build Environment Dependencies -### Automated local environment setup +### Automated Local Environment Setup - We encourage maintaining a consistent developer experience for all team members. There should be a central automated manifest / process that streamlines the installation and setup of any software dependencies. This way developers can replicate the same build environment locally as the one running on a CI server. - Build automation scripts often require specific software packages and version pre-installed within the runtime environment of the OS. This presents some challenges as build processes typically version lock these dependencies. - All developers on the team should be able to emulate the build environment from their local desktop regardless of their OS. -- For projects using VS Code, leveraging [Dev Containers](../developer-experience/devcontainers.md) can really help standardize the local developer experience across the team. +- For projects using VS Code, leveraging [Dev Containers](../developer-experience/devcontainers-getting-started.md) can really help standardize the local developer experience across the team. - Well established software packaging tools like Docker, Maven, npm, etc should be considered when designing your build automation tool chain. -### Document local setup +### Document Local Setup - The setup process for setting up a local build environment should be well documented and easy for developers to follow. @@ -172,7 +172,6 @@ Implementing schema validation is divided in two - the generation of the schemas There are two options to generate a schema: - [From code](https://json-schema.org/implementations.html#from-code) - we can leverage the existing models and objects in the code and generate a customized schema. - - [From data](https://json-schema.org/implementations.html#from-data) - we can take yaml/json samples which reflect the configuration in general and use the various online tools to generate a schema. ### Validation @@ -183,16 +182,16 @@ The schema has 30+ [validators](https://json-schema.org/implementations.html#val An effective way to identify bugs in your build at a rapid pace is to invest early into a reliable suite of automated tests that validate the baseline functionality of the system: -### End to end integration tests +### End-to-End Integration Tests - Include tests in your pipeline to validate the build candidate conforms to automated business functionality assertions. Any bugs or broken code should be reported in the test results including the failed test and relevant stack trace. All tests should be invoked through a single command. - Keep the build fast. Consider automated test runtime when deciding to pull in dependencies like databases, external services and mock data loading into your test harness. Slow builds often become a bottleneck for dev teams when parallel builds on a CI server are not an option. Consider adding max timeout limits for lengthy validations to fail fast and maintain high velocity across the team. -### Avoid checking in broken builds +### Avoid Checking in Broken Builds - Automated build checks, tests, lint runs, etc should be validated locally before committing your changes to the scm repo. [Test Driven Development](https://martinfowler.com/bliki/TestDrivenDevelopment.html) is a practice dev crews should consider to help identify bugs and failures as early as possible within the development lifecycle. -### Reporting build failures +### Reporting Build Failures - If the build step happens to fail then the build pipeline run status should be reported as failed including relevant logs and stack traces. @@ -206,22 +205,22 @@ An effective way to identify bugs in your build at a rapid pace is to invest ear ## Git Driven Workflow -### Build on commit +### Build on Commit - Every commit to the baseline repository should trigger the CI pipeline to create a new build candidate. - Build artifact(s) are built, packaged, validated and deployed continuously into a non-production environment per commit. Each commit against the repository results into a CI run which checks out the sources onto the integration machine, initiates a build, and notifies the committer of the result of the build. -### Avoid commenting out failing tests +### Avoid Commenting Out Failing Tests - Avoid commenting out tests in the mainline branch. By commenting out tests, we get an incorrect indication of the status of the build. -### Branch policy enforcement +### Branch Policy Enforcement - Protected [branch policies](https://help.github.com/en/github/administering-a-repository/about-protected-branches) should be setup on the main branch to ensure that CI stage(s) have passed prior to starting a code review. Code review approvers will only start reviewing a pull request once the CI pipeline run passes for the latest pushed git commit. - Broken builds should block pull request reviews. - Prevent commits directly into main branch. -### Branch strategy +### Branch Strategy - Release branches should auto trigger the deployment of a build artifact to its target cloud environment. You can find additional guidance on the Azure DevOps documentation site under the [Manage deployments](https://learn.microsoft.com/en-us/azure/devops/repos/git/git-branching-guidance?view=azure-devops#manage-deployments) section @@ -231,7 +230,7 @@ An effective way to identify bugs in your build at a rapid pace is to invest ear In the spirit of transparency and embracing frequent communication across a dev crew, we encourage developers to commit code on a daily cadence. This approach provides visibility to feature progress and accelerates pair programming across the team. Here are some principles to consider: -### Everyone commits to the git repository each day +### Everyone Commits to the Git Repository Each Day - End of day checked-in code should contain unit tests at the minimum. - Run the build locally before checking in to avoid CI pipeline failure saturation. You should verify what caused the error, and try to solve it as soon as possible instead of committing your code. We encourage developers to follow a [lean SDLC principles](https://leankit.com/learn/lean/principles-of-lean-development/). @@ -241,19 +240,19 @@ In the spirit of transparency and embracing frequent communication across a dev One of the key goals of build validation is to isolate and identify failures in staging environment(s) and minimize any disruption to live production traffic. Our E2E automated tests should run in an environment which mimics our production environment(as much as possible). This includes consistent software versions, OS, test data volume simulations, network traffic parity with production, etc. -### Test in a clone of production +### Test in a Clone of Production - The production environment should be duplicated into a staging environment(QA and/or Pre-Prod) at a minimum. -### Pull request update(s) trigger staged releases +### Pull Request Updates Trigger Staged Releases - New commits related to a pull request should trigger a build / release into an integration environment. The production environment should be fully isolated from this process. -### Promote infrastructure changes across fixed environments +### Promote Infrastructure Changes Across Fixed Environments - Infrastructure as code changes should be tested in an integration environment and promoted to all staging environment(s) then migrated to production with zero downtime for system users. -### Testing in production +### Testing in Production - There are various [approaches](https://medium.com/@copyconstruct/testing-in-production-the-safe-way-18ca102d0ef1) with safely carrying out automated tests for production deployments. Some of these may include: - Feature flagging @@ -264,11 +263,11 @@ One of the key goals of build validation is to isolate and identify failures in Our devops workflow should enable developers to get, install and run the latest system executable. Release executable(s) should be auto generated as part of our CI/CD pipeline(s). -### Developers can access latest executable +### Developers can Access the Latest Executable - The latest system executable is available for all developers on the team. There should be a well-known place where developers can reference the release artifact. -### Release artifact is published for each pull request or merges into main branch +### Release Artifacts are Published for Each Pull Request or Merges into the Main Branch ## Integration Observability @@ -276,16 +275,16 @@ Applied state changes to the mainline build should be made available and communi We recommend integrating Teams or Slack with CI/CD pipeline runs which helps keep the team continuously plugged into failures and build candidate status(s). -### Continuous integration top level dashboard +### Continuous Integration Top Level Dashboard - Modern CI providers have the capability to consolidate and report build status(s) within a given dashboard. - Your CI dashboard should be able to correlate a build failure with a git commit. -### Build status badge in project readme +### Build Status Badge in the Project Readme - There should be a build status badge included in the root README of the project. -### Build notifications +### Build Notifications - Your CI process should be configured to send notifications to messaging platforms like Teams / Slack once the build completes. We recommend creating a separate channel to help consolidate and isolate these notifications. diff --git a/docs/CI-CD/dev-sec-ops/README.md b/docs/CI-CD/dev-sec-ops/README.md new file mode 100644 index 0000000000..263d7896d6 --- /dev/null +++ b/docs/CI-CD/dev-sec-ops/README.md @@ -0,0 +1,21 @@ +# DevSecOps + +## The Concept of DevSecOps + +DevSecOps or DevOps security is about introducing security earlier in the life cycle of application development (a.k.a shift-left), thus minimizing the impact of vulnerabilities and bringing security closer to development team. + +## Why + +By embracing shift-left mentality, DevSecOps encourages organizations to bridge the gap that often exists between development and security teams to the point where many of the security processes are automated and are effectively handled by the development team. + +## DevSecOps Practices + +This section covers different tools, frameworks and resources allowing introduction of DevSecOps best practices to your project at early stages of development. +Topics covered: + +1. [Credential Scanning](./secrets-management/credential_scanning.md) - automatically inspecting a project to ensure that no secrets are included in the project's source code. +1. [Secrets Rotation](./secrets-management/secrets_rotation.md) - automated process by which the secret, used by the application, is refreshed and replaced by a new secret. +1. [Static Code Analysis](./secrets-management/static-code-analysis.md) - analyze source code or compiled versions of code to help find security flaws. +1. [Penetration Testing](./penetration-testing.md) - a simulated attack against your application to check for exploitable vulnerabilities. +1. [Container Dependencies Scanning](./dependency-and-container-scanning.md) - search for vulnerabilities in container operating systems, language packages and application dependencies. +1. [Evaluation of Open Source Libraries](./evaluate-open-source-software.md) - make it harder to apply open source supply chain attacks by evaluating the libraries you use. diff --git a/docs/continuous-integration/dev-sec-ops/azure-devops/service-connection-security.md b/docs/CI-CD/dev-sec-ops/azure-devops-service-connection-security.md similarity index 81% rename from docs/continuous-integration/dev-sec-ops/azure-devops/service-connection-security.md rename to docs/CI-CD/dev-sec-ops/azure-devops-service-connection-security.md index df14b0d832..8ab0ac59f6 100644 --- a/docs/continuous-integration/dev-sec-ops/azure-devops/service-connection-security.md +++ b/docs/CI-CD/dev-sec-ops/azure-devops-service-connection-security.md @@ -13,10 +13,11 @@ Securing Service Connections can be achieved by using several methods. - [Project permissions](https://learn.microsoft.com/en-us/azure/devops/pipelines/library/service-endpoints#project-permissions---cross-project-sharing-of-service-connections) can be configured to ensure only certain Azure DevOps projects are able to use the Service Connection. After using the above methods, what is secured is **who** can use the Service Connections. -What still *isn't* secured however, is **what** can be done with the Service Connections. +What still *isn't* secured however, is **what** can be done with the Service Connections. + +Because Service Connections have all the necessary permissions in the external services, it is crucial to secure Service Connections so they cannot be misused by accident or by malicious users. +An example of this is a Azure DevOps Pipeline that uses a Service Connection to an Azure Resource Group (or entire subscription) to list all resources and then delete those resources. Without the correct security in place, it could be possible to execute this Pipeline, without any validation or reviews being done. -Because Service Connections have all the necessary permissions in the external services, it is crucial to secure Service Connections so they cannot be misused by accident or by malicious users. -An example of this is a Azure DevOps Pipeline that uses a Service Connection to an Azure Resource Group (or entire subscription) to list all resources and then delete those resources. Without the correct security in place, it could be possible to execute this Pipeline, without any validation or reviews being done. ```yaml pool: vmImage: ubuntu-latest @@ -34,14 +35,14 @@ steps: } ``` -## Pipeline Security caveat +## Pipeline Security Caveat -YAML pipelines can be triggered without the need for a pull request, this introduces a security risk. +YAML pipelines can be triggered without the need for a pull request, this introduces a security risk. -In good practice, [Pull Requests](../../../code-reviews/pull-requests.md) and [Code Reviews](../../../code-reviews/README.md) should be used to ensure the code that is being deployed, is being reviewed by a second person and potentially automatically being checked for vulnerabilities and other security issues. -However, YAML Pipelines can be executed without the need for a Pull Request and Code Reviews. This allows the (malicious) user to make changes using the Service Connection which would normally require a reviewer. +In good practice, [Pull Requests](../../code-reviews/pull-requests.md) and Code Reviews should be used to ensure the code that is being deployed, is being reviewed by a second person and potentially automatically being checked for vulnerabilities and other security issues. +However, YAML Pipelines can be executed without the need for a Pull Request and Code Reviews. This allows the (malicious) user to make changes using the Service Connection which would normally require a reviewer. -The configuration of *when* a pipeline should be triggered is specified in the YAML Pipeline itself and therefore a pipeline can be configured to execute on changes in a temporary branch. In this temporary branch, any changes made to the pipeline itself will be executed without being reviewed. +The configuration of *when* a pipeline should be triggered is specified in the YAML Pipeline itself and therefore a pipeline can be configured to execute on changes in a temporary branch. In this temporary branch, any changes made to the pipeline itself will be executed without being reviewed. If the given pipeline has been granted [Pipeline-level permissions](https://learn.microsoft.com/en-us/azure/devops/pipelines/library/service-endpoints#pipeline-permissions) to use a specific Service Connection, any command can be executed using that Service Connection, without anyone reviewing the command. Since Service Connections can have a lot of permissions in the external service, executing any pipeline without review could potentially have big consequences. @@ -50,17 +51,17 @@ Since Service Connections can have a lot of permissions in the external service, To prevent accidental mis-use of Service Connections there are several checks that can be configured. These checks are configured on the Service Connection itself and therefore can only be configured by the owner or administrator of that Service Connection. A user of a certain YAML Pipeline cannot modify these checks since the checks are not defined in the YAML file itself. Configuration can be done in the Approvals and Checks menu on the Service Connection. -![ApprovalsAndChecks](images/approvals-and-checks.png) +![ApprovalsAndChecks](./images/approvals-and-checks.png) ### Branch Control -By configuring Branch Control on a Service Connection, you can control that the Service Connection can only be used in a YAML Pipeline if the pipeline is running from a specific branch. +By configuring Branch Control on a Service Connection, you can control that the Service Connection can only be used in a YAML Pipeline if the pipeline is running from a specific branch. By configuring Branch Control to only allow the main branch (and potentially release branches) you can ensure a YAML Pipeline can only use the Service Connection after any changes to that pipeline have been merged into the main branch, and therefore has passed any Pull Requests checks and Code Reviews. As an additional check, Branch Control can verify if Branch Protections (like required Pull Requests and Code Reviews) are actually configured on the allowed branches. -With Branch Control in place, in combination with Branch Protections, it is not possible anymore to run any commands against a Service Connection without having multiple persons review the commands. Therefore accidental, or malicious, mis-use of the permissions a Service Connection has is not possible anymore. +With Branch Control in place, in combination with Branch Protections, it is not possible anymore to run any commands against a Service Connection without having multiple persons review the commands. Therefore accidental, or malicious, mis-use of the permissions a Service Connection has is not possible anymore. -**Note: When setting a wildcard for the Allowed Branches, anyone could still create a branch matching that wildcard and would be able to use the Service Connection. Using [git permissions](https://learn.microsoft.com/en-us/azure/devops/repos/git/require-branch-folders#enforce-permissions) it can be configured so only administrators are allowed to create certain branches, like release branches.* +> **Note:** When setting a wildcard for the Allowed Branches, anyone could still create a branch matching that wildcard and would be able to use the Service Connection. Using [git permissions](https://learn.microsoft.com/en-us/azure/devops/repos/git/require-branch-folders#enforce-permissions) it can be configured so only administrators are allowed to create certain branches, like release branches.* -![BranchControl](images/branch-control.png) +![BranchControl](./images/branch-control.png) diff --git a/docs/continuous-integration/dev-sec-ops/dependency-container-scanning/README.md b/docs/CI-CD/dev-sec-ops/dependency-and-container-scanning.md similarity index 100% rename from docs/continuous-integration/dev-sec-ops/dependency-container-scanning/README.md rename to docs/CI-CD/dev-sec-ops/dependency-and-container-scanning.md diff --git a/docs/continuous-integration/dev-sec-ops/evaluate-oss/README.md b/docs/CI-CD/dev-sec-ops/evaluate-open-source-software.md similarity index 100% rename from docs/continuous-integration/dev-sec-ops/evaluate-oss/README.md rename to docs/CI-CD/dev-sec-ops/evaluate-open-source-software.md diff --git a/docs/continuous-integration/dev-sec-ops/azure-devops/images/approvals-and-checks.png b/docs/CI-CD/dev-sec-ops/images/approvals-and-checks.png similarity index 100% rename from docs/continuous-integration/dev-sec-ops/azure-devops/images/approvals-and-checks.png rename to docs/CI-CD/dev-sec-ops/images/approvals-and-checks.png diff --git a/docs/continuous-integration/dev-sec-ops/azure-devops/images/branch-control.png b/docs/CI-CD/dev-sec-ops/images/branch-control.png similarity index 100% rename from docs/continuous-integration/dev-sec-ops/azure-devops/images/branch-control.png rename to docs/CI-CD/dev-sec-ops/images/branch-control.png diff --git a/docs/continuous-integration/dev-sec-ops/penetration-testing/README.md b/docs/CI-CD/dev-sec-ops/penetration-testing.md similarity index 100% rename from docs/continuous-integration/dev-sec-ops/penetration-testing/README.md rename to docs/CI-CD/dev-sec-ops/penetration-testing.md diff --git a/docs/continuous-delivery/secrets-management/README.md b/docs/CI-CD/dev-sec-ops/secrets-management/README.md similarity index 79% rename from docs/continuous-delivery/secrets-management/README.md rename to docs/CI-CD/dev-sec-ops/secrets-management/README.md index 1241b46cbc..5f17a4f5fd 100644 --- a/docs/continuous-delivery/secrets-management/README.md +++ b/docs/CI-CD/dev-sec-ops/secrets-management/README.md @@ -1,12 +1,24 @@ # Secrets Management -Secrets Management refers to the way in which we protect configuration settings and other sensitive data which, if -made public, would allow unauthorized access to resources. Examples of secrets are usernames, passwords, api keys, SAS -tokens etc. +Secret management refers to the tools and practices used to manage digital authentication credentials (like API keys, tokens, passwords, and certificates). These secrets are used to protect access to sensitive data and services, making their management critical for security. We should assume any repo we work on may go public at any time and protect our secrets, even if the repo is initially private. +## Importance of Secrets Management + +In modern software development, applications often need to interact with other software components, APIs, and services. These interactions often require authentication, which is typically handled using secrets. If these secrets are not managed properly, they can be exposed, leading to potential security breaches. + +## Best Practices for Secrets Management + +1. **Centralized Secret Storage:** Store all secrets in a centralized, encrypted location. This reduces the risk of secrets being lost or exposed. +1. **Access Control:** Implement strict access control policies. Only authorized entities should have access to secrets. +1. **Rotation of Secrets:** Regularly change secrets to reduce the risk if a secret is compromised. +1. **Audit Trails:** Keep a record of when and who accessed which secret. This can help in identifying suspicious activities. +1. **Automated Secret Management:** Automate the processes of secret creation, rotation, and deletion. This reduces the risk of human error. + +Remember, the goal of secret management is to protect sensitive information from unauthorized access and potential security threats. + ## General Approach The general approach is to keep secrets in separate configuration files that are not checked in @@ -19,7 +31,7 @@ the Azure CLI to do the same is a useful time-saving utility. See [az webapp con It's best practice to maintain separate secrets configurations for each environment that you run. e.g. dev, test, prod, local etc -The [secrets-per-branch recipe](../gitops/secret-management/azure-devops-secret-management-per-branch.md) describes a simple way to manage separate secrets configurations for each environment. +The [secrets-per-branch recipe](../../gitops/secret-management/azure-devops-secret-management-per-branch.md) describes a simple way to manage separate secrets configurations for each environment. > Note: even if the secret was only pushed to a feature branch and never merged, it's still a part of the git history. Follow [these instructions](https://help.github.com/en/github/authenticating-to-github/removing-sensitive-data-from-a-repository) to remove any sensitive data and/or regenerate any keys and other sensitive information added to the repo. If a key or secret made it into the code base, rotate the key/secret so that it's no longer active @@ -53,7 +65,7 @@ These techniques make the loading of secrets transparent to the developer. For .NET SDK (version 2.0 or higher) we have `dotnet secrets`, a tool provided by the .NET SDK that allows you to manage and protect sensitive information, such as API keys, connection strings, and other secrets, during development. The secrets are stored securely on your machine and can be accessed by your .NET applications. ```shell -# Initialize dotnet secret +# Initialize dotnet secret dotnet user-secrets init # Adding secret @@ -176,4 +188,4 @@ The following steps lay out a clear pathway to creating new secrets and then uti ### Validation -Automated credential scanning can be performed on the code regardless of the programming language. Read more about it [here](../../continuous-integration/dev-sec-ops/secret-management/credential_scanning.md) +Automated [credential scanning](./credential_scanning.md) can be performed on the code regardless of the programming language. diff --git a/docs/continuous-integration/dev-sec-ops/secret-management/credential_scanning.md b/docs/CI-CD/dev-sec-ops/secrets-management/credential_scanning.md similarity index 89% rename from docs/continuous-integration/dev-sec-ops/secret-management/credential_scanning.md rename to docs/CI-CD/dev-sec-ops/secrets-management/credential_scanning.md index bdc3171417..fa7bf4db1a 100644 --- a/docs/continuous-integration/dev-sec-ops/secret-management/credential_scanning.md +++ b/docs/CI-CD/dev-sec-ops/secrets-management/credential_scanning.md @@ -2,11 +2,11 @@ Credential scanning is the practice of automatically inspecting a project to ensure that no secrets are included in the project's source code. Secrets include database passwords, storage connection strings, admin logins, service principals, etc. -## Why Credential scanning +## Why Credential Scanning -Including secrets in a project's source code is a significant risk, as it might make those secrets available to unwanted parties. Even if it seems that the source code is accessible to the same people who are privy to the secrets, this situation is likely to change as the project grows. Spreading secrets in different places makes them harder to manage, access control, and revoke efficiently. Secrets that are committed to source control are also harder to discard of, since they will persist in the source's history. +Including secrets in a project's source code is a significant risk, as it might make those secrets available to unwanted parties. Even if it seems that the source code is accessible to the same people who are privy to the secrets, this situation is likely to change as the project grows. Spreading secrets in different places makes them harder to manage, access control, and revoke efficiently. Secrets that are committed to source control are also harder to discard of, since they will persist in the source's history. Another consideration is that coupling the project's code to its infrastructure and deployment specifics is limiting and considered a bad practice. From a software design perspective, the code should be independent of the runtime configuration that will be used to run it, and that runtime configuration includes secrets. - As such, there should be a clear boundary between code and secrets: secrets should be managed outside of the source code (read more [here](../../../continuous-delivery/secrets-management/README.md)) and credential scanning should be employed to ensure that this boundary is never violated. +As such, there should be a clear boundary between code and secrets: secrets should be managed outside of the source code and credential scanning should be employed to ensure that this boundary is never violated. ## Applying Credential Scanning diff --git a/docs/continuous-integration/dev-sec-ops/secret-management/recipes/detect-secrets-ado.md b/docs/CI-CD/dev-sec-ops/secrets-management/recipes/detect-secrets-ado.md similarity index 81% rename from docs/continuous-integration/dev-sec-ops/secret-management/recipes/detect-secrets-ado.md rename to docs/CI-CD/dev-sec-ops/secrets-management/recipes/detect-secrets-ado.md index 0ec0b2c155..8b884ee135 100644 --- a/docs/continuous-integration/dev-sec-ops/secret-management/recipes/detect-secrets-ado.md +++ b/docs/CI-CD/dev-sec-ops/secrets-management/recipes/detect-secrets-ado.md @@ -1,4 +1,4 @@ -# Running detect-secrets in Azure DevOps Pipelines +# Running `detect-secrets` in Azure DevOps Pipelines ## Overview @@ -12,13 +12,13 @@ Proposed Azure DevOps Pipeline contains multiple steps described below: 1. Install detect-secrets using pip 1. Run detect-secrets tool 1. Publish results in the Pipeline Artifact - > NOTE: It's an optional step, but for future investigation .json file with results may be helpful. + > **Note:** It's an optional step, but for future investigation .json file with results may be helpful. 1. Analyzing detect-secrets results - > NOTE: This step does a simple analysis of the .json file. If any secret has been detected, then break the build with exit code 1. + > **Note:** This step does a simple analysis of the .json file. If any secret has been detected, then break the build with exit code 1. -> NOTE: The below example has 2 jobs: for Linux and Windows agents. You do not have to use both jobs - just adjust the pipeline to your needs. +> **Note:** The below example has 2 jobs: for Linux and Windows agents. You do not have to use both jobs - just adjust the pipeline to your needs. > -> NOTE: Windows example does not use the latest version of detect-secrets. It is related to the bug in the detect-secret tool (see more in [Issue#452](https://github.com/Yelp/detect-secrets/issues/452)). It is highly recommended to monitor the fix for the issue and use the latest version if possible by removing version tag `==1.0.3` in the pip install command. +> **Note:** Windows example does not use the latest version of detect-secrets. It is related to the bug in the detect-secret tool (see more in [Issue#452](https://github.com/Yelp/detect-secrets/issues/452)). It is highly recommended to monitor the fix for the issue and use the latest version if possible by removing version tag `==1.0.3` in the pip install command. ```yaml trigger: diff --git a/docs/continuous-integration/dev-sec-ops/secret-management/recipes/detect-secrets.md b/docs/CI-CD/dev-sec-ops/secrets-management/recipes/detect-secrets.md similarity index 96% rename from docs/continuous-integration/dev-sec-ops/secret-management/recipes/detect-secrets.md rename to docs/CI-CD/dev-sec-ops/secrets-management/recipes/detect-secrets.md index 33216aa237..dff07a55cf 100644 --- a/docs/continuous-integration/dev-sec-ops/secret-management/recipes/detect-secrets.md +++ b/docs/CI-CD/dev-sec-ops/secrets-management/recipes/detect-secrets.md @@ -1,4 +1,4 @@ -# Credential Scanning Tool: detect-secrets +# Credential Scanning Tool: `detect-secrets` ## Background @@ -28,7 +28,7 @@ python3 -m pip install detect-secrets detect-secrets scan > .secrets.baseline ``` -## Pre-commit hook +## Pre-Commit Hook It is recommended to use `detect-secrets` in your development environment as a Git pre-commit hook. @@ -45,7 +45,7 @@ repos: args: ['--baseline', '.secrets.baseline'] ``` -## Usage in CI pipelines +## Usage in CI Pipelines ```sh # backup the list of known secrets diff --git a/docs/continuous-integration/dev-sec-ops/secret-management/secrets_rotation.md b/docs/CI-CD/dev-sec-ops/secrets-management/secrets_rotation.md similarity index 100% rename from docs/continuous-integration/dev-sec-ops/secret-management/secrets_rotation.md rename to docs/CI-CD/dev-sec-ops/secrets-management/secrets_rotation.md diff --git a/docs/continuous-integration/dev-sec-ops/secret-management/static-code-analysis.md b/docs/CI-CD/dev-sec-ops/secrets-management/static-code-analysis.md similarity index 100% rename from docs/continuous-integration/dev-sec-ops/secret-management/static-code-analysis.md rename to docs/CI-CD/dev-sec-ops/secrets-management/static-code-analysis.md diff --git a/docs/continuous-delivery/gitops/deploying/README.md b/docs/CI-CD/gitops/deploying-with-gitops.md similarity index 94% rename from docs/continuous-delivery/gitops/deploying/README.md rename to docs/CI-CD/gitops/deploying-with-gitops.md index fdf9b85e66..11ca335384 100644 --- a/docs/continuous-delivery/gitops/deploying/README.md +++ b/docs/CI-CD/gitops/deploying-with-gitops.md @@ -9,7 +9,7 @@ GitOps simply allows faster deployments by having git repositories in the center offering a clear audit trail via git commits and no direct environment access. Read more on [Why should I use GitOps?](https://www.gitops.tech/#why-should-i-use-gitops) The below diagram compares traditional CI/CD vs GitOps workflow: -![push based vs pull based deployments](images/GitopsWorflowVsTraditionalPush.jpg) +![push based vs pull based deployments](./images/GitopsWorflowVsTraditionalPush.jpg) ## Tools for GitOps @@ -19,7 +19,7 @@ Some popular GitOps frameworks for Kubernetes backed by [CNCF](https://landscape - [Argo CD](https://argo-cd.readthedocs.io/en/stable/) - [Rancher Fleet](https://fleet.rancher.io/) -## Deploying using GitOps +## Deploying Using GitOps GitOps with Flux v2 can be enabled in Azure Kubernetes Service (AKS) managed clusters or Azure Arc-enabled Kubernetes connected clusters as a cluster extension. After the microsoft.flux cluster extension is installed, you can create one or more fluxConfigurations resources that sync your Git repository sources to the cluster and reconcile the cluster to the desired state. With GitOps, you can use your Git repository as the source of truth for cluster configuration and application deployment. diff --git a/docs/continuous-delivery/recipes/github-workflows/README.md b/docs/CI-CD/gitops/github-workflows.md similarity index 87% rename from docs/continuous-delivery/recipes/github-workflows/README.md rename to docs/CI-CD/gitops/github-workflows.md index 94af923278..dc701498be 100644 --- a/docs/continuous-delivery/recipes/github-workflows/README.md +++ b/docs/CI-CD/gitops/github-workflows.md @@ -2,21 +2,21 @@ A workflow is a configurable automated process made up of one or more jobs where each of these jobs can be an action in GitHub. Currently, a YAML file format is supported for defining a workflow in GitHub. -Additional information on GitHub actions and GitHub Workflows in the links posted in the [references](#references) section below. +Additional information on GitHub actions and GitHub Workflows in the links posted in the [resources](#resources) section below. -## Workflow Per Environment +## Workflow per Environment The general approach is to have one pipeline, where the code is built, tested and deployed, and the artifact is then promoted to the next environment, eventually to be deployed into production. There are multiple ways in GitHub that an environment setup can be achieved. One way it can be done is to have one workflow for multiple environments, but the complexity increases as additional processes and jobs are added to a workflow, which does not mean it cannot be done for small pipelines. The plus point of having one workflow is that, when an artifact flows from one environment to another the state and environment values between the deployment environments can be passed easily. -![Workflow-Designs-Dependent-Workflows](images/Workflow-Designs-Dependent-Workflows.png) +![Workflow-Designs-Dependent-Workflows](./images/Workflow-Designs-Dependent-Workflows.png) One way to get around the complexity of a single workflow is to have separate workflows for different environments, making sure that only the artifacts created and validated are promoted from one environment to another, as well as, the workflow is small enough, to debug any issues seen in any of the workflows. In this case, the state and environment values need to be passed from one deployment environment to another. Multiple workflows also helps to keep the deployments to the environments independent thus reducing the time to deploy and find issues earlier than later in the process. Also, since the environments are independent of each other, any failures in deploying to one environment does not block deployments to other environments. One tradeoff in this method, is that with different workflows for each environment, the maintenance increases as the complexity of workflows increase over time. -![Workflow-Designs-Independent-Workflows](images/Workflow-Designs-Independent-Workflows.png) +![Workflow-Designs-Independent-Workflows](./images/Workflow-Designs-Independent-Workflows.png) -## References +## Resources - [GitHub Actions](https://docs.github.com/en/actions) - [GitHub Workflows](https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions) diff --git a/docs/continuous-delivery/gitops/deploying/images/GitopsWorflowVsTraditionalPush.jpg b/docs/CI-CD/gitops/images/GitopsWorflowVsTraditionalPush.jpg similarity index 100% rename from docs/continuous-delivery/gitops/deploying/images/GitopsWorflowVsTraditionalPush.jpg rename to docs/CI-CD/gitops/images/GitopsWorflowVsTraditionalPush.jpg diff --git a/docs/continuous-delivery/recipes/github-workflows/images/Workflow-Designs-Dependent-Workflows.png b/docs/CI-CD/gitops/images/Workflow-Designs-Dependent-Workflows.png similarity index 100% rename from docs/continuous-delivery/recipes/github-workflows/images/Workflow-Designs-Dependent-Workflows.png rename to docs/CI-CD/gitops/images/Workflow-Designs-Dependent-Workflows.png diff --git a/docs/continuous-delivery/recipes/github-workflows/images/Workflow-Designs-Independent-Workflows.png b/docs/CI-CD/gitops/images/Workflow-Designs-Independent-Workflows.png similarity index 100% rename from docs/continuous-delivery/recipes/github-workflows/images/Workflow-Designs-Independent-Workflows.png rename to docs/CI-CD/gitops/images/Workflow-Designs-Independent-Workflows.png diff --git a/docs/continuous-delivery/gitops/secret-management/README.md b/docs/CI-CD/gitops/secret-management/README.md similarity index 98% rename from docs/continuous-delivery/gitops/secret-management/README.md rename to docs/CI-CD/gitops/secret-management/README.md index 7734c977de..def78f7b89 100644 --- a/docs/continuous-delivery/gitops/secret-management/README.md +++ b/docs/CI-CD/gitops/secret-management/README.md @@ -1,4 +1,4 @@ -# Secret management with GitOps +# Secrets Management with GitOps GitOps projects have git repositories in the center that are considered a source of truth for managing both infrastructure and application. This infrastructure and application will require secured access to other resources of the system through secrets. Committing clear-text secrets into git repositories is unacceptable even if the repositories are private to your team and organization. Teams need a secure way to handle secrets when using GitOps. @@ -10,7 +10,7 @@ There are many ways to manage secrets with GitOps and at high level can be categ > **TLDR**: Referencing secrets in an external key vault is the recommended approach. It is easier to orchestrate secret rotation and more scalable with multiple clusters and/or teams. -## Encrypted secrets in git repositories +## Encrypted Secrets in Git Repositories In this approach, Developers manually encrypt secrets using a public key, and the key can only be decrypted by the custom Kubernetes controller running in the target cluster. Some popular tools for his approach are [Bitnami Sealed Secrets](https://github.com/bitnami-labs/sealed-secrets), [Mozilla SOPS](https://github.com/mozilla/sops) @@ -50,7 +50,7 @@ Some of the key points of using SOPS are: - The public key is sufficient for creating brand new files. The secret key is required for decrypting and editing existing files because SOPS computes a MAC on all values. When using the public key solely to add or remove a field, the whole file should be deleted and recreated - Supports several types of keys that can be used in both connected and disconnected state. A secret can have a list of keys and will try do decrypt with all of them. -## Reference to secrets stored in an external key vault (recommended) +## Reference to Secrets Stored in an External Key Vault (Recommended) This approach relies on a key management system like [Azure Key Vault](https://learn.microsoft.com/en-us/azure/key-vault/general/overview) to hold the secrets and the git manifest in the repositories has reference to the key vault secrets. Developers do not perform any cryptographic operations with files in repositories. Kubernetes operators running in the target cluster are responsible for pulling the secrets from the key vault and making them available either as Kubernetes secrets or secrets volume mounted to the pod. @@ -64,7 +64,7 @@ All the below tools share the following: - Easily scalable with multi-cluster and larger teams - Both solutions support either Azure Active Directory (Azure AD) [service principal](https://learn.microsoft.com/en-us/azure/active-directory/develop/app-objects-and-service-principals) or [managed identity](https://learn.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/overview) for [authentication with the Key Vault](https://learn.microsoft.com/en-us/azure/key-vault/general/authentication). -For secret rotation ideas, see [Secrets Rotation on Environment Variables and Mounted Secrets](secret-rotation-in-pods.md) +For secret rotation ideas, see [Secrets Rotation on Environment Variables and Mounted Secrets](./secret-rotation-in-pods.md) For how to authenticate private container registries with a service principal see: [Authenticated Private Container Registry](#authenticated-private-container-registry) @@ -119,7 +119,7 @@ Disadvantages: - The GitOps repo must contain the name of the Key Vault within the SecretStore / ClusterSecretStore or a ConfigMap linking to it - Must create secrets as K8s secrets -## Important Links +## Resources - [Sealed Secrets with Flux v2](https://toolkit.fluxcd.io/guides/sealed-secrets/) - [Mozilla SOPS with Flux v2](https://toolkit.fluxcd.io/guides/mozilla-sops/) diff --git a/docs/continuous-delivery/gitops/secret-management/azure-devops-secret-management-per-branch.md b/docs/CI-CD/gitops/secret-management/azure-devops-secret-management-per-branch.md similarity index 87% rename from docs/continuous-delivery/gitops/secret-management/azure-devops-secret-management-per-branch.md rename to docs/CI-CD/gitops/secret-management/azure-devops-secret-management-per-branch.md index ca52e77ea6..722b120055 100644 --- a/docs/continuous-delivery/gitops/secret-management/azure-devops-secret-management-per-branch.md +++ b/docs/CI-CD/gitops/secret-management/azure-devops-secret-management-per-branch.md @@ -6,7 +6,7 @@ When using [Azure DevOps Pipelines](https://azure.microsoft.com/en-us/services/d - *Pipeline variables are global shared state.* This can lead to confusing situations and hard to debug problems when developers make concurrent changes to the pipeline variables which may override each other. Having a single global set of pipeline variables also makes it impossible for secrets to vary per environment (e.g. when using a branch-based deployment model where 'master' deploys using the production secrets, 'development' deploys using the staging secrets, and so forth). -A solution to these limitations is to manage secrets in the Git repository jointly with the project's source code. As described in [secrets management](README.md), don't check secrets into the repository in plain text. Instead we can add an encrypted version of our secrets to the repository and enable our CI/CD agents and developers to decrypt the secrets for local usage with some pre-shared key. This gives us the best of both worlds: a secure storage for secrets as well as side-by-side management of secrets and code. +A solution to these limitations is to manage secrets in the Git repository jointly with the project's source code. As described in [secrets management](./README.md), don't check secrets into the repository in plain text. Instead we can add an encrypted version of our secrets to the repository and enable our CI/CD agents and developers to decrypt the secrets for local usage with some pre-shared key. This gives us the best of both worlds: a secure storage for secrets as well as side-by-side management of secrets and code. ```sh # first, make sure that we never commit our plain text secrets and generate a strong encryption key diff --git a/docs/continuous-delivery/gitops/secret-management/secret-rotation-in-pods.md b/docs/CI-CD/gitops/secret-management/secret-rotation-in-pods.md similarity index 88% rename from docs/continuous-delivery/gitops/secret-management/secret-rotation-in-pods.md rename to docs/CI-CD/gitops/secret-management/secret-rotation-in-pods.md index 215ac8e895..0623f4896f 100644 --- a/docs/continuous-delivery/gitops/secret-management/secret-rotation-in-pods.md +++ b/docs/CI-CD/gitops/secret-management/secret-rotation-in-pods.md @@ -1,8 +1,8 @@ -# Secrets rotation of environment variables and mounted secrets in pods +# Secrets Rotation of Environment Variables and Mounted Secrets in Pods This document covers some ways you can do secret rotation with environment variables and mounted secrets in Kubernetes pods -## Mapping Secrets via secretKeyRef with environment variables +## Mapping Secrets via secretKeyRef with Environment Variables If we map a K8s native secret via a `secretKeyRef` into an environment variable and we rotate keys the environment variable is not updated even though the K8s native secret has been updated. We need to restart the Pod so changes get populated. [Reloader](https://github.com/stakater/Reloader) solves this issue with a K8S controller. @@ -17,7 +17,7 @@ If we map a K8s native secret via a `secretKeyRef` into an environment variable ... ``` -## Mapping Secrets via volumeMounts (ESO way) +## Mapping Secrets via volumeMounts (ESO Way) If we map a K8s native secret via a volume mount and we rotate keys the file gets updated. The application needs to then be able pick up the changes without a restart (requiring most likely custom logic in the application to support this). Then no restart of the application is required. @@ -34,7 +34,7 @@ If we map a K8s native secret via a volume mount and we rotate keys the file get ... ``` -## Mapping Secrets via volumeMounts (AKVP SSCSID way) +## Mapping Secrets via volumeMounts (AKVP SSCSID Way) SSCSID focuses on mounting external secrets into the CSI. Thus if we rotate keys the file gets updated. The application needs to then be able pick up the changes without a restart (requiring most likely custom logic in the application to support this). Then no restart of the application is required. diff --git a/docs/continuous-delivery/images/blue_green.png b/docs/CI-CD/images/blue_green.png similarity index 100% rename from docs/continuous-delivery/images/blue_green.png rename to docs/CI-CD/images/blue_green.png diff --git a/docs/continuous-delivery/images/canary_release.png b/docs/CI-CD/images/canary_release.png similarity index 100% rename from docs/continuous-delivery/images/canary_release.png rename to docs/CI-CD/images/canary_release.png diff --git a/docs/continuous-delivery/images/environments.png b/docs/CI-CD/images/environments.png similarity index 100% rename from docs/continuous-delivery/images/environments.png rename to docs/CI-CD/images/environments.png diff --git a/docs/continuous-delivery/images/example_release_flow.png b/docs/CI-CD/images/example_release_flow.png similarity index 100% rename from docs/continuous-delivery/images/example_release_flow.png rename to docs/CI-CD/images/example_release_flow.png diff --git a/docs/continuous-integration/ci-in-data-science/working-with-notebooks/assets/repository-properties.png b/docs/CI-CD/images/repository-properties.png similarity index 100% rename from docs/continuous-integration/ci-in-data-science/working-with-notebooks/assets/repository-properties.png rename to docs/CI-CD/images/repository-properties.png diff --git a/docs/CI-CD/recipes/.pages b/docs/CI-CD/recipes/.pages new file mode 100644 index 0000000000..af7ff8a0a7 --- /dev/null +++ b/docs/CI-CD/recipes/.pages @@ -0,0 +1,6 @@ +nav: + - CD on low code solutions: cd-on-low-code-solutions.md + - CI pipeline for better documentation: ci-pipeline-for-better-documentation.md + - CI with jupyter notebooks: ci-with-jupyter-notebooks.md + - GitHub actions: github-actions + - ... diff --git a/docs/continuous-delivery/low-code-solutions/README.md b/docs/CI-CD/recipes/cd-on-low-code-solutions.md similarity index 96% rename from docs/continuous-delivery/low-code-solutions/README.md rename to docs/CI-CD/recipes/cd-on-low-code-solutions.md index f3ebc2f22f..747ccff53a 100644 --- a/docs/continuous-delivery/low-code-solutions/README.md +++ b/docs/CI-CD/recipes/cd-on-low-code-solutions.md @@ -1,4 +1,4 @@ -# Continuous delivery on low-code and no-code solutions +# Continuous Delivery on Low-Code and No-Code Solutions Low-code and no-code platforms have taken a spot in a wide variety of Business Solutions involving process automation, AI models, Bots, Business Applications and Business Intelligence. The scenarios enabled by these platforms are constantly evolving and opening a spot for productive roles. This has been exactly the reason why bringing more professional tools to their development have become necessary such as controlled and automated delivery. @@ -10,11 +10,11 @@ Environments are spaces where Power Platform Solutions exists. They store, manag ![image](../images/environments.png) -### Environments considerations +### Environments Considerations Whenever an environment has been created, its resources can be only accessed by users within the same tenant which is an Azure Active Directory tenant in fact. When you create an app in an environment that app can only interact with data sources that are also deployed in that same environment, this includes connections, flows and Dataverse databases. This is an important consideration when dealing with a CD process. -## Deployment strategy +## Deployment Strategy With three environments already created to represent the stages of the deployment, the goal now is to automate the deployment from one environment to another. Each environment will require the creation of its own solution: business logic and data. @@ -36,10 +36,9 @@ Third and final step will import the solution into the production environment, t Most used tools to get this process completed are: -* [Power Platform Build Tools](https://marketplace.visualstudio.com/items?itemName=microsoft-IsvExpTools.PowerPlatform-BuildTools). - +* [Power Platform Build Tools](https://marketplace.visualstudio.com/items?itemName=microsoft-IsvExpTools.PowerPlatform-BuildTools) * There is also a non graphical tool that could be used to work with this CD process. The [Power CLI](https://aka.ms/PowerAppsCLI) tool. -## References +## Resources [Application lifecycle management with Microsoft Power Platform](https://learn.microsoft.com/en-us/power-platform/alm/) diff --git a/docs/continuous-integration/markdown-linting/README.md b/docs/CI-CD/recipes/ci-pipeline-for-better-documentation.md similarity index 90% rename from docs/continuous-integration/markdown-linting/README.md rename to docs/CI-CD/recipes/ci-pipeline-for-better-documentation.md index 5f6679c401..90fd1e1360 100644 --- a/docs/continuous-integration/markdown-linting/README.md +++ b/docs/CI-CD/recipes/ci-pipeline-for-better-documentation.md @@ -1,11 +1,10 @@ -# CI Pipeline for better documentation +# CI Pipeline for Better Documentation ## Introduction Most projects start with spikes, where developers and analysts produce lots of documentation. -Sometimes, these documents don't have a standard and each team member writes them accordingly with their preference. Add to that -the time a reviewer will spend confirming grammar, searching for typos or non-inclusive language. +Sometimes, these documents don't have a standard and each team member writes them accordingly with their preference. Add to that the time a reviewer will spend confirming grammar, searching for typos or non-inclusive language. This pipeline helps address that! @@ -21,7 +20,7 @@ ones We have been using this pipeline for more than one year in different engagements and always received great feedback from the customers! -## How does it work +## How Does it Work To start using this pipeline: @@ -33,6 +32,6 @@ To start using this pipeline: the name of the `.azdo` folder. 1. Create the pipeline in Azure DevOps or GitHub -## References +## Resources [Markdown Code Reviews in the Engineering Fundamentals Playbook](https://microsoft.github.io/code-with-engineering-playbook/code-reviews/recipes/markdown/#code-review-checklist) diff --git a/docs/continuous-integration/ci-in-data-science/working-with-notebooks/README.md b/docs/CI-CD/recipes/ci-with-jupyter-notebooks.md similarity index 97% rename from docs/continuous-integration/ci-in-data-science/working-with-notebooks/README.md rename to docs/CI-CD/recipes/ci-with-jupyter-notebooks.md index c8f8d9b199..4acb76b833 100644 --- a/docs/continuous-integration/ci-in-data-science/working-with-notebooks/README.md +++ b/docs/CI-CD/recipes/ci-with-jupyter-notebooks.md @@ -1,16 +1,15 @@ -# Data Science Pipeline +# CI with Jupyter Notebooks As Azure DevOps doesn't allow code reviewers to comment directly in Jupyter Notebooks, Data Scientists(DSs) have to convert the notebooks to scripts before they commit and push these files to the repository. This document aims to automate this process in Azure DevOps, so the DSs don't need to execute anything locally. -## Problem statement +## Problem Statement A Data Science repository has this folder structure: ```bash - . ├── notebooks │   ├── Machine Learning Experiments - 00.ipynb @@ -22,7 +21,6 @@ A Data Science repository has this folder structure:    ├── Machine Learning Experiments - 01.py    ├── Machine Learning Experiments - 02.py    └── Machine Learning Experiments - 03.py - ``` The python files are needed to allow Pull Request reviewers to add comments to the notebooks, they can add comments @@ -41,7 +39,7 @@ We can add a pipeline with the following steps to the repository to run in `ipyn 1. Go to the *Project Settings* -> *Repositories* -> *Security* -> *User Permissions* 1. Add the *Build Service* in *Users* the permission to *Contribute* - ![Contribute](assets/repository-properties.png) + ![Contribute](../images/repository-properties.png) 1. Create a new pipeline. In the newly created pipeline we add: diff --git a/docs/continuous-delivery/devops-provider-recipes/github-actions/runtime-variables/README.md b/docs/CI-CD/recipes/github-actions/runtime-variables/README.md similarity index 96% rename from docs/continuous-delivery/devops-provider-recipes/github-actions/runtime-variables/README.md rename to docs/CI-CD/recipes/github-actions/runtime-variables/README.md index f44adfab89..66eea07075 100644 --- a/docs/continuous-delivery/devops-provider-recipes/github-actions/runtime-variables/README.md +++ b/docs/CI-CD/recipes/github-actions/runtime-variables/README.md @@ -16,13 +16,13 @@ We assume that you, as a CI/CD engineer, want to inject environment variables or Many integration or end-to-end workflows require specific environment variables that are only available at runtime. For example, a workflow might be doing the following: -![Workflow Diagram](images/workflow-diagram.png) +![Workflow Diagram](./images/workflow-diagram.png) In this situation, testing the pipeline is extremely difficult without having to make external calls to the resource. In many cases, making external calls to the resource can be expensive or time-consuming, significantly slowing down inner loop development. Azure DevOps, as an example, offers a way to define pipeline variables on a manual trigger: -![AzDo Example](images/AzDoExample.PNG) +![AzDo Example](./images/AzDoExample.PNG) GitHub Actions does not do so yet. @@ -39,8 +39,8 @@ Out of Scope: - While the solution is obviously extensible using shell scripting or any other means of creating variables, this solution serves well as the proof of the basic concept. No such scripting is provided in this guide. - Additionally, teams may wish to formalize this process using a PR Template that has an additional section for the variables being provided. This is not however included in this guide. -> Security Warning: -> **This is NOT for injecting secrets** as the commit messages and PR body can be retrieved by a third party, are stored in `git log`, and can otherwise be read by a malicious individual using a variety of tools. Rather, this is for testing a workflow that needs simple variables to be injected into it, as above. +> Security Warning: +> **This is NOT for injecting secrets** as the commit messages and PR body can be retrieved by a third party, are stored in `git log`, and can otherwise be read by a malicious individual using a variety of tools. Rather, this is for testing a workflow that needs simple variables to be injected into it, as above. > **If you need to retrieve secrets or sensitive information**, use the [GitHub Action for Azure Key Vault](https://github.com/marketplace/actions/get-secrets-from-azure-key-vault) or some other similar secret storage and retrieval service. ## Commit Message Variables @@ -83,8 +83,6 @@ jobs: run: echo "Flag is available and true" ``` -Available as a .YAML [here](examples/commit-example.yaml). - Code Explanation: The first part of the code is setting up Push triggers on the working branch and checking out the repository, so we will not explore that in detail. @@ -153,7 +151,7 @@ Including the Variable 2. This triggers the workflow (as will any push). As the `[commit var]` is in the commit message, the `${COMMIT_VAR}` variable in the workflow will be set to `true` and result in the following: - ![Commit True Scenario](images/CommitTrue.PNG) + ![Commit True Scenario](./images/CommitTrue.PNG) Not Including the Variable @@ -167,7 +165,7 @@ Not Including the Variable 2. This triggers the workflow (as will any push). As the `[commit var]` is **not** in the commit message, the `${COMMIT_VAR}` variable in the workflow will be set to `false` and result in the following: - ![Commit False Scenario](images/CommitFalse.PNG) + ![Commit False Scenario](./images/CommitFalse.PNG) ## PR Body Variables @@ -211,8 +209,6 @@ jobs: run: echo "Flag is available and true" ``` -Available as a .YAML [here](examples/pr-example.yaml). - Code Explanation: The first part of the YAML file simply sets up the Pull Request Trigger. The majority of the following code is identical, so we will only explain the differences. @@ -256,7 +252,7 @@ There are many real world scenarios where controlling environment variables can Developer A is in the process of writing and testing an integration pipeline. The integration pipeline needs to make a call to an external service such as Azure Data Factory or Databricks, wait for a result, and then echo that result. The workflow could look like this: -![Workflow A](images/DevAWorkflow.png) +![Workflow A](./images/DevAWorkflow.png) The workflow inherently takes time and is expensive to run, as it involves maintaining a Databricks cluster while also waiting for the response. This external dependency can be removed by essentially mocking the response for the duration of writing and testing other parts of the workflow, and mocking the response in situations where the actual response either does not matter, or is not being directly tested. @@ -264,7 +260,7 @@ The workflow inherently takes time and is expensive to run, as it involves maint Developer B is in the process of writing and testing a CI/CD pipeline. The pipeline has multiple CI stages, each of which runs sequentially. The workflow might look like this: -![Workflow B](images/DevBWorkflow.png) +![Workflow B](./images/DevBWorkflow.png) In this case, each CI stage needs to run before the next one starts, and errors in the middle of the process can cause the entire pipeline to fail. While this might be intended behavior for the pipeline in some situations (Perhaps you don't want to run a more involved, longer build or run a time-consuming test coverage suite if the CI process is failing), it means that steps need to be commented out or deleted when testing the pipeline itself. diff --git a/docs/continuous-delivery/devops-provider-recipes/github-actions/runtime-variables/images/AzDoExample.PNG b/docs/CI-CD/recipes/github-actions/runtime-variables/images/AzDoExample.PNG similarity index 100% rename from docs/continuous-delivery/devops-provider-recipes/github-actions/runtime-variables/images/AzDoExample.PNG rename to docs/CI-CD/recipes/github-actions/runtime-variables/images/AzDoExample.PNG diff --git a/docs/continuous-delivery/devops-provider-recipes/github-actions/runtime-variables/images/CommitFalse.PNG b/docs/CI-CD/recipes/github-actions/runtime-variables/images/CommitFalse.PNG similarity index 100% rename from docs/continuous-delivery/devops-provider-recipes/github-actions/runtime-variables/images/CommitFalse.PNG rename to docs/CI-CD/recipes/github-actions/runtime-variables/images/CommitFalse.PNG diff --git a/docs/continuous-delivery/devops-provider-recipes/github-actions/runtime-variables/images/CommitTrue.PNG b/docs/CI-CD/recipes/github-actions/runtime-variables/images/CommitTrue.PNG similarity index 100% rename from docs/continuous-delivery/devops-provider-recipes/github-actions/runtime-variables/images/CommitTrue.PNG rename to docs/CI-CD/recipes/github-actions/runtime-variables/images/CommitTrue.PNG diff --git a/docs/continuous-delivery/devops-provider-recipes/github-actions/runtime-variables/images/DevAWorkflow.png b/docs/CI-CD/recipes/github-actions/runtime-variables/images/DevAWorkflow.png similarity index 100% rename from docs/continuous-delivery/devops-provider-recipes/github-actions/runtime-variables/images/DevAWorkflow.png rename to docs/CI-CD/recipes/github-actions/runtime-variables/images/DevAWorkflow.png diff --git a/docs/continuous-delivery/devops-provider-recipes/github-actions/runtime-variables/images/DevBWorkflow.png b/docs/CI-CD/recipes/github-actions/runtime-variables/images/DevBWorkflow.png similarity index 100% rename from docs/continuous-delivery/devops-provider-recipes/github-actions/runtime-variables/images/DevBWorkflow.png rename to docs/CI-CD/recipes/github-actions/runtime-variables/images/DevBWorkflow.png diff --git a/docs/continuous-delivery/devops-provider-recipes/github-actions/runtime-variables/images/PRExample.PNG b/docs/CI-CD/recipes/github-actions/runtime-variables/images/PRExample.PNG similarity index 100% rename from docs/continuous-delivery/devops-provider-recipes/github-actions/runtime-variables/images/PRExample.PNG rename to docs/CI-CD/recipes/github-actions/runtime-variables/images/PRExample.PNG diff --git a/docs/continuous-delivery/devops-provider-recipes/github-actions/runtime-variables/images/PRTrue.PNG b/docs/CI-CD/recipes/github-actions/runtime-variables/images/PRTrue.PNG similarity index 100% rename from docs/continuous-delivery/devops-provider-recipes/github-actions/runtime-variables/images/PRTrue.PNG rename to docs/CI-CD/recipes/github-actions/runtime-variables/images/PRTrue.PNG diff --git a/docs/continuous-delivery/devops-provider-recipes/github-actions/runtime-variables/images/workflow-diagram.png b/docs/CI-CD/recipes/github-actions/runtime-variables/images/workflow-diagram.png similarity index 100% rename from docs/continuous-delivery/devops-provider-recipes/github-actions/runtime-variables/images/workflow-diagram.png rename to docs/CI-CD/recipes/github-actions/runtime-variables/images/workflow-diagram.png diff --git a/docs/continuous-integration/inclusive-linting.md b/docs/CI-CD/recipes/inclusive-linting.md similarity index 76% rename from docs/continuous-integration/inclusive-linting.md rename to docs/CI-CD/recipes/inclusive-linting.md index 1e4cb52194..bd702987b5 100644 --- a/docs/continuous-integration/inclusive-linting.md +++ b/docs/CI-CD/recipes/inclusive-linting.md @@ -14,9 +14,9 @@ The ability to add additional terms to your linter has the added benefit of enab ## Getting Started with an Inclusive Linter -### [`woke`] +### woke -One inclusive linter we recommend is `woke`. It is a language-agnostic CLI tool that detects non-inclusive language in your source code and recommends alternatives. While `woke` automatically applies a [default ruleset] with non-inclusive terms to lint for, you can also apply a custom rule config (via a yaml file) with additional terms to lint for. See [`example.yaml`] for an example of adding custom rules. +One inclusive linter we recommend is `woke`. It is a language-agnostic CLI tool that detects non-inclusive language in your source code and recommends alternatives. While `woke` automatically applies a default ruleset with non-inclusive terms to lint for, you can also apply a custom rule config (via a yaml file) with additional terms to lint for. Running the tool locally on a file or directory is relatively straightforward: @@ -30,8 +30,8 @@ test.txt:2:2-6: `guys` may be insensitive, use `folks`, `people` instead (warnin `woke` can be run locally on your machine or CI/CD system via CLI and is also available as a two GitHub Actions: -- [Run woke] -- [Run woke with Reviewdog] +- Run woke +- Run woke with Reviewdog To use the standard "Run woke" GitHub Action with the default ruleset in a CI pipeline: @@ -59,11 +59,11 @@ To use the standard "Run woke" GitHub Action with the default ruleset in a CI pi 1. Run your pipeline 1. View the output in the "Actions" tab in the main repository view -For more information about additional configuration and usage, see the official [docs]. +## Resources -[`woke`]: https://github.com/get-woke/woke -[default ruleset]: https://github.com/get-woke/woke/blob/main/pkg/rule/default.yaml -[`example.yaml`]: https://github.com/get-woke/woke/blob/main/example.yaml -[Run woke]: https://github.com/marketplace/actions/run-woke -[Run woke with reviewdog]: https://github.com/marketplace/actions/run-woke-with-reviewdog -[docs]: https://docs.getwoke.tech/ +- [woke](https://github.com/get-woke/woke) +- [default ruleset](https://github.com/get-woke/woke/blob/main/pkg/rule/default.yaml) +- [example.yaml](https://github.com/get-woke/woke/blob/main/example.yaml) +- [Run woke](https://github.com/marketplace/actions/run-woke) +- [Run woke with reviewdog](https://github.com/marketplace/actions/run-woke-with-reviewdog) +- [docs](https://docs.getwoke.tech/) diff --git a/docs/CI-CD/recipes/reusing-devcontainers-within-a-pipeline.md b/docs/CI-CD/recipes/reusing-devcontainers-within-a-pipeline.md new file mode 100644 index 0000000000..195e0387c2 --- /dev/null +++ b/docs/CI-CD/recipes/reusing-devcontainers-within-a-pipeline.md @@ -0,0 +1,72 @@ +# Reusing Dev Containers Within a Pipeline + +Given a repository with a local development container a.k.a. dev container that contains all the tooling required for development, would it make sense to reuse that container for running the tooling in the Continuous Integration pipelines? + +## Options for Building Dev Containers Within a Pipeline + +There are three ways to build devcontainers within pipeline: + +- With [GitHub - devcontainers/ci](https://github.com/devcontainers/ci) builds the container with the `devcontainer.json`. Example here: [devcontainers/ci · Getting Started](https://github.com/devcontainers/ci/blob/main/docs/github-action.md#getting-started). +- With [GitHub - devcontainers/cli](https://github.com/devcontainers/cli), which is the same as the above, but using the underlying CLI directly without tasks. +- Building the `DockerFile` with `docker build`. This option excludes all configuration/features specified within the `devcontainer.json`. + +## Considered Options + +- Run CI pipelines in the native environment +- Run CI pipelines in the dev container via building image locally +- Run CI pipelines in the dev container with a container registry + +Here are below pros and cons for both approaches: + +### Run CI Pipelines in the Native Environment + +| Pros | Cons | +| -- | -- | +| Can use any pipeline tasks available | Need to keep two sets of tooling and their versions in sync | +| No container registry | Can take some time to start, based on tools/dependencies required | +| Agent will always be up to date with security patches | The dev container should always be built within each run of the CI pipeline, to verify the changes within the branch haven't broken anything | + +### Run CI Pipelines in the Dev Container Without Image Caching + +| Pros | Cons | +| -- | -- | +| Utilities scripts will work out of the box | Need to rebuild the container for each run, given that there may be changes within the branch being built | +| Rules used (for linting or unit tests) will be the same on the CI | Not everything in the container is needed for the CI pipeline¹ | +| No surprise for the developers, local outputs (of linting for instance) will be the same in the CI | Some pipeline tasks will not be available | +| All tooling and their versions defined in a single place | Building the image for each pipeline run is slow² | +| Tools/dependencies are already present || +| The dev container is being tested to include all new tooling in addition to not being broken || + +> ¹: container size can be reduced by exporting the layer that contains only the tooling needed for the CI pipeline +> +> ²: could be mitigated via adding image caching without using a container registry + +### Run CI Pipelines in the Dev Container with Image Registry + +| Pros | Cons | +| -- | -- | +| Utilities scripts will work out of the box | Need to rebuild the container for each run, given that there may be changes within the branch being built | +| No surprise for the developers, local outputs (of linting for instance) will be the same in the CI | Not everything in the container is needed for the CI pipeline¹ | +| Rules used (for linting or unit tests) will be the same on the CI | Some pipeline tasks will not be available² | +| All tooling and their versions defined in a single place | Require access to a container registry to host the container within the pipeline³ | +| Tools/dependencies are already present || +| The dev container is being tested to include all new tooling in addition to not being broken || +| Publishing the container built from `devcontainer.json` allows you to reference it in the cacheFrom in `devcontainer.json` (see [docs](https://containers.dev/implementors/json_reference/#image-specific)). By doing this, VS Code will use the published image as a layer cache when building || + +> ¹: container size can be reduces by exporting the layer that contains only the tooling needed for the CI pipeline. This would require building the image without tasks +> +> ²: using container jobs in AzDO you can use all tasks (as far as I can tell). Reference: [Dockerizing DevOps V2 - AzDO container jobs - DEV Community](https://dev.to/eliises/dockerizing-devops-v2-azdo-container-jobs-3hbf) +> +> ³: within GH actions, the default Github Actions token can be used for accessing GHCR without setting up separate registry, see the example below. +> **Note:** This does not build the `Dockerfile` together with the `devcontainer.json` + +```yaml +    - uses: whoan/docker-build-with-cache-action@v5 +        id: cache +        with: +          username: $GITHUB_ACTOR +          password: "${{ secrets.GITHUB_TOKEN }}" +          registry: docker.pkg.github.com +          image_name: devcontainer +          dockerfile: .devcontainer/Dockerfile +``` diff --git a/docs/continuous-delivery/recipes/terraform/save-output-to-variable-group.md b/docs/CI-CD/recipes/terraform/save-output-to-variable-group.md similarity index 99% rename from docs/continuous-delivery/recipes/terraform/save-output-to-variable-group.md rename to docs/CI-CD/recipes/terraform/save-output-to-variable-group.md index 743031a776..8bb21618c2 100644 --- a/docs/continuous-delivery/recipes/terraform/save-output-to-variable-group.md +++ b/docs/CI-CD/recipes/terraform/save-output-to-variable-group.md @@ -1,4 +1,4 @@ -# Save terraform output to a variable group (Azure DevOps) +# Save Terraform Output to a Variable Group (Azure DevOps) This recipe applies only to [terraform](https://www.terraform.io/) usage with Azure DevOps. It assumes your familiar with terraform commands and Azure Pipelines. @@ -137,4 +137,4 @@ Roles are defined for Library items, and membership of these roles governs the o When using `System.AccessToken`, service account ` Build Service` identity will be used to access the Library. -Please ensure in `Pipelines > Library > Security` section that this service account has `Administrator` role at the `Library` or `Variable Group` level to create/update/delete variables (see. [Library of assets](https://learn.microsoft.com/en-us/azure/devops/pipelines/library/?view=azure-devops) for additional information)). +Please ensure in `Pipelines > Library > Security` section that this service account has `Administrator` role at the `Library` or `Variable Group` level to create/update/delete variables (see. [Library of assets](https://learn.microsoft.com/en-us/azure/devops/pipelines/library/?view=azure-devops) for additional information). diff --git a/docs/continuous-delivery/recipes/terraform/share-common-variables-naming-conventions.md b/docs/CI-CD/recipes/terraform/share-common-variables-naming-conventions.md similarity index 95% rename from docs/continuous-delivery/recipes/terraform/share-common-variables-naming-conventions.md rename to docs/CI-CD/recipes/terraform/share-common-variables-naming-conventions.md index f9e6865db5..4660c77b9a 100644 --- a/docs/continuous-delivery/recipes/terraform/share-common-variables-naming-conventions.md +++ b/docs/CI-CD/recipes/terraform/share-common-variables-naming-conventions.md @@ -1,6 +1,6 @@ # Sharing Common Variables / Naming Conventions Between Terraform Modules -## What are we trying to solve? +## What are we Trying to Solve? When deploying infrastructure using code, it's common practice to split the code into different modules that are responsible for the deployment of a part or a component of the infrastructure. In Terraform, this can be done by using [modules](https://www.terraform.io/language/modules/develop). @@ -15,13 +15,13 @@ There are dependencies between these modules, like the Kubernetes cluster that w This page explains a way to solve this with Terraform. -## How to do it? +## How to Do It? ### Context Let's consider the following structure for our modules: -```console +```sh modules ├── kubernetes │   ├── main.tf @@ -40,7 +40,7 @@ Now, assume that you deploy a virtual network for the development environment, w Then at some point, you need to inject these values into the Kubernetes module, to get a reference to it through a data source, for example: -```hcl +```tf data "azurem_virtual_network" "vnet" { name = var.vnet_name resource_group_name = var.vnet_rg_name @@ -51,17 +51,17 @@ In the snippet above, the virtual network name and resource group are defined th Being able to manage naming in a central place will make sure the code can easily be refactored in the future, without updating all modules. -### About Terraform variables +### About Terraform Variables In Terraform, every [input variable](https://www.terraform.io/language/values/variables) must be defined at the configuration (or module) level, using the `variable` block. By convention, this is often done in a `variables.tf` file, in the module. This file contains variable declaration and default values. Values can be set using variables configuration files (.tfvars), environment variables or CLI arg when using the terraform `plan` or `apply` commands. One of the limitation of the variables declaration is that it's not possible to compose variables, [locals](https://www.terraform.io/language/values/locals) or Terraform [built-in functions](https://www.terraform.io/language/functions) are used for that. -### Common Terraform module +### Common Terraform Module One way to bypass this limitations is to introduce a "common" module, that will not deploy any resources, but just compute / calculate and output the resource names and shared variables, and be used by all other modules, as a dependency. -```console +```sh modules ├── common │   ├── output.tf @@ -78,7 +78,7 @@ modules *variables.tf:* -```hcl +```tf variable "environment_name" { type = string description = "The name of the environment." @@ -93,7 +93,7 @@ variable "location" { *output.tf:* -```hcl +```tf # Shared variables output "location" { value = var.location @@ -126,7 +126,7 @@ output "aks_name" { Now, if you execute the Terraform apply for the common module, you get all the shared/common variables in outputs: -```console +```sh $ terraform plan -var environment_name="dev" -var subscription="$(az account show --query id -o tsv)" Changes to Outputs: @@ -140,11 +140,11 @@ Changes to Outputs: You can apply this plan to save these new output values to the Terraform state, without changing any real infrastructure. ``` -### Use the common Terraform module +### Use the Common Terraform Module Using the common Terraform module in any other module is super easy. For example, this is what you can do in the Azure Kubernetes module `main.tf` file: -```hcl +```tf module "common" { source = "../common" environment_name = var.environment_name @@ -177,7 +177,7 @@ resource "azurerm_kubernetes_cluster" "aks" { Then, you can execute the `terraform plan` and `terraform apply` commands to deploy! -```console +```sh terraform plan -var environment_name="dev" -var subscription="$(az account show --query id -o tsv)" data.azurerm_subnet.aks_subnet: Reading... data.azurerm_subnet.aks_subnet: Read complete after 1s [id=/subscriptions/01010101-1010-0101-1010-010101010101/resourceGroups/rg-network-dev/providers/Microsoft.Network/virtualNetworks/vnet-dev/subnets/AksSubnet] @@ -240,9 +240,9 @@ Terraform will perform the following actions: Plan: 1 to add, 0 to change, 0 to destroy. ``` -Note: the usage of a common module is also valid if you decide to deploy all your modules in the same operation from a main Terraform configuration file, like: +> **Note:** the usage of a common module is also valid if you decide to deploy all your modules in the same operation from a main Terraform configuration file, like: -```hcl +```tf module "common" { source = "./common" environment_name = var.environment_name @@ -262,13 +262,13 @@ module "kubernetes" { } ``` -### Centralize input variables definitions +### Centralize Input Variables Definitions In case you chose to define variables values directly in the source control (e.g. gitops scenario) using [variables definitions files](https://www.terraform.io/language/values/variables#variable-definitions-tfvars-files) (`.tfvars`), having a common module will also help to not have to duplicate the common variables definitions in all modules. Indeed, it is possible to have a global file that is defined once, at the common module level, and merge it with a module-specific variables definitions files at Terraform `plan` or `apply` time. Let's consider the following structure: -```console +```sh modules ├── common │   ├── dev.tfvars @@ -297,7 +297,7 @@ Then, it's possible to merge these files when running the `terraform apply` or ` terraform plan -var-file=<(cat ../common/dev.tfvars ./dev.tfvars) ``` -*Note: using this, it is really important to ensure that you have not the same variable names in both files, otherwise that will generate an error.* +> **Note:** using this, it is really important to ensure that you have not the same variable names in both files, otherwise that will generate an error. ## Conclusion diff --git a/docs/continuous-delivery/recipes/terraform/terraform-structure-guidelines.md b/docs/CI-CD/recipes/terraform/terraform-structure-guidelines.md similarity index 97% rename from docs/continuous-delivery/recipes/terraform/terraform-structure-guidelines.md rename to docs/CI-CD/recipes/terraform/terraform-structure-guidelines.md index fbb39438c7..f115a1447a 100644 --- a/docs/continuous-delivery/recipes/terraform/terraform-structure-guidelines.md +++ b/docs/CI-CD/recipes/terraform/terraform-structure-guidelines.md @@ -3,12 +3,11 @@ ## Context When creating an infrastructure configuration, it is important to follow a consistent and organized structure to ensure maintainability, scalability and reusability of the code. The goal of this section is to briefly describe how to structure your Terraform configuration in order to achieve this. -## Structuring the Terraform configuration +## Structuring the Terraform Configuration The recommended structure is as follows: 1. Place each component you want to configure in its own module folder. Analyze your infrastructure code and identify the logical components that can be separated into reusable modules. This will give you a clear separation of concerns and will make it straight forward to include new resources, update existing ones or reuse them in the future. For more details on modules and when to use them, see the [Terraform guidance](https://developer.hashicorp.com/terraform/language/modules/develop#when-to-write-a-module). - 2. Place the `.tf` module files at the root of each folder and make sure to include a [`README`](#generating-the-documentation) file in a markdown format which can be automatically generated based on the module code. It's recommended to follow this approach as this file structure will be automatically picked up by the [Terraform Registry](https://registry.terraform.io/browse/modules). 3. Use a consistent set of files to structure your modules. While this can vary depending on the specific needs of the project, one good example can be the following: - **provider.tf**: defines the list of providers according to the plugins used @@ -25,7 +24,7 @@ The tests folder includes one or more files to test the example module together An example configuration structure obtained using the guidelines above is: -```console +```sh modules ├── mlops │   ├── doc @@ -42,7 +41,7 @@ modules ├── main ``` -## Testing the configuration +## Testing the Configuration To test Terraform configurations, the [Terratest library](https://terratest.gruntwork.io/) is utilized. A comprehensive guide to best practices with Terratest, including unit tests, integration tests, and end-to-end tests, is available for reference [here](https://terratest.gruntwork.io/docs/testing-best-practices/unit-integration-end-to-end-test/). @@ -62,7 +61,7 @@ For simple Terraform configurations, extensive unit testing might be overkill. I - **Validation of Key Vault contents**: Ensuring the presence of necessary keys, certificates, or secrets in the Azure Key Vault that are stored as part of resource configuration. - **Properties that can influence the cost or location**: This can be achieved by asserting the locations, service tiers, storage settings, depending on the properties available for the resources. -## Naming convention +## Naming Convention When naming Terraform variables, it's essential to use clear and consistent naming conventions that are easy to understand and follow. The general convention is to use lowercase letters and numbers, with underscores instead of dashes, for example: "azurerm_resource_group". When naming resources, start with the provider's name, followed by the target resource, separated by underscores. For instance, "azurerm_postgresql_server" is an appropriate name for an Azure provider resource. When it comes to data sources, use a similar naming convention, but make sure to use plural names for lists of items. For example, "azurerm_resource_groups" is a good name for a data source that represents a list of resource groups. @@ -70,7 +69,7 @@ Variable and output names should be descriptive and reflect the purpose or use o Make sure you include a description for outputs and variables, as well as marking the values as 'default' or 'sensitive' when the case. This information will be captured in the generated documentation. -## Generating the documentation +## Generating the Documentation The documentation can be automatically generated based on the configuration code in your modules with the help of [terraform-docs](https://terraform-docs.io/). To generate the Terraform module documentation, go to the module folder and enter this command: @@ -84,11 +83,11 @@ Then, the documentation will be generated inside the component root directory. The approach presented in this section is designed to be flexible and easy to use, making it straight forward to add new resources or update existing ones. The separation of concerns also makes it easy to reuse existing components in other projects, with all the information (modules, examples, documentation and tests) located in one place. -## References and Further Reading +## Resources - [Terraform-docs](https://github.com/terraform-docs/terraform-docs) - [Terraform Registry](https://registry.terraform.io/browse/modules) - [Terraform Module Guidance](https://developer.hashicorp.com/terraform/language/modules/develop#when-to-write-a-module) - [Terratest](https://terratest.gruntwork.io/) - [Testing HashiCorp Terraform](https://www.hashicorp.com/blog/testing-hashicorp-terraform) -- [Build Infrastructure - Terraform Azure Example](https://developer.hashicorp.com/terraform/tutorials/azure-get-started/azure-build) \ No newline at end of file +- [Build Infrastructure - Terraform Azure Example](https://developer.hashicorp.com/terraform/tutorials/azure-get-started/azure-build) diff --git a/docs/ISE.md b/docs/ISE.md index ff0425c391..eeff8db1a1 100644 --- a/docs/ISE.md +++ b/docs/ISE.md @@ -1,4 +1,4 @@ -# Who We Are +# Who is ISE (Industry Solutions Engineering) Our team, ISE (Industry Solutions Engineering), works side-by-side with customers to help them tackle their toughest technical problems both in the cloud and on the edge. We meet customers where they are, work in the languages they use, with the open source frameworks they use, and on the operating systems they use. We work with enterprises and start-ups across many industries from financial services to manufacturing. Our work covers a broad spectrum of domains including IoT, machine learning, and high scale compute. Our "superpower" is that we work closely with both our customers’ engineering teams and Microsoft’s product engineering teams, developing real-world expertise that we can use to help our customers grow their business and help Microsoft improve our products and services. diff --git a/docs/index.md b/docs/README.md similarity index 59% rename from docs/index.md rename to docs/README.md index 6bf536aa89..16ee1ba4e5 100644 --- a/docs/index.md +++ b/docs/README.md @@ -9,19 +9,15 @@ An engineer working for a [ISE](ISE.md) project... This is our playbook. All contributions are welcome! Please feel free to submit a pull request to get involved. -## Why Have A Playbook +## Why Have a Playbook * To increase overall efficiency for team members and the whole team in general. * To reduce the number of mistakes and avoid common pitfalls. * To strive to be better engineers and learn from other people's shared experience. -## "The" Checklist +If you do nothing else follow the [Engineering Fundamentals Checklist](./engineering-fundamentals-checklist.md)! -If you do nothing else follow the [Engineering Fundamentals Checklist](ENG-FUNDAMENTALS-CHECKLIST.md)! - -## Structure of a Sprint - -The [structure of a sprint](SPRINT-STRUCTURE.md) is a breakdown of the sections of the playbook according to the structure of an Agile sprint. +The [first week of an ISE project](./the-first-week-of-an-ise-project.md) is a breakdown of the sections of the playbook according to the structure of an Agile sprint. ## General Guidance @@ -36,34 +32,6 @@ The [structure of a sprint](SPRINT-STRUCTURE.md) is a breakdown of the sections * Report product issues found and provide clear and repeatable engineering feedback! * We all own our code and each one of us has an obligation to make all parts of the solution great. -## QuickLinks - -* [Engineering Fundamentals Checklist](ENG-FUNDAMENTALS-CHECKLIST.md) -* [Structure of a Sprint](SPRINT-STRUCTURE.md) - -## Engineering Fundamentals - -* [Accessibility](accessibility/README.md) -* [Agile Development](agile-development/README.md) -* [Automated Testing](automated-testing/README.md) -* [Code Reviews](code-reviews/README.md) -* [Continuous Delivery (CD)](continuous-delivery/README.md) -* [Continuous Integration (CI)](continuous-integration/README.md) -* [Design](design/readme.md) -* [Developer Experience](developer-experience/README.md) -* [Documentation](documentation/README.md) -* [Engineering Feedback](engineering-feedback/README.md) -* [Observability](observability/README.md) -* [Security](security/README.md) -* [Privacy](privacy/README.md) -* [Source Control](source-control/README.md) -* [Reliability](reliability/README.md) - -## Fundamentals for Specific Technology Areas - -* [Machine Learning Fundamentals](machine-learning/README.md) -* [User-Interface Engineering](user-interface-engineering/README.md) - ## Contributing See [CONTRIBUTING.md](https://github.com/microsoft/code-with-engineering-playbook/tree/master/CONTRIBUTING.md) for contribution guidelines. diff --git a/docs/user-interface-engineering/README.md b/docs/UI-UX/README.md similarity index 92% rename from docs/user-interface-engineering/README.md rename to docs/UI-UX/README.md index 61be61ef7a..56475b1d84 100644 --- a/docs/user-interface-engineering/README.md +++ b/docs/UI-UX/README.md @@ -12,10 +12,10 @@ Keep in mind that like all software, there is no "right way" to build a user int The state of web platform engineering is fast moving. There is no one-size-fits-all solution. For any team to be successful in building a UI, they need to have an understanding of the higher-level aspects of all UI project. -1. [**Accessibility**](../accessibility/README.md) - ensuring your application is usable and enjoyed by as many people as possible is at the heart of accessibility and inclusive design. -1. [**Usability**](./usability.md) - how effortless should it be for any given user to use the application? Do they need special training or a document to understand how to use it, or will it be intuitive? +1. [**Accessibility**](../non-functional-requirements/accessibility.md) - ensuring your application is usable and enjoyed by as many people as possible is at the heart of accessibility and inclusive design. +1. [**Usability**](../non-functional-requirements/usability.md) - how effortless should it be for any given user to use the application? Do they need special training or a document to understand how to use it, or will it be intuitive? 1. [**Maintainability**](../non-functional-requirements/maintainability.md) - is the application just a proof of concept to showcase an idea for future work, or will it be an MVP and act as the starting point for a larger, production-ready application? Sometimes you don't need React or any other framework. Sometimes you need React, but not all the bells and whistles from create-react-app. Understanding project maintainability requirements can simplify an engagement’s tooling needs significantly and let folks iterate without headaches. -1. [**Stability**](./stability.md) - what is the cost of adding a dependency? Is it actively stable/updated/maintained? If not, can you afford the tech debt (sometimes the answer can be yes!)? Could you get 90% of the way there without adding another dependency? +1. **Stability** - what is the cost of adding a dependency? Is it actively stable/updated/maintained? If not, can you afford the tech debt (sometimes the answer can be yes!)? Could you get 90% of the way there without adding another dependency? More information is available for each general guidance section in the corresponding pages. @@ -57,7 +57,3 @@ For more information of choosing the right implementation tool, read the [Recomm Continue reading the [Trade Study](./../design/design-reviews/trade-studies/README.md) section of this site for more information on completing this step in the design process. After iterating through multiple trade study documents, this design process can be considered complete! With an agreed upon solution and implementation in mind, it is now time to begin development. A natural continuation of the design process is to get users (or stakeholders) involved as early as possible. Constantly look for design and usability feedback, and utilize this to improve the application as it is being developed. - -### Example - -> Coming soon! diff --git a/docs/user-interface-engineering/recommended-technologies.md b/docs/UI-UX/recommended-technologies.md similarity index 98% rename from docs/user-interface-engineering/recommended-technologies.md rename to docs/UI-UX/recommended-technologies.md index 1a2e9ddd0a..80e4ffa326 100644 --- a/docs/user-interface-engineering/recommended-technologies.md +++ b/docs/UI-UX/recommended-technologies.md @@ -48,11 +48,3 @@ npm init vite@latest my-app --template react-ts # npm 7.x npm init vite@latest my-app -- --template react-ts ``` - -## HTML/CSS/JS - -> Coming soon! - -## Web Components - -> Coming soon! diff --git a/docs/agile-development/README.md b/docs/agile-development/README.md index 14d83a8139..df3b626a50 100644 --- a/docs/agile-development/README.md +++ b/docs/agile-development/README.md @@ -1,15 +1,15 @@ -# Agile development +# Agile Development In this documentation we refer to the team working on an engagement a **"Crew"**. This includes the dev team, dev lead, PM, data scientists, etc. -## Why agile +## Why Agile - We want to be quick to respond to change - We want to get to a state of working software fast, and iterate on it to improve it - We want to keep the customer/end users involved all the way through - We care about individuals and interactions over documents and processes -## The fundamentals +## The Fundamentals We care about the goal for each activity, but not necessarily about how they are accomplished. The suggestions in parenthesis are common ways to accomplish the goals. @@ -23,7 +23,7 @@ We care about the goal for each activity, but not necessarily about how they are - The team has a joint idea of how we work together (ex. team agreement) - We value and respect the opinions and work of all team members. -## Links +## References - [What Is Scrum?](https://www.scrum.org/resources/what-is-scrum) - [Essential Scrum: A Practical Guide to The Most Popular Agile Process](https://www.goodreads.com/book/show/13663747-essential-scrum) diff --git a/docs/agile-development/advanced-topics/backlog-management/minimal-slices.md b/docs/agile-development/advanced-topics/backlog-management/minimal-slices.md index ba64bc67a2..a0f9267541 100644 --- a/docs/agile-development/advanced-topics/backlog-management/minimal-slices.md +++ b/docs/agile-development/advanced-topics/backlog-management/minimal-slices.md @@ -1,9 +1,8 @@ -# Minimalism Slices +# Minimal Slices -## Always deliver your work using minimal valuable slices +## Always Deliver Your Work Using Minimal Valuable Slices - Split your work item into small chunks that are contributed in incremental commits. - - Contribute your chunks frequently. Follow an iterative approach by regularly providing updates and changes to the team. This allows for instant feedback and early issue discovery and ensures you are developing in the right direction, both technically and functionally. - Do NOT work independently on your task without providing any updates to your team. @@ -12,11 +11,11 @@ Imagine you are working on adding UWP (Universal Windows Platform) application building functionality for existing continuous integration service which already has Android/iOS support. -#### Bad approach +#### Bad Approach After six weeks of work you created PR with all required functionality, including portal UI (build settings), backend REST API (UWP build functionality), telemetry, unit and integration tests, etc. -#### Good approach +#### Good Approach You divided your feature into smaller user stories (which in turn were divided into multiple tasks) and started working on them one by one: diff --git a/docs/agile-development/advanced-topics/collaboration/add-pairing-field-azure-devops-cards.md b/docs/agile-development/advanced-topics/collaboration/add-pairing-field-azure-devops-cards.md index a857b7d78e..c0592bcb61 100644 --- a/docs/agile-development/advanced-topics/collaboration/add-pairing-field-azure-devops-cards.md +++ b/docs/agile-development/advanced-topics/collaboration/add-pairing-field-azure-devops-cards.md @@ -1,18 +1,18 @@ -# How to add a Pairing Custom Field in Azure DevOps User Stories +# How to Add a Pairing Custom Field in Azure DevOps User Stories This document outlines the benefits of adding a custom field of type _Identity_ in [Azure DevOps](https://learn.microsoft.com/en-us/azure/devops/user-guide/what-is-azure-devops) user stories, prerequisites, and a step-by-step guide. -## Benefits of adding a custom field +## Benefits of Adding a Custom Field -Having the names of both individuals [pairing on a story](README.md) visible on the Azure DevOps cards can be helpful during sprint ceremonies and lead to greater accountability by the pairing assignee. For example, it is easier to keep track of the individuals assigned stories as part of a pair during sprint planning by using the "pairing names" field. During stand-up it can also help the Process Lead filter stories assigned to the individual (both as an owner or as a pairing assignee) and show these on the board. Furthermore, the pairing field can provide an additional data point for reports and burndown rates. +Having the names of both individuals pairing on a story visible on the Azure DevOps cards can be helpful during sprint ceremonies and lead to greater accountability by the pairing assignee. For example, it is easier to keep track of the individuals assigned stories as part of a pair during sprint planning by using the "pairing names" field. During stand-up it can also help the Process Lead filter stories assigned to the individual (both as an owner or as a pairing assignee) and show these on the board. Furthermore, the pairing field can provide an additional data point for reports and burndown rates. ## Prerequisites Prior to customizing Azure DevOps, review [Configure and customize Azure Boards](https://learn.microsoft.com/en-us/azure/devops/boards/configure-customize). -In order to add a custom field to user stories in Azure DevOps changes must be made as an **Organizational setting**. This document therefore assumes use of an existing Organization in Azure DevOps and that the user account used to make these changes is a member of the [Project Collection Administrators Group](https://learn.microsoft.com/en-us/azure/devops/organizations/security/set-project-collection-level-permissions). +In order to add a custom field to user stories in Azure DevOps changes must be made as an **Organization setting**. This document therefore assumes use of an existing Organization in Azure DevOps and that the user account used to make these changes is a member of the [Project Collection Administrators Group](https://learn.microsoft.com/en-us/azure/devops/organizations/security/set-project-collection-level-permissions). -### Change the organization settings +### Change the Organization Settings 1. Duplicate the process currently in use. @@ -44,7 +44,7 @@ In order to add a custom field to user stories in Azure DevOps changes must be m This completes the change in Organization settings. The rest of the instructions must be completed under Project Settings. -### Change the project settings +### Change the Project Settings 1. Go to the Project that is to be modified, select **Project Settings**. diff --git a/docs/agile-development/advanced-topics/collaboration/pair-programming-tools.md b/docs/agile-development/advanced-topics/collaboration/pair-programming-tools.md index 55321727a7..ea76cad119 100644 --- a/docs/agile-development/advanced-topics/collaboration/pair-programming-tools.md +++ b/docs/agile-development/advanced-topics/collaboration/pair-programming-tools.md @@ -4,7 +4,7 @@ Pair programming used to be a software development technique in which two progra Through the effective utilization of a range of tools and techniques, we have successfully implemented both pair and swarm programming methodologies. As such, we are eager to share some of the valuable insights and knowledge gained from this experience. -## How to make pair programming a painless experience? +## How to Make Pair Programming a Painless Experience? ### Working Sessions diff --git a/docs/agile-development/advanced-topics/collaboration/social-question.md b/docs/agile-development/advanced-topics/collaboration/social-question.md index 5b9ba92606..537c933eab 100644 --- a/docs/agile-development/advanced-topics/collaboration/social-question.md +++ b/docs/agile-development/advanced-topics/collaboration/social-question.md @@ -6,13 +6,13 @@ The social question should be chosen before the stand-up. The facilitator should > **Tip:** having the stand-up facilitator role rotate each sprint lets the facilitator choose the social question independently without burdening any one team member. -## Properties of a good question +## Properties of a Good Question A good question has a brief answer with small optional elaboration. A yes or no answer doesn't tell you very much about someone, while knowing that their favorite fruit is a [durian](https://en.wikipedia.org/wiki/Durian) is informative. Good questions are low in consequence but allow controversy. Watching someone strongly exclaim that salmon and lox on cinnamon-raisin is the best bagel order is endearing. As a corollary, a good question is one someone is likely to be passionate about. You know a little more about a team member's personality if their eyes light up when describing their favorite karaoke song. -## Starter list of questions +## Starter List of Questions Potentially good questions include: diff --git a/docs/agile-development/advanced-topics/collaboration/teaming-up.md b/docs/agile-development/advanced-topics/collaboration/teaming-up.md index a08a0b6bee..4d88650325 100644 --- a/docs/agile-development/advanced-topics/collaboration/teaming-up.md +++ b/docs/agile-development/advanced-topics/collaboration/teaming-up.md @@ -12,7 +12,7 @@ However those phases can be extremely fast or sometimes mismatched in teams due In order to minimize the risk and set the expectations on the right way for all parties, an identification phase is important to understand each other. Some potential steps in this phase may be as following (not limited): -- [Working agreement](../team-agreements/working-agreements.md) +- [Working agreement](../../team-agreements/working-agreement.md) - Identification of styles/preferences in communication, sharing, learning, decision making of each team member @@ -29,7 +29,7 @@ Some potential steps in this phase may be as following (not limited): - Identification of communication channels, feedback loops and recurrent team call slots out of regular sprint meetings -- Introduction to [Technical Agility Team Manifesto](../team-agreements/team-manifesto.md) and planning the technical delivery by aiming to keep +- Introduction to [Technical Agility Team Manifesto](../../team-agreements/team-manifesto.md) and planning the technical delivery by aiming to keep technical debt risk minimum. ## Following the Plan and Agile Debugging @@ -50,7 +50,7 @@ Just as an example, agility debugging activities may include: - Are Acceptance Criteria enough and right? - Is everyone ready-to-go after taking the User Story/Task? -- Running [Efficient Retrospectives](../../basics/ceremonies.md#retrospectives) +- Running [efficient retrospectives](../../ceremonies.md#retrospectives) - Is the Sprint Goal clear in every iteration ? @@ -63,5 +63,4 @@ Following that, above suggestions aim to remove agile/team disfunctionalities an ## Resources - [Tuckman's Stages of Group Development](https://en.wikipedia.org/wiki/Tuckman%27s_stages_of_group_development) - - [Scrum Values](https://scrumguides.org/scrum-guide.html) diff --git a/docs/agile-development/advanced-topics/collaboration/virtual-collaboration.md b/docs/agile-development/advanced-topics/collaboration/virtual-collaboration.md index f6b199548e..fc9c624fee 100644 --- a/docs/agile-development/advanced-topics/collaboration/virtual-collaboration.md +++ b/docs/agile-development/advanced-topics/collaboration/virtual-collaboration.md @@ -12,13 +12,13 @@ Virtual work patterns are different from the in-person patterns we are accustome Pair programming is one way to achieve these results. Red Team Testing (RTT) is an alternate programming method that uses the same principles but with some of the advantages that virtual work methods provide. -## Red Team Testing +## Red Team Testing (RTT) Red Team Testing borrows its name from the “Red Team” and “Blue Team” paradigm of penetration testing, and is a collaborative, parallel way of working virtually. In Red Team Testing, two developers jointly decide on the interface, architecture, and design of the program, and then separate for the implementation phase. One developer writes tests using the public interface, attempting to perform edge case testing, input validation, and otherwise stress testing the interface. The second developer is simultaneously writing the implementation which will eventually be tested. Red Team Testing has the same philosophy as any other Test-Driven Development lifecycle: All implementation is separated from the interface, and the interface can be tested with no knowledge of the implementation. -![ptt-diagram](images/PTTdiagram.PNG) +![ptt-diagram](./images/PTTdiagram.PNG) ## Steps @@ -34,7 +34,7 @@ Red Team Testing has the same philosophy as any other Test-Driven Development li - Realistic Scenario: The tests have either broken or failed due to flaws in testing. This leads to further clarification of the design and a discussion of why the tests failed. 1. The developers will repeat the three phases until the code is functional and tested. -## When to follow the RTT strategy +## When to Follow the RTT Strategy RTT works well under specific circumstances. If collaboration needs to happen virtually, and all communication is virtual, RTT reduces the need for constant communication while maintaining the benefits of a joint design session. This considers the human element: Virtual communication is more exhausting than in person communication. @@ -50,7 +50,7 @@ RTT has many of the same benefits as Pair Programming and Test-Driven developmen - RTT encourages testing to be prioritized alongside implementation, instead of having testing follow or be influenced by the implementation of the code. - Documentation is inherently a part of RTT, since both the implementer and the tester need correct, up to date documentation, in the implementation phase. -## What you need for RTT to work well +## What You Need for RTT to Work Well - Demand for constant communication and good teamwork may pose a challenge; daily updates amongst team members are essential to maintain alignment on varying code requirements. - Clarity of the code design and testing strategy must be established beforehand and documented as reference. Lack of an established design will cause misalignment between the two major pieces of work and a need for time-consuming refactoring. diff --git a/docs/agile-development/advanced-topics/collaboration/why-collaboration.md b/docs/agile-development/advanced-topics/collaboration/why-collaboration.md index b5905742ad..135fdb7b7d 100644 --- a/docs/agile-development/advanced-topics/collaboration/why-collaboration.md +++ b/docs/agile-development/advanced-topics/collaboration/why-collaboration.md @@ -1,7 +1,7 @@ # Why Collaboration -## Why collaboration is important +## Why is Collaboration Important In engagements, we aim to be highly collaborative because when we code together, we perform better, have a higher sprint velocity, and have a greater degree of knowledge sharing across the team. @@ -11,7 +11,7 @@ There are two common patterns we use for collaboration: Pairing and swarming. **Swarm programming** (“swarming”) - three or more software engineers collaborating on a high-priority item to bring it to completion. -## How to pair program +## How to Pair Program As mentioned, every story is intentionally assigned to a pair. The pairing assignee may be in the process of upskilling, nevertheless, they are equal partners in the development effort. Below are some general guidelines for pairing: @@ -25,7 +25,7 @@ Below are some general guidelines for pairing: - The pairing assignee is the voice representing the pair during the daily standup while being supported by the story owner. - Having the names of both individuals (owner and pair assignee) visible on the PBI can be helpful during sprint ceremonies and lead to greater accountability by the pairing assignee. An example of this using Azure DevOps cards can be found [here](./add-pairing-field-azure-devops-cards.md). -## Why pair programming helps collaboration +## Why Pair Programming Helps Collaboration Pair programming helps collaboration because both engineers share equal responsibility for bringing the story to completion. This is a mutually beneficial exercise because, while the story owner often has more experience to lean on, the pairing assignee brings a fresh view that is unclouded by repetition. @@ -36,7 +36,7 @@ Some other benefits include: - Even something as simple as describing the problem out loud can help uncover issues or bugs in the code. - Pairing can help brainstorming as well as validating details such as making the variable names consistent. -## When to swarm program +## When to Swarm Program It is important to know that not every PBI needs to use swarming. Some sprints may not even warrant swarming at all. Swarm when: @@ -46,7 +46,7 @@ Swarm when: - An unknown is discovered that needs a collaborative effort to form a decision on how to move forward. The collective knowledge and expertise help move the story forward more quickly and ultimately produced better quality code. - A conflict or unresolved difference of opinion arises during a pairing session. Promote the work to become a swarming session to help resolve the conflict. -## How to swarm program +## How to Swarm Program As soon the pair finds out that the PBI will warrant swarming, the pair brings it up to the rest of the team (via parking lot during stand-up or asynchronously). Members of the team agree or volunteer to assist. @@ -54,20 +54,20 @@ As soon the pair finds out that the PBI will warrant swarming, the pair brings i - During a swarming session, an engineer can branch out if there is something that needs to be handled while the swarm tackles the main problem at hand, then reconnects and reports back. This allows the swarm to focus on a core aspect and to be all on the same page. - The Teams call is repeated until resolution is found or alternative path forward is formulated. -## Why swarm programming helps collaboration +## Why Swarm Programming Helps Collaboration - Swarming allows the collective knowledge and expertise of the team to come together in a focused and unified way. - Not only does swarming help close out the item faster, but it also helps the team understand each other’s strengths and weaknesses. - Allows the team to build a higher level of trust and work as a cohesive unit. -## When to decide to swarm, pair, and/or split +## When to Decide to Swarm, Pair, and/or Split - While a lot of time can be spent on pair programming, it does make sense to split the work when folks understand how the work will be carried out, and the work to be done is largely prescriptive. - Once the story has been jointly tasked out by both engineers, the engineers may choose to tackle some tasks separately and then combine the work together at the end. - Pair programming is more helpful when the engineers do not have perfect clarity about what is needed to be done or how it can be done. - Swarming is done when the two engineers assigned to the story need an additional sounding board or need expertise that other team members could provide. -## Benefits of increased collaboration +## Benefits of Increased Collaboration Knowledge sharing and bringing ISE and customer engineers together in a ‘code-with’ manner is an important aspect of ISE engagements. This grows both our customers’ and our ISE team’s capability to build on Azure. We are responsible for demonstrating engineering fundamentals and leaving the customer in a better place after we disengage. This can only happen if we collaborate and engage together as a team. In addition to improved software quality, this also adds a beneficial social aspect to the engagements. diff --git a/docs/agile-development/advanced-topics/effective-organization/delivery-plan.md b/docs/agile-development/advanced-topics/effective-organization/delivery-plan.md index 1d515e0341..c744adbbbd 100644 --- a/docs/agile-development/advanced-topics/effective-organization/delivery-plan.md +++ b/docs/agile-development/advanced-topics/effective-organization/delivery-plan.md @@ -19,20 +19,20 @@ Delivery Plans ensure your teams are aligning with your organizational goals. One approach you can take to accomplish is with stickies and a spreadsheet. -Step 1: Stack rank the features for everything in your backlog +1. Stack rank the features for everything in your backlog -- Functional Features -- [Non-functional Features] (docs/non-functional-requirements) -- User Research and Design -- Testing -- Documentation -- Knowledge Transfer/Support Processes + - Functional Features + - [Non-functional Features](../../../non-functional-requirements/) + - User Research and Design + - Testing + - Documentation + - Knowledge Transfer/Support Processes -Step 2: T-Shirt Features in terms of working weeks per person. In some scenarios, you have no idea how complex the work. In this situation, you can ask for time to conduct a spike (timebox the effort so you can get back on time). +1. T-Shirt Features in terms of working weeks per person. In some scenarios, you have no idea how complex the work. In this situation, you can ask for time to conduct a spike (timebox the effort so you can get back on time). -Step 3: Calculate the capacity for the team based on the number of weeks person with his/her start and end date and minus holidays, vacation, conferences, training, and onboarding days. Also, minus time if the person is also working on defects and support. +1. Calculate the capacity for the team based on the number of weeks person with his/her start and end date and minus holidays, vacation, conferences, training, and onboarding days. Also, minus time if the person is also working on defects and support. -Step 4: Based on your capacity, you know have the options +Based on your capacity, you know have the options - Ask for more resources. Caution: onboarding new resources take time. - Reduce the scope to the most MVP. Caution: as you trim more of the scope, it might not be valuable anymore to the customer. Consider a cupcake which is everything you need. You don't want to skim off the frosting. diff --git a/docs/agile-development/advanced-topics/effective-organization/scrum-of-scrums.md b/docs/agile-development/advanced-topics/effective-organization/scrum-of-scrums.md index fb0b3c9155..c1b5a67d6b 100644 --- a/docs/agile-development/advanced-topics/effective-organization/scrum-of-scrums.md +++ b/docs/agile-development/advanced-topics/effective-organization/scrum-of-scrums.md @@ -1,6 +1,6 @@ # Scrum of Scrums -Scrum of scrums is a technique used to scale Scrum to a larger group working towards the same project goal. In Scrum, we consider a team being too big when going over 10-12 individuals. This should be decided on a case by case basis. If the project is set up in multiple work streams that contain a fixed group of people and a common [stand-up](../../basics/ceremonies.md#stand-up) meeting is slowing down productivity: scrum of scrums should be considered. The team would identify the different subgroups that would act as a separate scrum teams with their own backlog, board and stand-up. +Scrum of scrums is a technique used to scale Scrum to a larger group working towards the same project goal. In Scrum, we consider a team being too big when going over 10-12 individuals. This should be decided on a case by case basis. If the project is set up in multiple work streams that contain a fixed group of people and a common [stand-up](../../ceremonies.md#stand-up) meeting is slowing down productivity: scrum of scrums should be considered. The team would identify the different subgroups that would act as a separate scrum teams with their own backlog, board and stand-up. ## Goals @@ -15,7 +15,7 @@ The scrum of scrums ceremony happens every day and can be seen as a regular stan The outcome of the meeting will result in a list of impediments related to coordination of the whole project. Solutions could be: agreeing on interfaces between teams, discussing architecture changes, evolving responsibility boundaries, etc. -This list of impediments is usually managed in a separate [backlog](../../basics/backlog-management.md) but does not have to. +This list of impediments is usually managed in a separate [backlog](../../backlog-management.md) but does not have to. ## Participation @@ -29,7 +29,7 @@ When choosing to implement Scrum of Scrums, you need to keep in mind that some t ## Measures -The easiest way to measure the impact is by tracking the time to resolve issues in the scrum of scrums backlog. You can also track issues reported during the [retrospective](../../basics/ceremonies.md#retrospectives) related to global coordination (is it well done? can it be improved?). +The easiest way to measure the impact is by tracking the time to resolve issues in the scrum of scrums backlog. You can also track issues reported during the [retrospective](../../ceremonies.md#retrospectives) related to global coordination (is it well done? can it be improved?). ## Facilitation Guidance diff --git a/docs/agile-development/basics/backlog-management.md b/docs/agile-development/backlog-management.md similarity index 71% rename from docs/agile-development/basics/backlog-management.md rename to docs/agile-development/backlog-management.md index 7dd6d50488..e3e106166d 100644 --- a/docs/agile-development/basics/backlog-management.md +++ b/docs/agile-development/backlog-management.md @@ -1,18 +1,20 @@ -# Backlog Management basics for the Product and Sprint backlog +# Backlog Management ## Backlog +**Goals** + - User stories have a clear acceptance criteria and definition of done. - Design activities are planned as part of the backlog (a design for a story that needs it should be done before it is added in a Sprint). **Suggestions** - Consider the backlog refinement as an ongoing activity, that expands outside of the typical "Refinement meeting". +- The team should decide on and have a clear understanding of a [definition of ready](./team-agreements/definition-of-ready.md) and a [definition of done](./team-agreements/definition-of-done.md). +- The team should have a clear understanding of what constitutes good acceptance criteria for a story/task, and decide on how stories/tasks are handled. Eg. in some projects, stories are refined as a crew, but tasks are created by individual developers on an as needed bases. - Technical debt is mostly due to shortcuts made in the implementation as well as the future maintenance cost as the natural result of continuous improvement. Shortcuts should generally be avoided. In some rare instances where they happen, prioritizing and planning improvement activities to reduce this debt at a later time is the recommended approach. -## Other - -This section has links directing you to best practices for managing Product and Sprint backlogs. After reading through the best practices you should have a basic understanding for managing both product and sprint backlogs, how to create acceptance criteria for user stories, creating a definition of done and definition of ready for user stories and the basics around estimating user stories. +## Resources - [Product Backlog](https://scrumguides.org/scrum-guide.html#product-backlog) - [Sprint Backlog](https://scrumguides.org/scrum-guide.html#sprint-backlog) diff --git a/docs/agile-development/basics/ceremonies.md b/docs/agile-development/basics/ceremonies.md deleted file mode 100644 index 2a3a8d71ae..0000000000 --- a/docs/agile-development/basics/ceremonies.md +++ /dev/null @@ -1,157 +0,0 @@ -# Agile Ceremonies - -## Sprint planning - -- The planning supports Diversity and Inclusion principles and provides equal opportunities. -- The Planning defines how the work is going to be completed in the sprint. -- Stories fit in a sprint and are [designed](https://github.com/microsoft/code-with-engineering-playbook/tree/main/docs/design/design-reviews) and [ready](../advanced-topics/team-agreements/definition-of-ready.md) before the planning. - -### Sprint goal - -Consider defining a sprint goal, or list of goals for each sprint. Effective sprint goals are a concise bullet point list of items. A Sprint goal can be created first and used as an input to choose the Stories for the sprint. A sprint goal could also be created from the list of stories that were picked for the Sprint. - -The sprint goal can be used: - -- At the end of each stand up meeting, to remember the north star for the Sprint and help everyone taking a step back -- *During the sprint review ("was the goal achieved?", "If not, why?") - -> Note: A simple way to define a sprint goal, is to create a User Story in each sprint backlog and name it "Sprint XX goal". You can add the bullet points in the description.* - -### Stories - -- Example 1 - Preparing in advance: - - The dev lead and product owner plan time to prepare the sprint backlog ahead of sprint planning. - - The dev lead uses their experience (past and on the current project) and the estimation made for these stories to gauge how many should be in the sprint. - - The dev lead asks the entire team to look at the tentative sprint backlog in advance of the sprint planning. - - The dev lead assigns stories to specific developers after confirming with them that it makes sense - - During the sprint planning meeting, the team reviews the sprint goal and the stories. Everyone confirm they understand the plan and feel it's reasonable. -- Example 2 - Building during the planning meeting: - - The product owner ensures that the highest priority items of the product backlog is refined and estimated following the team estimation process. - - During the Sprint planning meeting, the product owner describe each stories, one by one, starting by highest priority. - - For each story, the dev lead and the team confirm they understand what needs to be done and add the story to the sprint backlog. - - The team keeps considering more stories up to a point where they agree the sprint backlog is full. This should be informed by the estimation, past developer experience and past experience in this specific project. - - Stories are assigned during the planning meeting: - - Option 1: The dev lead makes suggestion on who could work on each stories. Each engineer agrees or discuss if required. - - Option 2: The team review each story and engineer volunteer select the one they want to be assigned to. (*Note*: this option might cause issues with the first core expectations. Who gets to work on what? Ultimately, it is the dev lead responsibility to ensure each engineer gets the opportunity to work on what makes sense for their growth.) - -### Tasks - -- Examples of approaches for task creation and assignment: - - Stories are split into tasks ahead of time by dev lead and assigned before/during sprint planning to engineers. - - Stories are assigned to more senior engineers who are responsible for splitting into tasks. - - Stories are split into tasks during the Sprint planning meeting by the entire team. - - *Note*: Depending on the seniority of the team, consider splitting into tasks before sprint planning. This can help getting out of sprint planning with all work assigned. It also increase clarity for junior engineers. - -### Sprint planning links - -- [Definition of Ready](../advanced-topics/team-agreements/definition-of-ready.md) -- [Sprint Goal Template](https://www.scrum.org/resources/blog/five-questions-sprint-goal) -- [Planning](https://scrumguides.org/scrum-guide.html#sprint-planning 'Sprint Planning') -- [Refinement](https://learn.microsoft.com/devops/plan/what-is-agile-development#diligent-backlog-refinement 'Refinement') -- [User Stories Applied: For Software Development](https://www.goodreads.com/book/show/3856.User_Stories_Applied) - -> Note: Self assignment by team members can give a feeling of fairness in how work is split in the team. Sometime, this ends up not being the case as it can give an advantage to the loudest or more experienced voices in the team. Individuals also tend to stay in their comfort zone, which might not be the right approach for their own growth.* - -## Estimation - -- Estimation supports the predictability of the team work and delivery. -- Estimation re-enforces the value of accountability to the team. -- The estimation process is improved over time and discussed on a regular basis. -- Estimation is inclusive of the different individuals in the team. - -**Suggestions** - -Rough estimation is usually done for a generic SE 2 dev. - -- Example 1 - - The team use t-shirt sizes (S, M, L, XL) and agrees in advance which size fits a sprint. - - In this example: S, M fits a sprint, L, XL too big for a sprint and need to be split / refined - - The dev lead with support of the team roughly estimates how much S and M stories can be done in the first sprints - - This rough estimation is refined over time and used to as an input for future sprint planning and to adjust project end date forecasting -- Example 2 - - The team uses a single indicator: "does this story fits in one sprint?", if not, the story needs to be split - - The dev lead with support of the team roughly estimates how many stories can be done in the first sprints - - How many stories are done in each sprint on average is used as an input for future sprint planning and as an indicator to adjust project end date forecasting -- Example 3 - - The team does planning poker and estimates in story points - - Story points are roughly used to estimate how much can be done in next sprint - - The dev lead and the TPM uses the past sprints and observed velocity to adjust project end date forecasting - -- Other considerations - - Estimating stories using story points in smaller project does not always provide the value it would in bigger ones. - - Avoid converting story points or t-shirt sizes to days. - - Measure estimation accuracy: - - Collect data to monitor estimation accuracy and sprint completion over time to drive improvements. - - Use the sprint goal to understand if the estimation was correct. If the sprint goal is met: does anything else matter? - - Scrum Practices: While Scrum does not prescribe how to size work, Professional Scrum is biased away from absolute estimation (hours, function points, ideal-days, etc.) and towards relative sizing. - - Planning Poker: is a collaborative technique to assign relative size. Developers may choose whatever units they want - story points and t-shirt sizes are examples of units. - - 'Same-Size' PBIs is a relative estimation approach that involves breaking items down small enough that they are roughly the same size. Velocity can be understood as a count of PBIs; this is sometimes used by teams doing continuously delivery. - - 'Right-Size' PBIs is a relative estimation approach that involves breaking things down small enough to deliver value in a certain time period (i.e. get to Done by the end of a Sprint). This is sometimes associated with teams utilizing flow for forecasting. Teams use historical data to determine if they think they can get the PBI done within the confidence level that their historical data says they typically get a PBI done. - -### Estimation links - -- [The Most Important Thing You Are Missing about Estimation](https://www.scrum.org/resources/blog/most-important-thing-you-are-missing-about-estimation) - -## Retrospectives - -- Retrospectives lead to actionable items that help grow the team's engineering practices. These items are in the backlog, assigned, and prioritized to be fixed by a date agreed upon (default being next retrospective). -- Is used to ask the hard questions ("we usually don't finish what we plan, let's talk about this") when necessary. - -**Suggestions** - -- Consider [other retro formats](https://www.goodreads.com/book/show/721338.Agile_Retrospectives) available outside of Mad Sad Glad. - - Gather Data: Triple Nickels, Timeline, Mad Sad Glad, Team Radar - - Generate Insights: 5 Whys, Fishbone, Patterns and Shifts -- Consider setting a retro focus area. -- Schedule enough time to ensure that you can have the conversation you need to get the correct plan an action and improve how you work. -- Bring in a neutral facilitator for project retros or retros that introspect after a difficult period. - -### Retrospective links - -- [Agile Retrospective: Making Good Teams Great](https://www.goodreads.com/book/show/721338.Agile_Retrospectives) -- [Retrospective](https://scrumguides.org/scrum-guide.html#sprint-retrospective 'Retrospective') - -### Use the following retrospectives techniques to address specific trends that might be emerging on an engagement - -#### 5 whys - -If a team is confronting a problem and is unsure of the exact root cause, the 5 whys exercise taken from the business analysis sector can help get to the bottom of it. For example, if a team cannot get to *Done* each Sprint, that would go at the top of the whiteboard. The team then asks why that problem exists, writing that answer in the box below.  Next, the team asks why again, but this time in response to the *why* they just identified. Continue this process until the team identifies an actual root cause, which usually becomes apparent within five steps. - -#### Processes, tools, individuals, interactions and the Definition of Done - -This approach encourages team members to think more broadly.  Ask team members to identify what is going well and ideas for improvement within the categories of processes, tools, individuals/interactions, and the Definition of Done.  Then, ask team members to vote on which improvement ideas to focus on during the upcoming Sprint. - -#### Focus - -This retrospective technique incorporates the concept of visioning. Using this technique, you ask team members where they would like to go?  Decide what the team should look like in 4 weeks, and then ask what is holding them back from that and how they can resolve the impediment.  If you are focusing on specific improvements, you can use this technique for one or two Retrospectives in a row so that the team can see progress over time. - -## Sprint Demo - -- Each sprint has demos that illustrate the sprint goal and how it fits in the engagement goal. - -**Suggestions** - -- Consider not pre-recording sprint demos in advance. You can record the demo meeting and archive them. -- A demo does not have to be about running code. It can be showing documentation that was written. - -## Stand-up - -- The stand-up is run efficiently. -- The stand-up helps the team understand what was done, what will be done and what are the blockers. -- The stand-up helps the team understand if they will meet the sprint goal or not. - -**Suggestions** - -- Keep stand up short and efficient. Table the longer conversations for a parking lot section, or for a conversation that will be planned later. -- Run daily stand ups: 15 minutes of stand up and 15 minutes of parking lot. -- If someone cannot make the stand-up exceptionally: Ask them to do a written stand up in advance. -- Stand ups should include everyone involved in the project, including the customer. -- Projects with widely divergent time zones should be avoided if possible, but if you are on one, you should adapt the standups to meet the needs and time constraints of all team members. - -### Sprint demo links - -- [Sprint Review/Demo](https://scrumguides.org/scrum-guide.html#sprint-review 'Sprint Review') - -### Stand-up links - -- [Stand-Up/Daily Scrum](https://scrumguides.org/scrum-guide.html#daily-scrum 'Stand-up/Daily Scrum') diff --git a/docs/agile-development/ceremonies.md b/docs/agile-development/ceremonies.md new file mode 100644 index 0000000000..87cd662cc2 --- /dev/null +++ b/docs/agile-development/ceremonies.md @@ -0,0 +1,190 @@ +# Agile Ceremonies + +## Sprint Planning + +**Goals** + +- The planning supports Diversity and Inclusion principles and provides equal opportunities. +- The Planning defines how the work is going to be completed in the sprint. +- Stories fit in a sprint and are [designed](../design/design-reviews) and [ready](./team-agreements/definition-of-ready.md) before the planning. + +> **Note:** Self assignment by team members can give a feeling of fairness in how work is split in the team. Sometime, this ends up not being the case as it can give an advantage to the loudest or more experienced voices in the team. Individuals also tend to stay in their comfort zone, which might not be the right approach for their own growth.* + +### Sprint Goal + +Consider defining a sprint goal, or list of goals for each sprint. Effective sprint goals are a concise bullet point list of items. A Sprint goal can be created first and used as an input to choose the Stories for the sprint. A sprint goal could also be created from the list of stories that were picked for the Sprint. + +The sprint goal can be used: + +- At the end of each stand up meeting, to remember the north star for the Sprint and help everyone taking a step back +- During the sprint review ("was the goal achieved?", "If not, why?") + +> **Note:** A simple way to define a sprint goal, is to create a User Story in each sprint backlog and name it "Sprint XX goal". You can add the bullet points in the description.* + +### Stories + +Example 1: Preparing in advance + +- The dev lead and product owner plan time to prepare the sprint backlog ahead of sprint planning. +- The dev lead uses their experience (past and on the current project) and the estimation made for these stories to gauge how many should be in the sprint. +- The dev lead asks the entire team to look at the tentative sprint backlog in advance of the sprint planning. +- The dev lead assigns stories to specific developers after confirming with them that it makes sense +- During the sprint planning meeting, the team reviews the sprint goal and the stories. Everyone confirm they understand the plan and feel it's reasonable. + +Example 2: Building during the planning meeting + +- The product owner ensures that the highest priority items of the product backlog is refined and estimated following the team estimation process. +- During the Sprint planning meeting, the product owner describe each stories, one by one, starting by highest priority. +- For each story, the dev lead and the team confirm they understand what needs to be done and add the story to the sprint backlog. +- The team keeps considering more stories up to a point where they agree the sprint backlog is full. This should be informed by the estimation, past developer experience and past experience in this specific project. +- Stories are assigned during the planning meeting: + - **Option 1:** The dev lead makes suggestion on who could work on each stories. Each engineer agrees or discuss if required. + - **Option 2:** The team review each story and engineer volunteer select the one they want to be assigned to. + > **Note**: this option might cause issues with the first core expectations. Who gets to work on what? Ultimately, it is the dev lead responsibility to ensure each engineer gets the opportunity to work on what makes sense for their growth.) + +### Tasks + +Examples of approaches for task creation and assignment: + +- Stories are split into tasks ahead of time by dev lead and assigned before/during sprint planning to engineers. +- Stories are assigned to more senior engineers who are responsible for splitting into tasks. +- Stories are split into tasks during the Sprint planning meeting by the entire team. + +> **Note**: Depending on the seniority of the team, consider splitting into tasks before sprint planning. This can help getting out of sprint planning with all work assigned. It also increase clarity for junior engineers. + +### Sprint Planning Resources + +- [Definition of Ready](team-agreements/definition-of-ready.md) +- [Sprint Goal Template](https://www.scrum.org/resources/blog/five-questions-sprint-goal) +- [Planning](https://scrumguides.org/scrum-guide.html#sprint-planning 'Sprint Planning') +- [Refinement](https://learn.microsoft.com/devops/plan/what-is-agile-development#diligent-backlog-refinement 'Refinement') +- [User Stories Applied: For Software Development](https://www.goodreads.com/book/show/3856.User_Stories_Applied) + +## Estimation + +**Goals** + +- Estimation supports the predictability of the team work and delivery. +- Estimation re-enforces the value of accountability to the team. +- The estimation process is improved over time and discussed on a regular basis. +- Estimation is inclusive of the different individuals in the team. + +Rough estimation is usually done for a generic SE 2 dev. + +### Example 1: T-shirt Sizes + +- The team use t-shirt sizes (S, M, L, XL) and agrees in advance which size fits a sprint. In this example: S, M fits a sprint, L, XL too big for a sprint and need to be split / refined +- The dev lead with support of the team roughly estimates how much S and M stories can be done in the first sprints +- This rough estimation is refined over time and used to as an input for future sprint planning and to adjust project end date forecasting + +### Example 2: Single Indicator + +- The team uses a single indicator: "does this story fits in one sprint?", if not, the story needs to be split +- The dev lead with support of the team roughly estimates how many stories can be done in the first sprints +- How many stories are done in each sprint on average is used as an input for future sprint planning and as an indicator to adjust project end date forecasting + +### Example 3: Planning Poker + +- The team does planning poker and estimates in story points +- Story points are roughly used to estimate how much can be done in next sprint +- The dev lead and the TPM uses the past sprints and observed velocity to adjust project end date forecasting + +### Other Considerations + +- Estimating stories using story points in smaller project does not always provide the value it would in bigger ones. +- Avoid converting story points or t-shirt sizes to days. + +#### Measure Estimation Accuracy + +- Collect data to monitor estimation accuracy and sprint completion over time to drive improvements. +- Use the sprint goal to understand if the estimation was correct. If the sprint goal is met: does anything else matter? + +#### Scrum Practices + +While Scrum does not prescribe how to size work, Professional Scrum is biased away from absolute estimation (hours, function points, ideal-days, etc.) and towards relative sizing. + +**Planning Poker** + +Planning Poker is a collaborative technique to assign relative size. Developers may choose whatever units they want - story points and t-shirt sizes are examples of units. + +**'Same-Size' Product Backlog Items (PBIs)** + +'Same-Size' PBIs is a relative estimation approach that involves breaking items down small enough that they are roughly the same size. Velocity can be understood as a count of PBIs; this is sometimes used by teams doing continuously delivery. + +**'Right-Size' Product Backlog Items (PBIs)** + +'Right-Size' PBIs is a relative estimation approach that involves breaking things down small enough to deliver value in a certain time period (i.e. get to Done by the end of a Sprint). This is sometimes associated with teams utilizing flow for forecasting. Teams use historical data to determine if they think they can get the PBI done within the confidence level that their historical data says they typically get a PBI done. + +### Estimation Resources + +- [The Most Important Thing You Are Missing about Estimation](https://www.scrum.org/resources/blog/most-important-thing-you-are-missing-about-estimation) + +## Retrospectives + +**Goals** + +- Retrospectives lead to actionable items that help grow the team's engineering practices. These items are in the backlog, assigned, and prioritized to be fixed by a date agreed upon (default being next retrospective). +- Retrospectives are used to ask the hard questions ("we usually don't finish what we plan, let's talk about this") when necessary. + +**Suggestions** + +- Consider [other retro formats](https://www.goodreads.com/book/show/721338.Agile_Retrospectives) available outside of Mad Sad Glad. + - **Gather Data:** Triple Nickels, Timeline, Mad Sad Glad, Team Radar + - **Generate Insights:** 5 Whys, Fishbone, Patterns and Shifts +- Consider setting a retro focus area. +- Schedule enough time to ensure that you can have the conversation you need to get the correct plan an action and improve how you work. +- Bring in a neutral facilitator for project retros or retros that introspect after a difficult period. + +Use the following retrospectives techniques to address specific trends that might be emerging on an engagement + +### 5 Whys + +If a team is confronting a problem and is unsure of the exact root cause, the 5 whys exercise taken from the business analysis sector can help get to the bottom of it. For example, if a team cannot get to *Done* each Sprint, that would go at the top of the whiteboard. The team then asks why that problem exists, writing that answer in the box below.  Next, the team asks why again, but this time in response to the *why* they just identified. Continue this process until the team identifies an actual root cause, which usually becomes apparent within five steps. + +### Processes, Tools, Individuals, Interactions and the Definition of Done + +This approach encourages team members to think more broadly.  Ask team members to identify what is going well and ideas for improvement within the categories of processes, tools, individuals/interactions, and the Definition of Done.  Then, ask team members to vote on which improvement ideas to focus on during the upcoming Sprint. + +### Focus + +This retrospective technique incorporates the concept of visioning. Using this technique, you ask team members where they would like to go?  Decide what the team should look like in 4 weeks, and then ask what is holding them back from that and how they can resolve the impediment.  If you are focusing on specific improvements, you can use this technique for one or two Retrospectives in a row so that the team can see progress over time. + +### Retrospective Resources + +- [Agile Retrospective: Making Good Teams Great](https://www.goodreads.com/book/show/721338.Agile_Retrospectives) +- [Retrospective](https://scrumguides.org/scrum-guide.html#sprint-retrospective 'Retrospective') + +## Sprint Demo + +**Goals** + +- Each sprint ends with demos that illustrate the sprint goal and how it fits in the engagement goal. + +**Suggestions** + +- Consider not pre-recording sprint demos in advance. You can record the demo meeting and archive them. +- A demo does not have to be about running code. It can be showing documentation that was written. + +### Sprint Demo Resources + +- [Sprint Review/Demo](https://scrumguides.org/scrum-guide.html#sprint-review 'Sprint Review') + +## Stand-Up + +**Goals** + +- The stand-up is run efficiently. +- The stand-up helps the team understand what was done, what will be done and what are the blockers. +- The stand-up helps the team understand if they will meet the sprint goal or not. + +**Suggestions** + +- Keep stand up short and efficient. Table the longer conversations for a parking lot section, or for a conversation that will be planned later. +- Run daily stand ups: 15 minutes of stand up and 15 minutes of parking lot. +- If someone cannot make the stand-up exceptionally: Ask them to do a written stand up in advance. +- Stand ups should include everyone involved in the project, including the customer. +- Projects with widely divergent time zones should be avoided if possible, but if you are on one, you should adapt the standups to meet the needs and time constraints of all team members. + +### Stand-Up Resources + +- [Stand-Up/Daily Scrum](https://scrumguides.org/scrum-guide.html#daily-scrum 'Stand-up/Daily Scrum') diff --git a/docs/agile-development/basics/roles.md b/docs/agile-development/roles.md similarity index 86% rename from docs/agile-development/basics/roles.md rename to docs/agile-development/roles.md index 477f66a125..68b4fee147 100644 --- a/docs/agile-development/basics/roles.md +++ b/docs/agile-development/roles.md @@ -1,6 +1,6 @@ # Agile/Scrum Roles -- We prefer the usage of "process lead" over "scrum master". It describes the same role. +- We prefer using "process lead" over "scrum master". It describes the same role. This section has links directing you to definitions for the traditional roles within Agile/Scrum. After reading through the best practices you should have a basic understanding of the key Agile roles in terms of what they are and the expectations for the role. diff --git a/docs/agile-development/advanced-topics/team-agreements/definition-of-done.md b/docs/agile-development/team-agreements/definition-of-done.md similarity index 100% rename from docs/agile-development/advanced-topics/team-agreements/definition-of-done.md rename to docs/agile-development/team-agreements/definition-of-done.md diff --git a/docs/agile-development/advanced-topics/team-agreements/definition-of-ready.md b/docs/agile-development/team-agreements/definition-of-ready.md similarity index 85% rename from docs/agile-development/advanced-topics/team-agreements/definition-of-ready.md rename to docs/agile-development/team-agreements/definition-of-ready.md index dad73a7b1e..29321db212 100644 --- a/docs/agile-development/advanced-topics/team-agreements/definition-of-ready.md +++ b/docs/agile-development/team-agreements/definition-of-ready.md @@ -10,7 +10,7 @@ When the development team picks a user story from the top of the backlog, the us It can be understood as a checklist that helps the Product Owner to ensure that the user story they wrote contains all the necessary details for the scrum team to understand the work to be done. -### Examples of ready checklist items +### Examples of Ready Checklist Items * [ ] Does the description have the details including any input values required to implement the user story? * [ ] Does the user story have clear and complete acceptance criteria? @@ -21,22 +21,22 @@ It can be understood as a checklist that helps the Product Owner to ensure that * The completion of unfinished work * A deliverable provided by another team (code artifact, data, etc...) -## Who writes it +## Who Writes it The ready checklist can be written by a Product Owner in agreement with the development team and the Process Lead. -## When should a Definition of Ready be updated +## When Should a Definition of Ready be Updated Update or change the definition of ready anytime the scrum team observes that there are missing information in the user stories that recurrently impacts the planning. -## What should be avoided +## What Should be Avoided The ready checklist should contain items that apply broadly. Don't include items or details that only apply to one or two user stories. This may become an overhead when writing the user stories. -## How to get stories ready +## How to get Stories Ready In the case that the highest priority work is not yet ready, it still may be possible to make forward progress. Here are some strategies that may help: -* [Backlog Refinement](../backlog-management/README.md) sessions are a good time to validate that high priority user stories are verified to have a clear description, acceptance criteria and demonstrable business value. It is also a good time to breakdown large stories will likely not be completable in a single sprint. +* Backlog Refinement sessions are a good time to validate that high priority user stories are verified to have a clear description, acceptance criteria and demonstrable business value. It is also a good time to breakdown large stories will likely not be completable in a single sprint. * Prioritization sessions are a good time to prioritize user stories that unblock other blocked high priority work. * Blocked user stories can often be broken down in a way that unblocks a portion of the original stories scope. This is a good way to make forward progress even when some work is blocked. diff --git a/docs/agile-development/advanced-topics/team-agreements/team-manifesto.md b/docs/agile-development/team-agreements/team-manifesto.md similarity index 100% rename from docs/agile-development/advanced-topics/team-agreements/team-manifesto.md rename to docs/agile-development/team-agreements/team-manifesto.md diff --git a/docs/agile-development/advanced-topics/team-agreements/working-agreements.md b/docs/agile-development/team-agreements/working-agreement.md similarity index 92% rename from docs/agile-development/advanced-topics/team-agreements/working-agreements.md rename to docs/agile-development/team-agreements/working-agreement.md index 9102bea4fa..e998b1e62a 100644 --- a/docs/agile-development/advanced-topics/team-agreements/working-agreements.md +++ b/docs/agile-development/team-agreements/working-agreement.md @@ -21,7 +21,7 @@ their own, and adjust times, communication channels, branch naming policies etc. ## Communication - We communicate all information relevant to the team through the Project Teams channel -- We add all [technical spikes](../../../design/design-reviews/recipes/technical-spike.md), [trade studies](../../../design/design-reviews/trade-studies/README.md), and other technical documentation to the project repository through [async design reviews in PRs](../../../design/design-reviews/recipes/async-design-reviews.md) +- We add all [technical spikes](../../design/design-reviews/recipes/technical-spike.md), [trade studies](../../design/design-reviews/trade-studies/README.md), and other technical documentation to the project repository through [async design reviews in PRs](../../design/design-reviews/recipes/async-design-reviews.md) ## Work-life Balance @@ -51,7 +51,7 @@ their own, and adjust times, communication channels, branch naming policies etc. The Process Lead is responsible for leading any scrum or agile practices to enable the project to move forward. - Facilitate standup meetings and hold team accountable for attendance and participation. -- Keep the meeting moving as described in the [Project Standup](../../basics/ceremonies.md) page. +- Keep the meeting moving as described in the [Project Standup](../ceremonies.md) page. - Make sure all action items are documented and ensure each has an owner and a due date and tracks the open issues. - Notes as needed after planning / stand-ups. - Make sure that items are moved to the parking lot and ensure follow-up afterwards. @@ -80,4 +80,4 @@ The Process Lead is responsible for leading any scrum or agile practices to enab - All PRs are reviewed by one person from and one from Microsoft (for knowledge transfer and to ensure code and security standards are met) - We always review existing PRs before starting work on a new task - We look through open PRs at the end of stand-up to make sure all PRs have reviewers. -- We treat documentation as code and apply the same [standards to Markdown](../../../code-reviews/recipes/markdown.md) as code +- We treat documentation as code and apply the same [standards to Markdown](../../code-reviews/recipes/markdown.md) as code diff --git a/docs/automated-testing/README.md b/docs/automated-testing/README.md index 5dac2e1dae..d6668bf785 100644 --- a/docs/automated-testing/README.md +++ b/docs/automated-testing/README.md @@ -1,64 +1,19 @@ # Testing -## Map of Outcomes to Testing Techniques +## Why Testing -The table below maps outcomes -- the results that you may want to achieve in your validation efforts -- to one or more techniques that can be used to accomplish that outcome. +- Tests allow us to find flaws in our software +- Good tests document the code by describing the intent +- Automated tests saves time, compared to manual tests +- Automated tests allow us to safely change and refactor our code without introducing regressions -| When I am working on... | I want to get this outcome... | ...so I should consider | -|-------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Development | Prove backward compatibility with existing callers and clients | [Shadow testing](shadow-testing/README.md) | Development; [Integration testing](integration-testing/README.md) | Ensure telemetry is sufficiently detailed and complete to trace and diagnose malfunction in [End-to-End testing](e2e-testing/README.md) flows | Distributed Debug challenges ; Orphaned call chain analysis | -| Development | Ensure program logic is correct for a variety of expected, mainline, edge and unexpected inputs | [Unit testing](unit-testing/README.md); Functional tests; [Consumer-driven Contract Testing](cdc-testing/README.md); [Integration testing](integration-testing/README.md) | -| Development | Prevent regressions in logical correctness; earlier is better | [Unit testing](unit-testing/README.md); Functional tests; [Consumer-driven Contract Testing](cdc-testing/README.md); [Integration testing](integration-testing/README.md); Rings (each of these are expanding scopes of coverage) | | Development | Quickly validate mainline correctness of a point of functionality (e.g. single API), manually | Manual smoke testing Tools: postman, powershell, curl | -| Development | Validate interactions between components in isolation, ensuring that consumer and provider components are compatible and conform to a shared understanding documented in a contract | [Consumer-driven Contract Testing](cdc-testing/README.md) | -| Development; [Integration testing](integration-testing/README.md) | Validate that multiple components function together across multiple interfaces in a call chain, incl network hops | [Integration testing](integration-testing/README.md); End-to-end ([End-to-End testing](e2e-testing/README.md)) tests; Segmented end-to-end ([End-to-End testing](e2e-testing/README.md)) | -| Development | Prove disaster recoverability – recover from corruption of data | DR drills | -| Development | Find vulnerabilities in service Authentication or Authorization | Scenario (security) | | Development | Prove correct RBAC and claims interpretation of Authorization code | Scenario (security) | | Development | Document and/or enforce valid API usage | [Unit testing](unit-testing/README.md); Functional tests; [Consumer-driven Contract Testing](cdc-testing/README.md) | -| Development | Prove implementation correctness in advance of a dependency or absent a dependency | [Unit testing](unit-testing/README.md) (with mocks); [Unit testing](unit-testing/README.md) (with emulators); [Consumer-driven Contract Testing](cdc-testing/README.md) | -| Development | Ensure that the user interface is accessible | [Accessibility](../accessibility/README.md) | -| Development | Ensure that users can operate the interface | [UI testing (automated)](ui-testing/README.md) (human usability observation) | -| Development | Prevent regression in user experience | UI automation; [End-to-End testing](e2e-testing/README.md) | -| Development | Detect and prevent 'noisy neighbor' phenomena | [Load testing](performance-testing/load-testing.md) | -| Development | Detect availability drops | [Synthetic Transaction testing](synthetic-monitoring-tests/README.md); Outside-in probes | -| Development | Prevent regression in 'composite' scenario use cases / workflows (e.g. an e-commerce system might have many APIs that used together in a sequence perform a "shop-and-buy" scenario) | [End-to-End testing](e2e-testing/README.md); Scenario | -| Development; Operations | Prevent regressions in runtime performance metrics e.g. latency / cost / resource consumption; earlier is better | Rings; [Synthetic Transaction testing](synthetic-monitoring-tests/README.md) / Transaction; Rollback Watchdogs | -| Development; Optimization | Compare any given metric between 2 candidate implementations or variations in functionality | Flighting; A/B testing | -| Development; Staging | Prove production system of provisioned capacity meets goals for reliability, availability, resource consumption, performance | [Load testing (stress)](performance-testing/load-testing.md); Spike; Soak; [Performance testing](performance-testing/README.md) | -| Development; Staging | Understand key user experience performance characteristics – latency, chattiness, resiliency to network errors | Load; [Performance testing](performance-testing/README.md); Scenario (network partitioning) | -| Development; Staging; Operation | Discover melt points (the loads at which failure or maximum tolerable resource consumption occurs) for each individual component in the stack | Squeeze; [Load testing (stress)](performance-testing/load-testing.md) | -| Development; Staging; Operation | Discover overall system melt point (the loads at which the end-to-end system fails) and which component is the weakest link in the whole stack | Squeeze; [Load testing (stress)](performance-testing/load-testing.md) | -| Development; Staging; Operation | Measure capacity limits for given provisioning to predict or satisfy future provisioning needs | Squeeze; [Load testing (stress)](performance-testing/load-testing.md) | -| Development; Staging; Operation | Create / exercise failover runbook | Failover drills | -| Development; Staging; Operation | Prove disaster recoverability – loss of data center (the meteor scenario); measure MTTR | DR drills | -| Development; Staging; Operation | Understand whether observability dashboards are correct, and telemetry is complete; flowing | Trace Validation; [Load testing (stress)](performance-testing/load-testing.md); Scenario; [End-to-End testing](e2e-testing/README.md) | -| Development; Staging; Operation | Measure impact of seasonality of traffic | [Load testing](performance-testing/load-testing.md) | -| Development; Staging; Operation | Prove Transaction and alerts correctly notify / take action | [Synthetic Transaction testing](synthetic-monitoring-tests/README.md) (negative cases); [Load testing](performance-testing/load-testing.md) | -| Development; Staging; Operation; Optimizing | Understand scalability curve, i.e. how the system consumes resources with load | [Load testing (stress)](performance-testing/load-testing.md); [Performance testing](performance-testing/README.md) | -| Operation; Optimizing | Discover system behavior over long-haul time | Soak | -| Optimizing | Find cost savings opportunities | Squeeze | -| Staging; Operation | Measure impact of failover / scale-out (repartitioning, increasing provisioning) / scale-down | Failover drills; Scale drills | -| Staging; Operation | Create/Exercise runbook for increasing/reducing provisioning | Scale drills | -| Staging; Operation | Measure behavior under rapid changes in traffic | Spike | -| Staging; Optimizing | Discover cost metrics per unit load volume (what factors influence cost at what load points, e.g. cost per million concurrent users) | Load (stress) | -| Development; Operation | Discover points where a system is not resilient to unpredictable yet inevitable failures (network outage, hardware failure, VM host servicing, rack/switch failures, random acts of the Malevolent Divine, solar flares, sharks that eat undersea cable relays, cosmic radiation, power outages, renegade backhoe operators, wolves chewing on junction boxes, …) | Chaos | -| Development | Perform unit testing on Power platform custom connectors | [Custom Connector Testing](unit-testing/custom-connector.md) | +## The Fundamentals -## Sections within Testing - -- [Consumer-driven contract (CDC) testing](cdc-testing/README.md) -- [End-to-End testing](e2e-testing/README.md) -- [Fault Injection testing](fault-injection-testing/README.md) -- [Integration testing](integration-testing/README.md) -- [Performance testing](performance-testing/README.md) -- [Shadow testing](shadow-testing/README.md) -- [Smoke testing](smoke-testing/README.md) -- [Synthetic Transaction testing](synthetic-monitoring-tests/README.md) -- [UI testing](ui-testing/README.md) -- [Unit testing](unit-testing/README.md) - -## Technology Specific Testing - -- [Using DevTest Pattern for building containers with AzDO](tech-specific-samples/azdo-container-dev-test-release/README.md) -- [Using Azurite to run blob storage tests in pipeline](tech-specific-samples/blobstorage-unit-tests/README.md) +- We consider code to be incomplete if it is not accompanied by tests +- We write unit tests (tests without external dependencies) that can run before every PR merge to validate that we don’t have regressions +- We write Integration tests/E2E tests that test the whole system end to end, and run them regularly +- We write our tests early and block any further code merging if tests fail. +- We run load tests/performance tests where appropriate to validate that the system performs under stress ## Build for Testing @@ -79,3 +34,55 @@ Testing is a critical part of the development process. It is important to build - **Log metadata.** When logging, it is important to include metadata that is relevant to the activity. For example, a Tenant ID, Customer ID, or Order ID. This allows someone reviewing the logs to understand the context of the activity and filter to a manageable set of logs. - **Log performance metrics.** Even if you are using App Insights to capture how long dependency calls are taking, it is often useful to know long certain functions of your application took. It then becomes possible to evaluate the performance characteristics of your application as it is deployed on different compute platforms with different limitations on CPU, memory, and network bandwidth. For more information, please see [Metrics](../observability/pillars/metrics.md). + + +## Map of Outcomes to Testing Techniques + +The table below maps outcomes (the results that you may want to achieve in your validation efforts) to one or more techniques that can be used to accomplish that outcome. + +| When I am working on... | I want to get this outcome... | ...so I should consider | +| -- | -- | -- | +| Development | Prove backward compatibility with existing callers and clients | [Shadow testing](./shadow-testing/README.md) | +| Development | Ensure telemetry is sufficiently detailed and complete to trace and diagnose malfunction in [End-to-End testing](./e2e-testing/README.md) flows | Distributed Debug challenges; Orphaned call chain analysis | +| Development | Ensure program logic is correct for a variety of expected, mainline, edge and unexpected inputs | [Unit testing](./unit-testing/README.md); Functional tests; [Consumer-driven Contract Testing](./cdc-testing/README.md); [Integration testing](./integration-testing/README.md) | +| Development | Prevent regressions in logical correctness; earlier is better | [Unit testing](./unit-testing/README.md); Functional tests; [Consumer-driven Contract Testing](./cdc-testing/README.md); [Integration testing](./integration-testing/README.md); Rings (each of these are expanding scopes of coverage) | +| Development | Quickly validate mainline correctness of a point of functionality (e.g. single API), manually | Manual smoke testing Tools: postman, powershell, curl | +| Development | Validate interactions between components in isolation, ensuring that consumer and provider components are compatible and conform to a shared understanding documented in a contract | [Consumer-driven Contract Testing](./cdc-testing/README.md) | +| Development | Validate that multiple components function together across multiple interfaces in a call chain, incl network hops | [Integration testing](./integration-testing/README.md); End-to-end ([End-to-End testing](./e2e-testing/README.md)) tests; Segmented end-to-end ([End-to-End testing](./e2e-testing/README.md)) | +| Development | Prove disaster recoverability – recover from corruption of data | DR drills | +| Development | Find vulnerabilities in service Authentication or Authorization | Scenario (security) | +| Development | Prove correct RBAC and claims interpretation of Authorization code | Scenario (security) | +| Development | Document and/or enforce valid API usage | [Unit testing](./unit-testing/README.md); Functional tests; [Consumer-driven Contract Testing](./cdc-testing/README.md) | +| Development | Prove implementation correctness in advance of a dependency or absent a dependency | [Unit testing](./unit-testing/README.md) (with mocks); [Unit testing](./unit-testing/README.md) (with emulators); [Consumer-driven Contract Testing](./cdc-testing/README.md) | +| Development | Ensure that the user interface is accessible | [Accessibility](../non-functional-requirements/accessibility.md) | +| Development | Ensure that users can operate the interface | [UI testing (automated)](./ui-testing/README.md) (human usability observation) | +| Development | Prevent regression in user experience | UI automation; [End-to-End testing](./e2e-testing/README.md) | +| Development | Detect and prevent 'noisy neighbor' phenomena | [Load testing](./performance-testing/load-testing.md) | +| Development | Detect availability drops | [Synthetic Transaction testing](./synthetic-monitoring-tests/README.md); Outside-in probes | +| Development | Prevent regression in 'composite' scenario use cases / workflows (e.g. an e-commerce system might have many APIs that used together in a sequence perform a "shop-and-buy" scenario) | [End-to-End testing](./e2e-testing/README.md); Scenario | +| Development; Operations | Prevent regressions in runtime performance metrics e.g. latency / cost / resource consumption; earlier is better | Rings; [Synthetic Transaction testing](./synthetic-monitoring-tests/README.md) / Transaction; Rollback Watchdogs | +| Development; Optimization | Compare any given metric between 2 candidate implementations or variations in functionality | Flighting; A/B testing | +| Development; Staging | Prove production system of provisioned capacity meets goals for reliability, availability, resource consumption, performance | [Load testing (stress)](./performance-testing/load-testing.md); Spike; Soak; [Performance testing](./performance-testing/README.md) | +| Development; Staging | Understand key user experience performance characteristics – latency, chattiness, resiliency to network errors | Load; [Performance testing](./performance-testing/README.md); Scenario (network partitioning) | +| Development; Staging; Operation | Discover melt points (the loads at which failure or maximum tolerable resource consumption occurs) for each individual component in the stack | Squeeze; [Load testing (stress)](./performance-testing/load-testing.md) | +| Development; Staging; Operation | Discover overall system melt point (the loads at which the end-to-end system fails) and which component is the weakest link in the whole stack | Squeeze; [Load testing (stress)](./performance-testing/load-testing.md) | +| Development; Staging; Operation | Measure capacity limits for given provisioning to predict or satisfy future provisioning needs | Squeeze; [Load testing (stress)](./performance-testing/load-testing.md) | +| Development; Staging; Operation | Create / exercise failover runbook | Failover drills | +| Development; Staging; Operation | Prove disaster recoverability – loss of data center (the meteor scenario); measure MTTR | DR drills | +| Development; Staging; Operation | Understand whether observability dashboards are correct, and telemetry is complete; flowing | Trace Validation; [Load testing (stress)](./performance-testing/load-testing.md); Scenario; [End-to-End testing](./e2e-testing/README.md) | +| Development; Staging; Operation | Measure impact of seasonality of traffic | [Load testing](./performance-testing/load-testing.md) | +| Development; Staging; Operation | Prove Transaction and alerts correctly notify / take action | [Synthetic Transaction testing](./synthetic-monitoring-tests/README.md) (negative cases); [Load testing](./performance-testing/load-testing.md) | +| Development; Staging; Operation; Optimizing | Understand scalability curve, i.e. how the system consumes resources with load | [Load testing (stress)](./performance-testing/load-testing.md); [Performance testing](./performance-testing/README.md) | +| Operation; Optimizing | Discover system behavior over long-haul time | Soak | +| Optimizing | Find cost savings opportunities | Squeeze | +| Staging; Operation | Measure impact of failover / scale-out (repartitioning, increasing provisioning) / scale-down | Failover drills; Scale drills | +| Staging; Operation | Create/Exercise runbook for increasing/reducing provisioning | Scale drills | +| Staging; Operation | Measure behavior under rapid changes in traffic | Spike | +| Staging; Optimizing | Discover cost metrics per unit load volume (what factors influence cost at what load points, e.g. cost per million concurrent users) | Load (stress) | +| Development; Operation | Discover points where a system is not resilient to unpredictable yet inevitable failures (network outage, hardware failure, VM host servicing, rack/switch failures, random acts of the Malevolent Divine, solar flares, sharks that eat undersea cable relays, cosmic radiation, power outages, renegade backhoe operators, wolves chewing on junction boxes, …) | Chaos | +| Development | Perform unit testing on Power platform custom connectors | [Custom Connector Testing](./unit-testing/custom-connector.md) | + +## Technology Specific Testing + +- [Using DevTest Pattern for building containers with AzDO](./tech-specific-samples/building-containers-with-azure-devops.md) +- [Using Azurite to run blob storage tests in pipeline](./tech-specific-samples/blobstorage-unit-tests/README.md) diff --git a/docs/automated-testing/cdc-testing/README.md b/docs/automated-testing/cdc-testing/README.md index 24938fb345..3fc7a56009 100644 --- a/docs/automated-testing/cdc-testing/README.md +++ b/docs/automated-testing/cdc-testing/README.md @@ -1,8 +1,8 @@ -# Consumer-driven Contract Testing (CDC) +# Consumer-Driven Contract Testing (CDC) Consumer-driven Contract Testing (or CDC for short) is a software testing methodology used to test components of a system in isolation while ensuring that provider components are compatible with the expectations that consumer components have of them. -## Why Consumer-driven Contract Testing +## Why Consumer-Driven Contract Testing CDC tries to overcome the [several painful drawbacks](https://pactflow.io/blog/proving-e2e-tests-are-a-scam) of automated E2E tests with components interacting together: @@ -21,7 +21,7 @@ Some E2E tests are still required to verify the system as a whole when deployed CDC testing was initially developed for testing RESTful API's, but the pattern scales to all consumer-provider systems and tooling for other messaging protocols besides HTTP does exist. -## Consumer-driven Contract Testing Design Blocks +## Consumer-Driven Contract Testing Design Blocks In a [consumer-driven approach](https://martinfowler.com/articles/consumerDrivenContracts.html) the consumer drives changes to contracts between a consumer (the client) and a provider (the server). This may sound counterintuitive, but it helps providers create APIs that fit the real requirements of the consumers rather than trying to guess these in advance. Next we describe the CDC building blocks ordered by their occurrence in the development cycle. diff --git a/docs/automated-testing/e2e-testing/README.md b/docs/automated-testing/e2e-testing/README.md index 2505008b78..3dde56102c 100644 --- a/docs/automated-testing/e2e-testing/README.md +++ b/docs/automated-testing/e2e-testing/README.md @@ -6,7 +6,7 @@ At times, these systems are developed in different technologies by different tea ![End to End Testing](./images/e2e-testing.png) -## Why E2E Testing [The Why] +## Why E2E Testing In many commercial software application scenarios, a modern software system consists of its interconnection with multiple sub-systems. These sub-systems can be within the same organization or can be components of different organizations. Also, these sub-systems can have somewhat similar or different lifetime release cycle from the current system. As a result, if there is any failure or fault in any sub-system, it can adversely affect the whole software system leading to its collapse. @@ -14,9 +14,9 @@ In many commercial software application scenarios, a modern software system cons The above illustration is a testing pyramid from [Kent C. Dodd's blog](https://blog.kentcdodds.com/write-tests-not-too-many-mostly-integration-5e8c7fff591c) which is a combination of the pyramids from [Martin Fowler’s blog](https://martinfowler.com/bliki/TestPyramid.html) and the [Google Testing Blog](https://testing.googleblog.com/2015/04/just-say-no-to-more-end-to-end-tests.html). -The majority of your tests are at the bottom of the pyramid. As you move up the pyramid, the number of tests gets smaller. Also, going up the pyramid, tests get slower and more expensive to write, run, and maintain. Each type of testing vary for its purpose, application and the areas it's supposed to cover. For more information on comparison analysis of different testing types, please see this [## Unit vs Integration vs System vs E2E Testing](../README.md) document. +The majority of your tests are at the bottom of the pyramid. As you move up the pyramid, the number of tests gets smaller. Also, going up the pyramid, tests get slower and more expensive to write, run, and maintain. Each type of testing vary for its purpose, application and the areas it's supposed to cover. For more information on comparison analysis of different testing types, please see this [## Unit vs Integration vs System vs E2E Testing](./testing-comparison.md) document. -## E2E Testing Design Blocks [The What] +## E2E Testing Design Blocks ![E2E Testing Design Framework](./images/e2e-blocks.png) @@ -45,7 +45,7 @@ Following factors should be considered for building test cases: - For every scenario, one or more test cases should be created to test each and every functionality of the user functions. If possible, these test cases should be automated through the standard CI/CD build pipeline processes with the track of each successful and failed build in AzDO. - Every single condition should be enlisted as a separate test case. -## Applying the E2E testing [The How] +## Applying the E2E Testing Like any other testing, E2E testing also goes through formal planning, test execution, and closure phases. @@ -74,7 +74,7 @@ E2E testing is done with the following steps: - Report the Bugs in the bug reporting tool - Re-verify the bug fixes -### Test closure +### Test Closure - Test report preparation - Evaluation of exit criteria diff --git a/docs/automated-testing/e2e-testing/recipes/README.md b/docs/automated-testing/e2e-testing/recipes/README.md deleted file mode 100644 index 0d6f0fc1ed..0000000000 --- a/docs/automated-testing/e2e-testing/recipes/README.md +++ /dev/null @@ -1,4 +0,0 @@ -# Templates - -- [Gauge Framework](gauge-framework.md) -- [Postman](postman-testing.md) \ No newline at end of file diff --git a/docs/automated-testing/e2e-testing/recipes/gauge-framework.md b/docs/automated-testing/e2e-testing/recipes/gauge-framework.md index c66eb351b4..00ac139472 100644 --- a/docs/automated-testing/e2e-testing/recipes/gauge-framework.md +++ b/docs/automated-testing/e2e-testing/recipes/gauge-framework.md @@ -156,7 +156,7 @@ This getting started guide takes you through the core features of Gauge. By the This section gives specific instructions on setting up Gauge in a Microsoft Windows environment. Download the following [installation bundle](https://github.com/getgauge/gauge/releases/download/v1.0.6/gauge-1.0.6-windows.x86_64.exe) to get the latest stable release of Gauge. -### Step 2: Installing Gauge extension for Visual Studio Code +### Step 2: Installing Gauge Extension for Visual Studio Code Follow the steps to add the Gauge Visual Studio Code plugin from the IDE @@ -196,7 +196,7 @@ This section gives specific instructions on setting up Gauge in a macOS environm /usr/local/Cellar/gauge/1.4.3: 6 files, 18.9MB ``` -### Step 2 : Installing Gauge extension for Visual Studio Code +### Step 2 : Installing Gauge Extension for Visual Studio Code Follow the steps to add the Gauge Visual Studio Code plugin from the IDE diff --git a/docs/automated-testing/e2e-testing/recipes/postman-testing.md b/docs/automated-testing/e2e-testing/recipes/postman-testing.md index 5cbbfc73df..9cd104a5cc 100644 --- a/docs/automated-testing/e2e-testing/recipes/postman-testing.md +++ b/docs/automated-testing/e2e-testing/recipes/postman-testing.md @@ -133,7 +133,7 @@ Steps may look like the following: ``` 5. Build a script that automatically generates your environment files. - > NOTE: App Configuration references Key Vault, however, your script is responsible for authenticating properly to both App Configuration and Key Vault. The two services don't communicate directly. + > **Note:** App Configuration references Key Vault, however, your script is responsible for authenticating properly to both App Configuration and Key Vault. The two services don't communicate directly. ```powershell (CreatePostmanEnvironmentFiles.ps1) # Please treat this as pseudocode, and adjust as necessary. @@ -172,7 +172,7 @@ Ending with this approach has the following downsides: - Secrets may happen to get exposed in the git commit history if .gitIgnore is not updated to ignore Postman Environment files. - Collections can only be used locally to hit APIs (local or deployed). Not CI based. -### Use Case - E2E testing With Continuous Integration and Newman +### Use Case - E2E Testing with Continuous Integration and Newman A developer or QA analyst may have an existing API test suite of local Postman Collections that follow security best practices for development, however, they now want E2E tests to run as part of automated CI pipeline. With the advent of Newman, you can now more readily use Postman to craft an API test suite executable in your CI. @@ -198,12 +198,12 @@ Steps may look like the following: envVars = az appconfig kv list --name PostmanAppConfig --label $env | ConvertFrom-Json # 3. step through envVars array to get Key Vault uris keyvaultURI = "" - @envVars | % {if($_.key -eq 'password'){keyvaultURI = $_.value}} + @envVars | % {if($_.key -eq 'password'){keyvaultURI = $_.value}} # 4. parse uris for Key Vault name and secret names # 5. get secret from Key Vault kvsecret = az keyvault secret show --name $secretName --vault-name $keyvaultName --query "value" # 6. set password value to returned Key Vault secret - $envVars | % {if($_.key -eq 'password'){$_.value=$kvsecret}} + $envVars | % {if($_.key -eq 'password'){$_.value=$kvsecret}} # 7. create environment file envFile = @{ "_postman_variable_scope" = "environment", "name" = $env, values = @() } foreach($var in $envVars){ diff --git a/docs/automated-testing/fault-injection-testing/README.md b/docs/automated-testing/fault-injection-testing/README.md index 9b56fd7497..80b0473293 100644 --- a/docs/automated-testing/fault-injection-testing/README.md +++ b/docs/automated-testing/fault-injection-testing/README.md @@ -1,6 +1,6 @@ # Fault Injection Testing -Fault injection testing is the deliberate introduction of errors and faults to a system to validate and harden its [stability and reliability](../../reliability/README.md). The goal is to improve the system's design for resiliency and performance under intermittent failure conditions over time. +Fault injection testing is the deliberate introduction of errors and faults to a system to validate and harden its [stability and reliability](../../non-functional-requirements/reliability.md). The goal is to improve the system's design for resiliency and performance under intermittent failure conditions over time. ## When To Use @@ -38,7 +38,7 @@ Fault injection is an advanced form of testing where the system is subjected to Fault injection testing is a specific approach to testing one condition. It introduces a failure into a system to validate its robustness. Chaos engineering, coined by Netflix, is a practice for generating new information. There is an overlap in concerns and often in tooling between the terms, and many times chaos engineering uses fault injection to introduce the required effects to the system. -### High-level Step-by-step +### High-level Step-by-Step #### Fault injection testing in the development cycle @@ -52,7 +52,7 @@ Examples of performing fault injection during the development lifecycle: * Write regression and acceptance tests based on issues that were found and fixed or based on resolved service incidents. * Ad-hoc (manual) validations of fault in the dev environment for new features. -#### Fault injection testing in the release cycle +#### Fault Injection Testing in the Release Cycle Much like [Synthetic Monitoring Tests](../synthetic-monitoring-tests/README.md), fault injection testing in the release cycle is a part of [Shift-Right testing](https://learn.microsoft.com/en-us/devops/deliver/shift-right-test-production) approach, which uses safe methods to perform tests in a production or pre-production environment. Given the nature of distributed, cloud-based applications, it is very difficult to simulate the real behavior of services outside their production environment. Testers are encouraged to run tests where it really matters, on a live system with customer traffic. @@ -65,7 +65,7 @@ Fault injection tests rely on metrics observability and are usually statistical; * Document the process and the observations. * Identify and act on the result. -#### Fault injection testing in kubernetes +#### Fault Injection Testing in Kubernetes With the advancement of kubernetes (k8s) as the infrastructure platform, fault injection testing in kubernetes has become inevitable to ensure that system behaves in a reliable manner in the event of a fault or failure. There could be different type of workloads running within a k8s cluster which are written in different languages. For eg. within a K8s cluster, you can run a micro service, a web app and/or a scheduled job. Hence you need to have mechanism to inject fault into any kind of workloads running within the cluster. In addition, kubernetes clusters are managed differently from traditional infrastructure. The tools used for fault injection testing within kubernetes should have compatibility with k8s infrastructure. These are the main characteristics which are required: diff --git a/docs/automated-testing/integration-testing/README.md b/docs/automated-testing/integration-testing/README.md index 01d6a83c63..e3698a4098 100644 --- a/docs/automated-testing/integration-testing/README.md +++ b/docs/automated-testing/integration-testing/README.md @@ -12,7 +12,7 @@ Consider a banking application with three modules: login, transfers, and current Integration testing is done by the developer or QA tester. In the past, integration testing always happened after unit and before system and E2E testing. Compared to unit-tests, integration tests are fewer in quantity, usually run slower, and are more expensive to set up and develop. Now, if a team is following agile principles, integration tests can be performed before or after unit tests, early and often, as there is no need to wait for sequential processes. Additionally, integration tests can utilize mock data in order to simulate a complete system. There is an abundance of language-specific testing frameworks that can be used throughout the entire development lifecycle. -\*\* It is important to note the difference between integration and acceptance testing. Integration testing confirms a group of components work together as intended from a technical perspective, while acceptance testing confirms a group of components work together as intended from a business scenario. +> It is important to note the difference between integration and acceptance testing. Integration testing confirms a group of components work together as intended from a technical perspective, while acceptance testing confirms a group of components work together as intended from a business scenario. ## Applying Integration Testing diff --git a/docs/automated-testing/performance-testing/README.md b/docs/automated-testing/performance-testing/README.md index bd068f8040..11b33b2ad7 100644 --- a/docs/automated-testing/performance-testing/README.md +++ b/docs/automated-testing/performance-testing/README.md @@ -38,7 +38,7 @@ following: the cost of running the hardware and software infrastructure. - Assess the **system's readiness** for release: - + - Evaluating the system's performance characteristics (response time, throughput) in a production-like environment. The goal is to ensure that performance goals can be achieved upon release. @@ -49,8 +49,8 @@ following: to the values of performance characteristics during previous runs (or baseline values), can provide an indication of performance issues (performance regression) or enhancements introduced due to a change - -## Key Performance Testing categories + +## Key Performance Testing Categories Performance testing is a broad topic. There are many areas where you can perform tests. In broad strokes you can perform tests on the backend and on the front @@ -79,12 +79,12 @@ does it recover (i.e., scale\-out) or does it just break and fail? The goal of endurance testing is to make sure that the system can maintain good performance under extended periods of load. -### Spike testing +### Spike Testing The goal of Spike testing is to validate that a software system can respond well to large and sudden spikes. -### Chaos testing +### Chaos Testing Chaos testing or Chaos engineering is the practice of experimenting on a system to build confidence that the system can withstand turbulent conditions in @@ -93,7 +93,7 @@ Developers often implement fallback procedures for service failure. Chaos testing arbitrarily shuts down different parts of the system to validate that fallback procedures function correctly. -## Best practices +## Best Practices Consider the following best practices for performance testing: @@ -109,7 +109,7 @@ Consider the following best practices for performance testing: single IP address. If you are testing a system that has this type of restriction, you can use different IP addresses to simulate multiple users. -## Performance monitor metrics +## Performance Monitor Metrics When executing the various types of testing approaches, whether it is stress, endurance, spike, or chaos testing, it is important to capture various @@ -235,19 +235,19 @@ of data being sent and received within a unit of time. | Pages/sec | This is actually the sum of "Pages Input/sec" and "Pages Output/sec" counters which is the rate at which pages are being read and written as a result of pages faults. Small spikes with this value do not mean there is an issue but sustained values of greater than 50 can mean that system memory is a bottleneck. | | Paging File(_Total)\% Usage | The percentage of the system page file that is currently in use. This is not directly related to performance, but you can run into serious application issues if the page file does become completely full and additional memory is still being requested by applications. | -## Key Performance testing activities +## Key Performance Testing Activities Performance testing activities vary depending on the subcategory of performance testing and the system's requirements and constraints. For specific guidance you can follow the link to the subcategory of performance tests listed above. The following activities might be included depending on the performance test subcategory: -### Identify the Acceptance criteria for the tests +### Identify the Acceptance Criteria for the Tests This will generally include identifying the goals and constraints for the performance characteristics of the system -### Plan and design the tests +### Plan and Design the Tests In general we need to consider the following points: @@ -271,7 +271,7 @@ In general we need to consider the following points: - Execute the tests and collect performance metrics. -### Result analysis and re-testing +### Result Analysis and Re-testing - Analyze the results/performance metrics from the tests. @@ -283,5 +283,4 @@ The [Iterative Performance Test Template](./iterative-perf-test-template.md) can ## Resources -- [Patters and Practices: Performance Testing Guidance for Web - Applications](https://learn.microsoft.com/en-us/archive/blogs/dajung/ebook-pnp-performance-testing-guidance-for-web-applications) +- [Patters and Practices: Performance Testing Guidance for Web Applications](https://learn.microsoft.com/en-us/archive/blogs/dajung/ebook-pnp-performance-testing-guidance-for-web-applications) diff --git a/docs/automated-testing/performance-testing/iterative-perf-test-template.md b/docs/automated-testing/performance-testing/iterative-perf-test-template.md index d851d8afb0..85fc0b23dd 100644 --- a/docs/automated-testing/performance-testing/iterative-perf-test-template.md +++ b/docs/automated-testing/performance-testing/iterative-perf-test-template.md @@ -26,7 +26,7 @@ ### Results ```md -In bullet points document the results from the test. +In bullet points document the results from the test. - Attach any documents supporting the test results. - Add links to the dashboard for metrics and logs such as Application Insights. - Capture screenshots for metrics and include it in the results. Good candidate for this is CPU/Memory/Disk usage. @@ -34,4 +34,4 @@ In bullet points document the results from the test. ### Observations -> Observations are insights derived from test results. Keep the observations brief and as bullet points. Mention outcomes supporting the goal of the iteration. If any of the observation results in a work item (task, story, bug) then add the link to the work item together with the observation. +> Observations are insights derived from test results. Keep the observations brief and as bullet points. Mention outcomes supporting the goal of the iteration. If any of the observation results in a work item (task, story, bug) then add the link to the work item together with the observation. diff --git a/docs/automated-testing/performance-testing/load-testing.md b/docs/automated-testing/performance-testing/load-testing.md index 3cccf3a3c9..3e9304e99d 100644 --- a/docs/automated-testing/performance-testing/load-testing.md +++ b/docs/automated-testing/performance-testing/load-testing.md @@ -12,7 +12,7 @@ Additionally, the results of a load test can also be used as data to help with c ## Load Testing Design Blocks -There are a number of basic components that are required to carry out a load test. +There are a number of basic components that are required to carry out a load test. 1. In order to have meaningful results the system needs to be tested in a production-like environment with a network and hardware which closely resembles the expected deployment environment. @@ -51,24 +51,25 @@ Evaluate whether load tests should be run as part of the PR strategy. ### Execution -It is recommended to use an existing testing framework (see below). These tools will provide a method of both specifying the user activity scenarios and how to execute those at load. Depending on the situation, it may be advisable to coordinate testing activities with the platform operations team. +It is recommended to use an existing testing framework (see below). These tools will provide a method of both specifying the user activity scenarios and how to execute those at load. Depending on the situation, it may be advisable to coordinate testing activities with the platform operations team. It is common to slowly ramp up to your desired load to better replicate real world behavior. Once you have reached your defined workload, maintain this level long enough to see if your system stabilizes. To finish up the test you should also ramp to see record how the system slows down as well. You should also consider the origin of your load test traffic. Depending on the scope of the target system you may want to initiate from a different location to better replicate real world traffic such as from a different region. -**Note:** Before starting please be aware of any restrictions on your network such as DDOS protection where you may need to notify a network administrator or apply for an exemption. - -**Note:** In general, the preferred approach to load testing would be the usage of a standard test framework such as the ones discussed below. There are cases, however, where a custom test client may be advantageous. Examples include batch oriented workloads that can be run under a single security context and the same test data can be re-used for multiple load tests. In such a scenario it may be beneficial to develop a custom script that can be used interactively as well as non-interactively. +> **Note:** Before starting please be aware of any restrictions on your network such as DDOS protection where you may need to notify a network administrator or apply for an exemption. +> +> **Note:** In general, the preferred approach to load testing would be the usage of a standard test framework such as the ones discussed below. There are cases, however, where a custom test client may be advantageous. Examples include batch oriented workloads that can be run under a single security context and the same test data can be re-used for multiple load tests. In such a scenario it may be beneficial to develop a custom script that can be used interactively as well as non-interactively. ### Analysis The analysis phase represents the work that brings all previous activities together: -* Set aside time to allow for collection of new test data based on the analysis of the load tests. -* Correlate application metrics and platform metrics to identify potential pitfalls and bottlenecks. -* Include business stakeholders early in the analysis phase to validate application findings. Include platform operations to validate platform findings. -### Report writing +- Set aside time to allow for collection of new test data based on the analysis of the load tests. +- Correlate application metrics and platform metrics to identify potential pitfalls and bottlenecks. +- Include business stakeholders early in the analysis phase to validate application findings. Include platform operations to validate platform findings. + +### Report Writing Summarize your findings from the analysis phase. Be sure to include application and platform enhancement suggestions, if any. diff --git a/docs/automated-testing/shadow-testing/README.md b/docs/automated-testing/shadow-testing/README.md index 0427424765..1b522a9190 100644 --- a/docs/automated-testing/shadow-testing/README.md +++ b/docs/automated-testing/shadow-testing/README.md @@ -2,13 +2,13 @@ Shadow testing is one approach to reduce risks before going to production. Shadow testing is also known as "Shadow Deployment" or "Shadowing Traffic" and similarities with "Dark launching". -## When to use +## When to Use Shadow Testing reduces risks when you consider replacing the current environment (V-Current) with candidate environment with new feature (V-Next). This approach is monitoring and capturing differences between two environments then compare and reduces all risks before you introduce a new feature/release. In our test cases, code coverage is very important however sometimes providing code coverage can be tricky to replicate real-life combinations and possibilities. In this approach, to test V-Next environment we have side by side deployment, we're replicating the same traffic with V-Current environment and directing same traffic to V-Next environment, the only difference is we don't return any response from V-Next environment to users, but we collect those responses to compare with V-Current responses. -![Shadow Testing Overview](images/shadow-testing.png) +![Shadow Testing Overview](./images/shadow-testing.png) Referencing back to one of the Principles of Chaos Engineering, mentions importance of sampling real traffic like below: @@ -32,6 +32,7 @@ There are some tools to implement shadow testing. The main purpose of these tool - [Envoy](https://www.envoyproxy.io) - [McRouter](https://github.com/facebook/mcrouter) - [Scientist](https://github.com/github/scientist) +- [Keploy](https://github.com/keploy/keploy) One of the most popular tools is [Diffy](https://github.com/opendiffy/diffy). It was created and used at Twitter. Now the original author and a former Twitter employee maintains their own version of this project, called [Opendiffy](https://github.com/opendiffy/diffy). Twitter announced this tool on their engineering blog as "[Testing services without writing tests](https://blog.twitter.com/engineering/en_us/a/2015/diffy-testing-services-without-writing-tests.html)". @@ -39,7 +40,7 @@ As of today Diffy is used in production by Twitter, Airbnb, Baidu and Bytedance > Diffy finds potential bugs in your service using running instances of your new code, and your old code side by side. Diffy behaves as a proxy and multicasts whatever requests it receives to each of the running instances. It then compares the responses, and reports any regressions that may surface from those comparisons. The premise for Diffy is that if two implementations of the service return “similar” responses for a sufficiently large and diverse set of requests, then the two implementations can be treated as equivalent, and the newer implementation is regression-free. -![Diffy Shadow Testing Architecture](images/diffy-shadow-testing.png) +![Diffy Shadow Testing Architecture](./images/diffy-shadow-testing.png) Diffy architecture @@ -54,7 +55,7 @@ Some advantages of shadow testing are: - We can test real-life scenarios with real-life data. - We can simulate scale with replicated production traffic. -## References +## Resources - [Martin Fowler - Dark Launching](https://martinfowler.com/bliki/DarkLaunching.html) - [Martin Fowler - Feature Toggle](https://martinfowler.com/bliki/FeatureToggle.html) diff --git a/docs/automated-testing/synthetic-monitoring-tests/README.md b/docs/automated-testing/synthetic-monitoring-tests/README.md index cb4f5b7425..f07f29282e 100644 --- a/docs/automated-testing/synthetic-monitoring-tests/README.md +++ b/docs/automated-testing/synthetic-monitoring-tests/README.md @@ -2,7 +2,7 @@ Synthetic Monitoring Tests are a set of functional tests that target a live system in production. The focus of these tests, which are sometimes named "watchdog", "active monitoring" or "synthetic transactions", is to verify the product's health and resilience continuously. -## Why Synthetic Monitoring tests +## Why Synthetic Monitoring Tests Traditionally, software providers rely on testing through CI/CD stages in the well known [testing pyramid](https://martinfowler.com/bliki/TestPyramid.html) (unit, integration, e2e) to validate that the product is healthy and without regressions. Such tests will run on the build agent or in the test/stage environment before being deployed to production and released to live user traffic. During the services' lifetime in the production environment, they are safeguarded by monitoring and alerting tools that rely on Real User Metrics/Monitoring ([RUM](https://en.wikipedia.org/wiki/Real_user_monitoring)). @@ -21,7 +21,7 @@ Synthetic Monitoring tests are a subset of tests that run in production, sometim With [Shift-Left](https://en.wikipedia.org/wiki/Shift-left_testing) paradigms that are so popular, the approach is to perform testing as early as possible in the application development lifecycle (i.e., moved left on the project timeline). Shift right compliments and adds on top of Shift-Left. It refers to running tests late in the cycle, during deployment, release, and post-release when the product is serving production traffic. They provide modern engineering teams a broader set of tools to assure high SLAs over time. -## Synthetic Monitoring tests Design Blocks +## Synthetic Monitoring Tests Design Blocks A synthetic monitoring test is a test that uses synthetic data and real testing accounts to inject user behaviors to the system and validates their effect, usually by passively relying on existing monitoring and alerting capabilities. Components of synthetic monitoring tests include **Probes**, test code/ accounts which generates data, and **Monitoring tools** placed to validate both the system's behavior under test and the health of the probes themselves. @@ -40,7 +40,7 @@ There would usually be a finite set of tests, and key metrics that are used to b ## Applying Synthetic Monitoring Tests -### Asserting the system under tests +### Asserting the System under Test Synthetic monitoring tests are usually statistical. Test metrics are compared against some historical or running average with a time dimension *(Example: Over the last 30 days, for this time of day, the mean average response time is 250ms for AddToCart operation with a standard deviation from the mean of +/- 32ms)*. So if an observed measurement is within a [deviation of the norm](https://en.wikipedia.org/wiki/Standard_deviation) at any time, the services are probably healthy. @@ -53,7 +53,7 @@ At a high level, building synthetic monitors usually consists of the following s - Set up monitoring alarms/actions/responses that detect the failure of the system to meet the desired goal of the metric. - Run the test case automation continuously at an appropriate interval. -### Monitoring the health of tests +### Monitoring the Health of Tests Probes runtime is a production environment on its own, and the health of tests is critical. Many providers offer cloud-based systems that host such runtimes, while some organizations use existing production environments to run these tests on. In either way, a monitor-the-monitor strategy should be a first-class citizen of the production environment's alerting systems. @@ -75,7 +75,7 @@ Testing in production, in general, has a risk factor attached to it, which does - Skewed analytics (traffic funnels, A/B test results, etc.) - Auth/AuthZ - Tests are required to run in production where access to tokens and secrets may be restricted or more challenging to retrieve. -## Synthetic Monitoring tests Frameworks and Tools +## Synthetic Monitoring Tests Frameworks and Tools Most key monitoring/APM players have an enterprise product that supports synthetic monitoring built into their systems (see list below). Such offerings make some of the risks raised above irrelevant as the integration and runtime aspects of the solution are OOTB. However, such solutions are typically pricey. diff --git a/docs/automated-testing/tech-specific-samples/README.md b/docs/automated-testing/tech-specific-samples/README.md deleted file mode 100644 index f6d7e47470..0000000000 --- a/docs/automated-testing/tech-specific-samples/README.md +++ /dev/null @@ -1,4 +0,0 @@ -# Tech specific samples - -- [azdo-container-dev-test-release](azdo-container-dev-test-release/README.md) -- [blobstorage-unit-tests](blobstorage-unit-tests/README.md) \ No newline at end of file diff --git a/docs/automated-testing/tech-specific-samples/blobstorage-unit-tests/README.md b/docs/automated-testing/tech-specific-samples/blobstorage-unit-tests/README.md index b985356bcf..fcbf2f9b50 100644 --- a/docs/automated-testing/tech-specific-samples/blobstorage-unit-tests/README.md +++ b/docs/automated-testing/tech-specific-samples/blobstorage-unit-tests/README.md @@ -4,7 +4,7 @@ This document determines the approach for writing automated tests with a short f Once private endpoints are enabled for the Azure Storage accounts, the current tests will fail when executed locally or as part of a pipeline because this connection will be blocked. -## Utilize an Azure Storage emulator - Azurite +## Utilize an Azure Storage Emulator - Azurite To emulate a local Azure Blob Storage, we can use [Azure Storage Emulator](https://learn.microsoft.com/en-us/azure/storage/common/storage-use-emulator). The Storage Emulator currently runs only on Windows. If you need a Storage Emulator for Linux, one option is the community maintained, open-source Storage Emulator [Azurite](https://github.com/azure/azurite). @@ -14,7 +14,7 @@ Some differences in functionality exist between the Storage Emulator and Azure s There are several ways to install and run Azurite on your local system as listed [here](https://learn.microsoft.com/en-us/azure/storage/common/storage-use-azurite#install-and-run-azurite-by-using-npm). In this document we will cover `Install and run Azurite using NPM` and `Install and run the Azurite Docker image`. -## 1. Install and run Azurite +## 1. Install and Run Azurite ### a. Using NPM @@ -44,7 +44,7 @@ Azurite Queue service is starting at http://127.0.0.1:10001 Azurite Queue service is successfully listening at http://127.0.0.1:10001 ``` -### b. Using a docker image +### b. Using a Docker Image Another way to run Azurite is using docker, using default `HTTP` endpoint @@ -68,7 +68,7 @@ services: - "10001:10001" ``` -## 2. Run tests on your local machine +## 2. Run Tests on Your Local Machine Python 3.8.7 is used for this, but it should be fine on other 3.x versions as well. @@ -93,11 +93,11 @@ Python 3.8.7 is used for this, but it should be fine on other 3.x versions as we 1. In Azure Storage Explorer, select `Attach to a local emulator` - ![connect blob](images/blob_storage_connection.png) + ![connect blob](./images/blob_storage_connection.png) 1. Provide a Display name and port number, then your connection will be ready, and you can use Storage Explorer to manage your local blob storage. - ![attach to local](images/blob_storage_connection_attach.png) + ![attach to local](./images/blob_storage_connection_attach.png) To test and see how these endpoints are running you can attach your local blob storage to the [**Azure Storage Explorer**](https://azure.microsoft.com/en-us/features/storage-explorer/). @@ -136,9 +136,9 @@ Python 3.8.7 is used for this, but it should be fine on other 3.x versions as we After running tests, you can see the files in your local blob storage -![https local blob](images/http_local_blob_storage.png) +![https local blob](./images/http_local_blob_storage.png) -## 3. Run tests on Azure Pipelines +## 3. Run Tests on Azure Pipelines After running tests locally we need to make sure these tests pass on Azure Pipelines too. We have 2 options here, we can use docker image as hosted agent on Azure or install an npm package in the Pipeline steps. @@ -155,7 +155,7 @@ steps: - bash: | pip install -r requirements_tests.txt displayName: 'Setup requirements for tests' - + - bash: | sudo npm install -g azurite sudo mkdir azurite @@ -181,4 +181,4 @@ steps: Once we set up our pipeline in Azure Pipelines, result will be like below -![azure pipelines](images/azure_pipeline.png) +![azure pipelines](./images/azure_pipeline.png) diff --git a/docs/automated-testing/tech-specific-samples/azdo-container-dev-test-release/README.md b/docs/automated-testing/tech-specific-samples/building-containers-with-azure-devops.md similarity index 97% rename from docs/automated-testing/tech-specific-samples/azdo-container-dev-test-release/README.md rename to docs/automated-testing/tech-specific-samples/building-containers-with-azure-devops.md index f8faa3bcac..b7923d22d0 100644 --- a/docs/automated-testing/tech-specific-samples/azdo-container-dev-test-release/README.md +++ b/docs/automated-testing/tech-specific-samples/building-containers-with-azure-devops.md @@ -1,4 +1,4 @@ -# Building Containers with Azure DevOps using DevTest Pattern +# Building Containers with Azure DevOps Using the DevTest Pattern In this documents, we highlight learnings from applying the DevTest pattern to container development in Azure DevOps through pipelines. @@ -8,13 +8,6 @@ We will dive into tools needed to build, test and push a container, our environm Follow this link to dive deeper or revisit the [DevTest pattern](https://learn.microsoft.com/en-us/azure/architecture/solution-ideas/articles/dev-test-paas). -## Table of Contents - -[Build the Container](#build-the-container) -[Test the Container](#test-the-container) -[Push Container](#push-container) -[References](#references) - ## Build the Container The first step in container development, after creating the necessary Dockerfiles and source code, is building the container. Even the Dockerfile itself can include some basic testing. Code tests are performed when pushing the code to the repository origin, where it is then used to build the container. @@ -161,7 +154,7 @@ As a last task of this pipeline to build and test the container, we set a variab echo '##vso[task.setvariable variable=testsPassed]true' ``` -## Push container +## Push the Container After building and testing, if our container runs as expected, we want to release it to our Azure Container Registry (ACR) to be used by our larger application. Before that, we want to automate the push behavior and define a meaningful tag. @@ -271,7 +264,7 @@ if the tests succeeded: If you don't want to include the `latest` tag, you can also remove the steps involving latest (SetLatestSuffixTag & pushSuccessfulDockerImageLatest). -## References +## Resources - [DevTest pattern](https://learn.microsoft.com/en-us/azure/architecture/solution-ideas/articles/dev-test-paas) - [Azure Docs on Azure DevOps](https://learn.microsoft.com/en-us/azure/devops/pipelines/build/variables?view=azure-devops&tabs=yaml#build-variables-devops-services) diff --git a/docs/automated-testing/templates/README.md b/docs/automated-testing/templates/README.md deleted file mode 100644 index eea709bfab..0000000000 --- a/docs/automated-testing/templates/README.md +++ /dev/null @@ -1,4 +0,0 @@ -# Templates - -- [case-study-template](./case-study-template.md) -- [test-type-template](./test-type-template.md) \ No newline at end of file diff --git a/docs/automated-testing/templates/case-study-template.md b/docs/automated-testing/templates/case-study-template.md index 14a520c8aa..c552bd8762 100644 --- a/docs/automated-testing/templates/case-study-template.md +++ b/docs/automated-testing/templates/case-study-template.md @@ -1,4 +1,6 @@ -# ~Customer Project~ Case Study +# Case study template + +**[Customer Project] Case Study** ## Background @@ -29,7 +31,7 @@ Describe any architecture solution were used to monitor, observe and track the v Describe any testing architecture were built to run E2E testing. -## E2E Testing Implementation (Code samples) +## E2E Testing Implementation (Code Samples) Include sample test cases and their implementation in the programming language of choice. Include any common reusable code implementation blocks that could be leveraged in the future project's E2E testing implementation. diff --git a/docs/automated-testing/templates/test-type-template.md b/docs/automated-testing/templates/test-type-template.md index 6f313e3383..0924c76761 100644 --- a/docs/automated-testing/templates/test-type-template.md +++ b/docs/automated-testing/templates/test-type-template.md @@ -1,4 +1,6 @@ -# Insert Test Technique Name Here +# Test Type Template + +**[Test Technique Name Here]** Put a 2-3 sentence overview about the test technique here. @@ -22,7 +24,7 @@ How much is enough? For example, some opine that unit test ROI drops significan - [ ] Build pipelines - [ ] Non-production deployments - [ ] Production deployments - + ## NOTE: If there is great (clear, succinct) documentation for the technique on the web, supply a pointer and skip the rest of this template. No need to re-type content ## How to Use @@ -35,7 +37,7 @@ Describe the components of the technique and how they interact with each other a Anything required in advance? -### High-level Step-by-step +### High-level Step-by-Step 1. 1. diff --git a/docs/automated-testing/ui-testing/teams-tests.md b/docs/automated-testing/ui-testing/teams-tests.md index 6d6ef5b557..b7939be5fb 100644 --- a/docs/automated-testing/ui-testing/teams-tests.md +++ b/docs/automated-testing/ui-testing/teams-tests.md @@ -1,10 +1,10 @@ -# Automated UI Tests for a Teams application +# Automated UI Tests for a Teams Application ## Overview This is an overview on how you can implement UI tests for a custom Teams application. The insights provided can also be applied to automated end-to-end testing. -### General observations +### General Observations - Testing in a web browser is easier than on a native app. - Testing a Teams app on a mobile device in an automated way is more challenging due to the fact that you are testing an app within an app: @@ -16,9 +16,9 @@ This is an overview on how you can implement UI tests for a custom Teams applica The following are learnings from various engagements: -## 1. Web based UI tests +## Web Based UI Tests -To implement web-based UI tests for your Teams application, follow the same approach as you would for testing any other web application with a UI. [UI testing](README.md) provides valuable guidance in this regard. Your starting point for the test would be to automatically launch a browser (using Selenium or similar frameworks) and navigate to [https://teams.microsoft.com](https://teams.microsoft.com). +To implement web-based UI tests for your Teams application, follow the same approach as you would for testing any other web application with a UI. UI testing provides valuable guidance in this regard. Your starting point for the test would be to automatically launch a browser (using Selenium or similar frameworks) and navigate to [https://teams.microsoft.com](https://teams.microsoft.com). If you want to test a Teams app that hasn’t been published in the Teams store yet or if you’d like to test the DEV/QA version of your app, you can use the [Teams Toolkit](https://github.com/OfficeDev/TeamsFx) and package your app based on the [manifest.json](https://learn.microsoft.com/microsoftteams/platform/resources/schema/manifest-schema). @@ -65,13 +65,13 @@ var buildEdgeDriver = function () { }; ``` -## 2. Mobile based UI tests +## Mobile Based UI Tests Testing your custom Teams application on mobile devices is a bit more difficult than using the web-based approach as it requires usage of actual or simulated devices. Running such tests in a CI/CD pipeline can be more difficult and resource-intensive. One approach is to use real devices or cloud-based emulators from vendors such as [BrowserStack](https://www.browserstack.com/) which requires a license. Alternatively, you can use virtual devices hosted in Azure Virtual Machines. -### a) Using Android Virtual Devices (AVD) +### Option 1: Using Android Virtual Devices (AVD) This approach enables the creation of Android UI tests using virtual devices. It comes with the advantage of not requiring paid licenses to certain vendors. However, due to the nature of emulators, compared to real devices, it may prove to be less stable. Always choose the solution that best fits your project requirements and resources. @@ -93,7 +93,7 @@ Overall setup: - The advantage of this architecture is that it opens the possibility of running the server in a VM, and the client in a pipeline, enabling the tests to be ran automatically on scheduled basis as part of CI/CD pipelines. -#### How to run mobile tests locally on a Windows machine using AVD? +#### How to Run Mobile Tests Locally on a Windows Machine Using AVD? This approach involves: @@ -135,7 +135,7 @@ Install `appium`: List emulators that you have previously created, without opening Android Studio: -```cli +```sh emulator -list-avds ``` @@ -153,7 +153,7 @@ This approach involves hosting a virtual device within a virtual machine. To set ##### Enable connection from outside to Appium server on the VM -> Note: By default appium server runs on port 4723. The rest of the steps will assume that this is the port where your appium server runs. +> **Note:** By default appium server runs on port 4723. The rest of the steps will assume that this is the port where your appium server runs. In order to be able to reach appium server which runs on the VM from outside: @@ -164,10 +164,10 @@ In order to be able to reach appium server which runs on the VM from outside: ##### Installing Android Studio and create AVD inside the VM -1. Follow the instructions under the [end to end tests on a Windows machine section](#running-mobile-test-locally-on-a-windows-machine) to install Android Studio and create an Android Virtual Device. +1. Follow the instructions under the [end to end tests on a Windows machine section](#how-to-run-mobile-tests-locally-on-a-windows-machine-using-avd) to install Android Studio and create an Android Virtual Device. 1. When you launch the emulator, it may show a warning as below and will eventually crash: - ![failure](images/warning.png) + ![failure](./images/warning.png) Solution to fix it: 1. [Enable Windows Hypervisor Platform](https://devblogs.microsoft.com/visualstudio/hyper-v-android-emulator-support/) @@ -212,11 +212,11 @@ Inspecting the app is highly valuable when writing new tests, as it enables you If the appium server runs on your local machine at the default portal, then Remote Host and Remote Port can be kept to the default values. The configuration should look similar to the printscren below: -![appium-inspector](images/appium-inspector.png) +![appium-inspector](./images/appium-inspector.png) 3. Press on **Start Session**. - In the browser, you should see a similar view as below: -![teams-appium-inspector](images/teams-appium-inspector.png) +![teams-appium-inspector](./images/teams-appium-inspector.png) - You can do any action on the emulator, and if you press on the "Refresh" button in the browser, the left hand side of the Appium Inspector will reflect your app. In the **App Source** you will be able to see the IDs of the elements, so you can write relevant selectors in your tests. @@ -250,7 +250,7 @@ Assuming you are using [webdriverio](https://webdriver.io/) as the client, you w - "appium:appActivity": the activity within Teams that you would like to launch on the device. In our case, we would like just to launch the app. The activity name for launching Teams is called "com.microsoft.skype.teams.Launcher". - "appium:automationName": the name of the driver you are using. Note: Appium can communicate to different platforms. This is achieved by installing a dedicated driver, designed for each platform. In our case, it would be [UiAutomator2](https://github.com/appium/appium-uiautomator2-driver) or [Espresso](https://github.com/appium/appium-espresso-driver), since they are both designed for Android platform. -### b) Using BrowserStack +### Option 2: Using BrowserStack BrowserStack serves as a cloud-based platform that enables developers to test both the web and mobile application across various browsers, operating systems, and real mobile devices. This can be seen as an alternative solution to the approach described earlier. The specific insights provided below relate to implementing such tests for a custom Microsoft Teams application: diff --git a/docs/automated-testing/unit-testing/README.md b/docs/automated-testing/unit-testing/README.md index 49399cfec5..a8c02640d3 100644 --- a/docs/automated-testing/unit-testing/README.md +++ b/docs/automated-testing/unit-testing/README.md @@ -47,7 +47,7 @@ but the general techniques and best practices of writing a unit test are univers ### Techniques These are some commonly used techniques that will help when authoring unit tests. For some examples, see the pages on -using [abstraction and dependency injection to author a unit test](authoring_example.md), or how to do [test-driven development](tdd_example.md). +using [abstraction and dependency injection to author a unit test](./authoring-example.md), or how to do [test-driven development](./tdd-example.md). Note that some of these techniques are more specific to strongly typed, object-oriented languages. Functional languages and scripting languages have similar techniques that may look different, but these terms are commonly used in all unit @@ -58,7 +58,7 @@ testing examples. Abstraction is when we take an exact implementation detail, and we generalize it into a concept instead. This technique can be used in creating testable design and is used often especially in object-oriented languages. For unit tests, abstraction is commonly used to break a hard dependency and replace it with an abstraction. That abstraction then allows -for greater flexibility in the code and allows for the a [mock or simulator](mocking.md) to be used in its place. +for greater flexibility in the code and allows for the a [mock or simulator](./mocking.md) to be used in its place. One of the side effects of abstracting dependencies is that you may have an abstraction that has no test coverage. This is case where unit testing is not well-suited, you can not expect to unit test everything, things like dependencies will @@ -68,7 +68,7 @@ should still be used - without that, a change in the way the dependency function When building wrappers around third-party dependencies, it is best to keep the implementations with as little logic as possible, using a very simple [facade](https://en.wikipedia.org/wiki/Facade_pattern) that calls the dependency. -An example of using abstraction can be found [here](authoring_example.md#abstraction). +An example of using abstraction can be found [here](./authoring-example.md#abstraction). #### Dependency Injection @@ -93,7 +93,7 @@ system. Many languages include special Dependency Injection frameworks that take care of the boilerplate code and construction of the objects. Examples of this are [Spring](https://spring.io/) in Java or built into [ASP.NET Core](https://learn.microsoft.com/en-us/aspnet/core/fundamentals/dependency-injection?view=aspnetcore-3.1) -An example of using dependency injection can be found [here](authoring_example.md#dependency-injection). +An example of using dependency injection can be found [here](./authoring-example.md#dependency-injection). #### Test-Driven Development @@ -103,7 +103,7 @@ write your test code first and then write the system under test to match the tes design is done up front and by the time you finish writing your system code, you are already at 100% test pass rate and test coverage. It also guarantees testable design is built into the system since the test was written first! -For more information on TDD and an example, see the page on [Test-Driven Development](./tdd_example.md) +For more information on TDD and an example, see the page on [Test-Driven Development](./tdd-example.md) ### Best Practices @@ -140,7 +140,7 @@ public void TrySomething_NoElements_ReturnsFalse() } ``` -#### Keep tests small and test only one thing +#### Keep Tests Small and Test Only One Thing Unit tests should be short and test only one thing. This makes it easy to diagnose when there was a failure without needing something like which line number the test failed at. When using [Arrange/Act/Assert](#arrangeactassert), think @@ -150,7 +150,7 @@ There is some disagreement on whether testing one thing means "assert one thing" multiple asserts if needed". Both have their advantages and disadvantages, but as with most technical disagreements there is no "right" answer. Consistency when writing your tests one way or the other is more important! -#### Using a standard naming convention for all unit tests +#### Using a Standard Naming Convention for All Unit Tests Without having a set standard convention for unit test names, unit test names end up being either not descriptive enough, or duplicated across multiple different test classes. Establishing a standard is not only important for keeping @@ -196,7 +196,7 @@ Many projects start with both a unit test framework, and also add a mock framewo uses and sometimes can be a requirement, it should not be something that is added without considering the broader implications and risks associated with heavy usage of mocks. -To see if mocking is right for your project, or if a mock-free approach is more appropriate, see the page on [mocking](mocking.md). +To see if mocking is right for your project, or if a mock-free approach is more appropriate, see the page on [mocking](./mocking.md). ### Tools @@ -208,9 +208,9 @@ extremely fast and allows for easy TDD: - [Infinitest](http://infinitest.github.io/) for Java - [PyCrunch](https://plugins.jetbrains.com/plugin/13264-pycrunch--live-testing) for Python -## Things to consider +## Things to Consider -### Transferring responsibility to integration tests +### Transferring Responsibility to Integration Tests In some situations it is worth considering to include the integration tests in the inner development loop to provide a sufficient code coverage to ensure the system is working properly. The prerequisite for this approach to be successful is to have integration tests being able to execute at a speed comparable to that of unit tests both locally and in a CI environment. Modern application frameworks like .NET or Spring Boot combined with the right mocking or stubbing approach for external dependencies offer excellent capabilities to enable such scenarios for testing. diff --git a/docs/automated-testing/unit-testing/authoring_example.md b/docs/automated-testing/unit-testing/authoring-example.md similarity index 98% rename from docs/automated-testing/unit-testing/authoring_example.md rename to docs/automated-testing/unit-testing/authoring-example.md index 039edbe99c..38688970c0 100644 --- a/docs/automated-testing/unit-testing/authoring_example.md +++ b/docs/automated-testing/unit-testing/authoring-example.md @@ -1,4 +1,4 @@ -# Example: Authoring a unit test +# Writing a Unit Test To illustrate some unit testing techniques for an object-oriented language, let's start with an example of some code we wish to add unit tests for. In this example, we have a configuration class that contains all the startup options @@ -193,7 +193,7 @@ public class ConfigurationTests } ``` -## Fixing the bug +## Fixing the Bug All our current tests pass, and give us 100% coverage, however as evidenced by the bug, we must not be covering all possible inputs and outputs. In the case of the bug, multiple empty lines would cause an issue. Additionally, @@ -303,7 +303,7 @@ confidence in future changes. ## Untestable Code -As described in the [abstraction section](README.md#abstraction), not all code can be properly unit tested. In our case +As described in the [abstraction section](./README.md#abstraction), not all code can be properly unit tested. In our case we have a single class that has 0% test coverage: `FileConfigurationReader`. This is expected; in this case we kept `FileConfigurationReader` as light as possible with no additional logic other than calling into the third-party dependency. `FileConfigurationReader` is an example of the [facade design pattern](https://en.wikipedia.org/wiki/Facade_pattern). diff --git a/docs/automated-testing/unit-testing/mocking.md b/docs/automated-testing/unit-testing/mocking.md index dcf9f7787b..662f0795c0 100644 --- a/docs/automated-testing/unit-testing/mocking.md +++ b/docs/automated-testing/unit-testing/mocking.md @@ -35,7 +35,7 @@ class StubTestCase(TestBase): def setUp(self) -> None: super(StubTestCase, self).setUp() self.app.container.service_a.override(StubService()) - + def test_service(): service = self.app.container.service_a() self.assertTrue(isinstance(service, StubService)) @@ -133,7 +133,7 @@ to do a time-intensive refactor to make the code unit testable. A common problem languages such as C# is not using dependency injection. Consider using dependency injection so that a mock can easily be injected into your Subject Under Test (SUT) during a unit test. -More information on using dependency injection can be found [here](authoring_example.md#dependency-injection). +More information on using dependency injection can be found [here](./authoring-example.md#dependency-injection). ### Assertions @@ -144,7 +144,6 @@ changes, consider not asserting on the return value. Because if you do, you are mock correctly. For a very simple example, look at this class: ```csharp - public class SearchController : ControllerBase { public ISearchClient SearchClient { get; } @@ -212,11 +211,11 @@ store the options in a callback for later assertions. var actualOptions = new SearchOptions(); mockSearchClient - .Setup(x => + .Setup(x => x.Search( - "[This parameter is most relevant]", + "[This parameter is most relevant]", It.IsAny() - ) + ) ) .Returns(mockResults) .Callback((query, searchOptions) => diff --git a/docs/automated-testing/unit-testing/tdd_example.md b/docs/automated-testing/unit-testing/tdd-example.md similarity index 100% rename from docs/automated-testing/unit-testing/tdd_example.md rename to docs/automated-testing/unit-testing/tdd-example.md diff --git a/docs/automated-testing/unit-testing/why-unit-tests.md b/docs/automated-testing/unit-testing/why-unit-tests.md index b4689f3b74..f46130793e 100644 --- a/docs/automated-testing/unit-testing/why-unit-tests.md +++ b/docs/automated-testing/unit-testing/why-unit-tests.md @@ -3,7 +3,7 @@ It is no secret that writing unit tests is hard, and even harder to write well. Writing unit tests also increases the development time for every feature. So why should we bother writing them? -## Reduce costs +## Reduce Costs There is no question that the later a bug is found, the more expensive it is to fix; especially so if the bug makes it into production. A [2008 research study by IBM](ftp://ftp.software.ibm.com/software/rational/info/do-more/RAW14109USEN.pdf) @@ -17,7 +17,7 @@ Having unit tests also helps with making safe, mechanical refactors that are pro refactoring tools to do mechanical refactoring and running unit tests that cover the refactored code should be enough to increase confidence in the commit. -## Speed up development +## Speed Up Development Unit tests take time to write, but they also speed up development? While this may seem like an oxymoron, it is one of the strengths of a unit testing suite - over time it continues to grow and evolve until the tests become an essential @@ -35,7 +35,7 @@ writing. Since unit tests execute really quickly, running tests shouldn't be see Tooling such as [Visual Studio Live Unit Testing](https://learn.microsoft.com/en-us/visualstudio/test/live-unit-testing-start?view=vs-2019) also help to shorten the inner loop even more. -## Documentation as code +## Documentation as Code Writing unit tests is a great way to show how the units of code you are writing are supposed to be used. In some ways, unit tests are better than any documentation or samples because they are (or at least should be) executed with every diff --git a/docs/code-reviews/README.md b/docs/code-reviews/README.md index 7bbd065770..67937b5ec4 100644 --- a/docs/code-reviews/README.md +++ b/docs/code-reviews/README.md @@ -12,6 +12,6 @@ Code review is a way to have a conversation about the code where participants wi ## Resources -- [Code review tools](tools.md) +- [Code review tools](./tools.md) - [Google's Engineering Practices documentation: How to do a code review](https://google.github.io/eng-practices/review/reviewer/) - [Best Kept Secrets of Peer Code Review](https://static1.smartbear.co/smartbear/media/pdfs/best-kept-secrets-of-peer-code-review_redirected.pdf) diff --git a/docs/code-reviews/evidence-and-measures/README.md b/docs/code-reviews/evidence-and-measures/README.md index 8e5c7fc205..a13ae2ba20 100644 --- a/docs/code-reviews/evidence-and-measures/README.md +++ b/docs/code-reviews/evidence-and-measures/README.md @@ -25,8 +25,6 @@ It is a perfectly reasonable solution to track these metrics manually e.g. in an Remember that since defects removed thanks to reviews is far less costly compared to finding them in production, the cost of doing code reviews is actually negative! -For more information, see links under [resources](#resources). - ## Resources * [A Guide to Code Inspections](http://www.ganssle.com/inspections.pdf) diff --git a/docs/code-reviews/faq.md b/docs/code-reviews/faq.md index 7f8e93ff5d..91a3dbc1f3 100644 --- a/docs/code-reviews/faq.md +++ b/docs/code-reviews/faq.md @@ -2,44 +2,45 @@ This is a list of questions / frequently occurring issues when working with code reviews and answers how you can possibly tackle them. -## What makes a code review different from a PR? +## What Makes a Code Review Different from a PR? A pull request (PR) is a way to notify a task is finished and ready to be merged into the main working branch (source of truth). A code review is having someone go over the code in a PR and validate it before it is merged, but, in general, code reviews can take place outside PRs too. | Code Review | Pull Request | ---- | --- +| -- | -- | | Source code focused | Intended to enhance and enable code reviews. Includes both source code but can have a broader scope (e.g., docs, integration tests, compiles) | | Intended for **early feedback** before submitting a PR | Not intended for **early feedback**. Created when author is ready to merge | | Usually a synchronous review with faster feedback cycles (draft PRs as an exception). Examples: scheduled meetings, over-the-shoulder review, pair programming | Usually a tool assisted asynchronous review but can be elevated to a synchronous meeting when needed | -## Why do we need code reviews? +## Why do we Need Code Reviews? Our peer code reviews are structured around best practices, to find specific kinds of errors. Much like you would still run a linter over mobbed code, you would still ask someone to make the last pass to make sure the code conforms to expected standards and avoids common pitfalls. -## PRs are too large, how can we fix this? +## PRs are Too Large, How can we Fix This? Make sure you size the work items into small clear chunks, so the reviewer will be able to understand the code on their own. The team is instructed to commit early, before the full product backlog item / user story is complete, but rather when an individual item is done. If the work would result in an incomplete feature, make sure it can be turned off, until the full feature is delivered. More information can be found in [Pull Requests - Size Guidance](./pull-requests.md#size-guidance). -## How can we expedite code reviews? +## How can we Expedite Code Reviews? Slow code reviews might cause delays in delivering features and cause frustration amongst team members. -### Possible actions you can take +### Possible Actions you can Take - Add a rule for PR turnaround time to your work agreement. - Set up a slot after the standup to go through pending PRs and assign the ones that are inactive. - Dedicate a PR review manager who will be responsible to keep things flowing by assigning or notifying people when PR got stale. -- Use tools to better indicate stale reviews - [Customize ADO - Task Boards](tools.md#task-boards). +- Use tools to better indicate stale reviews - [Customize ADO - Task Boards](./tools.md#task-boards). -## Which tools can I use to review a complex PR? +## Which Tools can I use to Review a Complex PR? Checkout the [Tools](./tools.md) for help on how to perform reviews out of Visual Studio or Visual Studio Code. -## How can we enforce code review policies? +## How can we Enforce the Code Review Policies? + By configuring [Branch Policies](./tools.md#Configuring-Branch-Policies) , you can easily enforce code reviews rules. -## We pair or mob. How should this reflect in our code reviews? +## We Pair or Mob. How Should This Reflect in our Code Reviews? There are two ways to perform a code review: diff --git a/docs/code-reviews/process-guidance/README.md b/docs/code-reviews/process-guidance/README.md index 7b190c9582..87ea37879c 100644 --- a/docs/code-reviews/process-guidance/README.md +++ b/docs/code-reviews/process-guidance/README.md @@ -13,7 +13,7 @@ To ensure that the code review process is healthy, inclusive and meets the goals - Utilize tools to streamline the review process - [Code review tools](../tools.md) - Foster inclusive code reviews - [Inclusion in Code Review](../inclusion-in-code-review.md) -## Measuring code review process +## Measuring Code Review Process If the team is finding that code reviews are taking a significant time to merge, and it is becoming a blocker, consider the following additional recommendations: @@ -22,15 +22,15 @@ If the team is finding that code reviews are taking a significant time to merge, 1. Assess the time to merge across sprints to see if the process is improving. 1. Ping required approvers directly as a reminder. -## Code reviews shouldn't include too many lines of code +## Code Reviews Shouldn't Include too Many Lines of Code It's easy to say a developer can review few hundred lines of code, but when the code surpasses certain amount of lines, the effectiveness of defects discovery will decrease and there is a lesser chance of doing a good review. It's not a matter of setting a code line limit, but rather using common sense. More code there is to review, the higher chances there are letting a bug sneak through. See [PR size guidance](../pull-requests.md#size-guidance). -## Automate whenever reasonable +## Automate Whenever Reasonable -Use automation (linting, code analysis etc.) to avoid the need for "[nits](https://en.wikipedia.org/wiki/Nitpicking)" and allow the reviewer to focus more on the functional aspects of the PR. By configuring automated builds, tests and checks (something achievable in the [CI process](../../continuous-integration/README.md)), teams can save human reviewers some time and let them focus in areas like design and functionality for proper evaluation. This will ensure higher chances of success as the team is focusing on the things that matter. +Use automation (linting, code analysis etc.) to avoid the need for "[nits](https://en.wikipedia.org/wiki/Nitpicking)" and allow the reviewer to focus more on the functional aspects of the PR. By configuring automated builds, tests and checks (something achievable in the [CI process](../../CI-CD/continuous-integration.md)), teams can save human reviewers some time and let them focus in areas like design and functionality for proper evaluation. This will ensure higher chances of success as the team is focusing on the things that matter. ## Role specific guidance -- [Author Guidance](author-guidance.md) -- [Reviewer Guidance](reviewer-guidance.md) +- [Author Guidance](./author-guidance.md) +- [Reviewer Guidance](./reviewer-guidance.md) diff --git a/docs/code-reviews/process-guidance/author-guidance.md b/docs/code-reviews/process-guidance/author-guidance.md index 501ec4426c..35492f7720 100644 --- a/docs/code-reviews/process-guidance/author-guidance.md +++ b/docs/code-reviews/process-guidance/author-guidance.md @@ -1,17 +1,17 @@ # Author Guidance -## Properly describe your pull request (PR) +## Properly Describe Your Pull Request (PR) - Give the PR a descriptive title, so that other members can easily (in one short sentence) understand what a PR is about. - Every PR should have a proper description, that shows the reviewer what has been changed and why. -## Add relevant reviewers +## Add Relevant Reviewers - Add one or more reviewers (depending on your project's guidelines) to the PR. Ideally, you would add at least someone who has expertise and is familiar with the project, or the language used - Adding someone less familiar with the project or the language can aid in verifying the changes are understandable, easy to read, and increases the expertise within the team - In ISE code-with projects with a customer team, it is important to include reviewers from both organizations for knowledge transfer - [Customize Reviewers Policy](../tools.md#reviewer-policies) -## Be open to receive feedback +## Be Open to Receive Feedback Discuss design/code logic and address all comments as follows: @@ -22,14 +22,14 @@ Discuss design/code logic and address all comments as follows: - If you don't understand a comment, ask questions in the review itself as opposed to a private chat - If a thread gets bloated without a conclusion, have a meeting with the reviewer (call them or knock on door) -## Use checklists +## Use Checklists When creating a PR, it is a good idea to add a checklist of objectives of the PR in the description. This helps the reviewers to focus on the key areas of the code changes. -## Link a task to your PR +## Link a Task to Your PR Link the corresponding work items/tasks to the PR. There is no need to duplicate information between the work item and the PR, but if some details are missing in either one, together they provide more context to the reviewer. -## Code should have annotations before the review +## Code Should Have Annotations Before the Review If you can't avoid large PRs, include explanations of the changes in order to make it easier for the reviewer to review the code, with clear comments the reviewer can identify the goal of every code block. diff --git a/docs/code-reviews/process-guidance/reviewer-guidance.md b/docs/code-reviews/process-guidance/reviewer-guidance.md index 885df90f05..9c58e7dbda 100644 --- a/docs/code-reviews/process-guidance/reviewer-guidance.md +++ b/docs/code-reviews/process-guidance/reviewer-guidance.md @@ -9,9 +9,9 @@ Since parts of reviews can be automated via linters and such, human reviewers ca Code reviews should use the below guidance and checklists to ensure positive and effective code reviews. -## General guidance +## General Guidance -### Understand the code you are reviewing +### Understand the Code You are Reviewing - Read every line changed. - If we have a stakeholder review, it’s not necessary to run the PR unless it aids your understanding of the code. @@ -19,17 +19,17 @@ Code reviews should use the below guidance and checklists to ensure positive and - If you don’t fully understand a change in a file because you don’t have context, click to view the whole file and read through the surrounding code or checkout the changes and view them in IDE. - Ask the author to clarify. -### Take your time and keep focus on scope +### Take Your Time and Keep Focus on Scope You shouldn't review code hastily but neither take too long in one sitting. If you have many pull requests (PRs) to review or if the complexity of code is demanding, the recommendation is to take a break between the reviews to recover and focus on the ones you are most experienced with. Always remember that a goal of a code review is to verify that the goals of the corresponding task have been achieved. If you have concerns about the related, adjacent code that isn't in the scope of the PR, address those as separate tasks (e.g., bugs, technical debt). Don't block the current PR due to issues that are out of scope. -## Foster a positive code review culture +## Foster a Positive Code Review Culture Code reviews play a critical role in product quality and it should not represent an arena for long discussions or even worse a battle of egos. What matters is a bug caught, not who made it, not who found it, not who fixed it. The only thing that matters is having the best possible product. -## Be considerate +## Be Considerate - Be positive – encouraging, appreciation for good practices. - Prefix a “point of polish” with “Nit:”. @@ -68,7 +68,7 @@ Code reviews play a critical role in product quality and it should not represent - Does the code add functionality that isn’t needed? - Can the code be understood easily by code readers? -### Naming/readability +### Naming/Readability - Did the developer pick good names for functions, variables, etc? diff --git a/docs/code-reviews/pull-request-template.md b/docs/code-reviews/pull-request-template.md index b32e108ef8..6cd77a5687 100644 --- a/docs/code-reviews/pull-request-template.md +++ b/docs/code-reviews/pull-request-template.md @@ -1,4 +1,4 @@ -# Pull Request template +# Pull Request Template ```markdown # [Work Item ID](./link-to-the-work-item) @@ -31,7 +31,7 @@ For more information about how to contribute to this repo, visit this [page](htt - [ ] I ran the lint checks which produced no new errors nor warnings for my changes. - [ ] I have checked to ensure there aren't other open Pull Requests for the same update/change. -## Does this introduce a breaking change? +## Does This Introduce a Breaking Change? --- @@ -49,7 +49,7 @@ For more information about how to contribute to this repo, visit this [page](htt > - Which test sets were used. > - Description of test scenarios that you have tried. -## Any relevant logs or outputs +## Any Relevant Logs or Outputs --- @@ -58,7 +58,7 @@ For more information about how to contribute to this repo, visit this [page](htt > - When you want to share long logs upload to: > `(StorageAccount)/pr-support/attachments/(PR Number)/(yourFiles) using [Azure Storage Explorer](https://azure.microsoft.com/en-us/features/storage-explorer/)` or [portal.azure.com](https://portal.azure.com) and insert the link here. -## Other information or known dependencies +## Other Information or Known Dependencies --- diff --git a/docs/code-reviews/recipes/README.md b/docs/code-reviews/recipes/README.md deleted file mode 100644 index df34ec7c04..0000000000 --- a/docs/code-reviews/recipes/README.md +++ /dev/null @@ -1,11 +0,0 @@ -# Language Specific Guidance - -- [Bash](bash.md) -- [C#](csharp.md) -- [Go](go.md) -- [Java](java.md) -- [JavaScript and TypeScript](./javascript-and-typescript.md) -- [Markdown](markdown.md) -- [Python](python.md) -- [Terraform](terraform.md) -- [YAML (Azure Pipelines)](./azure-pipelines-yaml.md) diff --git a/docs/code-reviews/recipes/azure-pipelines-yaml.md b/docs/code-reviews/recipes/azure-pipelines-yaml.md index d89bc94daa..7958b99b6f 100644 --- a/docs/code-reviews/recipes/azure-pipelines-yaml.md +++ b/docs/code-reviews/recipes/azure-pipelines-yaml.md @@ -25,7 +25,7 @@ These documents may be useful when reviewing YAML files: - [Key concepts for new Azure Pipelines](https://learn.microsoft.com/en-us/azure/devops/pipelines/get-started/key-pipelines-concepts?view=azure-devops) **Key concepts overview** -![Azure Pipelines key concepts](images/key-concepts-overview.png) +![Azure Pipelines key concepts](./images/key-concepts-overview.png) - A trigger tells a Pipeline to run. - A pipeline is made up of one or more stages. A pipeline can deploy to one or more environments. diff --git a/docs/code-reviews/recipes/bash.md b/docs/code-reviews/recipes/bash.md index f5e6f8a446..f4387c7d33 100644 --- a/docs/code-reviews/recipes/bash.md +++ b/docs/code-reviews/recipes/bash.md @@ -6,7 +6,7 @@ Developers should follow [Google's Bash Style Guide](https://google.github.io/st ## Code Analysis / Linting -Projects must check bash code with [shellcheck](https://github.com/koalaman/shellcheck) as part of the [CI process](../../continuous-integration/README.md). +Projects must check bash code with [shellcheck](https://github.com/koalaman/shellcheck) as part of the [CI process](../../CI-CD/continuous-integration.md). Apart from linting, [shfmt](https://github.com/mvdan/sh) can be used to automatically format shell scripts. There are few vscode code extensions which are based on shfmt like shell-format which can be used to automatically format shell scripts. ## Project Setup @@ -15,7 +15,7 @@ Apart from linting, [shfmt](https://github.com/mvdan/sh) can be used to automati Shellcheck extension should be used in VS Code, it provides static code analysis capabilities and auto fixing linting issues. To use vscode-shellcheck in vscode do the following: -#### Install shellcheck on your machine +#### Install shellcheck on Your Machine For macOS @@ -29,7 +29,7 @@ For Ubuntu: apt-get install shellcheck ``` -#### Install shellcheck on vscode +#### Install shellcheck on VSCode Find the vscode-shellcheck extension in vscode and install it. @@ -40,13 +40,15 @@ Find the vscode-shellcheck extension in vscode and install it. shell-format extension does automatic formatting of your bash scripts, docker files and several configuration files. It is dependent on shfmt which can enforce google style guide checks for bash. To use shell-format in vscode do the following: -#### Install shfmt(Requires Go 1.13 or later) on your machine +#### Install shfmt on Your Machine + +Requires Go 1.13 or Later ```bash GO111MODULE=on go get mvdan.cc/sh/v3/cmd/shfmt ``` -#### Install shell-format on vscode +#### Install shell-format on VSCode Find the shell-format extension in vscode and install it. @@ -140,3 +142,4 @@ In addition to the [Code Review Checklist](../process-guidance/reviewer-guidance * [ ] Does the code pass all linting checks as per shellcheck and unit tests as per shunit2 ? * [ ] Does the code uses relative paths or absolute paths? Relative paths should be avoided as they are prone to environment attacks. If relative path is needed, check that the `PATH` variable is set. * [ ] Does the code take credentials as user input? Are the credentials masked or encrypted in the script? +S \ No newline at end of file diff --git a/docs/code-reviews/recipes/csharp.md b/docs/code-reviews/recipes/csharp.md index 24d28d82c9..ec2e8bb7e1 100644 --- a/docs/code-reviews/recipes/csharp.md +++ b/docs/code-reviews/recipes/csharp.md @@ -55,7 +55,7 @@ Microsoft's .NET analyzers has code quality rules and .NET API usage rules imple If you are currently using the legacy FxCop analyzers, [migrate from FxCop analyzers to .NET analyzers](https://learn.microsoft.com/en-us/visualstudio/code-quality/migrate-from-fxcop-analyzers-to-net-analyzers?view=vs-2019). -### StyleCop analyzer +### StyleCop Analyzer The StyleCop analyzer is a nuget package (StyleCop.Analyzers) that can be installed in any of your projects. It's mainly around code style rules and makes sure the team is following the same rules without having subjective discussions about braces and spaces. Detailed information can be found here: [StyleCop Analyzers for the .NET Compiler Platform](https://github.com/DotNetAnalyzers/StyleCopAnalyzers). @@ -65,7 +65,7 @@ The minimum rules set teams should adopt is the [Managed Recommended Rules](http Use .editorconfig to configure code formatting rules in your project. -## Build validation +## Build Validation It's important that you enforce your code style and rules in the CI to avoid any team member merging code that does not comply with your standards into your git repo. @@ -79,7 +79,7 @@ If you are using FxCop analyzers and StyleCop analyzer, it's very simple to enab projects: '**/*.csproj' ``` -## Enable Roslyn Support in Visual Studio Code +## Enable Roslyn Support in VSCode The above steps also work in VS Code provided you enable Roslyn support for Omnisharp. The setting is `omnisharp.enableRoslynAnalyzers` and must be set to `true`. After enabling this setting you must "Restart Omnisharp" (this can be done from the Command Palette in VS Code or by restarting VS Code). diff --git a/docs/code-reviews/recipes/go.md b/docs/code-reviews/recipes/go.md index e13f85e6c9..c3149f1f18 100644 --- a/docs/code-reviews/recipes/go.md +++ b/docs/code-reviews/recipes/go.md @@ -10,7 +10,7 @@ Developers should follow the [Effective Go](https://golang.org/doc/effective_go. Below is the project setup that you would like to have in your VS Code. -#### vscode-go extension +#### VSCode go Extension Using the Go extension for Visual Studio Code, you get language features like IntelliSense, code navigation, symbol search, bracket matching, snippets, etc. This extension includes rich language support for go in VS Code. @@ -20,7 +20,7 @@ Using the Go extension for Visual Studio Code, you get language features like In #### golint -**:exclamation: NOTICE: The golint library is deprecated and archived.** +> **Note:** The golint library is deprecated and archived. The linter revive (below) might be a suitable replacement. diff --git a/docs/code-reviews/recipes/javascript-and-typescript.md b/docs/code-reviews/recipes/javascript-and-typescript.md index 50c566797b..a9d62c40ca 100644 --- a/docs/code-reviews/recipes/javascript-and-typescript.md +++ b/docs/code-reviews/recipes/javascript-and-typescript.md @@ -89,7 +89,7 @@ module.exports = { This will apply the `prettier` rule set when linting with ESLint. -## Auto formatting with VS Code +## Auto Formatting with VSCode VS Code can be configured to automatically perform `eslint --fix` on save. @@ -137,7 +137,7 @@ To automate this process in Azure Devops you can add the following snippet to yo workingDir: './scripts/' ``` -## Pre-commit hooks +## Pre-Commit Hooks All developers should run `eslint` in a pre-commit hook to ensure standard formatting. We highly recommend using an editor integration like [vscode-eslint](https://github.com/Microsoft/vscode-eslint) to provide realtime feedback. diff --git a/docs/code-reviews/recipes/markdown.md b/docs/code-reviews/recipes/markdown.md index c5166444e4..d586ae3c2b 100644 --- a/docs/code-reviews/recipes/markdown.md +++ b/docs/code-reviews/recipes/markdown.md @@ -68,13 +68,13 @@ npx write-good *.md Write Good is also available as an [extension for VS Code](https://marketplace.visualstudio.com/items?itemName=travisthetechie.write-good-linter) -## VS Code Extensions +## VSCode Extensions ### Write Good Linter The [`Write Good Linter Extension`](https://marketplace.visualstudio.com/items?itemName=travisthetechie.write-good-linter) integrates with VS Code to give grammar and language advice while editing the document. -### markdownlint extension +### markdownlint Extension The [`markdownlint extension`](https://marketplace.visualstudio.com/items?itemName=DavidAnson.vscode-markdownlint) examines the Markdown documents, showing warnings for rule violations while editing. @@ -166,7 +166,7 @@ Save your guidelines together with your documentation, so they are easy to refer - Avoid using symbols and special characters in headers, this causes problems with anchor links - Avoid links in headers -### Links +### Resources - Avoid duplication of content, instead link to the `single source of truth` - Link but don't summarize. Summarizing content on another page leads to the content living in two places @@ -186,7 +186,7 @@ Save your guidelines together with your documentation, so they are easy to refer - Name images appropriately, avoiding generic names like `screenshot.png` - Avoid adding large images or videos to source control, link to an external location instead -### Emphasis and special sections +### Emphasis and Special Sections - Use **bold** or _italic_ to emphasize > For sections that everyone reading this document needs to be aware of, use blocks diff --git a/docs/code-reviews/recipes/python.md b/docs/code-reviews/recipes/python.md index f37d2c1d83..55e9fc29c6 100644 --- a/docs/code-reviews/recipes/python.md +++ b/docs/code-reviews/recipes/python.md @@ -72,7 +72,7 @@ Format python code black [file/folder] ``` -### Autopep8 +### autopep8 [`Autopep8`](https://github.com/hhatto/autopep8) is more lenient and allows more configuration if you want less stringent formatting. @@ -102,15 +102,15 @@ yapf [file/folder] --in-place ### Bandit -[Bandit](https://github.com/PyCQA/bandit) is a tool designed by the Python Code Quality Authority (PyCQA) to perform static analysis of Python code, specifically targeting security issues. +[Bandit](https://github.com/PyCQA/bandit) is a tool designed by the Python Code Quality Authority (PyCQA) to perform static analysis of Python code, specifically targeting security issues. It scans for common security issues in Python codebase. - + - **Installation**: Add Bandit to your development environment with: ```bash pip install bandit ``` -## VS Code Extensions +## VSCode Extensions ### Python @@ -125,7 +125,7 @@ def add(first_value: int, second_value: int) -> int: return first_value + second_value ``` -## Build validation +## Build Validation To automate linting with `flake8` and testing with `pytest` in Azure Devops you can add the following snippet to you `azure-pipelines.yaml` file. @@ -185,7 +185,7 @@ jobs: To perform a PR validation on GitHub you can use a similar YAML configuration with [GitHub Actions](https://help.github.com/en/actions/language-and-framework-guides/using-python-with-github-actions) -## Pre-commit hooks +## Pre-Commit Hooks Pre-commit hooks allow you to format and lint code locally before submitting the pull request. @@ -193,7 +193,7 @@ Adding pre-commit hooks for your python repository is easy using the pre-commit 1. Install pre-commit and add to the requirements.txt - ```bash + ```sh pip install pre-commit ``` @@ -214,7 +214,7 @@ Adding pre-commit hooks for your python repository is easy using the pre-commit 3. Each individual developer that wants to set up pre-commit hooks can then run - ```bash + ```sh pre-commit install ``` diff --git a/docs/code-reviews/recipes/terraform.md b/docs/code-reviews/recipes/terraform.md index 1b83011253..52cc85f6ef 100644 --- a/docs/code-reviews/recipes/terraform.md +++ b/docs/code-reviews/recipes/terraform.md @@ -12,7 +12,7 @@ Projects should check Terraform scripts with automated tools. [`TFLint`](https://github.com/terraform-linters/tflint) is a Terraform linter focused on possible errors, best practices, etc. Once TFLint installed in the environment, it can be invoked using the VS Code [`terraform extension`](https://marketplace.visualstudio.com/items?itemName=mauve.terraform). -## VS Code Extensions +## VSCode Extensions The following VS Code extensions are widely used. @@ -80,7 +80,7 @@ In addition to the [Code Review Checklist](../process-guidance/reviewer-guidance * [ ] The repository contains a `README.md` describing the architecture provisioned? * [ ] If Terraform code is mixed with application source code, the Terraform code isolated into a dedicated folder? -### Terraform state +### Terraform State * [ ] The Terraform project configured using Azure Storage as remote state backend? * [ ] The remote state backend storage account key stored a secure location (e.g. Azure Key Vault)? @@ -97,7 +97,7 @@ In addition to the [Code Review Checklist](../process-guidance/reviewer-guidance * [ ] Unit and integration tests covering the Terraform code exist (e.g. [`Terratest`](https://terratest.gruntwork.io/), [`terratest-abstraction`](https://github.com/microsoft/terratest-abstraction))? -### Naming and code structure +### Naming and Code Structure * [ ] Resource definitions and data sources are used correctly in the Terraform scripts? * **resource:** Indicates to Terraform that the current configuration is in charge of managing the life cycle of the object @@ -107,7 +107,7 @@ In addition to the [Code Review Checklist](../process-guidance/reviewer-guidance * [ ] Explicit type conversion functions used to normalize types are only returned in module outputs? Explicit type conversions are rarely necessary in Terraform because it will convert types automatically where required. * [ ] The `Sensitive` property on schema set to `true` for the fields that contains sensitive information? This will prevent the field's values from showing up in CLI output. -### General recommendations +### General Recommendations * Try avoiding nesting sub configuration within resources. Create a separate resource section for resources even though they can be declared as sub-element of a resource. For example, declaring subnets within virtual network vs declaring subnets as a separate resources compared to virtual network on Azure. * Never hard-code any value in configuration. Declare them in `locals` section if a variable is needed multiple times as a static value and are internal to the configuration. diff --git a/docs/code-reviews/tools.md b/docs/code-reviews/tools.md index 3b9c6e7491..4f029daee3 100644 --- a/docs/code-reviews/tools.md +++ b/docs/code-reviews/tools.md @@ -2,12 +2,12 @@ ## Customize ADO -### Task boards +### Task Boards - AzDO: [Customize cards](https://learn.microsoft.com/en-us/azure/devops/boards/boards/customize-cards?view=azure-devops) - AzDO: [Add columns on task board](https://learn.microsoft.com/en-us/azure/devops/boards/sprints/customize-taskboard?view=azure-devops#add-columns) -### Reviewer policies +### Reviewer Policies - Setting required reviewer group in AzDO - [Automatically include code reviewers](https://learn.microsoft.com/en-us/azure/devops/repos/git/branch-policies?view=azure-devops#automatically-include-code-reviewers) @@ -19,7 +19,7 @@ 2. [Approval count policy](https://learn.microsoft.com/en-us/rest/api/azure/devops/policy/configurations/create?view=azure-devops-rest-5.1#approval-count-policy) 1. GitHub: [Configuring protected branches](https://help.github.com/en/github/administering-a-repository/about-protected-branches) -## Visual Studio Code +## VSCode ### GitHub: [GitHub Pull Requests](https://marketplace.visualstudio.com/items?itemName=GitHub.vscode-pull-request-github) diff --git a/docs/continuous-delivery/.pages b/docs/continuous-delivery/.pages deleted file mode 100644 index 5233920c6f..0000000000 --- a/docs/continuous-delivery/.pages +++ /dev/null @@ -1,5 +0,0 @@ -nav: - - Azure DevOps: azure-devops - - DevOps provider recipes: devops-provider-recipes - - GitOps: gitops - - ... \ No newline at end of file diff --git a/docs/continuous-delivery/devops-provider-recipes/github-actions/runtime-variables/examples/commit-example.yaml b/docs/continuous-delivery/devops-provider-recipes/github-actions/runtime-variables/examples/commit-example.yaml deleted file mode 100644 index 5f84782ec8..0000000000 --- a/docs/continuous-delivery/devops-provider-recipes/github-actions/runtime-variables/examples/commit-example.yaml +++ /dev/null @@ -1,28 +0,0 @@ ---- -on: - push: - branches: - - master - -jobs: - Echo-On-Commit: - runs-on: ubuntu-latest - steps: - - name: "Checkout Repository" - uses: actions/checkout@v3 - - - name: "Set flag from Commit" - env: - COMMIT_VAR: ${{ contains(github.event.head_commit.message, '[commit var]') }} - run: | - if ${COMMIT_VAR} == true; then - echo "flag=true" >> $GITHUB_ENV - echo "flag set to true" - else - echo "flag=false" >> $GITHUB_ENV - echo "flag set to false" - fi - - - name: "Use flag if true" - if: env.flag - run: echo "Flag is available and true" diff --git a/docs/continuous-delivery/devops-provider-recipes/github-actions/runtime-variables/examples/pr-example.yaml b/docs/continuous-delivery/devops-provider-recipes/github-actions/runtime-variables/examples/pr-example.yaml deleted file mode 100644 index f639d06af7..0000000000 --- a/docs/continuous-delivery/devops-provider-recipes/github-actions/runtime-variables/examples/pr-example.yaml +++ /dev/null @@ -1,27 +0,0 @@ -on: - pull_request: - branches: - - master - -jobs: - Echo-On-PR: - runs-on: ubuntu-latest - steps: - - name: "Checkout Repository" - uses: actions/checkout@v3 - - - name: "Set flag from PR" - env: - PR_VAR: ${{ contains(github.event.pull_request.body, '[pr var]') }} - run: | - if ${PR_VAR} == true; then - echo "flag=true" >> $GITHUB_ENV - echo "flag set to true" - else - echo "flag=false" >> $GITHUB_ENV - echo "flag set to false" - fi - - - name: "Use flag if true" - if: env.flag - run: echo "Flag is available and true" diff --git a/docs/continuous-delivery/recipes/README.md b/docs/continuous-delivery/recipes/README.md deleted file mode 100644 index 88fa469901..0000000000 --- a/docs/continuous-delivery/recipes/README.md +++ /dev/null @@ -1,11 +0,0 @@ -# Recipes - -## Github - -- [Github workflows](./github-workflows/README.md) - -## Terraform - -- [Save output to variable group](./terraform/save-output-to-variable-group.md) -- [Share common variables naming conventions](./terraform/share-common-variables-naming-conventions.md) -- [Terraform structure guidelines](./terraform/terraform-structure-guidelines.md) \ No newline at end of file diff --git a/docs/continuous-delivery/recipes/terraform/README.md b/docs/continuous-delivery/recipes/terraform/README.md deleted file mode 100644 index d163c83702..0000000000 --- a/docs/continuous-delivery/recipes/terraform/README.md +++ /dev/null @@ -1,5 +0,0 @@ -# Terraform recipes - -- [Save output to variable group](./save-output-to-variable-group.md) -- [Share common variables naming conventions](./share-common-variables-naming-conventions.md) -- [Terraform structure guidelines](./terraform-structure-guidelines.md) \ No newline at end of file diff --git a/docs/continuous-integration/.pages b/docs/continuous-integration/.pages deleted file mode 100644 index 73e6208844..0000000000 --- a/docs/continuous-integration/.pages +++ /dev/null @@ -1,5 +0,0 @@ -nav: - - CI in data science: ci-in-data-science - - DevSecOps: dev-sec-ops - - Dev conainters: devcontainers - - markdown-linting \ No newline at end of file diff --git a/docs/continuous-integration/dev-sec-ops/README.md b/docs/continuous-integration/dev-sec-ops/README.md deleted file mode 100644 index 57fc3b0508..0000000000 --- a/docs/continuous-integration/dev-sec-ops/README.md +++ /dev/null @@ -1,21 +0,0 @@ -# DevSecOps - -## The concept of DevSecOps - -DevSecOps or DevOps security is about introducing security earlier in the life cycle of application development (a.k.a shift-left), thus minimizing the impact of vulnerabilities and bringing security closer to development team. - -## Why - -By embracing shift-left mentality, DevSecOps encourages organizations to bridge the gap that often exists between development and security teams to the point where many of the security processes are automated and are effectively handled by the development team. - -## DevSecOps Practices - -This section covers different tools, frameworks and resources allowing introduction of DevSecOps best practices to your project at early stages of development. -Topics covered: - -1. [Credential Scanning](./secret-management/credential_scanning.md) - automatically inspecting a project to ensure that no secrets are included in the project's source code. -1. [Secrets Rotation](./secret-management/secrets_rotation.md) - automated process by which the secret, used by the application, is refreshed and replaced by a new secret. -1. [Static Code Analysis](./secret-management/static-code-analysis.md) - analyze source code or compiled versions of code to help find security flaws. -1. [Penetration Testing](./penetration-testing/README.md) - a simulated attack against your application to check for exploitable vulnerabilities. -1. [Container Dependencies Scanning](./dependency-container-scanning/README.md) - search for vulnerabilities in container operating systems, language packages and application dependencies. -1. [Evaluation of Open Source Libraries](./evaluate-oss/README.md) - make it harder to apply open source supply chain attacks by evaluating the libraries you use. diff --git a/docs/continuous-integration/dev-sec-ops/azure-devops/README.md b/docs/continuous-integration/dev-sec-ops/azure-devops/README.md deleted file mode 100644 index a12e35e689..0000000000 --- a/docs/continuous-integration/dev-sec-ops/azure-devops/README.md +++ /dev/null @@ -1,8 +0,0 @@ -# Azure DevOps - -Write something about Azure DevOps here. - - -## Table of Contents - -- [Service connection security](./service-connection-security.md) diff --git a/docs/continuous-integration/dev-sec-ops/secret-management/README.md b/docs/continuous-integration/dev-sec-ops/secret-management/README.md deleted file mode 100644 index 85acf5b97c..0000000000 --- a/docs/continuous-integration/dev-sec-ops/secret-management/README.md +++ /dev/null @@ -1,27 +0,0 @@ -# Secret management - -Secret Management refers to the tools and practices used to manage digital authentication credentials (like API keys, tokens, passwords, and certificates). These secrets are used to protect access to sensitive data and services, making their management critical for security. - -## Importance of Secret Management - -In modern software development, applications often need to interact with other software components, APIs, and services. These interactions often require authentication, which is typically handled using secrets. If these secrets are not managed properly, they can be exposed, leading to potential security breaches. - -## Best Practices for Secret Management - -1. **Centralized Secret Storage:** Store all secrets in a centralized, encrypted location. This reduces the risk of secrets being lost or exposed. - -2. **Access Control:** Implement strict access control policies. Only authorized entities should have access to secrets. - -3. **Rotation of Secrets:** Regularly change secrets to reduce the risk if a secret is compromised. - -4. **Audit Trails:** Keep a record of when and who accessed which secret. This can help in identifying suspicious activities. - -5. **Automated Secret Management:** Automate the processes of secret creation, rotation, and deletion. This reduces the risk of human error. - -Remember, the goal of secret management is to protect sensitive information from unauthorized access and potential security threats. - -## Pages - -- [Credential Scanning](./credential_scanning.md) -- [Secrets Rotation](./secrets_rotation.md) -- [Static code analysis](./static-code-analysis.md) diff --git a/docs/continuous-integration/dev-sec-ops/secret-management/recipes/README.md b/docs/continuous-integration/dev-sec-ops/secret-management/recipes/README.md deleted file mode 100644 index 3125ae151d..0000000000 --- a/docs/continuous-integration/dev-sec-ops/secret-management/recipes/README.md +++ /dev/null @@ -1,4 +0,0 @@ -# Recipes - -- [Detect secrets](./detect-secrets.md) -- [Detect secrets on Azure DevOps](./detect-secrets-ado.md) \ No newline at end of file diff --git a/docs/continuous-integration/devcontainers/README.md b/docs/continuous-integration/devcontainers/README.md deleted file mode 100644 index dbdea5e97a..0000000000 --- a/docs/continuous-integration/devcontainers/README.md +++ /dev/null @@ -1,74 +0,0 @@ -# Reusing dev containers within a pipeline - -Given a repository with a local development container aka [dev container](../devcontainers/README.md) that contains all the tooling required for development, would it make sense to reuse that container for running the tooling in the Continuous Integration pipelines? - -## Options for building devcontainers within pipeline - -There are three ways to build devcontainers within pipeline: - -- With [GitHub - devcontainers/ci](https://github.com/devcontainers/ci) builds the container with the `devcontainer.json`. Example here: [devcontainers/ci · Getting Started](https://github.com/devcontainers/ci/blob/main/docs/github-action.md#getting-started). -- With [GitHub - devcontainers/cli](https://github.com/devcontainers/cli), which is the same as the above, but using the underlying CLI directly without tasks. -- Building the `DockerFile` with `docker build`. This option excludes all configuration/features specified within the `devcontainer.json`. - -## Considered Options - -- Run CI pipelines in native environment -- Run CI pipelines in the dev container via building image locally -- Run CI pipelines in the dev container with a container registry - -Here are below pros and cons for both approaches: - -### Run CI pipelines in native environment - -| Pros | Cons | -|-------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------| -| Can use any pipeline tasks available | Need to keep two sets of tooling and their versions in sync | -| No container registry | Can take some time to start, based on tools/dependencies required | -| Agent will always be up to date with security patches | The dev container should always be built within each run of the CI pipeline, to verify the changes within the branch haven't broken anything | - -### Run CI pipelines in the dev container without image caching - -| Pros | Cons | -|----------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------| -| Utilities scripts will work out of the box | Need to rebuild the container for each run, given that there may be changes within the branch being built | -| Rules used (for linting or unit tests) will be the same on the CI | Not everything in the container is needed for the CI pipeline¹ | -| No surprise for the developers, local outputs (of linting for instance) will be the same in the CI | Some pipeline tasks will not be available | -| All tooling and their versions defined in a single place | Building the image for each pipeline run is slow² | -| Tools/dependencies are already present | | -| The dev container is being tested to include all new tooling in addition to not being broken | | - -> ¹: container size can be reduces by exporting the layer that contains only the tooling needed for the CI pipeline -> -> ²: could be mitigated via adding image caching without using a container registry - -### Run CI pipelines in the dev container with image registry - -| Pros | Cons | -|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------| -| Utilities scripts will work out of the box | Need to rebuild the container for each run, given that there may be changes within the branch being built | -| No surprise for the developers, local outputs (of linting for instance) will be the same in the CI | Not everything in the container is needed for the CI pipeline¹ | -| Rules used (for linting or unit tests) will be the same on the CI | Some pipeline tasks will not be available ² | -| All tooling and their versions defined in a single place | Require access to a container registry to host the container within the pipeline³ | -| Tools/dependencies are already present | | -| The dev container is being tested to include all new tooling in addition to not being broken | | -| Publishing the container built from `devcontainer.json` allows you to reference it in the cacheFrom in `devcontainer.json` (see [docs](https://containers.dev/implementors/json_reference/#image-specific)). By doing this, VS Code will use the published image as a layer cache when building | | - -> ¹: container size can be reduces by exporting the layer that contains only the tooling needed for the CI pipeline. This would require building the image without tasks -> -> ²: using container jobs in AzDO you can use all tasks (as far as I can tell). Reference: [Dockerizing DevOps V2 - AzDO container jobs - DEV Community](https://dev.to/eliises/dockerizing-devops-v2-azdo-container-jobs-3hbf) -> -> ³: within GH actions, the default Github Actions token can be used for accessing GHCR without setting up separate registry, see the example below. -> **NOTE:** This does not build the `Dockerfile` together with the `devcontainer.json` - -```yaml -    - uses: whoan/docker-build-with-cache-action@v5 -        id: cache -        with: -          username: $GITHUB_ACTOR -          password: "${{ secrets.GITHUB_TOKEN }}" -          registry: docker.pkg.github.com -          image_name: devcontainer -          dockerfile: .devcontainer/Dockerfile -``` - - diff --git a/docs/design/design-patterns/README.md b/docs/design/design-patterns/README.md index 3eedb58920..11ea801470 100644 --- a/docs/design/design-patterns/README.md +++ b/docs/design/design-patterns/README.md @@ -3,13 +3,3 @@ The design patterns section recommends patterns of software and architecture design. This section provides a curated list of commonly used patterns from trusted sources. Rather than duplicate or replace the cited sources, this section aims to compliment them with suggestions, guidance, and learnings based on firsthand experiences. - -## Subsections - -* [Data Heavy Design Guidance](data-heavy-design-guidance.md) -* [Object Oriented Design Reference](object-oriented-design-reference.md) -* [Distributed System Design Reference](distributed-system-design-reference.md) -* [REST API Design Guidance](rest-api-design-guidance.md) -* [Cloud Resource Design Guidance](cloud-resource-design-guidance.md) -* [Network Architecture Guidance for Azure](network-architecture-guidance-for-azure.md) -* [Network Architecture Guidance for Hybrid](network-architecture-guidance-for-hybrid.md) diff --git a/docs/design/design-patterns/cloud-resource-design-guidance.md b/docs/design/design-patterns/cloud-resource-design-guidance.md index b508c5169c..a52609210c 100644 --- a/docs/design/design-patterns/cloud-resource-design-guidance.md +++ b/docs/design/design-patterns/cloud-resource-design-guidance.md @@ -2,11 +2,11 @@ As cloud usage scales, considerations for subscription design, management groups, and resource naming/tagging conventions have an impact on governance, operations management, and adoption patterns. -> **NOTE:** Always work with the relevant stakeholders to ensure that introducing new patterns provides the intended value. +> **Note:** Always work with the relevant stakeholders to ensure that introducing new patterns provides the intended value. > > When working in an existing cloud environment, it is important to understand any current patterns and how they are used before making a change to them. -## References +## Resources The following references can be used to understand the latest best practices in organizing cloud resources: diff --git a/docs/design/design-patterns/data-heavy-design-guidance.md b/docs/design/design-patterns/data-heavy-design-guidance.md index d2296e1fa6..aee3b7bbc4 100644 --- a/docs/design/design-patterns/data-heavy-design-guidance.md +++ b/docs/design/design-patterns/data-heavy-design-guidance.md @@ -89,5 +89,5 @@ Monitor infrastructure, pipelines and data The [DataOps for the Modern Data Warehouse repo](https://github.com/Azure-Samples/modern-data-warehouse-dataops) contains both end-to-end and technology specific samples on how to implement DataOps on Azure. -![CI/CD](images/CI_CD_process.png?raw=true "CI/CD") +![CI/CD](./images/CI_CD_process.png?raw=true "CI/CD") Image: CI/CD for Data pipelines on Azure - from DataOps for the Modern Data Warehouse repo diff --git a/docs/design/design-patterns/network-architecture-guidance-for-azure.md b/docs/design/design-patterns/network-architecture-guidance-for-azure.md index 65ea7c06b6..e5deb34f97 100644 --- a/docs/design/design-patterns/network-architecture-guidance-for-azure.md +++ b/docs/design/design-patterns/network-architecture-guidance-for-azure.md @@ -1,13 +1,13 @@ # Network Architecture Guidance for Azure The following are some best practices when setting up and working with network resources in Azure Cloud environments. -> **NOTE:** When working in an existing cloud environment, it is important to understand any current patterns, and how they are used, before making a change to them. You should also work with the relevant stakeholders to make sure that any new patterns you introduce provide enough value to make the change. +> **Note:** When working in an existing cloud environment, it is important to understand any current patterns, and how they are used, before making a change to them. You should also work with the relevant stakeholders to make sure that any new patterns you introduce provide enough value to make the change. -## Networking and VNet setup +## Networking and VNet Setup -### Hub-and-spoke Topology +### Hub-and-Spoke Topology -![image](images/spoke-spoke-routing.png) +![image](./images/spoke-spoke-routing.png) A hub-and-spoke network topology is a common architecture pattern used in Azure for organizing and managing network resources. It is based on the concept of a central hub that connects to various spoke networks. This model is particularly useful for organizing resources, maintaining security, and simplifying network management. diff --git a/docs/design/design-patterns/network-architecture-guidance-for-hybrid.md b/docs/design/design-patterns/network-architecture-guidance-for-hybrid.md index 8a0a72d771..f77b78b9ec 100644 --- a/docs/design/design-patterns/network-architecture-guidance-for-hybrid.md +++ b/docs/design/design-patterns/network-architecture-guidance-for-hybrid.md @@ -2,19 +2,19 @@ The following are best practices around how to design and configure resources, used for Hybrid and Multi-Cloud environments. -> **NOTE:** When working in an existing hybrid environment, it is important to understand any current patterns, and how they are used before making any changes. +> **Note:** When working in an existing hybrid environment, it is important to understand any current patterns, and how they are used before making any changes. -## Hub-and-spoke Topology +## Hub-and-Spoke Topology The hub-and-spoke topology doesn't change much when using cloud/hybrid if configured correctly, The main different is that the hub VNet is peering to the on-prem network via a ExpressRoute and that all traffic from Azure might exit via the ExpressRoute and the on-prem internet connection. -The generalized best practices are in [Network Architecture Guidance for Azure#Hub and Spoke topology](network-architecture-guidance-for-azure.md#hub-and-spoke-topology) +The generalized best practices are in [Network Architecture Guidance for Azure#Hub and Spoke topology](./network-architecture-guidance-for-azure.md#hub-and-spoke-topology) ### IP Allocation When working with Hybrid deployment, take extra care when planning IP allocation as there is a much greater risk of overlapping network ranges. -The general best practices are available in the [Network Architecture Guidance for Azure#ip-allocation](network-architecture-guidance-for-azure.md#ip-allocation) +The general best practices are available in the [Network Architecture Guidance for Azure#ip-allocation](./network-architecture-guidance-for-azure.md#ip-allocation) Read more about this in [Azure Best Practices Plan for IP Addressing](https://learn.microsoft.com/azure/cloud-adoption-framework/ready/azure-best-practices/plan-for-ip-addressing) @@ -30,7 +30,7 @@ Monitoring: Use Azure Monitor and Network Performance Monitor (NPM) to monitor t ### DNS -General best practices are available in [Network Architecture Guidance for Azure#dns](network-architecture-guidance-for-azure.md#dns) +General best practices are available in [Network Architecture Guidance for Azure#dns](./network-architecture-guidance-for-azure.md#dns) When using Azure DNS in a hybrid or multi-cloud environment it is important to ensure a consistent DNS and forwarding configuration which ensures that records are automatically updated and that all DNS servers are aware of each other and know which server is the authoritative for the different records. @@ -38,4 +38,4 @@ Read more about [Hybrid/Multi-Cloud DNS infrastructure](https://learn.microsoft. ### Resource Allocation -For resource allocation the best practices from [Cloud Resource Design Guidance](cloud-resource-design-guidance.md) should be followed. +For resource allocation the best practices from [Cloud Resource Design Guidance](./cloud-resource-design-guidance.md) should be followed. diff --git a/docs/design/design-patterns/non-functional-requirements-capture-guide.md b/docs/design/design-patterns/non-functional-requirements-capture-guide.md index 21f9e455c5..55949f5403 100644 --- a/docs/design/design-patterns/non-functional-requirements-capture-guide.md +++ b/docs/design/design-patterns/non-functional-requirements-capture-guide.md @@ -36,7 +36,7 @@ To support the process of capturing a project's _comprehensive_ non-functional r | [Availability](../../non-functional-requirements/availability.md) | System's uptime and accessibility to users. | - Uptime: Uptime measures the percentage of time that a system is operational and available for use. It is typically expressed as a percentage of total time (e.g., 99.9% uptime means the system is available 99.9% of the time). Common thresholds for uptime include:
99% uptime: The system is available 99% of the time, allowing for approximately 3.65 days of downtime per year.
99.9% uptime (three nines): The system is available 99.9% of the time, allowing for approximately 8.76 hours of downtime per year.
99.99% uptime (four nines): The system is available 99.99% of the time, allowing for approximately 52.56 minutes of downtime per year.
99.999% uptime (five nines): The system is available 99.999% of the time, allowing for approximately 5.26 minutes of downtime per year. | | [Data Integrity](../../non-functional-requirements/data-integrity.md) | Accuracy and consistency of data throughout its lifecycle. | - Error Rate: The proportion of data entries that contain errors or inaccuracies. \(\text{Error Rate} = \left( \frac{\text{Number of Errors}}{\text{Total Number of Entries}} \right) \times 100\)
- Accuracy Rate: The percentage of data entries that are correct and match the source of truth. \(\text{Accuracy Rate} = \left( \frac{\text{Number of Accurate Entries}}{\text{Total Number of Entries}} \right) \times 100\)
- Duplicate Record Rate: The percentage of data entries that are duplicates. \(\text{Duplicate Record Rate} = \left( \frac{\text{Number of Duplicate Entries}}{\text{Total Number of Entries}} \right) \times 100\) | | [Disaster recovery and business continuity](../../non-functional-requirements/disaster-recovery.md) | Determine the system's requirements for disaster recovery and business continuity, including backup and recovery procedures and disaster recovery testing. | - Backup and Recovery: The application must have a Backup and Recovery plan in place that includes regular backups of all data and configurations, and a process for restoring data and functionality in the event of a disaster or disruption.
- Redundancy: The application must have Redundancy built into its infrastructure, such as redundant servers, network devices, and power supplies, to ensure high availability and minimize downtime in the event of a failure.
- Failover and high availability: The application must be designed to support Failover and high availability, such as by using load balancers or Failover clusters, to ensure that it can continue to operate in the event of a system failure or disruption.
- Disaster Recovery plan: The application must have a comprehensive disaster Recovery plan that includes procedures for restoring data and functionality in the event of a major disaster, such as a natural disaster, cyber attack, or other catastrophic event.
- Testing and Maintenance: The application must be regularly tested and maintained to ensure that it can withstand a disaster or disruption, and that all systems, processes, and data can be quickly restored and recovered. | -| [Reliability](../../reliability/README.md) | System's ability to maintain functionality under varying conditions and failure scenarios. | - Mean Time Between Failures (MTBF): The system should achieve an MTBF of at least 1000 hours, indicating a high level of reliability with infrequent failures.
- Mean Time to Recover (MTTR): The system should aim for an MTTR of less than 1 hour, ensuring quick recovery and minimal disruption in the event of a failure.
- Redundancy Levels: The system should include redundancy mechanisms to achieve a redundancy level of N+1, ensuring high availability and fault tolerance. | +| [Reliability](../../non-functional-requirements/reliability.md) | System's ability to maintain functionality under varying conditions and failure scenarios. | - Mean Time Between Failures (MTBF): The system should achieve an MTBF of at least 1000 hours, indicating a high level of reliability with infrequent failures.
- Mean Time to Recover (MTTR): The system should aim for an MTTR of less than 1 hour, ensuring quick recovery and minimal disruption in the event of a failure.
- Redundancy Levels: The system should include redundancy mechanisms to achieve a redundancy level of N+1, ensuring high availability and fault tolerance. | ### Performance Requirements @@ -53,7 +53,7 @@ To support the process of capturing a project's _comprehensive_ non-functional r | [Compliance](../../non-functional-requirements/compliance.md) | Adherence to legal, regulatory, and industry standards and requirements. | See [Microsoft Purview Compliance Manager](https://aka.ms/ComplianceManager) | | [Privacy](../../privacy/README.md) | Protection of sensitive information and compliance with privacy regulations. | - Compliance with Privacy Regulations: Achieve full compliance with GDPR, CCPA and HIPAA.
- Data Anonymization: Implement anonymization techniques in protecting individual privacy while still allowing for data analysis.
- Data Encryption: Ensure that sensitive data is encrypted according to encryption standards and best practices.
- User Privacy Preferences: The ability to respect and accommodate user privacy preferences regarding data collection, processing, and sharing. | | [Security](../../security/README.md) | Establish the security requirements of the system, such as authentication, authorization, encryption, and compliance with industry or legal regulations. | See [Threat Modeling Tool](https://aka.ms/tmt) | -| [Sustainability](../../design/sustainability/readme.md) | Ability to operate over an extended period while minimizing environmental impact and resource consumption. | - Energy Efficiency: Kilowatt-hours/Transaction.
- Carbon Footprint: Tons of CO2 emissions per year. | +| [Sustainability](../sustainability/README.md) | Ability to operate over an extended period while minimizing environmental impact and resource consumption. | - Energy Efficiency: Kilowatt-hours/Transaction.
- Carbon Footprint: Tons of CO2 emissions per year. | ### System Maintainability Requirements @@ -68,6 +68,6 @@ To support the process of capturing a project's _comprehensive_ non-functional r | Quality | Attribute |Description | Common Metrics | | -- | -- | -- | -- | -| [Accessibility](../../accessibility/README.md) | The solution must be usable by people with disabilities. Compliance with accessibility standards. Support for assistive technologies | - Alternative Text for Images: All images and non-text content must have alternative text descriptions that can be read by screen readers.
- Color contrast: The application must use color schemes that meet the recommended contrast ratio between foreground and background colors to ensure visibility for users with low vision.
- Focus indicators: The application must provide visible focus indicators to highlight the currently focused element, which is especially important for users who rely on keyboard navigation.
- Captions and Transcripts: All audio and video content must have captions and transcripts, to ensure that users with hearing impairments can access the content.
- Language identification: The application must correctly identify the language of the content, to ensure that screen readers and other assistive technologies can read the content properly. | | +| [Accessibility](../../non-functional-requirements/accessibility.md) | The solution must be usable by people with disabilities. Compliance with accessibility standards. Support for assistive technologies | - Alternative Text for Images: All images and non-text content must have alternative text descriptions that can be read by screen readers.
- Color contrast: The application must use color schemes that meet the recommended contrast ratio between foreground and background colors to ensure visibility for users with low vision.
- Focus indicators: The application must provide visible focus indicators to highlight the currently focused element, which is especially important for users who rely on keyboard navigation.
- Captions and Transcripts: All audio and video content must have captions and transcripts, to ensure that users with hearing impairments can access the content.
- Language identification: The application must correctly identify the language of the content, to ensure that screen readers and other assistive technologies can read the content properly. | | | [Internationalization and Localization](../../non-functional-requirements/internationalization.md) | Adaptation of the software for use in different languages and cultures. Tailoring the software to meet the specific needs of different regions or locales. | - Language and Locale Support: The software's support for different languages, character sets, and locales. Portability requires internationalization and localization efforts to ensure that the software can be used effectively in different regions and cultures, with support for at least five major languages.
- Multi currency: The system's support for multiple currencies, allowing different symbols and conversion rates. | | -| [Usability](../../user-interface-engineering/usability.md) | Intuitiveness, ease of learning, and user satisfaction with the software interface. | - Task Completion Time: The average time it takes for users to complete specific tasks. A user must be able to complete an account settings in less than 2 minutes.
- Ease of Navigation: The ease with which users can navigate through the system and find the information they need. This can be measured by observing user interactions or conducting usability tests.
- User Satisfaction: User satisfaction can be measured using surveys, feedback forms, or satisfaction ratings. A satisfaction score of 70% or higher is typically considered satisfactory.
- Learnability: The ease with which new users can learn to use the system. This can be measured by the time it takes for users to perform basic tasks or by conducting usability tests with novice users. | | +| [Usability](../../non-functional-requirements/usability.md) | Intuitiveness, ease of learning, and user satisfaction with the software interface. | - Task Completion Time: The average time it takes for users to complete specific tasks. A user must be able to complete an account settings in less than 2 minutes.
- Ease of Navigation: The ease with which users can navigate through the system and find the information they need. This can be measured by observing user interactions or conducting usability tests.
- User Satisfaction: User satisfaction can be measured using surveys, feedback forms, or satisfaction ratings. A satisfaction score of 70% or higher is typically considered satisfactory.
- Learnability: The ease with which new users can learn to use the system. This can be measured by the time it takes for users to perform basic tasks or by conducting usability tests with novice users. | | diff --git a/docs/design/design-patterns/rest-api-design-guidance.md b/docs/design/design-patterns/rest-api-design-guidance.md index ff311439e8..f4d81af133 100644 --- a/docs/design/design-patterns/rest-api-design-guidance.md +++ b/docs/design/design-patterns/rest-api-design-guidance.md @@ -77,14 +77,14 @@ Important Points to consider: * With Agile development, it is hard to ensure that definitions embedded in runtime code remain stable, especially across rounds of refactoring and when serving multiple concurrent API versions. * It might be useful to regularly generate Open API definition and store it in version control system otherwise generating the OpenAPI definition at runtime might makes it more complex in scenarios where that definition is required at development/CI time. -## How to Interpret and Apply The Guidelines +## How to Interpret and Apply the Guidelines The API guidelines document includes a section on [how to apply the guidelines](https://github.com/microsoft/api-guidelines/blob/vNext/Guidelines.md#4-interpreting-the-guidelines) depending on whether the API is new or existing. In particular, when working in an existing API ecosystem, be sure to align with stakeholders on a definition of what constitutes a [breaking change](https://github.com/microsoft/api-guidelines/blob/vNext/Guidelines.md#123-definition-of-a-breaking-change) to understand the impact of implementing certain best practices. > We do not recommend making a breaking change to a service that predates these guidelines simply for the sake of compliance. -## Additional Resources +## Resources * [Microsoft's Recommended Reading List for REST APIs](https://github.com/microsoft/api-guidelines/blob/vNext/Guidelines.md#31-recommended-reading) * [Documentation - Guidance - REST APIs](https://microsoft.github.io/code-with-engineering-playbook/documentation/guidance/rest-apis/) diff --git a/docs/design/design-reviews/README.md b/docs/design/design-reviews/README.md index 381bad60dc..abb79a3507 100644 --- a/docs/design/design-reviews/README.md +++ b/docs/design/design-reviews/README.md @@ -1,14 +1,5 @@ # Design Reviews -## Table of Contents - -- [Goals](#goals) -- [Measures](#measures) -- [Impact](#impact) -- [Participation](#participation) -- [Facilitation Guidance](#facilitation-guidance) -- [Technical Spike](#technical-spike) - ## Goals - Reduce technical debt for our customers @@ -53,9 +44,6 @@ There is also a healthy balancing act in supporting a healthy debate while not h The dev crew should always participate in all design review sessions -- [ISE](../../ISE.md) Engineering -- Customer Engineering - ### Domain Experts Domain experts should participate in design review sessions as needed @@ -70,7 +58,7 @@ Domain experts should participate in design review sessions as needed Please see our [Design Review Recipes](./recipes/README.md) for guidance on design process. -### Sync Design Reviews via in-person / virtual meetings +### Sync Design Reviews via In-Person / Virtual Meetings Joint meetings with dev crew, subject-matter experts (SMEs) and customer engineers @@ -85,8 +73,8 @@ A technical spike is most often used for evaluating the impact new technology ha ## Design Documentation - Document and update the architecture design in the project design documentation -- Track and document design decisions in a [decision log](decision-log/README.md) -- Document decision process in [trade studies](trade-studies/README.md) when multiple solutions exist for the given problem +- Track and document design decisions in a [decision log](./decision-log/README.md) +- Document decision process in [trade studies](./trade-studies/README.md) when multiple solutions exist for the given problem Early on in engagements, the team must decide where to land artifacts generated from design reviews. Typically, we meet the customer where they are at (for example, using their Confluence instance to land documentation if that is their preferred process). diff --git a/docs/design/design-reviews/decision-log/README.md b/docs/design/design-reviews/decision-log/README.md index 3ab557818b..c2692f2414 100644 --- a/docs/design/design-reviews/decision-log/README.md +++ b/docs/design/design-reviews/decision-log/README.md @@ -2,7 +2,7 @@ Not all requirements can be captured in the beginning of an agile project during one or more design sessions. The initial architecture design can evolve or change during the project, especially if there are multiple possible technology choices that can be made. Tracking these changes within a large document is in most cases not ideal, as one can lose oversight over the design changes made at which point in time. Having to scan through a large document to find a specific content takes time, and in many cases the consequences of a decision is not documented. -## Why is it important to track design decisions +## Why is it Important to Track Design Decisions Tracking an architecture design decision can have many advantages: @@ -11,7 +11,7 @@ Tracking an architecture design decision can have many advantages: - The context of a decision including the consequences for the team are documented with the decision. - It is easier to find the design decision in a log than having to read a large document. -## What is a recommended format for tracking decisions +## What is a Recommended Format for Tracking Decisions In addition to incorporating a design decision as an update of the overall design documentation of the project, the decisions could be tracked as [Architecture Decision Records](http://thinkrelevance.com/blog/2011/11/15/documenting-architecture-decisions) as Michael Nygard proposed in his blog. @@ -21,62 +21,63 @@ The effort invested in design reviews and discussions can be different throughou An architecture decision record has the structure -- **[Ascending number]. [Title of decision]** +**[Ascending number]. [Title of decision]** - *The title should give the reader the information on what was decided upon.* +The title should give the reader the information on what was decided upon. - Example: +Example: - > *001. App level logging with Serilog and Application Insights* +> *001. App level logging with Serilog and Application Insights* - Hint: +Hint: - When several developers regularly start ADRs in parallel, it becomes difficult to deal with conflicting ascending numbers. An easy way to overcome this is to give ADRs the ID of the work item they relate to. +When several developers regularly start ADRs in parallel, it becomes difficult to deal with conflicting ascending numbers. An easy way to overcome this is to give ADRs the ID of the work item they relate to. -- **Date:** +**Date:** - *The date the decision was made.* +The date the decision was made. -- **Status:** - Proposed/Accepted/Deprecated/Superseded +**Status:** - *A proposed design can be reviewed by the development team prior to accepting it. A previous decision can be superseded by a new one, or the ADR record marked as deprecated in case it is not valid anymore.* +[Proposed/Accepted/Deprecated/Superseded] -- **Context:** +A proposed design can be reviewed by the development team prior to accepting it. A previous decision can be superseded by a new one, or the ADR record marked as deprecated in case it is not valid anymore. - *The text should provide the reader an understanding of the problem, or as Michael Nygard puts it, a value-neutral [an objective] description of the forces at play.* +**Context:** - Example: +The text should provide the reader an understanding of the problem, or as Michael Nygard puts it, a value-neutral [an objective] description of the forces at play. - > *Due to the microservices design of the platform, we need to ensure consistency of logging throughout each service so tracking of usage, performance, errors etc. can be performed end-to-end. A single logging/monitoring framework should be used where possible to achieve this, whilst allowing the flexibility for integration/export into other tools at a later stage. The developers should be equipped with a simple interface to log messages and metrics.* +Example: - *If the development team had a data-driven approach to back the decision, i.e., a study that evaluates the potential choices against a set of objective criteria by following the guidance in [Trade Studies](../trade-studies/README.md), the study should be referred to in this section.* +> Due to the microservices design of the platform, we need to ensure consistency of logging throughout each service so tracking of usage, performance, errors etc. can be performed end-to-end. A single logging/monitoring framework should be used where possible to achieve this, whilst allowing the flexibility for integration/export into other tools at a later stage. The developers should be equipped with a simple interface to log messages and metrics. +> +> If the development team had a data-driven approach to back the decision, i.e., a study that evaluates the potential choices against a set of objective criteria by following the guidance in [Trade Studies](../trade-studies/README.md), the study should be referred to in this section. -- **Decision:** +**Decision:** - *The decision made, it should begin with 'We will...' or 'We have agreed to ...*. +The decision made, it should begin with 'We will...' or 'We have agreed to ... - Example: +Example: - > *We have agreed to utilize Serilog as the Dotnet Logging framework of choice at the application level, with integration into Log Analytics and Application Insights for analysis.* +> We have agreed to utilize Serilog as the Dotnet Logging framework of choice at the application level, with integration into Log Analytics and Application Insights for analysis. -- **Consequences:** +**Consequences:** - *The resulting context, after having applied the decision.* +The resulting context, after having applied the decision. - Example: +Example: - > *Sampling will need to be configured in Application Insights so that it does not become overly-expensive when ingesting millions of messages, but also does not prevent capture of essential information. The team will need to only log what is agreed to be essential for monitoring as part of design reviews, to reduce noise and unnecessary levels of sampling.* +> Sampling will need to be configured in Application Insights so that it does not become overly-expensive when ingesting millions of messages, but also does not prevent capture of essential information. The team will need to only log what is agreed to be essential for monitoring as part of design reviews, to reduce noise and unnecessary levels of sampling. -### Where to store ADRs +### Where to Store ADRs ADRs can be stored and tracked in any version control system such as git. As a recommended practice, ADRs can be added as pull request in the *proposed* status to be discussed by the team until it is updated to *accepted* to be merged with the main branch. They are usually stored in a folder structure *doc/adr* or *doc/arch*. Additionally, it can be useful to track ADRs in a `decision-log.md` to provide useful metadata in an obvious format. #### Decision Logs -A decision log is a Markdown file containing a table which provides executive summaries of the decisions contained in ADRs, as well as some other metadata. You can see a template table at [`doc/decision-log.md`](doc/decision-log.md). +A decision log is a Markdown file containing a table which provides executive summaries of the decisions contained in ADRs, as well as some other metadata. You can see a template table at [`doc/decision-log.md`](./doc/decision-log.md). -### When to track ADRs +### When to Track ADRs Architecture design decisions are usually tracked whenever significant decisions are made that affect the structure and characteristics of the solution or framework we are building. ADRs can also be used to document results of spikes when evaluating different technology choices. @@ -84,8 +85,8 @@ Architecture design decisions are usually tracked whenever significant decisions The first ADR could be the decision to use ADRs to track design decisions, -- [0001-record-architecture-decisions.md](doc/adr/0001-record-architecture-decisions.md), +- [0001-record-architecture-decisions.md](./doc/adr/0001-record-architecture-decisions.md), followed by actual decisions in the engagement as in the example used above, -- [0002-app-level-logging.md](doc/adr/0002-app-level-logging.md). +- [0002-app-level-logging.md](./doc/adr/0002-app-level-logging.md). diff --git a/docs/design/design-reviews/decision-log/doc/decision-log.md b/docs/design/design-reviews/decision-log/doc/decision-log.md index 63fb5ce1eb..5171b02301 100644 --- a/docs/design/design-reviews/decision-log/doc/decision-log.md +++ b/docs/design/design-reviews/decision-log/doc/decision-log.md @@ -3,6 +3,6 @@ This document is used to track key decisions that are made during the course of the project. This can be used at a later stage to understand why decisions were made and by whom. -| **Decision** | **Date** | **Alternatives Considered** | **Reasoning** | **Detailed doc** | **Made By** | **Work Required** | -|----------------------------------------------|-----------------------------|--------------------------------------------|---------------------------------------------------------------|-----------------------------------------------|-------------------------|---------------------------------------------| +| Decision | Date | Alternatives Considered | Reasoning | Detailed doc | Made By | Work Required | +| -- | -- | -- | -- | -- | -- | -- | | A one-sentence summary of the decision made. | Date the decision was made. | A list of the other approaches considered. | A two to three sentence summary of why the decision was made. | A link to the ADR with the format [Title] DR. | Who made this decision? | A link to the work item for the linked ADR. | diff --git a/docs/design/design-reviews/decision-log/examples/memory/Architecture/Data-Model.md b/docs/design/design-reviews/decision-log/examples/memory/Architecture/Data-Model.md index 3ff134e7b0..1ce7271a19 100644 --- a/docs/design/design-reviews/decision-log/examples/memory/Architecture/Data-Model.md +++ b/docs/design/design-reviews/decision-log/examples/memory/Architecture/Data-Model.md @@ -1,13 +1,4 @@ -# Data Model - -## Table of Contents - -- [Graph vertices and edges](#graph-vertices-and-edges) -- [Graph Properties](#graph-properties) -- [Vertex Descriptions](#vertex-descriptions) -- [Full Role JSON Example](#full-role-json-example) - -## Graph Model +# Graph Model ## Graph Vertices and Edges diff --git a/docs/design/design-reviews/decision-log/examples/memory/Deployment/Environments.md b/docs/design/design-reviews/decision-log/examples/memory/Deployment/Environments.md index b60ee2907c..fc49341587 100644 --- a/docs/design/design-reviews/decision-log/examples/memory/Deployment/Environments.md +++ b/docs/design/design-reviews/decision-log/examples/memory/Deployment/Environments.md @@ -2,8 +2,6 @@ The Memory application leverages [Azure DevOps](https://learn.microsoft.com/en-gb/azure/devops/index?view=azure-devops) for work item tracking as well as continuous integration (CI) and continuous deployment (CD). ---- - ## Environments The Memory project uses multiple environments to isolate and test changes before promoting releases to the global user base. @@ -32,8 +30,7 @@ The local environment is used by individual software engineers during the develo Engineers leverage some components from the deployed development environment that are not available on certain platforms or are unable to run locally. -- CosmosDB - - Emulator only exists for Windows +- CosmosDB (Emulator only exists for Windows) The local environment also does not use Azure Traffic Manager. The frontend web app directly communicates to the backend REST API typically running on a separate localhost port mapping. @@ -68,8 +65,6 @@ Changes to this environment are gated by manual approval by your product's leade - Central US (centralus) - East US (eastus) ---- - ## Environment Variable Group ### Infrastructure Setup (memory-common) diff --git a/docs/design/design-reviews/decision-log/examples/memory/README.md b/docs/design/design-reviews/decision-log/examples/memory/README.md index 9ad22e34ff..e1490e6c28 100644 --- a/docs/design/design-reviews/decision-log/examples/memory/README.md +++ b/docs/design/design-reviews/decision-log/examples/memory/README.md @@ -2,6 +2,6 @@ These examples were taken from the Memory project, an internal tool for tracking an individual's accomplishments. -The main example here is the [Decision Log](Decision-Log.md). +The main example here is the [Decision Log](./decision-log.md). Since this log was used from the start, the decisions are mostly based on technology choices made in the start of the project. All line items have a link out to the trade studies done for each technology choice. diff --git a/docs/design/design-reviews/decision-log/examples/memory/Decision-Log.md b/docs/design/design-reviews/decision-log/examples/memory/decision-log.md similarity index 50% rename from docs/design/design-reviews/decision-log/examples/memory/Decision-Log.md rename to docs/design/design-reviews/decision-log/examples/memory/decision-log.md index e0b40e3620..2551f224c9 100644 --- a/docs/design/design-reviews/decision-log/examples/memory/Decision-Log.md +++ b/docs/design/design-reviews/decision-log/examples/memory/decision-log.md @@ -3,14 +3,14 @@ This document is used to track key decisions that are made during the course of the project. This can be used at a later stage to understand why decisions were made and by whom. -| **Decision** | **Date** | **Alternatives Considered** | **Reasoning** | **Detailed doc** | **Made By** | **Work Required** | -|-----------------------------------|------------|-----------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------|-------------|-------------------| -| Use Architecture Decision Records | 01/25/2021 | Standard Design Docs | An easy and low cost solution of tracking architecture decisions over the lifetime of a project | Record Architecture Decisions | Dev Team | #21654 | -| Use ArgoCD | 01/26/2021 | FluxCD | ArgoCD is more feature rich, will support more scenarios, and will be a better tool to put in our tool belts. So we have decided at this point to go with ArgoCD | [GitOps Trade Study](Trade-Studies/GitOps.md) | Dev Team | #21672 | -| Use Helm | 01/28/2021 | Kustomize, Kubes, Gitkube, Draft | Platform maturity, templating, ArgoCD support | K8s Package Manager Trade Study | Dev Team | #21674 | -| Use CosmosDB | 01/29/2021 | Blob Storage, CosmosDB, SQL Server, Neo4j, JanusGraph, ArangoDB | CosmosDB has better Azure integration, managed identity, and the Gremlin API is powerful. | Graph Storage Trade Study and Decision | Dev Team | #21650 | -| Use Azure Traffic Manager | 02/02/2021 | Azure Front Door | A lightweight solution to route traffic between multiple k8s regional clusters | Routing Trade Study | Dev Team | #21673 -| Use Linkerd + Contour | 02/02/2021 | Istio, Consul, Ambassador, Traefik | A CNCF backed cloud native k8s stack to deliver service mesh, API gateway and ingress | Routing Trade Study | Dev Team | #21673 +| Decision | Date | Alternatives Considered | Reasoning | Detailed doc | Made By | Work Required | +| -- | -- | -- | -- | -- | -- | -- | +| Use Architecture Decision Records | 01/25/2021 | Standard Design Docs | An easy and low cost solution of tracking architecture decisions over the lifetime of a project | Record Architecture Decisions | Dev Team | #21654 | +| Use ArgoCD | 01/26/2021 | FluxCD | ArgoCD is more feature rich, will support more scenarios, and will be a better tool to put in our tool belts. So we have decided at this point to go with ArgoCD | [GitOps Trade Study](./trade-studies/gitops.md) | Dev Team | #21672 | +| Use Helm | 01/28/2021 | Kustomize, Kubes, Gitkube, Draft | Platform maturity, templating, ArgoCD support | K8s Package Manager Trade Study | Dev Team | #21674 | +| Use CosmosDB | 01/29/2021 | Blob Storage, CosmosDB, SQL Server, Neo4j, JanusGraph, ArangoDB | CosmosDB has better Azure integration, managed identity, and the Gremlin API is powerful. | Graph Storage Trade Study and Decision | Dev Team | #21650 | +| Use Azure Traffic Manager | 02/02/2021 | Azure Front Door | A lightweight solution to route traffic between multiple k8s regional clusters | Routing Trade Study | Dev Team | #21673 | +| Use Linkerd + Contour | 02/02/2021 | Istio, Consul, Ambassador, Traefik | A CNCF backed cloud native k8s stack to deliver service mesh, API gateway and ingress | Routing Trade Study | Dev Team | #21673 | | Use ARM Templates | 02/02/2021 | Terraform, Pulumi, Az CLI | Azure Native, Az Monitoring and incremental updates support | Automated Deployment Trade Study | Dev Team | #21651 | | Use 99designs/gqlgen | 02/04/2021 | graphql, graphql-go, thunder | Type safety, auto-registration and code generation | GraphQL Golang Trade Study | Dev Team | #21775 | | Create normalized role data model | 03/25/2021 | Career Stage Profiles (CSP), Microsoft Role Library | Requires a data model that support the data requirements of both role systems | Role Data Model Schema | Dev Team | #22035 | diff --git a/docs/design/design-reviews/decision-log/examples/memory/Trade-Studies/GitOps.md b/docs/design/design-reviews/decision-log/examples/memory/trade-studies/gitops.md similarity index 98% rename from docs/design/design-reviews/decision-log/examples/memory/Trade-Studies/GitOps.md rename to docs/design/design-reviews/decision-log/examples/memory/trade-studies/gitops.md index 6588146646..9d8fe7ea62 100644 --- a/docs/design/design-reviews/decision-log/examples/memory/Trade-Studies/GitOps.md +++ b/docs/design/design-reviews/decision-log/examples/memory/trade-studies/gitops.md @@ -1,10 +1,8 @@ # Trade Study: GitOps -Conducted by: Tess and Jeff - -Backlog Work Item: #21672 - -Decision Makers: Wallace, whole team +- Conducted by: Tess and Jeff +- Backlog Work Item: #21672 +- Decision Makers: Wallace, whole team ## Overview @@ -146,7 +144,7 @@ This section should contain a table that has each solution rated against each of ArgoCD is more feature rich, will support more scenarios, and will be a better tool to put in our tool belts. So we have decided at this point to go with ArgoCD. -## References +## Resources 1. [GitOps](https://www.gitops.tech/#:~:text=What%20is%20GitOps?%20GitOps%20is%20a%20way%20of,familiar%20with,%20including%20Git%20and%20Continuous%20Deployment%20tools.) 1. [Enforcement](https://learn.microsoft.com/en-us/azure/azure-arc/kubernetes/use-azure-policy) diff --git a/docs/design/design-reviews/recipes/README.md b/docs/design/design-reviews/recipes/README.md index 021f1b1808..d7b38ce2a9 100644 --- a/docs/design/design-reviews/recipes/README.md +++ b/docs/design/design-reviews/recipes/README.md @@ -20,13 +20,13 @@ Design reviews come in all shapes and sizes. There are also different items to c - Design should be more detailed than game plan - May require unique deployment, security and/or privacy characteristics from other milestones -### [Feature/story design review](./feature-story-design-review-template.md) +### [Feature / Story Design Review](./templates/feature-story-design-review.md) - Design for complex features or stories - Will reuse deployment, security and other characteristics defined within game plan or milestone - May require new libraries, OSS or patterns to accomplish goals -### [Task design review](./task-design-review-template.md) +### [Task Design Review](./templates/template-task-design-review.md) - Highly detailed design for a complex tasks with many unknowns - Will integrate into higher level feature/component designs diff --git a/docs/design/design-reviews/recipes/async-design-reviews.md b/docs/design/design-reviews/recipes/async-design-reviews.md index d9e79413c7..18dcda0646 100644 --- a/docs/design/design-reviews/recipes/async-design-reviews.md +++ b/docs/design/design-reviews/recipes/async-design-reviews.md @@ -33,7 +33,7 @@ Design documentation must live in a source control repository that supports pull 2. If the documentation represents code that lives in many different repositories, it may make more sense to keep the docs in their own repository. 3. Place the docs so that they do not trigger CI builds for the affected code (assuming the documentation was the only change). This can be done by placing them in an isolated directory should they live alongside the code they represent. See directory structure example below. -```text +```sh -root --src --docs <-- exclude from ci build trigger diff --git a/docs/design/design-reviews/recipes/engagement-process.md b/docs/design/design-reviews/recipes/engagement-process.md index 68999861e7..022865a0e7 100644 --- a/docs/design/design-reviews/recipes/engagement-process.md +++ b/docs/design/design-reviews/recipes/engagement-process.md @@ -21,7 +21,7 @@ During this time the team uncovers many unknowns, leveraging all new-found infor ## Sprint Planning -In many engagements Microsoft works with customers using a SCRUM agile development process which begins with sprint planning. [Sprint planning](../../../agile-development/basics/ceremonies.md#sprint-planning) is a great opportunity to dive deep into the next set of high priority work. Some key points to address are the following: +In many engagements Microsoft works with customers using a SCRUM agile development process which begins with sprint planning. [Sprint planning](../../../agile-development/ceremonies.md#sprint-planning) is a great opportunity to dive deep into the next set of high priority work. Some key points to address are the following: 1. Identify stories that require design reviews 1. Separate design from implementation for complex stories @@ -48,7 +48,7 @@ The team can follow the same steps from [sprint planning](#sprint-planning) to h ## Sprint Retrospectives -[Sprint retrospectives](../../../agile-development/basics/ceremonies.md#retrospectives) are a great time to check in with the dev team, identify what is working or not working, and propose changes to keep improving. +[Sprint retrospectives](../../../agile-development/ceremonies.md#retrospectives) are a great time to check in with the dev team, identify what is working or not working, and propose changes to keep improving. It is also a great time to check in on design reviews @@ -58,7 +58,7 @@ It is also a great time to check in on design reviews All design artifacts should be treated as a living document. As requirements change or uncover more unknowns the dev crew should retroactively update all design artifacts. Missing this critical step may cause the customer to incur future technical debt. Artifacts that are not up to date are `bugs` in the design. -> **Tip:** Keep your artifacts up to date by adding it to your teams [Definition of Done](../../../agile-development/advanced-topics/team-agreements/definition-of-done.md) for all user stories. +> **Tip:** Keep your artifacts up to date by adding it to your teams [definition of done](../../../agile-development/team-agreements/definition-of-done.md) for all user stories. ## Sync Design Reviews diff --git a/docs/design/design-reviews/recipes/engineering-feasibility-spikes.md b/docs/design/design-reviews/recipes/engineering-feasibility-spikes.md index 26e7f2fb87..161b1af976 100644 --- a/docs/design/design-reviews/recipes/engineering-feasibility-spikes.md +++ b/docs/design/design-reviews/recipes/engineering-feasibility-spikes.md @@ -1,10 +1,10 @@ -# Engineering Feasibility Spikes: identifying and mitigating risk +# Engineering Feasibility Spikes: Identifying and Mitigating Risk ## Introduction Some engagements require more de-risking than others. Even after Architectural Design Sessions (ADS) an engagement may still have substantial technical unknowns. These types of engagements warrant an exploratory/validation phase where Engineering Feasibility Spikes can be conducted immediately after envisioning/ADS and before engineering sprints. -### Engineering feasibility spikes +### Engineering Feasibility Spikes - Are regimented yet collaborative time-boxed investigatory activities conducted in a feedback loop to capitalize on individual learnings to inform the team. - Increase the team’s knowledge and understanding while minimizing engagement risks. @@ -15,7 +15,7 @@ The following guidelines outline how Microsoft and the customer can incorporate A good way to gauge what engineering spikes to conduct is to do a pre-mortem. -### What is a pre-mortem? +### What is a Pre-Mortem? - A 90-minute meeting after envisioning/ADS that includes the entire team (and can also include the customer) which answers "Imagine the project has failed. What problems and challenges caused this failure?" - Allows the entire team to initially raise concerns and risks early in the engagement. @@ -24,7 +24,7 @@ This input is used to decide which risks to pursue as engineering spikes. ## Sharing Learnings & Current Progress -### Feedback loop +### Feedback Loop The key element from conducting the engineering feasibility spikes is sharing the outcomes in-flight. @@ -34,14 +34,14 @@ The key element from conducting the engineering feasibility spikes is sharing th The feedback loop is significantly tighter/shorter than in sprint-based agile process. Instead of using the Sprint as the forcing function to adjust/pivot/re-prioritize, the interim sharing sessions were the trigger. -### Re-prioritizing the next spikes +### Re-Prioritizing the Next Spikes After the team shares current progress, another round of planning is done. This allows the team to - Establish a very tight feedback loop. - Re-prioritize the next spike(s) because of the outcome from the current engineering feasibility spikes. -### Adjusting based on context +### Adjusting Based on Context During the sharing call, and when the team believes it has enough information, the team sometimes comes to the realization that the original spike acceptance criteria is no longer valid. The team pivots into another area that provides more value. @@ -55,7 +55,7 @@ The process is depicted in the diagram below. ## Benefits -### Creating code samples to prove out ideas +### Creating Code Samples to Prove Out Ideas It is important to note to be intentional about the spikes not aiming to produce production-level code. @@ -65,14 +65,14 @@ It is important to note to be intentional about the spikes not aiming to produce For example, supposed the team was exploring the API choreography of creating a Graph client with various Azure Active Directory (AAD) authentication flows and permissions. The code to demonstrate this is implemented in a console app, but it could have been done via an Express server, etc. The fact that it was a console app was not important, but rather the ability of the Graph client to be able to do operations against the Graph API endpoint with the minimal number of permissions is the main learning goal. -### Targeted conversations +### Targeted Conversations By sharing the progress of the spike, the team’s collective knowledge increases. - The spikes allow the team to drive succinct conversations with various Product Groups (PGs) and other subject matter experts (SMEs). - Rather than speaking at a hypothetical level, the team playbacks project/architecture concerns and concretely points out why something is a showstopper or not a viable way forward. -### Increased customer trust +### Increased Customer Trust This process leads to increased customer trust. @@ -83,10 +83,10 @@ Using this process, the team Conducting engineering feasibility spikes sets the team and the customer up for success, especially if it highlights technology learnings that help the customer fully understand the feasibility/viability of an engineering solution. -## Summary of key points +## Summary of Key Points - A pre-mortem can involve the whole team in surfacing business and technical risks. - The key purpose of the engineering feasibility spike is learning. - Learning comes from both conducting and sharing insights from spikes. - Use new spike infused learnings to revise, refine, re-prioritize, or create the next set of spikes. -- When spikes are completed, look for new weekly rhythms like adding a ‘risk’ column to the retro board or raising topics at [daily standup](../../../agile-development/basics/ceremonies.md#stand-up) to identify emerging risks. +- When spikes are completed, look for new weekly rhythms like adding a ‘risk’ column to the retro board or raising topics at [daily standup](../../../agile-development/ceremonies.md#stand-up) to identify emerging risks. diff --git a/docs/design/design-reviews/recipes/high-level-design-recipe.md b/docs/design/design-reviews/recipes/high-level-design-recipe.md index 89155d62b8..0389a5a493 100644 --- a/docs/design/design-reviews/recipes/high-level-design-recipe.md +++ b/docs/design/design-reviews/recipes/high-level-design-recipe.md @@ -1,12 +1,12 @@ # High Level / Game Plan Design Recipe -## Why is this valuable? +## Why is this Valuable? Design at macroscopic level shows the interactions between systems and services that will be used to accomplish the project. It is intended to ensure there is high level understanding of the plan for what to build, which off-the-shelf components will be used, and which external components will need to interact with the deliverable. -## Things to keep in mind +## Things to Keep in Mind -* As with all other aspects of the project, design reviews must provide a friendly and safe environment so that any team member feels comfortable proposing a design for review and can use the opportunity to grow and learn from the constructive / non-judgemental feedback from peers and subject-matter experts (see [Team Agreements](../../../agile-development/advanced-topics/team-agreements/README.md)). +* As with all other aspects of the project, design reviews must provide a friendly and safe environment so that any team member feels comfortable proposing a design for review and can use the opportunity to grow and learn from the constructive / non-judgemental feedback from peers and subject-matter experts (see [Team Agreements](../../../agile-development/team-agreements/)). * Attempt to illustrate different personas involved in the use cases and how/which boxes are their entry points. * Prefer pictures over paragraphs. The diagrams aren't intended to generate code, so they should be fairly high level. * Artifacts should indicate the direction of calls (are they outbound, inbound, or bidirectional?) and call out system boundaries where ports might need to be opened or additional infrastructure work may be needed to allow calls to be made. diff --git a/docs/design/design-reviews/recipes/milestone-epic-design-review-recipe.md b/docs/design/design-reviews/recipes/milestone-epic-design-review-recipe.md index 20c8589788..8e4b8a6313 100644 --- a/docs/design/design-reviews/recipes/milestone-epic-design-review-recipe.md +++ b/docs/design/design-reviews/recipes/milestone-epic-design-review-recipe.md @@ -1,12 +1,12 @@ # Milestone / Epic Design Review Recipe -## Why is this valuable? +## Why is this Valuable? Design at epic/milestone level can help the team make better decisions about prioritization by summarizing the value, effort, complexity, risks, and dependencies. This brief document can help the team align on the selected approach and briefly explain the rationale for other teams, subject-matter experts, project advisors, and new team members. -## Things to keep in mind +## Things to Keep in Mind -* As with all other aspects of the project, design reviews must provide a friendly and safe environment so that any team member feels comfortable proposing a design for review and can use the opportunity to grow and learn from the constructive / non-judgemental feedback from peers and subject-matter experts (see [Team Agreements](../../../agile-development/advanced-topics/team-agreements/README.md)). +* As with all other aspects of the project, design reviews must provide a friendly and safe environment so that any team member feels comfortable proposing a design for review and can use the opportunity to grow and learn from the constructive / non-judgemental feedback from peers and subject-matter experts (see [Team Agreements](../../../agile-development/team-agreements)). * Design reviews should be lightweight and should not feel like an additional process overhead. * Dev Lead can usually provide guidance on whether a given epic/milestone needs a design review and can help other team members in preparation. * This is *not* a strict template that must be followed and teams should not be bogged down with polished "design presentations". @@ -16,4 +16,4 @@ Design at epic/milestone level can help the team make better decisions about pri ## Template -You can download the **[Milestone/Epic Design Review Template](./milestone-epic-design-review-template.md)**, copy it into your project, and use it as described in the [async design review recipe](./async-design-reviews.md). +You can download the **[Milestone/Epic Design Review Template](./templates/milestone-epic-design-review.md)**, copy it into your project, and use it as described in the [async design review recipe](./async-design-reviews.md). diff --git a/docs/design/design-reviews/recipes/feature-story-design-review-template.md b/docs/design/design-reviews/recipes/templates/feature-story-design-review.md similarity index 88% rename from docs/design/design-reviews/recipes/feature-story-design-review-template.md rename to docs/design/design-reviews/recipes/templates/feature-story-design-review.md index d7014c81e8..d059689dbe 100644 --- a/docs/design/design-reviews/recipes/feature-story-design-review-template.md +++ b/docs/design/design-reviews/recipes/templates/feature-story-design-review.md @@ -1,4 +1,6 @@ -# Your Feature or Story Design Title Here (prefix with DRAFT/WIP to indicate level of completeness) +# Template: Feature / Story Design Review + +## [DRAFT/WIP] [Feature or Story Design Title] > Does the feature re-use or extend existing patterns / interfaces that have already been established for the project? > Does the feature expose new patterns or interfaces that will establish a new standard for new future development? @@ -18,9 +20,9 @@ ## Goals/In-Scope * List the goals that the feature/story will help us achieve that are most relevant for the design review discussion. -* This should include acceptance criteria required to meet [definition of done](../../../agile-development/advanced-topics/team-agreements/definition-of-done.md). +* This should include acceptance criteria required to meet [definition of done](../../../../agile-development/team-agreements/definition-of-done.md). -## Non-goals / Out-of-Scope +## Non-Goals / Out-of-Scope * List the non-goals for the feature/story. * This contains work that is beyond the scope of what the feature/component/service is intended for. @@ -61,6 +63,6 @@ > List any open questions/concerns here. -## Additional References +## Resources -> List any additional references here including links to backlog items, work items or other documents. +> List any additional resources here including links to backlog items, work items or other documents. diff --git a/docs/design/design-reviews/recipes/milestone-epic-design-review-template.md b/docs/design/design-reviews/recipes/templates/milestone-epic-design-review.md similarity index 88% rename from docs/design/design-reviews/recipes/milestone-epic-design-review-template.md rename to docs/design/design-reviews/recipes/templates/milestone-epic-design-review.md index 8c8a21a515..a30c37234d 100644 --- a/docs/design/design-reviews/recipes/milestone-epic-design-review-template.md +++ b/docs/design/design-reviews/recipes/templates/milestone-epic-design-review.md @@ -1,6 +1,8 @@ -# Your Milestone/Epic Design Title Here (prefix with DRAFT/WIP to indicate level of completeness) +# Template: Milestone / Epic Design Review -> Please refer to for things to keep in mind when using this template. +## [DRAFT/WIP] [Milestone/Epic Design Title] + +> Please refer to the [milestone/epic design review recipe](../milestone-epic-design-review-recipe.md) for things to keep in mind when using this template. * Milestone / Epic: [Name](http://link-to-work-item) * Project / Engagement: [Project Engagement] @@ -12,7 +14,7 @@ ## Goals / In-Scope -> List a few bullet points of goals that this milestone/epic will achieve and that are most relevant for the design review discussion. You may include acceptable criteria required to meet the [Definition of Done](../../../agile-development/advanced-topics/team-agreements/definition-of-done.md). +> List a few bullet points of goals that this milestone/epic will achieve and that are most relevant for the design review discussion. You may include acceptable criteria required to meet the [Definition of Done](../../../../agile-development/team-agreements/definition-of-done.md). ## Non-goals / Out-of-Scope @@ -69,6 +71,6 @@ > Include any open questions and concerns. -## Additional References +## Resources -> Include any additional references including links to work items or other documents. +> Include any additional resources including links to work items or other documents. diff --git a/docs/design/design-reviews/recipes/task-design-review-template.md b/docs/design/design-reviews/recipes/templates/template-task-design-review.md similarity index 81% rename from docs/design/design-reviews/recipes/task-design-review-template.md rename to docs/design/design-reviews/recipes/templates/template-task-design-review.md index d7ea572609..2942050092 100644 --- a/docs/design/design-reviews/recipes/task-design-review-template.md +++ b/docs/design/design-reviews/recipes/templates/template-task-design-review.md @@ -1,6 +1,8 @@ -# Your Task Design Title Here (prefix with DRAFT/WIP to indicate level of completeness) +# Template: Task Design Review -> When developing a design document for a new task, it should contain a detailed design proposal demonstrating how it will solve the goals outlined below. +## [DRAFT/WIP] [Task Design Title] + +> When developing a design document for a new task, it should contain a detailed design proposal demonstrating how it will solve the goals outlined below. > Not all tasks require a design review, but when they do it is likely that there many unknowns, or the solution may be more complex. > The design should include diagrams, pseudocode, interface contracts as needed to provide a detailed understanding of the proposal. @@ -19,7 +21,7 @@ ## Goals/In-Scope * List a few bullet points of what this task will achieve and that are most relevant for the design review discussion. -* This should include acceptance criteria required to meet the [definition of done](../../../agile-development/advanced-topics/team-agreements/definition-of-done.md). +* This should include acceptance criteria required to meet the [definition of done](../../../../agile-development/team-agreements/definition-of-done.md). ## Non-goals / Out-of-Scope @@ -28,8 +30,8 @@ ## Proposed Options * Describe the detailed design to accomplish the proposed task. -* What patterns & practices will be used and why were they chosen. -* Were any alternate proposals considered? +* What patterns & practices will be used and why were they chosen. +* Were any alternate proposals considered? * What new components are required to be developed? * Are there any existing components that require updates? * Relevant diagrams (e.g. sequence, component, context, deployment) should be included here. @@ -43,6 +45,6 @@ > List any open questions/concerns here. -## Additional References +## Resources -> List any additional references here including links to backlog items, work items or other documents. +> List any additional resources here including links to backlog items, work items or other documents. diff --git a/docs/design/design-reviews/recipes/sprint-spike-template.md b/docs/design/design-reviews/recipes/templates/template-technical-spike.md similarity index 95% rename from docs/design/design-reviews/recipes/sprint-spike-template.md rename to docs/design/design-reviews/recipes/templates/template-technical-spike.md index aac4072320..53eff228d5 100644 --- a/docs/design/design-reviews/recipes/sprint-spike-template.md +++ b/docs/design/design-reviews/recipes/templates/template-technical-spike.md @@ -1,4 +1,6 @@ -# Spike: {Name} +# Template: Technical Spike + +## Spike: [Spike Name] - **Conducted by:** {Names and at least one email address for follow-up questions} - **Backlog Work Item:** {Link to the work item to provide more context} diff --git a/docs/design/design-reviews/trade-studies/README.md b/docs/design/design-reviews/trade-studies/README.md index 250e873b16..03ea542b8e 100644 --- a/docs/design/design-reviews/trade-studies/README.md +++ b/docs/design/design-reviews/trade-studies/README.md @@ -6,9 +6,9 @@ of each solution. [Trade studies](https://en.wikipedia.org/wiki/Trade_study) are a concept from systems engineering that we adapted for software projects. Trade studies have proved to be a critical tool to drive alignment with the stakeholders, earn credibility while doing so and ensure our decisions -were backed by data and not bias. +were backed by data and not bias. -## When to use the tool +## When to Use Trade studies go hand in hand with high level architecture design. This usually occurs as project requirements are solidifying, before coding begins. Trade studies continue to be useful throughout the project any time there are multiple options that need diff --git a/docs/design/design-reviews/trade-studies/template.md b/docs/design/design-reviews/trade-studies/template.md index 8f2e453cd1..9523605370 100644 --- a/docs/design/design-reviews/trade-studies/template.md +++ b/docs/design/design-reviews/trade-studies/template.md @@ -4,7 +4,7 @@ This generic template can be used for any situation where we have a set of requi by multiple solutions. They can range in scope from choice of which open source package to use, to full architecture designs. -## Trade Study/Design: {study name goes here} +## Trade Study/Design: [Trade Study Name] - **Conducted by:** {Names of those that can answer follow-up questions and at least one email address} - **Backlog Work Item:** {Link to the work item to provide more context} @@ -36,7 +36,7 @@ The following section should establish the desired capabilities of the solution > **IMPORTANT** This is **not** intended to define outcomes for the design activity itself. It is intended to define the outcomes for the solution being designed. -As mentioned in the [User Interface](../../../user-interface-engineering/README.md) section, if the trade study is analyzing an application development solution, make use of the _persona stories_ to derive desired outcomes. For example, if a persona story exemplifies a certain accessibility requirement, the parallel desired outcome may be "The application must be accessible for people with vision-based disabilities". +As mentioned in the [User Interface](../../../UI-UX/README.md) section, if the trade study is analyzing an application development solution, make use of the _persona stories_ to derive desired outcomes. For example, if a persona story exemplifies a certain accessibility requirement, the parallel desired outcome may be "The application must be accessible for people with vision-based disabilities". ### Evaluation Criteria @@ -70,7 +70,7 @@ If applicable, describe the boundaries from which we have to design the solution #### Accessibility -**Accessibility is never optional**. Microsoft has made a public commitment to always produce accessible applications. For more information visit the official [Microsoft accessibility site](https://www.microsoft.com/accessibility) and read the [Accessibility](../../../accessibility/README.md) page. +**Accessibility is never optional**. Microsoft has made a public commitment to always produce accessible applications. For more information visit the official [Microsoft accessibility site](https://www.microsoft.com/accessibility) and read the [Accessibility](../../../non-functional-requirements/accessibility.md) page. Consider the following prompts when determining application accessibility requirements: @@ -82,9 +82,9 @@ Consider the following prompts when determining application accessibility requir Enumerate the solutions that are believed to deliver the outcomes defined above. -> NOTE: Limiting the evaluated solutions to 2 or 3 potential candidates can help manage the time spent on the evaluation. If there are more than 3 candidates, prioritize what the team feels are the top 3. If appropriate, the eliminated candidates can be mentioned to capture why they were eliminated. Additionally, there should be at least two options compared, otherwise you didn't need a trade study. +> **Note:** Limiting the evaluated solutions to 2 or 3 potential candidates can help manage the time spent on the evaluation. If there are more than 3 candidates, prioritize what the team feels are the top 3. If appropriate, the eliminated candidates can be mentioned to capture why they were eliminated. Additionally, there should be at least two options compared, otherwise you didn't need a trade study. -### {Solution 1} - Short and easily recognizable name +### [Solution 1] Add a **brief** description of the solution and how its expected to produce the desired outcomes. If appropriate, illustrations/diagrams can be used to reduce the amount of text explanation required to describe the solution. @@ -118,11 +118,11 @@ Present the evidence collected during experimentation that supports the hypothes > NOTE: **Evidence is not required for every capability, metric, or constraint for the design to be considered done.** Instead, focus on presenting evidence that is most relevant and impactful towards supporting or eliminating the hypothesis. -### {Solution 2} +### [Solution 2] ... -### {Solution N} +### [Solution N] ... @@ -148,7 +148,7 @@ In the latter case, each question needs an action item and an assigned person fo In the first case, describe which solution was chosen and why. Summarize what evidence informed the decision and how that evidence mapped to the desired outcomes. -> **IMPORTANT**: Decisions should be made with the understanding that they can change as the team learns more. It's a starting point, not a contract. +> **Note:** Decisions should be made with the understanding that they can change as the team learns more. It's a starting point, not a contract. ## Next Steps diff --git a/docs/design/diagram-types/.pages b/docs/design/diagram-types/.pages deleted file mode 100644 index 5a2526c39e..0000000000 --- a/docs/design/diagram-types/.pages +++ /dev/null @@ -1,3 +0,0 @@ -nav: - - Design diagram templates: DesignDiagramsTemplates - - ... \ No newline at end of file diff --git a/docs/design/diagram-types/DesignDiagramsTemplates/Images/azureDeploymentDiagram.png b/docs/design/diagram-types/Images/azureDeploymentDiagram.png similarity index 100% rename from docs/design/diagram-types/DesignDiagramsTemplates/Images/azureDeploymentDiagram.png rename to docs/design/diagram-types/Images/azureDeploymentDiagram.png diff --git a/docs/design/diagram-types/DesignDiagramsTemplates/Images/deploymentDiagram.png b/docs/design/diagram-types/Images/deploymentDiagram.png similarity index 100% rename from docs/design/diagram-types/DesignDiagramsTemplates/Images/deploymentDiagram.png rename to docs/design/diagram-types/Images/deploymentDiagram.png diff --git a/docs/design/diagram-types/DesignDiagramsTemplates/Images/ecommerceSite.png b/docs/design/diagram-types/Images/ecommerceSite.png similarity index 100% rename from docs/design/diagram-types/DesignDiagramsTemplates/Images/ecommerceSite.png rename to docs/design/diagram-types/Images/ecommerceSite.png diff --git a/docs/design/diagram-types/DesignDiagramsTemplates/Images/facebookUserAuthentication.png b/docs/design/diagram-types/Images/facebookUserAuthentication.png similarity index 100% rename from docs/design/diagram-types/DesignDiagramsTemplates/Images/facebookUserAuthentication.png rename to docs/design/diagram-types/Images/facebookUserAuthentication.png diff --git a/docs/design/diagram-types/DesignDiagramsTemplates/Images/generalization-aggregation-association.png b/docs/design/diagram-types/Images/generalization-aggregation-association.png similarity index 100% rename from docs/design/diagram-types/DesignDiagramsTemplates/Images/generalization-aggregation-association.png rename to docs/design/diagram-types/Images/generalization-aggregation-association.png diff --git a/docs/design/diagram-types/DesignDiagramsTemplates/Images/orderingSystem.png b/docs/design/diagram-types/Images/orderingSystem.png similarity index 100% rename from docs/design/diagram-types/DesignDiagramsTemplates/Images/orderingSystem.png rename to docs/design/diagram-types/Images/orderingSystem.png diff --git a/docs/design/diagram-types/DesignDiagramsTemplates/Images/placeOrderScenario.png b/docs/design/diagram-types/Images/placeOrderScenario.png similarity index 100% rename from docs/design/diagram-types/DesignDiagramsTemplates/Images/placeOrderScenario.png rename to docs/design/diagram-types/Images/placeOrderScenario.png diff --git a/docs/design/diagram-types/DesignDiagramsTemplates/Images/realization.png b/docs/design/diagram-types/Images/realization.png similarity index 100% rename from docs/design/diagram-types/DesignDiagramsTemplates/Images/realization.png rename to docs/design/diagram-types/Images/realization.png diff --git a/docs/design/diagram-types/DesignDiagramsTemplates/Images/withPersistenceAndSecurity.png b/docs/design/diagram-types/Images/withPersistenceAndSecurity.png similarity index 100% rename from docs/design/diagram-types/DesignDiagramsTemplates/Images/withPersistenceAndSecurity.png rename to docs/design/diagram-types/Images/withPersistenceAndSecurity.png diff --git a/docs/design/diagram-types/README.md b/docs/design/diagram-types/README.md index 26adaac003..1055575012 100644 --- a/docs/design/diagram-types/README.md +++ b/docs/design/diagram-types/README.md @@ -20,10 +20,10 @@ Within each of these classes, there are many types of diagrams, each intended to This section contains educational material and examples for the following design diagrams: -- [Class Diagrams](DesignDiagramsTemplates/classDiagrams.md) - Useful to document the structural design of a codebase's relationship between classes, and their corresponding methods -- [Component Diagrams](DesignDiagramsTemplates/componentDiagrams.md) - Useful to document a high level structural overview of all the components and their direct "touch points" with other Components -- [Sequence Diagrams](DesignDiagramsTemplates/sequenceDiagrams.md) - Useful to document a behavior overview of the system, capturing the various "use cases" or "actions" that triggers the system to perform some business logic -- [Deployment Diagram](DesignDiagramsTemplates/deploymentDiagrams.md) - Useful in order to document the networking and hosting environments where the system will operate in +- [Class Diagrams](./class-diagrams.md) - Useful to document the structural design of a codebase's relationship between classes, and their corresponding methods +- [Component Diagrams](./component-diagrams.md) - Useful to document a high level structural overview of all the components and their direct "touch points" with other Components +- [Sequence Diagrams](./sequence-diagrams.md) - Useful to document a behavior overview of the system, capturing the various "use cases" or "actions" that triggers the system to perform some business logic +- [Deployment Diagram](./deployment-diagrams.md) - Useful in order to document the networking and hosting environments where the system will operate in ## Supplemental Resources diff --git a/docs/design/diagram-types/DesignDiagramsTemplates/classDiagrams.md b/docs/design/diagram-types/class-diagrams.md similarity index 100% rename from docs/design/diagram-types/DesignDiagramsTemplates/classDiagrams.md rename to docs/design/diagram-types/class-diagrams.md diff --git a/docs/design/diagram-types/DesignDiagramsTemplates/componentDiagrams.md b/docs/design/diagram-types/component-diagrams.md similarity index 98% rename from docs/design/diagram-types/DesignDiagramsTemplates/componentDiagrams.md rename to docs/design/diagram-types/component-diagrams.md index 28ccb7ac57..09fe33c385 100644 --- a/docs/design/diagram-types/DesignDiagramsTemplates/componentDiagrams.md +++ b/docs/design/diagram-types/component-diagrams.md @@ -35,7 +35,7 @@ Because Component Diagrams represent a high level overview of the entire system - the team won't be able to identify areas of improvement - the team or other necessary stakeholders won't have a full understanding on how the system works as it is being developed -Because of the inherent granularity of the system, the Component Diagrams won't have to be updated as often as [Class Diagrams](./classDiagrams.md). Things that might merit updating a Component Diagram could be: +Because of the inherent granularity of the system, the Component Diagrams won't have to be updated as often as [Class Diagrams](./class-diagrams.md). Things that might merit updating a Component Diagram could be: - A deletion or addition of a new Component into the system - A change to a system Component's interaction APIs diff --git a/docs/design/diagram-types/DesignDiagramsTemplates/deploymentDiagrams.md b/docs/design/diagram-types/deployment-diagrams.md similarity index 97% rename from docs/design/diagram-types/DesignDiagramsTemplates/deploymentDiagrams.md rename to docs/design/diagram-types/deployment-diagrams.md index bc2a55b77f..162b01e90e 100644 --- a/docs/design/diagram-types/DesignDiagramsTemplates/deploymentDiagrams.md +++ b/docs/design/diagram-types/deployment-diagrams.md @@ -15,7 +15,7 @@ It is not supposed to inform about the data flow, the caller or callee responsib ## Essential Takeaways -The Deployment diagram should contain all Components identified in the [Component Diagram(s)](./componentDiagrams.md), but captured alongside the following elements: +The Deployment diagram should contain all Components identified in the [Component Diagram(s)](./component-diagrams.md), but captured alongside the following elements: - Firewalls - VNETs and subnets diff --git a/docs/design/diagram-types/DesignDiagramsTemplates/sequenceDiagrams.md b/docs/design/diagram-types/sequence-diagrams.md similarity index 97% rename from docs/design/diagram-types/DesignDiagramsTemplates/sequenceDiagrams.md rename to docs/design/diagram-types/sequence-diagrams.md index 6400f33971..2cd46e9abd 100644 --- a/docs/design/diagram-types/DesignDiagramsTemplates/sequenceDiagrams.md +++ b/docs/design/diagram-types/sequence-diagrams.md @@ -37,7 +37,7 @@ A Sequence Diagram should: It is okay for a single Sequence Diagram to have many different scenarios if they have some related context that merits them being grouped. -Another important thing to keep in mind, is that the **objects** involved in a Sequence Diagram should refer to existing Components from a [Component Diagram](./componentDiagrams.md). +Another important thing to keep in mind, is that the **objects** involved in a Sequence Diagram should refer to existing Components from a [Component Diagram](./component-diagrams.md). There are 2 areas where complexity can result in an overly "crowded" Sequence Diagram, making it costly to maintain. They are: @@ -73,7 +73,7 @@ early on are that: - the team will be unable to gain insights in time, from visualizing the various messages and requests sent between Components, in order to perform any potential refactoring - the team or other necessary stakeholders won't have a complete understanding of the request/message/data flow within the system -Because of the inherent granularity of the system, the Sequence Diagrams won't have to be updated as often as [Class Diagrams](./classDiagrams.md), but may require more maintenance than [Component Diagrams](./componentDiagrams.md). Things that might merit updating a Sequence Diagram could be: +Because of the inherent granularity of the system, the Sequence Diagrams won't have to be updated as often as [Class Diagrams](./class-diagrams.md), but may require more maintenance than [Component Diagrams](./component-diagrams.md). Things that might merit updating a Sequence Diagram could be: - A new request/message/data being sent across Components involved in a scenario - A change to one or several Components involved in a Sequence Diagram. Such as splitting a component into multiple ones, or consolidating many Components into a single one diff --git a/docs/design/exception-handling/readme.md b/docs/design/exception-handling.md similarity index 95% rename from docs/design/exception-handling/readme.md rename to docs/design/exception-handling.md index 0d71a6c301..87919314f9 100644 --- a/docs/design/exception-handling/readme.md +++ b/docs/design/exception-handling.md @@ -1,6 +1,6 @@ -# Exception handling +# Exception Handling -## Exception constructs +## Exception Constructs Almost all language platforms offer a construct of exception or equivalent to handle error scenarios. The underlying platform, used libraries or the authored code can "throw" exceptions to initiate an error flow. Some of the advantages of using exceptions are - @@ -17,7 +17,7 @@ Here is some guidance on exception handling in .Net [Handling exceptions in .Net](https://learn.microsoft.com/en-us/dotnet/standard/exceptions/#exceptions-vs-traditional-error-handling-methods) -## Custom exceptions +## Custom Exceptions Although the platform offers numerous types of exceptions, often we need custom defined exceptions to arrive at an optimal low level design for error handling. The advantages of using custom exceptions are - @@ -27,16 +27,16 @@ Although the platform offers numerous types of exceptions, often we need custom 4. Enrich the exception with more information about the data context of the error. E.g. RecordId in property in DatabaseWriteException which carries the Id of the record failed to update. 5. Define custom error message which is more business user friendly or support team friendly. -### Custom exception hierarchy +### Custom Exception Hierarchy Below diagram shows a sample hierarchy of custom exceptions. -1. It defines a BaseException class which derives from System.Exception class and parent of all custom exceptions. BaseException also has additional properties for ActionCode and ResultCode. ActionCode represents the "flow" in which the error happened. ResultCode represents the exact error that happened. These additional properties help in defining different error handling flows in the catch blocks. +1. It defines a BaseException class which derives from System.Exception class and parent of all custom exceptions. BaseException also has additional properties for ActionCode and ResultCode. ActionCode represents the "flow" in which the error happened. ResultCode represents the exact error that happened. These additional properties help in defining different error handling flows in the catch blocks. 2. Defines a number of System exceptions which derive from SystemException class. They will address all the errors generated by the technical aspects of the code. Like connectivity, read, write, buffer overflow etc 3. Defines a number of Business exceptions which derive from BusinessException class. They will address all the errors generated by the business aspects of the code. Like data validations, duplicate rows. ![ClassDiagram1](https://github.com/SudhirChandra/code-with-engineering-playbook/assets/23739807/1234e529-67ab-4a14-8f7d-fc5c41006015) -## Error details in API response +## Error Details in API Response When an error occurs in an API, it has to rendered as response with all the necessary fields. There can be custom response schema drafted for these purposes. But one of the popular formats is the problem detail structure - [Problem details](https://datatracker.ietf.org/doc/html/rfc7807) diff --git a/docs/design/readme.md b/docs/design/readme.md index 1079dea365..485b7792cd 100644 --- a/docs/design/readme.md +++ b/docs/design/readme.md @@ -11,18 +11,6 @@ This covers not only technical design of software, but also architecture design - Reference or define process or checklists to help ensure well-designed software. - Collate and point to reference sources (guides, repos, articles) that can help shortcut the learning process. -## Sections - -- [Diagram Types](diagram-types/README.md) -- [Design Patterns](design-patterns/README.md) -- [Design Reviews](design-reviews/README.md) -- [Non-Functional Requirements Guidance](design-patterns/non-functional-requirements-capture-guide.md) -- [Sustainable Software Engineering](sustainability/readme.md) - -## Recipes - -- [Design Recipes](design-reviews/recipes/README.md) - ## Code Examples - Folder Structure diff --git a/docs/design/sustainability/readme.md b/docs/design/sustainability/README.md similarity index 100% rename from docs/design/sustainability/readme.md rename to docs/design/sustainability/README.md diff --git a/docs/design/sustainability/sustainable-action-disclaimers.md b/docs/design/sustainability/sustainable-action-disclaimers.md index eca756e6d0..d311ba171e 100644 --- a/docs/design/sustainability/sustainable-action-disclaimers.md +++ b/docs/design/sustainability/sustainable-action-disclaimers.md @@ -1,8 +1,8 @@ # Disclaimers -The following disclaimers provide more details about how to consider the impact of particular actions recommended by the [Sustainable Engineering Checklist](readme.md#sustainable-engineering-checklist). +The following disclaimers provide more details about how to consider the impact of particular actions recommended by the [Sustainable Engineering Checklist](./README.md#sustainable-engineering-checklist). -## ACTION: Resize physical or virtual machines to improve utilization +## ACTION: Resize Physical or Virtual Machines to Improve Utilization Recommendations from cost-savings tools are usually aligned with carbon-reduction, but as sustainability is not the purpose of such tools, carbon-savings are not guaranteed. How a cloud provider or data center manages unused capacity is also a factor in determining how impactful this action may be. For example: @@ -10,23 +10,23 @@ The sustainable impact of using smaller VMs in the same family are typically ben The sustainable impact of changing VM families can be harder to reason about because the underlying hardware and reserved cores may be changing with them. -## ACTION: Migrate to a hyperscale cloud provider +## ACTION: Migrate to a Hyperscale Cloud Provider Carbon savings from hyperscale cloud providers are generally attributable to four key features: IT operational efficiency, IT equipment efficiency, data center infrastructure efficiency, and renewable electricity. Microsoft Cloud, for example, is between 22 and 93 percent more energy efficient than traditional enterprise data centers, depending on the specific comparison being made. When taking into account renewable energy purchases, the Microsoft Cloud is between 72 and 98 percent more carbon efficient. [Source (PDF)](https://download.microsoft.com/download/7/3/9/739BC4AD-A855-436E-961D-9C95EB51DAF9/Microsoft_Cloud_Carbon_Study_2018.pdf) -## ACTION: Consider running an edge device +## ACTION: Consider Running an Edge Device Running an edge device negates many of the benefits of hyperscale compute facilities, so considering the local energy grid mix and the typical timing of the workloads is important to determine if this is beneficial overall. The larger volume of data that needs to be transmitted, the more this solution becomes appealing. For example, sending large amounts of audio and video content for processing. -## ACTION: Consider physically shipping data to the provider +## ACTION: Consider Physically Shipping Data to the Provider Shipping physical items has its own carbon impact, depending on the mode of transportation, which needs to be understood before making this decision. The larger the volume of data that needs to be transmitted the more this options may be beneficial. -## ACTION: Consider the energy efficiency of languages +## ACTION: Consider the Energy Efficiency of Languages When selecting a programming language, the _most_ energy efficient programming language may not always be the best choice for development speed, maintenance, integration with dependent systems, and other project factors. But when deciding between languages that all meet the project needs, energy efficiency can be a helpful consideration. -## ACTION: Use caching policies +## ACTION: Use Caching Policies A cache provides temporary storage of resources that have been requested by an application. Caching can improve application performance by reducing the time required to get a requested resource. Caching can also improve sustainability by decreasing the amount of network traffic. @@ -34,6 +34,6 @@ While caching provides these benefits, it also increases the risk that the resou Additionally, caching may allow unauthorized users or processes to read sensitive data. An authenticated response that is cached may be retrieved from the cache without an additional authorization. Due to security concerns like this, caching is **not recommended** for middle tier scenarios. -## ACTION: Consider caching data close to end users with a CDN +## ACTION: Consider Caching Data Close to End Users with a CDN Including CDNs in your network architecture adds many additional servers to your software footprint, each with their own local energy grid mix. The details of CDN hardware and the impact of the power that runs it is important to determine if the carbon emissions from running them is lower than the emissions from sending the data over the wire from a more distant source. The larger the volume of data, distance it needs to travel, and frequency of requests, the more this solution becomes appealing. diff --git a/docs/design/sustainability/sustainable-engineering-principles.md b/docs/design/sustainability/sustainable-engineering-principles.md index 13761a2978..a114505f61 100644 --- a/docs/design/sustainability/sustainable-engineering-principles.md +++ b/docs/design/sustainability/sustainable-engineering-principles.md @@ -1,6 +1,6 @@ # Sustainable Principles -The following principle overviews provide the foundations supporting specific actions in the [Sustainable Engineering Checklist](./readme.md#sustainable-engineering-checklist). More details about each principle can be found by following the links in the headings or visiting the [Principles of Green Software Engineering website](https://principles.green/). +The following principle overviews provide the foundations supporting specific actions in the [Sustainable Engineering Checklist](./README.md#sustainable-engineering-checklist). More details about each principle can be found by following the links in the headings or visiting the [Principles of Green Software Engineering website](https://principles.green/). ## [Electricity Consumption](https://principles.green/principles/electricity/) diff --git a/docs/developer-experience/README.md b/docs/developer-experience/README.md index 0b424bfd0b..e71a66ae87 100644 --- a/docs/developer-experience/README.md +++ b/docs/developer-experience/README.md @@ -45,7 +45,7 @@ How long does it take to make a change that can be verified/tested locally. A lo Providing a positive developer experience is a team effort. However, certain members can take ownership of different areas to help hold the entire team accountable. -### Dev Lead - Set the bar +### Dev Lead - Set the Bar The following are examples of how the Dev Lead might set the bar for dev experience @@ -232,4 +232,4 @@ class MyService { } ``` -The recipes section has a more complete discussion on [DI as part of a high productivity inner dev loop](client-app-inner-loop.md) +The recipes section has a more complete discussion on [DI as part of a high productivity inner dev loop](./client-app-inner-loop.md) diff --git a/docs/developer-experience/client-app-inner-loop.md b/docs/developer-experience/client-app-inner-loop.md index 864bc353ea..86a0d6e613 100755 --- a/docs/developer-experience/client-app-inner-loop.md +++ b/docs/developer-experience/client-app-inner-loop.md @@ -1,4 +1,4 @@ -# Separating client apps from the services they consume during development +# Separating Client Apps from the Services They Consume During Development Client Apps typically rely on remote services to power their apps. However, development schedules between the client app and the services don't always fully align. For a high velocity inner dev loop, client app development must be decoupled from the backend services while still allowing the app to "invoke" the services for local testing. @@ -47,7 +47,7 @@ public static void Bootstrap(IUnityContainer container) } ``` -#### Consuming mocks via Dependency Injection +#### Consuming Mocks via Dependency Injection The code consuming the interfaces will not notice the difference. @@ -87,12 +87,12 @@ This approach also enables full fidelity integration testing without spinning up Lower fidelity approaches run stub services, that could be generated from API specs, or run fake servers like JsonServer ([JsonServer.io: A fake json server API Service for prototyping and testing.](https://www.jsonserver.io/)) or Postman. All these services would respond with predetermined and configured JSON messages. -## How to decide +## How to Decide -| | Pros | Cons | Example when developing for: | Example When not to Use | -|----------------|----------------------------------------|-----------------------------|---------------------------------|-----------------------------------------------| +|| Pros | Cons | Example when developing for: | Example When not to Use | +| -- | -- | -- | -- | -- | | Embedded Mocks | Simplifies the F5 developer experience | Tightly coupled with Client | More static type data scenarios | Testing (e.g. unit tests, integration tests) | -|| No external dependencies to manage | Hard coded data | Initial integration with services | +|| No external dependencies to manage | Hard coded data | Initial integration with services | | | | | Mocking via Dependency Injection can be a non-trivial effort | | | | High-Fidelity Local Services | Loosely Coupled from Client | Extra tooling required i.e. local infrastructure overhead | URL Routes | When API contract are not available | | | Easier to independently modify response | Extra setup and configuration of services | | | diff --git a/docs/developer-experience/cross-platform-tasks.md b/docs/developer-experience/cross-platform-tasks.md index 9f37abdbe1..22bb0bd6fd 100644 --- a/docs/developer-experience/cross-platform-tasks.md +++ b/docs/developer-experience/cross-platform-tasks.md @@ -5,7 +5,7 @@ There are several options to alleviate cross-platform compatibility issues. - Running tasks in a container - Using the tasks-system in VS Code which provides options to allow commands to be executed specific to an operating system. -## Docker or Container based +## Docker or Container Based Using containers as development machines allows developers to get started with minimal setup and abstracts the development environment from the host OS by having it run in a container. DevContainers can also help in standardizing the local developer experience across the team. @@ -15,9 +15,9 @@ The following are some good resources to get started with running tasks in DevCo - [Developing inside a container](https://code.visualstudio.com/docs/remote/containers). - [Tutorial on Development in Containers](https://code.visualstudio.com/docs/remote/containers-tutorial) - For samples projects and dev container templates see [VS Code Dev Containers Recipe](https://github.com/microsoft/vscode-dev-containers) -- [Dev Containers Library](devcontainers.md) +- [Dev Containers Library](./devcontainers-getting-started.md) -## Tasks in VS Code +## Tasks in VSCode ### Running Node.js @@ -66,6 +66,6 @@ Not all scripts or tasks can be auto-detected in the workspace. It may be necess The command here is a shell command and tells the system to run either the test.sh or test.cmd. By default, it will run test.sh with that given path. This example here also defines Windows specific properties and tells it execute test.cmd instead of the default. -### References +### Resources VS Code Docs - [operating system specific properties](https://vscode-docs.readthedocs.io/en/stable/editor/tasks/#operating-system-specific-properties) diff --git a/docs/developer-experience/devcontainers.md b/docs/developer-experience/devcontainers-getting-started.md similarity index 96% rename from docs/developer-experience/devcontainers.md rename to docs/developer-experience/devcontainers-getting-started.md index 8f4626a9eb..63b6018d4e 100644 --- a/docs/developer-experience/devcontainers.md +++ b/docs/developer-experience/devcontainers-getting-started.md @@ -9,7 +9,7 @@ If you are a developer and have experience with Visual Studio Code (VS Code) or - Experience with VS Code - Experience with Docker -## What are dev containers? +## What are Dev Containers? Development containers are a VS Code feature that allows developers to package a local development tool stack into the internals of a Docker container while also bringing the VS Code UI experience with them. Have you ever set a breakpoint inside a Docker container? Maybe not. Dev containers make that possible. This is all made possible through a VS Code extension called the [Remote Development Extension Pack](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.vscode-remote-extensionpack) that works together with Docker to spin-up a VS Code Server within a Docker container. The VS Code UI component remains local, but your working files are volume mounted into the container. The diagram below, taken directly from the [official VS Code docs](https://code.visualstudio.com/docs/remote/containers), illustrates this: @@ -19,7 +19,7 @@ If the above diagram is not clear, a basic analogy that might help you intuitive To set yourself up for the dev container experience described above, use your VS Code's Extension Marketplace to install the [Remote Development Extension Pack](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.vscode-remote-extensionpack). -## How can dev containers improve project collaboration? +## How can Dev Containers Improve Project Collaboration? VS Code dev containers have improved project collaboration between developers on recent team projects by addressing two very specific problems: @@ -54,7 +54,7 @@ $ tree vs-code-remote-try-python # main repo directory For a list of devcontainer.json configuration properties, visit VS Code documentation on [dev container properties](https://code.visualstudio.com/docs/remote/devcontainerjson-reference). -## How do I decide which dev container is right for my use case? +## How do I Decide Which Dev Container is Right for my Use Case? Fortunately, VS Code has a repo gallery of platform specific folders that host dev container definitions (.devcontainer directories) to make getting started with dev containers easier. The code snippet below shows a list of gallery folders that come directly from the [VS Code dev container gallery repo](https://github.com/microsoft/vscode-dev-containers/tree/master/containers): @@ -79,4 +79,4 @@ Here are the final high-level steps it takes to build a dev container: ## Going further -There are use cases where you would want to go further in configuring your Dev Container. [More details here](going-further.md) +There are use cases where you would want to go further in configuring your Dev Container. [More details here](./devcontainers-going-further.md) diff --git a/docs/developer-experience/going-further.md b/docs/developer-experience/devcontainers-going-further.md similarity index 100% rename from docs/developer-experience/going-further.md rename to docs/developer-experience/devcontainers-going-further.md diff --git a/docs/developer-experience/execute-local-pipeline-with-docker.md b/docs/developer-experience/execute-local-pipeline-with-docker.md index 60d937c596..e9d8b2d992 100644 --- a/docs/developer-experience/execute-local-pipeline-with-docker.md +++ b/docs/developer-experience/execute-local-pipeline-with-docker.md @@ -1,4 +1,4 @@ -# Executing pipelines locally +# Executing Pipelines Locally ## Abstract @@ -18,7 +18,7 @@ Using the suggested method will allow us to: [Docker Compose](https://docs.docker.com/compose/) allows you to build push or run multi-container Docker applications. -### Method of work +### Method of Work 1. Dockerize your application(s), including a build step if possible. 2. Add a step in your docker file to execute unit tests. @@ -31,7 +31,7 @@ Using the suggested method will allow us to: 1. [Docker](https://www.docker.com/products/docker-desktop) 2. Optional: if you clone the sample app, you need to have [dotnet core](https://dotnet.microsoft.com/download) installed. -### Step by step with examples +### Step by Step with Examples For this tutorial we are going to use a [sample dotnet core api application](https://github.com/itye-msft/cse-engagement-template). Here is the docker file for the sample app: diff --git a/docs/developer-experience/fake-services-inner-loop.md b/docs/developer-experience/fake-services-inner-loop.md index 60a5423c8c..4e558d14b5 100644 --- a/docs/developer-experience/fake-services-inner-loop.md +++ b/docs/developer-experience/fake-services-inner-loop.md @@ -2,7 +2,7 @@ ## Introduction -Consumers of remote services often find that their development cycle is not in sync with development of remote services, leaving developers of these consumers waiting for the remote services to "catch up". One approach to mitigate this issue and improve the inner dev loop is by decoupling and using Mock Services. Various Mock Service options are detailed [here](client-app-inner-loop.md). +Consumers of remote services often find that their development cycle is not in sync with development of remote services, leaving developers of these consumers waiting for the remote services to "catch up". One approach to mitigate this issue and improve the inner dev loop is by decoupling and using Mock Services. Various Mock Service options are detailed [here](./client-app-inner-loop.md). This document will focus on providing an example using the Fake Services approach. @@ -33,7 +33,7 @@ In order to run Json-Server, it simply requires a source for data and will infer For our example, we will use the following data file, `db.json`: -```text +```json { "user": [ { diff --git a/docs/developer-experience/onboarding-guide-template.md b/docs/developer-experience/onboarding-guide-template.md index 962148d190..935d1d4c40 100644 --- a/docs/developer-experience/onboarding-guide-template.md +++ b/docs/developer-experience/onboarding-guide-template.md @@ -16,7 +16,7 @@ When developing an onboarding document for a team, it should contain details of ## Team Agreement and Code of Conduct * Include the team's code of conduct or agreement that defines a set of expectation from each team member and how the team has agreed to operate. -* Working Agreement Template - [working agreement](../agile-development/advanced-topics/team-agreements/working-agreements.md) +* Working Agreement Template - [working agreement](../agile-development/team-agreements/working-agreement.md) ## Dev Environment Setup @@ -28,7 +28,7 @@ When developing an onboarding document for a team, it should contain details of * This can include a more in depth description with different areas of the project to help increase the project understanding. * It can include different sections on the various components of the project including deployment, e2e testing, repositories. -## Helpful Resources and Links +## Resources * This can include any additional links to documents related to the project * It may include links to backlog items, work items, wiki pages or project history. diff --git a/docs/developer-experience/toggle-vnet-dev-environment.md b/docs/developer-experience/toggle-vnet-dev-environment.md index dec65f70e9..0c9f402ea2 100644 --- a/docs/developer-experience/toggle-vnet-dev-environment.md +++ b/docs/developer-experience/toggle-vnet-dev-environment.md @@ -1,4 +1,4 @@ -# Toggle VNet on and off for production and development environment +# Toggle VNet On and Off for Production and Development Environment ## Problem Statement diff --git a/docs/documentation/README.md b/docs/documentation/README.md index 0de7877b6b..78061f63c3 100644 --- a/docs/documentation/README.md +++ b/docs/documentation/README.md @@ -4,16 +4,6 @@ Every software development project requires documentation. [Agile Software Devel Documentation shouldn't be an afterthought. Different written documents and materials should be created during the whole life cycle of the project, as per the project needs. -## Table of Contents - -- [Goals](#goals) -- [Challenges](#challenges) -- [What documentation should exist?](#what-documentation-should-exist) -- [Best practices](#best-practices) -- [Tools](#tools) -- [Recipes](#recipes) -- [Resources](#resources) - ## Goals - Facilitate onboarding of new team members. @@ -50,7 +40,7 @@ When working in an engineering project, we typically encounter one or more of th - Key documents created several weeks into the project: onboarding, how to run the app, etc. - Documents created last minute just before the end of a project, forgetting that they also help the team while working on the project. -## What documentation should exist +## What Documentation Should Exist - [Project and Repositories](./guidance/project-and-repositories.md) - [Commit Messages](../source-control/git-guidance/README.md#commit-best-practices) @@ -60,7 +50,7 @@ When working in an engineering project, we typically encounter one or more of th - [REST APIs](./guidance/rest-apis.md) - [Engineering Feedback](./guidance/engineering-feedback.md) -## Best practices +## Best Practices - [Establishing and managing documentation](./best-practices/establish-and-manage.md) - [Creating good documentation](./best-practices/good-documentation.md) diff --git a/docs/documentation/README.md.orig b/docs/documentation/README.md.orig deleted file mode 100644 index 36a04b34be..0000000000 --- a/docs/documentation/README.md.orig +++ /dev/null @@ -1,91 +0,0 @@ -# Documentation - -Every software development project requires documentation. [Agile Software Development](https://agilemanifesto.org/) values *working software over comprehensive documentation*. Still, projects should include the key information needed to understand the development and the use of the generated software. - -Documentation shouldn't be an afterthought. Different written documents and materials should be created during the whole life cycle of the project, as per the project needs. - -## Table of Contents - -- [Goals](#goals) -- [Challenges](#challenges) -- [What documentation should exist?](#what-documentation-should-exist) -- [Best practices](#best-practices) -- [Tools](#tools) -- [Recipes](#recipes) -- [Resources](#resources) - -## Goals - -- Facilitate onboarding of new team members. -- Improve communication and collaboration between teams (especially when distributed across time zones). -- Improve the transition of the project to another team. - -## Challenges - -When working in an engineering project, we typically encounter one or more of these challenges related to documentation (including some examples): - -- **Non-existent**. - - No onboarding documentation, so it takes a long time to set up the environment when you join the project. - - No document in the wiki explaining existing repositories, so you cannot tell which of the 10 available repositories you should clone. - - No main README, so you don't know where to start when you clone a repository. - - No "how to contribute" section, so you don't know which is the branch policy, where to add new documents, etc. - - No code guidelines, so everyone follows different naming conventions, etc. -- **Hidden**. - - Impossible to find useful documentation as it’s scattered all over the place. E.g., no idea how to compile, run and test the code as the README is hidden in a folder within a folder within a folder. - - Useful processes (e.g., grooming process) explained outside the backlog management tool and not linked anywhere. - - Decisions taken in different channels other than the backlog management tool and not recorded anywhere else. -- **Incomplete**. - - No clear branch policy, so everyone names their branches differently. - - Missing settings in the "how to run this" document that are required to run the application. -- **Inaccurate**. - - Documents not updated along with the code, so they don't mention the right folders, settings, etc. -- **Obsolete**. - - Design documents that don't apply anymore, sitting next to valid documents. Which one shows the latest decisions? -- **Out of order (subject / date)**. - - Documents not organized per subject/workstream so not easy to find relevant information when you change to a new workstream. - - Design decision logs out of order and without a date that helps to determine which is the final decision on something. -- **Duplicate**. - - No settings file available in a centralized place as a single source of truth, so developers must keep sharing their own versions, and we end up with many files that might or might not work. -- **Afterthought**. - - Key documents created several weeks into the project: onboarding, how to run the app, etc. - - Documents created last minute just before the end of a project, forgetting that they also help the team while working on the project. - -## What documentation should exist - -- [Project and Repositories](./guidance/project-and-repositories.md) -- [Commit Messages](../source-control/README.md#commit-best-practices) -- [Pull Requests](./guidance/pull-requests.md) -- [Code](./guidance/code.md) -- [Work Items](./guidance/work-items.md) -- [REST APIs](./guidance/rest-apis.md) -- [Engineering Feedback](./guidance/engineering-feedback.md) - -## Best practices - -- [Establishing and managing documentation](./best-practices/establish-and-manage.md) -- [Creating good documentation](./best-practices/good-documentation.md) -- [Replacing documentation with automation](./best-practices/automation.md) - -## Tools - -- [Wikis](./tools/wikis.md) -- [Languages](./tools/languages.md) - - [markdown](./tools/languages.md#markdown) - - [mermaid](./tools/languages.md#mermaid) -- [How to automate simple checks](./tools/automation.md) -- [Integration with Teams/Slack](./tools/integrations.md) - -## Recipes - -- [How to sync a wiki between repositories](./recipes/sync-wiki-between-repos.md) -<<<<<<< HEAD -- [Using DocFx and Companion Tools to generate a Documentation website](./recipes/using-docfx-and-tools.md) -- [Deploy the DocFx Documentation website to an Azure Website automatically](./recipes/deploy-docfx-azure-website.md) -======= -- [How to create a static website for your documentation based on MkDocs and Material for MkDocs](./recipes/static-website-with-mkdocs.md) ->>>>>>> 73831954d1b6de34968eb9648243b3010849d8a0 - -## Resources - -- [Software Documentation Types and Best Practices](https://blog.prototypr.io/software-documentation-types-and-best-practices-1726ca595c7f) -- [Why is project documentation important?](https://www.greycampus.com/blog/project-management/why-is-project-documentation-important) diff --git a/docs/documentation/best-practices/README.md b/docs/documentation/best-practices/README.md deleted file mode 100644 index 6680e136c3..0000000000 --- a/docs/documentation/best-practices/README.md +++ /dev/null @@ -1,5 +0,0 @@ -# Best Practices - -- [Replacing Documentation with Automation](automation.md) -- [Establishing and Managing Documentation](establish-and-manage.md) -- [Creating Good Documentation](good-documentation.md) \ No newline at end of file diff --git a/docs/documentation/best-practices/automation.md b/docs/documentation/best-practices/automation.md index dce750d2b8..f77b66a84e 100644 --- a/docs/documentation/best-practices/automation.md +++ b/docs/documentation/best-practices/automation.md @@ -4,13 +4,13 @@ You can document how to set up your dev machine with the right version of the fr Some examples are provided below: -## Dev containers in Visual Studio Code +## Dev Containers in Visual Studio Code The [Visual Studio Code Remote - Containers extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers) lets you use a Docker container as a full-featured development environment. It allows you to open any folder inside (or mounted into) a container and take advantage of Visual Studio Code's full feature set. Additional information: [Developing inside a Container](https://code.visualstudio.com/docs/remote/containers). -## Launch configurations and Tasks in Visual Studio Code +## Launch Configurations and Tasks in Visual Studio Code [Launch configurations](https://code.visualstudio.com/Docs/editor/debugging#_launch-configurations) allows you to configure and save debugging setup details. diff --git a/docs/documentation/guidance/code.md b/docs/documentation/guidance/code.md index 3c4319dd55..dc3e44ce2a 100644 --- a/docs/documentation/guidance/code.md +++ b/docs/documentation/guidance/code.md @@ -4,7 +4,7 @@ You might have heard more than once that **you should write self-documenting cod There are two types of code comments, implementation comments and documentation comments. -## Implementation comments +## Implementation Comments They are used for internal documentation, and are intended for anyone who may need to maintain the code in the future, including your future self. @@ -42,7 +42,7 @@ if (sourceLength < charTextLength { ``` -## Documentation comments +## Documentation Comments Doc comments are a special kind of comment, added above the definition of any user-defined type or member, and are intended for anyone who may need to use those types or members in their own code. diff --git a/docs/documentation/guidance/project-and-repositories.md b/docs/documentation/guidance/project-and-repositories.md index 5541fc1c57..aec3cd664e 100644 --- a/docs/documentation/guidance/project-and-repositories.md +++ b/docs/documentation/guidance/project-and-repositories.md @@ -2,7 +2,7 @@ Every source code repository should include documentation that is specific to it (e.g., in a Wiki within the repository), while the project itself should include general documentation that is common to all its associated repositories (e.g., in a Wiki within the backlog management tool). -## Documentation specific to a repository +## Documentation Specific to a Repository - Introduction - Getting started @@ -21,7 +21,7 @@ Every source code repository should include documentation that is specific to it Some sections in the documentation of the repository might point to the project’s documentation (e.g., Onboarding, Working Agreement, Contributing Guide). -## Common documentation to all repositories +## Common Documentation to all Repositories - Introduction - Project @@ -31,16 +31,16 @@ Some sections in the documentation of the repository might point to the project - [Onboarding](../../developer-experience/onboarding-guide-template.md) - Repository guide - Production, Spikes -- [Team agreements](../../agile-development/advanced-topics/team-agreements/README.md) - - [Team Manifesto](../../agile-development/advanced-topics/team-agreements/team-manifesto.md) +- [Team agreements](../../agile-development/team-agreements/) + - [Team Manifesto](../../agile-development/team-agreements/team-manifesto.md) - Short summary of expectations around the technical way of working and supported mindset in the team. - E.g., ownership, respect, collaboration, transparency. - - [Working Agreement](../../agile-development/advanced-topics/team-agreements/working-agreements.md) + - [Working Agreement](../../agile-development/team-agreements/working-agreement.md) - How we work together as a team and what our expectations and principles are. - E.g., communication, work-life balance, scrum rhythm, backlog management, code management. - - [Definition of Done](../../agile-development/advanced-topics/team-agreements/definition-of-done.md) + - [Definition of Done](../../agile-development/team-agreements/definition-of-done.md) - List of tasks that must be completed to close a user story, a sprint, or a milestone. - - [Definition of Ready](../../agile-development/advanced-topics/team-agreements/definition-of-ready.md) + - [Definition of Ready](../../agile-development/team-agreements/definition-of-ready.md) - How complete a user story should be in order to be selected as candidate for estimation in the sprint planning. - Contributing Guide - Repo structure @@ -50,14 +50,14 @@ Some sections in the documentation of the repository might point to the project - [Pull Requests](./pull-requests.md) - [Code Review Process](../../code-reviews/README.md) - [Code Review Checklist](../../code-reviews/process-guidance/reviewer-guidance.md) - - [Language Specific Checklists](../../code-reviews/recipes/README.md) + - [Language Specific Checklists](../../code-reviews/recipes/) - [Project Design](../../design/design-reviews/README.md) - [High Level / Game Plan](../../design/design-reviews/recipes/high-level-design-recipe.md) - [Milestone / Epic Design Review](../../design/design-reviews/recipes/milestone-epic-design-review-recipe.md) - [Design Review Recipes](../../design/design-reviews/README.md#Recipes) - - [Milestone / Epic Design Review Template](../../design/design-reviews/recipes/milestone-epic-design-review-template.md) - - [Feature / Story Design Review Template](../../design/design-reviews/recipes/feature-story-design-review-template.md) - - [Task Design Review Template](../../design/design-reviews/recipes/task-design-review-template.md) + - [Milestone / Epic Design Review Template](../../design/design-reviews/recipes/templates/milestone-epic-design-review.md) + - [Feature / Story Design Review Template](../../design/design-reviews/recipes/templates/feature-story-design-review.md) + - [Task Design Review Template](../../design/design-reviews/recipes/templates/template-task-design-review.md) - [Decision Log Template](../../design/design-reviews/decision-log/doc/decision-log.md) - [Architecture Decision Record (ADR) Template](../../design/design-reviews/decision-log/README.md#architecture-decision-record-(ADR)) ([Example 1](../../design/design-reviews/decision-log/doc/adr/0001-record-architecture-decisions.md), [Example 2](../../design/design-reviews/decision-log/doc/adr/0002-app-level-logging.md)) diff --git a/docs/documentation/guidance/rest-apis.md b/docs/documentation/guidance/rest-apis.md index cc7f9c2759..b761dc0885 100644 --- a/docs/documentation/guidance/rest-apis.md +++ b/docs/documentation/guidance/rest-apis.md @@ -16,7 +16,7 @@ While the [OpenAPI-Specification (OAI)](https://github.com/OAI/OpenAPI-Specifica [Microsoft TypeSpec](https://github.com/Microsoft/typespec) is a widely adopted tool within Azure teams, particularly for generating OpenAPI Specifications in complex and interconnected APIs that span multiple teams. To ensure consistency across different parts of the API, teams commonly leverage shared libraries which contain reusable patterns. This makes easier to follow best practices rather than deviating from them. By promoting highly regular API designs that adhere to best practices by construction, TypeSpec can help improve the quality and consistency of APIs developed within an organization. -## References +## Resources - [ASP.NET Core web API documentation with Swagger / OpenAPI](https://learn.microsoft.com/en-us/aspnet/core/tutorials/web-api-help-pages-using-swagger?view=aspnetcore-5.0). - [Microsoft TypeSpec](https://github.com/Microsoft/typespec). diff --git a/docs/documentation/recipes/README.md b/docs/documentation/recipes/README.md deleted file mode 100644 index 7b814a0c3c..0000000000 --- a/docs/documentation/recipes/README.md +++ /dev/null @@ -1,6 +0,0 @@ -# Recipes - -- [deploy-docfx-azure-website](./deploy-docfx-azure-website.md) -- [static-website-with-mkdocs](./static-website-with-mkdocs.md) -- [sync-wiki-between-repos](./sync-wiki-between-repos.md) -- [using-docfx-and-tools](./using-docfx-and-tools.md) \ No newline at end of file diff --git a/docs/documentation/recipes/deploy-docfx-azure-website.md b/docs/documentation/recipes/deploy-docfx-azure-website.md index 0e14eb24bb..5fe3a167fb 100644 --- a/docs/documentation/recipes/deploy-docfx-azure-website.md +++ b/docs/documentation/recipes/deploy-docfx-azure-website.md @@ -1,4 +1,4 @@ -# Deploy the DocFx Documentation website to an Azure Website automatically +# Deploy the DocFx Documentation Website to an Azure Website Automatically In the article [Using DocFx and Companion Tools to generate a Documentation website](using-docfx-and-tools.md) the process is described to generate content of a documentation website using DocFx. This document describes how to setup an Azure Website to host the content and automate the deployment to it using a pipeline in Azure DevOps. @@ -14,9 +14,9 @@ You can use tools like [Chocolatey](https://chocolatey.org/) to install Terrafor choco install terraform ``` -## 2. Set the proper variables +## 2. Set the Proper Variables -> **IMPORTANT:** Make sure you modify the value of the **app_name**, **rg_name** and **rg_location** variables. The *app_name* value is appended by **azurewebsites.net** and must be unique. Otherwise the script will fail that it cannot create the website. +> **Note:** Make sure you modify the value of the **app_name**, **rg_name** and **rg_location** variables. The *app_name* value is appended by **azurewebsites.net** and must be unique. Otherwise the script will fail that it cannot create the website. In the Quick Start, authentication is disabled. If you want that enabled, make sure you have create an *Application* in the Azure AD and have the *client ID*. This client id must be set as the value of the **client_id** variable in *variables.tf*. In the *main.tf* make sure you uncomment the authentication settings in the *app-service*. For more information see [Configure Azure AD authentication - Azure App Service](https://learn.microsoft.com/en-us/azure/app-service/configure-authentication-provider-aad). @@ -44,14 +44,11 @@ Export-PfxCertificate -cert $path -FilePath [FILENAME].pfx -Password $pwd The certificate needs to be stored in the common Key Vault. Go to `Settings > Certificates` in the left menu of the Key Vault and click `Generate/Import`. Provide these details: * Method of Certificate Creation: `Import` - * Certificate name: e.g. `ssl-certificate` - * Upload Certificate File: select the file on disc for this. - * Password: this is the [PASSWORD] we reference earlier. -### Custom domain registration +### Custom Domain Registration To use a custom domain a few things need to be done. The process in the Azure portal is described in the article [Tutorial: Map an existing custom DNS name to Azure App Service](https://learn.microsoft.com/en-us/azure/app-service/app-service-web-tutorial-custom-domain). An important part is described under the header [Get a domain verification ID](https://learn.microsoft.com/en-us/azure/app-service/app-service-web-tutorial-custom-domain#get-a-domain-verification-id). This ID needs to be registered with the DNS description as a TXT record. @@ -61,11 +58,11 @@ Important to know is that this `Custom Domain Verification ID` is the same for a The Azure App Service needs to access the Key Vault to get the certificate. This is needed for the first run, but also when the certificate is renewed in the Key Vault. For this purpose the Azure App Service accesses the Key Vault with the App Service resource provided identity. This identity can be found with the service principal name **abfa0a7c-a6b6-4736-8310-5855508787cd** or **Microsoft Azure App Service** and is of type **Application**. This ID is the same for all Azure subscriptions. It needs to have Get-permissions on secrets and certificates. For more information see this article [Import a certificate from Key Vault](https://learn.microsoft.com/en-us/azure/app-service/configure-ssl-certificate#import-a-certificate-from-key-vault). -### Add the custom domain and SSL certificate to the App Service +### Add the Custom Domain and SSL Certificate to the App Service Once we have the SSL certificate and there is a complete DNS registration as described, we can uncomment the code in the Terraform script from the Quick Start folder to attach this to the App Service. In this script you need to reference the certificate in the common Key Vault and use it in the custom hostname binding. The custom hostname is assigned in the script as well. The settings `ssl_state` needs to be `SniEnabled` if you're using an SSL certificate. Now the creation of the authenticated website with a custom domain is automated. -## 3. Deploy Azure resources from your local machine +## 3. Deploy Azure Resources from Your Local Machine Open up a command prompt. For the commands to be executed, you need to have a connection to your Azure subscription. This can be done using [Azure Cli](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-windows?tabs=azure-cli). Type this command: @@ -107,14 +104,14 @@ When asked for approval, type "yes" and ENTER. You can also add the *-auto-appro The deployment using Terraform is not included in the pipeline from the Quick Start folder as described in the next step, as that asks for more configuration. But of course that can always be added. -## 4. Deploy the website from a pipeline +## 4. Deploy the Website from a Pipeline The best way to create the resources and deploy to it, is to do this automatically in a pipeline. For this purpose the **.pipelines/documentation.yml** pipeline is provided. This pipeline is built for an Azure DevOps environment. Create a pipeline and reference this YAML file. -> **IMPORTANT:** the Quick Start folder contains a web.config that is needed for deployment to IIS or Azure App Service. This enables the use of the json file for search requests. If you don't have this in place, the search of text will never return anything and result in 404's under the hood. +> **Note:** the Quick Start folder contains a web.config that is needed for deployment to IIS or Azure App Service. This enables the use of the json file for search requests. If you don't have this in place, the search of text will never return anything and result in 404's under the hood. You have to create a Service Connection in your DevOps environment to connect to the Azure Subscription you want to deploy to. -> **IMPORTANT:** set the variables **AzureConnectionName** to the name of the Service Connection and the **AzureAppServiceName** to the name you determined in the *infrastructure/variables.tf*. +> **Note:** set the variables **AzureConnectionName** to the name of the Service Connection and the **AzureAppServiceName** to the name you determined in the *infrastructure/variables.tf*. In the Quick Start folder the pipeline uses `master` as trigger, which means that any push being done to master triggers the pipeline. You will probably change this to another branch. diff --git a/docs/documentation/recipes/static-website-with-mkdocs.md b/docs/documentation/recipes/static-website-with-mkdocs.md index 74f4669ee2..05c0026a7f 100644 --- a/docs/documentation/recipes/static-website-with-mkdocs.md +++ b/docs/documentation/recipes/static-website-with-mkdocs.md @@ -1,4 +1,4 @@ -# How to create a static website for your documentation based on mkdocs and mkdocs-material +# How to Create a Static Website for Your Documentation Based on mkdocs and mkdocs-material [MkDocs](https://www.mkdocs.org/) is a tool built to create static websites from raw markdown files. Other alternatives include [Sphinx](https://www.sphinx-doc.org/en/master/), and [Jekyll](https://jekyllrb.com/). @@ -27,7 +27,7 @@ Setting up locally is very easy. See [Getting Started with MkDocs](https://www.m For publishing the website, there's a [good integration with GitHub for storing the website as a GitHub Page](https://www.mkdocs.org/user-guide/deploying-your-docs/). -## Additional links +## Resources - [MkDocs Plugins](https://github.com/mkdocs/mkdocs/wiki/MkDocs-Plugins) - [The best MkDocs plugins and customizations](https://chrieke.medium.com/the-best-mkdocs-plugins-and-customizations-fc820eb19759) diff --git a/docs/documentation/recipes/sync-wiki-between-repos.md b/docs/documentation/recipes/sync-wiki-between-repos.md index 22348840ce..1a26d7aa8b 100644 --- a/docs/documentation/recipes/sync-wiki-between-repos.md +++ b/docs/documentation/recipes/sync-wiki-between-repos.md @@ -1,4 +1,4 @@ -# How to Sync a Wiki between Repositories +# How to Sync a Wiki Between Repositories This is a quick guide to mirroring a Project Wiki to another repository. @@ -33,7 +33,7 @@ git branch -u origin/wikiMaster Your output should look like this when run: -```powershell +```ps PS C:\Git\MyProject.wiki> git pull -v POST git-upload-pack (909 bytes) remote: Azure Repos diff --git a/docs/documentation/recipes/using-docfx-and-tools.md b/docs/documentation/recipes/using-docfx-and-tools.md index f2d62026a3..5a84676822 100644 --- a/docs/documentation/recipes/using-docfx-and-tools.md +++ b/docs/documentation/recipes/using-docfx-and-tools.md @@ -1,4 +1,4 @@ -# Using DocFx and Companion Tools to generate a Documentation website +# Using DocFx and Companion Tools to Generate a Documentation Website If you want an easy way to have a website with all your documentation coming from Markdown files and comments coming from code, you can use [DocFx](https://dotnet.github.io/docfx/). The website generated by DocFx also includes fast search capabilities. There are some gaps in the DocFx solution, but we've provided companion tools that help you fill those gaps. Also see the blog post [Providing quality documentation in your project with DocFx and Companion Tools](https://mtirion.medium.com/providing-quality-documentation-in-your-project-with-docfx-and-companion-tools-76aed42b1ddd) for more explanation about the solution. @@ -20,13 +20,13 @@ This document is followed best by cloning the sample from Now you can create a pipeline in your Azure DevOps project that uses the *.pipelines/documentation.yml* and run it. > -## Documents and projects folder structure +## Documents and Projects Folder Structure The easiest is to work with a [mono repository](https://mtirion.medium.com/monorepo-for-beginners-45d5059ab40e) where documentation and code live together. If that's not the case in your situation but you still want to combine multiple repositories into one documentation website, you'll have to clone all repositories first to be able to combine the information. In this recipe we'll assume a monorepo is used. In the steps below we'll consider the generation of the documentation website from this content structure: -```xaml +```sh ├── .pipelines // Azure DevOps pipeline for automatic generation and deployment │ ├── docs // all documents @@ -64,7 +64,7 @@ A `.markdownlint.json` is included with the contents below. The [MD013 setting]( The contents of the **.pipelines** and **infrastructure** folders are explained in the recipe [Deploy the DocFx Documentation website to an Azure Website automatically](deploy-docfx-azure-website.md). -## Reference documentation from source code +## Reference Documentation from Source Code DocFx can generate reference documentation from code, where C# and Typescript are supported best at the moment. In the QuickStart folder we only used C# projects. For DocFx to generate quality reference documentation, quality triple slash-comments are required. See [Triple-slash (///) Code Comments Support](https://dotnet.github.io/docfx/spec/triple_slash_comments_spec.html). To enforce this, it's a good idea to enforce the use of [StyleCop](https://github.com/DotNetAnalyzers/StyleCopAnalyzers). There are a few steps that will give you an easy start with this. @@ -87,7 +87,7 @@ To make sure developers are forced to add the triple-slash comments by throwing Now you are all set to generate documentation from your C# code. For more information about languages supported by DocFx and how to configure it, see [Introduction to Multiple Languages Support](https://dotnet.github.io/docfx/spec/metadata_format_spec.html#26-multiple-language-support). -> **NOTE:** You can also add a PropertyGroup definition with the two settings in *Directory.Build.props* to have that settings in all projects. But in that case it will also be inherited in your Test projects. +> **Note:** You can also add a PropertyGroup definition with the two settings in *Directory.Build.props* to have that settings in all projects. But in that case it will also be inherited in your Test projects. ## 1. Install DocFx and markdownlint-cli @@ -104,7 +104,7 @@ choco install markdownlint-cli Configuration for DocFx is done in a `docfx.json` file. Store this file in the root of your repository. -> **NOTE:** You can store the docfx.json somewhere else in the hierarchy, but then you need to provide the path of the file as an argument to the docfx command so it can be located. +> **Note:** You can store the docfx.json somewhere else in the hierarchy, but then you need to provide the path of the file as an argument to the docfx command so it can be located. Below is a good configuration to start with, where documentation is in the **/docs** folder and the sources are in the **/src** folder: @@ -146,11 +146,11 @@ Below is a good configuration to start with, where documentation is in the **/do } ``` -## 3. Setup some basic documents +## 3. Setup Some Basic Documents We suggest starting with a basic documentation structure in the **/docs** folder. In the provided QuickStart folder we have a basic setup: -```xaml +```sh ├── docs │ ├── .attachments // All images and other attachments used by documents │ @@ -194,12 +194,11 @@ You can add specific links that are important to provide direct access. > Try not to duplicate the links on the top of the page, unless it really makes sense. To get started with the setup of this website, read the getting started document with the title [Using DocFx and Companion Tools](using-docfx-and-tools.md). - ``` -## 4. Compile the companion tools and run them +## 4. Compile the Companion Tools and Run Them -> **NOTE:** To explain each step, we'll be going through the various steps in the next few paragraphs. In the provided sample, a batch-file called **GenerateDocWebsite.cmd** is included. This script will take all the necessary steps to compile the tools, execute the checks, generate the table of contents and execute docfx to generate the website. +> **Note:** To explain each step, we'll be going through the various steps in the next few paragraphs. In the provided sample, a batch-file called **GenerateDocWebsite.cmd** is included. This script will take all the necessary steps to compile the tools, execute the checks, generate the table of contents and execute docfx to generate the website. To check for proper markdown formatting the **markdownlint-cli** tool is used. The command takes it's configuration from the `.markdownlint.json` file in the root of the project. To check all markdown files, simply execute this command: @@ -223,13 +222,13 @@ The **TocDocFxCreation** tool is needed to generate a table of contents for your TocDocFxCreation.exe -d ./docs -sri ``` -## 5. Run DocFx to generate the website +## 5. Run DocFx to Generate the Website Run the `docfx` command to generate the website, by default in the **_site** folder. > **TIP:** If you want to check the website in your local environment, provide the **--serve** option to either the *docfx* command or the *GenerateDocWebsite* script. A small webserver is launched that hosts your website, which is accessible on localhost. -### Style of the website +### Style of the Website If you started with the QuickStart folder, the website is generated using a custom theme using [material design](https://ovasquez.github.io/docfx-material/) and the Microsoft logo. You can change this to your likings. For more information see [How-to: Create A Custom Template | DocFX website (dotnet.github.io)](https://dotnet.github.io/docfx/tutorial/howto_create_custom_template.html). @@ -237,7 +236,7 @@ If you started with the QuickStart folder, the website is generated using a cust After you completed the steps, you should have a default website generated in the *_site* folder. But of course, you want this to be accessible for everyone. So, the next step is to create for instance an Azure Website and have a process to automatically generate and deploy the contents to that website. That process is described in the recipe [Deploy the DocFx Documentation website to an Azure Website automatically](deploy-docfx-azure-website.md). -## References +## Resources * [DocFX - static documentation generator](https://dotnet.github.io/docfx/index.html) * [Deploy the DocFx Documentation website to an Azure Website automatically](deploy-docfx-azure-website.md) diff --git a/docs/documentation/tools/automation.md b/docs/documentation/tools/automation.md index 256cedcfff..afca787fec 100644 --- a/docs/documentation/tools/automation.md +++ b/docs/documentation/tools/automation.md @@ -7,7 +7,7 @@ If you want to automate some checks on your Markdown documents, there are severa - [markdown-link-check](https://github.com/tcort/markdown-link-check) to extract links from markdown texts and check whether each link is alive (200 OK) or dead. - [write-good](../../code-reviews/recipes/markdown.md#write-good) to check English prose. - [Docker image for node-markdown-spellcheck](https://github.com/tmaier/docker-markdown-spellcheck), a lightweight docker image to spellcheck markdown files. - - [static code analysis](../../continuous-integration/dev-sec-ops/secret-management/static-code-analysis.md) + - [static code analysis](../../CI-CD/dev-sec-ops/secrets-management/static-code-analysis.md) - [VS Code Extensions](../../code-reviews/recipes/markdown.md#vs-code-extensions) - [Write Good Linter](../../code-reviews/recipes/markdown.md#write-good-linter) to get grammar and language advice while editing a document. @@ -16,12 +16,12 @@ If you want to automate some checks on your Markdown documents, there are severa - Automation - [pre-commit](https://pre-commit.com/) to use Git hook scripts to identify simple issues before submitting our code or documentation for review. - Check [Build validation](../../code-reviews/recipes/markdown.md#build-validation) to automate linting for PRs. - - Check [CI Pipeline for better documentation](../../continuous-integration/markdown-linting/README.md) for a sample pipeline with `markdownlint`, `markdown-link-check` and `write-good`. + - Check [CI Pipeline for better documentation](../../CI-CD/recipes/ci-pipeline-for-better-documentation.md) for a sample pipeline with `markdownlint`, `markdown-link-check` and `write-good`. Sample output: ![docs-checks](./images/docs-checks.png) -## On linting rules +## On Linting Rules The team needs to be clear what linting rules are required and shouldn't be overridden with tooling or comments. The team should have consensus on when to override tooling rules. diff --git a/docs/documentation/tools/integrations.md b/docs/documentation/tools/integrations.md index 8e226a2e76..ec5ef86a0b 100644 --- a/docs/documentation/tools/integrations.md +++ b/docs/documentation/tools/integrations.md @@ -3,5 +3,4 @@ Monitor your Azure repositories and receive notifications in your channel whenever code is pushed/checked in and whenever a pull request (PR) is created, updated, or a merge is attempted. - [Azure Repos with Microsoft Teams](https://learn.microsoft.com/en-us/azure/devops/repos/integrations/repos-teams?view=azure-devops) - - [Azure Repos with Slack](https://learn.microsoft.com/en-us/azure/devops/repos/integrations/repos-slack?view=azure-devops) diff --git a/docs/documentation/tools/languages.md b/docs/documentation/tools/languages.md index 8a86074faa..851fbca1f8 100644 --- a/docs/documentation/tools/languages.md +++ b/docs/documentation/tools/languages.md @@ -25,7 +25,7 @@ Mermaid lets you create diagrams using text definitions that can later be render Mermaid files (.mmd) can be source-controlled along with your code. It's also recommended to include image files (.png) with the rendered diagrams under source control. Your markdown files should link the image files, so they can be read without the need of a Mermaid rendering tool (e.g., during Pull Request review). -### Example Mermaid diagram +### Example Mermaid Diagram This is an example of a Mermaid flowchart diagram written as code. @@ -39,7 +39,7 @@ graph LR This is an example of how it can be rendered as an image. -![Example mermaid diagram](images/example-mermaid-diagram.png) +![Example mermaid diagram](./images/example-mermaid-diagram.png) More information: diff --git a/docs/documentation/tools/wikis.md b/docs/documentation/tools/wikis.md index 3352e9de62..af5af7a259 100644 --- a/docs/documentation/tools/wikis.md +++ b/docs/documentation/tools/wikis.md @@ -14,6 +14,6 @@ More information: - [Create a Wiki for your project](https://learn.microsoft.com/en-us/azure/devops/project/wiki/wiki-create-repo?view=azure-devops&tabs=browser). - [Manage wikis](https://learn.microsoft.com/en-us/azure/devops/project/wiki/manage-wikis?view=azure-devops). -## Wikis vs. digital notebooks (e.g., OneNote) +## Wikis vs. Digital Notebooks (e.g., OneNote) When you work on a project, you may decide to document relevant details or record important decisions about the project in a digital notebook. Tools like [OneNote](https://www.microsoft.com/en-us/microsoft-365/onenote/digital-note-taking-app) allows you to easily organize, navigate and search your notes. You can provide type, highlighting, or ink annotations to your notes. These notes can easily be shared and created together with others. Still, Wikis greatly facilitate the process of [establishing and managing documentation](../best-practices/establish-and-manage.md) by allowing us to source control the documentation. diff --git a/docs/engineering-feedback/README.md b/docs/engineering-feedback/README.md index 9fc0699cf8..f6a66148a7 100644 --- a/docs/engineering-feedback/README.md +++ b/docs/engineering-feedback/README.md @@ -1,6 +1,6 @@ # Microsoft Engineering Feedback -## Why is it important to submit Microsoft Engineering Feedback +## Why is it Important to Submit Microsoft Engineering Feedback Engineering Feedback captures the "voice of the customer" and is an important mechanism to provide actionable insights and help Microsoft product groups continuously improve the platform and cloud services to enable all customers to be as productive as possible. @@ -8,13 +8,13 @@ Engineering Feedback captures the "voice of the customer" and is an important me Even if the feedback has already been raised directly with a product group or on through online channels like GitHub or Stack Overflow, it is still important to raise it via Microsoft Engineering feedback, so it can be consolidated with other customer projects that have the same feedback to help with prioritization. -## When to submit Engineering Feedback +## When to Submit Engineering Feedback Capturing and providing high-quality actionable Engineering Feedback is an integral **ongoing** part of all code-with engagements. It is recommended to submit feedback on an ongoing basis instead of batching it up for submission at the end of the engagement. You should jot down the details of the feedback close to the time when you encounter the specific blockers, challenges, and friction since that is when it is freshest in your mind. The project team can then decide how to prioritize and when to submit the feedback into the official CSE Feedback system (accessible to ISE team members) during each sprint. -## What is good and high-quality Engineering Feedback +## What is Good and High-quality Engineering Feedback Good engineering feedback provides enough information for those who are not part of the code-with engagement to understand the customer pain, the associated product issues, the impact and priority of these issues, and any potential workarounds that exist to minimize that impact. @@ -35,10 +35,10 @@ For example, here is an evolution of transforming a fictitious feedback with the | Adding **Specifics** | Customer scenario was to receive **a total of 250 messages/second from 50 producers with requirement for ordering per producer & minimum latency, using a Service Bus topic with sessions enabled for ordering. Batch receiving is not supported in Azure Functions Service Bus Trigger.** | | Making it **Actionable** | Customer scenario was to receive a total of 250 messages/second from 50 producers with requirement for ordering per producer & minimum latency, using a Service Bus topic with sessions enabled for ordering. **According to [Microsoft documentation](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-performance-improvements#prefetching-and-receivebatch), batch receiving is recommended for better performance but this is not currently supported in the Azure Functions Service Bus Trigger. The impact and workaround was choosing containers over Functions. The desired outcome is for Azure Functions to support Service Bus sessions with batch and non-batch processing for all Azure Functions GA languages.** | -For real-world examples please follow [Feedback Examples](feedback-examples.md). +For real-world examples please follow [Feedback Examples](./feedback-examples.md). -## How to submit Engineering Feedback +## How to Submit Engineering Feedback -Please follow the [Engineering Feedback Guidance](feedback-guidance.md) to ensure that you provide feedback that can be triaged and processed most efficiently. +Please follow the [Engineering Feedback Guidance](./feedback-guidance.md) to ensure that you provide feedback that can be triaged and processed most efficiently. -Please review the [Frequently Asked Questions](feedback-faq.md) page for additional information on the engineering feedback process. +Please review the [Frequently Asked Questions](./feedback-faq.md) page for additional information on the engineering feedback process. diff --git a/docs/engineering-feedback/feedback-examples.md b/docs/engineering-feedback/feedback-examples.md index 596e6798f0..ae3b4bb955 100644 --- a/docs/engineering-feedback/feedback-examples.md +++ b/docs/engineering-feedback/feedback-examples.md @@ -2,7 +2,7 @@ The following are real-world examples of Engineering Feedback that have led to product improvements and unblocked customers. -## Windows Server Container support for Azure Kubernetes Service +## Windows Server Container Support for Azure Kubernetes Service The Azure Kubernetes Service should have first class Windows container support so solutions that require Windows workloads can be deployed on a wildly popular container orchestration platform. The need was to be able to deploy Windows Server containers on AKS the managed Azure Kubernetes Service. According to [this FAQ](https://learn.microsoft.com/en-us/azure/aks/faq#can-i-run-windows-server-containers-on-aks) (and in parallel confirmation) it is not available yet. @@ -22,7 +22,7 @@ Customer scenario was to receive a total of 250 messages per second from 50 prod > This feedback was [created as a feedback](https://github.com/Azure/azure-functions-servicebus-extension/issues/15) with the Azure Functions product group and also followed up internally until addressed. -## Stream Analytics - No support for zero-downtime scale-down +## Stream Analytics - No Support for Zero-Downtime Scale-Down In order to update the Streaming Unit number in Stream Analytics you need to stop the service and wait for minutes for it to restart. This unacceptable by customers who need near real-time analysis​. In order to have a job re-started, up to 2 minutes are needed and this is not acceptable for a real-time streaming solution. It would also be optimal if scale-up and scale-down could be done automatically, by setting threshold values that when reached increase or decrease automatically the amount of RU available. This feedback is for customers' request for zero down-time scale-down capability in stream analytics. @@ -36,7 +36,7 @@ Desired Experience: Partners should be able to update the Streaming Unit number Several customers already use Python as part of their workflow, and would like to be able to use Python for Azure Functions. This is specially true since many of them are already have scripts running on other clouds and services. -In addition, Python support has been in Preview for a very long time, and it's missing a lot of functionality. +In addition, Python support has been in Preview for a very long time, and it's missing a lot of functionality. This feature request is one of the most asked, and a huge upside potential to pull through Machine Learning (ML) based workloads. diff --git a/docs/engineering-feedback/feedback-faq.md b/docs/engineering-feedback/feedback-faq.md index 067442b74c..39fa642b6b 100644 --- a/docs/engineering-feedback/feedback-faq.md +++ b/docs/engineering-feedback/feedback-faq.md @@ -2,7 +2,7 @@ The questions below are common questions related to the feedback process. The answers are intended to help both Microsoft employees and customers. -## When should I submit feedback versus creating an issue on GitHub, UserVoice, or sending an email directly to a Microsoft employee? +## When Should I Submit Feedback vs. Creating an Issue on GitHub, UserVoice, or Sending an Email Directly to a Microsoft Employee? It is appropriate to do both. As a customer or Microsoft employee, you are empowered to create an issue or submit feedback via the medium appropriate for service. @@ -10,29 +10,29 @@ In addition to an issue on GitHub, feedback on UserVoice, or a personal email, M Submitting to ISE Feedback allows the ISE Feedback team to coalesce feedback across a wide range of sources, and thus create a unified case to submit to the appropriate Azure engineering team(s). -## How can a customer track the status of a specific feedback item? +## How can a Customer Track the Status of a Specific Feedback Item? At this time, customers are not able to directly track the status of feedback submitted via ISE Feedback. The ISE Feedback process is internal to Microsoft, and as such, available only to Microsoft employees. Customers may request an update from their ISE engineering partner or Microsoft account representative(s). Customers can also submit their feedback directly via GitHub or UserVoice (as appropriate for the specific service), and inform their ISE engineering partner. The ISE engineer should submit the feedback via the ISE Feedback process, and in doing so reference the previously created issue. Customers can follow the GitHub or UserVoice item to be alerted on updates. -## How can a Microsoft employee track the status of a specific feedback item? +## How can a Microsoft Employee Track the Status of a Specific Feedback Item? The easiest way for a Microsoft employee within ISE to track a specific feedback item is to [follow the feedback (a work item)](https://learn.microsoft.com/azure/devops/boards/work-items/follow-work-items?view=azure-devops) in Azure DevOps. -## As a Microsoft employee within ISE, if I submit a feedback and move to another dev crew engagement, how would my customer get an update on that feedback? +## As a Microsoft Employee Within ISE, if I Submit a Feedback and Move to Another Dev Crew Engagement, how Would my Customer get an Update on that Feedback? If the feedback is also submitted via GitHub or UserVoice, the customer may elect to follow that item for publicly available updates. The customer may also contact their Microsoft account representative to request an update. -## As a Microsoft employee within ISE, what should I expect/do after submitting feedback via ISE Feedback? +## As a Microsoft Employee Within ISE, what Should I Expect/Do After Submitting Feedback via ISE Feedback? After submitting the feedback, it is recommended to [follow the feedback (a work item)](https://learn.microsoft.com/azure/devops/boards/work-items/follow-work-items?view=azure-devops) in Azure DevOps. If you have configured Azure DevOps notifications to send an email on work item updates, you will receive an email when the feedback is updated. If more information about the feedback is needed, a member of the ISE Feedback team will contact you to gather more information. -## How/when are feedback aggregated? +## How/When are Feedback Aggregated? -Members of the CSE Feedback team will make a best effort to triage and review new CSE Feedback items within two weeks of the original submission date. +Members of the ISE Feedback team will make a best effort to triage and review new ISE Feedback items within two weeks of the original submission date. If there is similarity across multiple feedback items, a member of the ISE Feedback team may decide to create a new feedback item which is an aggregate of similar items. This is done to aid in the creation of a unified feedback item to present to the appropriate Microsoft engineering team. diff --git a/docs/engineering-feedback/feedback-guidance.md b/docs/engineering-feedback/feedback-guidance.md index ebe15c5ccb..de8ea4ff49 100644 --- a/docs/engineering-feedback/feedback-guidance.md +++ b/docs/engineering-feedback/feedback-guidance.md @@ -43,7 +43,7 @@ Select one of the following to describe the lifecycle stage of the engagement th Describe the impact to the customer and engagement that this feedback implies. -### Time frame +### Time Frame Provide a time frame that this feedback item needs to be resolved within (if relevant). @@ -80,14 +80,14 @@ Provide a clear set of repeatable steps that will allow for this feedback to be Include items like architecture diagrams, screenshots, logs, traces etc which can help with understanding your notes and the feedback item. Also include details about the scenario customer/partner verbatim as much as possible in the main content. -### What didn't work +### What Didn't Work Describe what didn't work or what feature gap you identified. -### What was your expectation or the desired outcome +### What was Your Expectation or the Desired Outcome Describe what you expected to happen. What was the outcome that was expected? -### Describe the steps you took +### Describe the Steps you Took Provide a clear description of the steps taken and the outcome/description at each point. diff --git a/docs/ENG-FUNDAMENTALS-CHECKLIST.md b/docs/engineering-fundamentals-checklist.md similarity index 73% rename from docs/ENG-FUNDAMENTALS-CHECKLIST.md rename to docs/engineering-fundamentals-checklist.md index 6a7764a7cd..bc6e336bfb 100644 --- a/docs/ENG-FUNDAMENTALS-CHECKLIST.md +++ b/docs/engineering-fundamentals-checklist.md @@ -10,24 +10,24 @@ This checklist helps to ensure that our projects meet our Engineering Fundamenta - [ ] Commit history is consistent and commit messages are informative (what, why). - [ ] Consistent branch naming conventions. - [ ] Clear documentation of repository structure. -- [ ] Secrets are not part of the commit history or made public. (see [Credential scanning](continuous-integration/dev-sec-ops/secret-management/credential_scanning.md)) -- [ ] Public repositories follow the [OSS guidelines](source-control/README.md#creating-a-new-repository), see `Required files in default branch for public repositories`. +- [ ] Secrets are not part of the commit history or made public. (see [Credential scanning](./CI-CD/dev-sec-ops/secrets-management/credential_scanning.md)) +- [ ] Public repositories follow the [OSS guidelines](./source-control/README.md#creating-a-new-repository), see `Required files in default branch for public repositories`. -More details on [source control](source-control/README.md) +More details on [source control](./source-control/README.md) ## Work Item Tracking - [ ] All items are tracked in AzDevOps (or similar). - [ ] The board is organized (swim lanes, feature tags, technology tags). -More details on [backlog management](agile-development/advanced-topics/backlog-management) +More details on [backlog management](./agile-development/backlog-management.md) ## Testing - [ ] Unit tests cover the majority of all components (>90% if possible). - [ ] Integration tests run to test the solution e2e. -More details on [automated testing](automated-testing/README.md) +More details on [automated testing](./automated-testing/README.md) ## CI/CD @@ -35,7 +35,7 @@ More details on [automated testing](automated-testing/README.md) - [ ] Project uses CD to manage deployments to a replica environment before PRs are merged. - [ ] Main branch is always shippable. -More details on [continuous integration](continuous-integration/README.md) and [continuous delivery](continuous-delivery/README.md) +More details on [continuous integration](./CI-CD/continuous-integration.md) and [continuous delivery](./CI-CD/continuous-delivery.md) ## Security @@ -44,7 +44,7 @@ More details on [continuous integration](continuous-integration/README.md) and [ - [ ] Data is encrypted in transit (and if necessary at rest) and passwords are hashed - [ ] Is the system split into logical segments with separation of concerns? This helps limiting security vulnerabilities. -More details on [security](security/README.md) +More details on [security](./security/README.md) ## Observability @@ -53,10 +53,10 @@ More details on [security](security/README.md) - [ ] Health of the system is monitored. - [ ] The client and server side observability data can be differentiated. - [ ] Logging configuration can be modified without code changes (eg: verbose mode). -- [ ] [Incoming tracing context](observability/correlation-id.md) is propagated to allow for production issue debugging purposes. +- [ ] [Incoming tracing context](./observability/correlation-id.md) is propagated to allow for production issue debugging purposes. - [ ] GDPR compliance is ensured regarding PII (Personally Identifiable Information). -More details on [observability](observability/README.md) +More details on [observability](./observability/README.md) ## Agile/Scrum @@ -65,20 +65,20 @@ More details on [observability](observability/README.md) - [ ] The Dev Lead (+ PO/Others) are responsible for backlog management and refinement. - [ ] A working agreement is established between team members and customer. -More details on [agile development](agile-development/README.md) +More details on [agile development](./agile-development/README.md) ## Design Reviews -- [ ] Process for conducting design reviews is included in the [Working Agreement](agile-development/advanced-topics/team-agreements/working-agreements.md). +- [ ] Process for conducting design reviews is included in the [Working Agreement](./agile-development/team-agreements/working-agreement.md). - [ ] Design reviews for each major component of the solution are carried out and documented, including alternatives. - [ ] Stories and/or PRs link to the design document. - [ ] Each user story includes a task for design review by default, which is assigned or removed during sprint planning. - [ ] Project advisors are invited to design reviews or asked to give feedback to the design decisions captured in documentation. - [ ] Discover all the reviews that the customer's processes require and plan for them. -- [ ] Clear non-functional requirements captured (see [Non-Functional Requirements Guidance](design/design-patterns/non-functional-requirements-capture-guide.md)) -- [ ] Risks and opportunities captured (see [Risk/Opportunity Management](agile-development/advanced-topics/backlog-management/risk-management.md)) +- [ ] Clear non-functional requirements captured (see [Non-Functional Requirements Guidance](./design/design-patterns/non-functional-requirements-capture-guide.md)) +- [ ] Risks and opportunities captured (see [Risk/Opportunity Management](./agile-development/advanced-topics/backlog-management/risk-management.md)) -More details on [design reviews](design/design-reviews/README.md) +More details on [design reviews](./design/design-reviews/README.md) ## Code Reviews @@ -88,7 +88,7 @@ More details on [design reviews](design/design-reviews/README.md) - [ ] Linters/Code Analyzers, unit tests and successful builds for PR merges are set up. - [ ] There is a process to enforce a quick review turnaround. -More details on [code reviews](code-reviews/README.md) +More details on [code reviews](./code-reviews/README.md) ## Retrospectives @@ -97,7 +97,7 @@ More details on [code reviews](code-reviews/README.md) - [ ] Experiments have owners and are added to project backlog. - [ ] The team conducts longer retrospective for Milestones and project completion. -More details on [retrospectives](agile-development/basics/ceremonies.md#retrospectives) +More details on [retrospectives](./agile-development/ceremonies.md#retrospectives) ## Engineering Feedback @@ -105,7 +105,7 @@ More details on [retrospectives](agile-development/basics/ceremonies.md#retrospe - [ ] Suggestions for improvements are incorporated in the solution - [ ] Feedback is detailed and repeatable -More details on [engineering feedback](engineering-feedback/README.md) +More details on [engineering feedback](./engineering-feedback/README.md) ## Developer Experience (DevEx) @@ -118,4 +118,4 @@ Developers on the team can: - [ ] Automatically install dependencies by pressing F5 (or equivalent) in their IDE. - [ ] Use local dev configuration values (i.e. .env, appsettings.development.json). -More details on [developer experience](developer-experience/README.md) +More details on [developer experience](./developer-experience/README.md) diff --git a/docs/machine-learning/README.md b/docs/machine-learning/README.md index b40a5454da..bf8f4cab39 100644 --- a/docs/machine-learning/README.md +++ b/docs/machine-learning/README.md @@ -8,34 +8,34 @@ This guideline documents the Machine Learning (ML) practices in ISE. ISE works w * Provide clarity on ML process and how it fits within a software engineering project. * Provide best practices for the different stages of an ML project. -## How to use these fundamentals +## How to use these Fundamentals * If you are starting a new ML project, consider reading through the [general guidance documents](#general-guidance). * For specific aspects of an ML project, refer to the guidelines for different [project phases](#ml-project-phases). -## ML Project phases +## ML Project Phases The diagram below shows different phases in an ideal ML project. Due to practical constraints and requirements, it might not always be possible to have a project structured in such a manner, however best practices should be followed for each individual phase. -![Project flow](images/flow.png) +![Project flow](./images/flow.png) -* **[Envisioning](ml-problem-formulation-envisioning.md)**: Initial problem understanding, customer goals and objectives. -* **[Feasibility Study](ml-feasibility-study.md)**: Assess whether the problem in question is feasible to solve satisfactorily using ML with the available data. +* **[Envisioning](./envisioning-and-problem-formulation.md)**: Initial problem understanding, customer goals and objectives. +* **[Feasibility Study](./feasibility-studies.md)**: Assess whether the problem in question is feasible to solve satisfactorily using ML with the available data. * **Model Milestone**: There is a basic model that is achieving the minimum required performance, both in terms of ML performance and system performance. Using the knowledge gathered to this milestone, define the scope, objectives, high-level architecture, definition of done and plan for the entire project. -* **[Model(s) experimentation](ml-experimentation.md)**: Tools and best practices for conducting successful model experimentation. +* **[Model(s) experimentation](./model-experimentation.md)**: Tools and best practices for conducting successful model experimentation. * **Model(s) Operationalization**: [Model readiness for production](ml-model-checklist.md) checklist. -## General guidance +## General Guidance -* [ML Process Guidance](ml-proposed-process.md) -* [ML Fundamentals checklist](ml-fundamentals-checklist.md) -* [Data Exploration](ml-data-exploration.md) -* [Agile ML development](ml-project-management.md) -* [Testing Data Science and ML Ops code](ml-testing.md) -* [Profiling Machine Learning and ML Ops code](ml-profiling.md) -* [Responsible AI](responsible-ai.md) -* [Program Management for ML projects](ml-tpm-guidance.md) +* [ML Process Guidance](./proposed-ml-process.md) +* [ML Fundamentals checklist](./ml-fundamentals-checklist.md) +* [Data Exploration](./data-exploration.md) +* [Agile ML development](./agile-development-considerations-for-ml-projects.md) +* [Testing Data Science and ML Ops code](./testing-data-science-and-mlops-code.md) +* [Profiling Machine Learning and ML Ops code](./profiling-ml-and-mlops-code.md) +* [Responsible AI](./responsible-ai.md) +* [Program Management for ML projects](./tpm-considerations-for-ml-projects.md) -## References +## Resources * [Model Operationalization](https://github.com/Microsoft/MLOps) diff --git a/docs/machine-learning/ml-project-management.md b/docs/machine-learning/agile-development-considerations-for-ml-projects.md similarity index 66% rename from docs/machine-learning/ml-project-management.md rename to docs/machine-learning/agile-development-considerations-for-ml-projects.md index 002ee1c1cb..e5f0b6ae81 100644 --- a/docs/machine-learning/ml-project-management.md +++ b/docs/machine-learning/agile-development-considerations-for-ml-projects.md @@ -14,33 +14,33 @@ To learn more about how ISE runs the Agile process for software development team Within this framework, the team follows these Agile ceremonies: - [Backlog management](../agile-development/advanced-topics/backlog-management) -- [Retrospectives](../agile-development/basics/ceremonies.md#retrospectives) +- [Retrospectives](../agile-development/ceremonies.md#retrospectives) - [Scrum of Scrums](../agile-development/advanced-topics/effective-organization/scrum-of-scrums.md) (where applicable) -- [Sprint planning](../agile-development/basics/ceremonies.md#sprint-planning) -- [Stand-ups](../agile-development/basics/ceremonies.md#stand-up) -- [Working agreement](../agile-development/advanced-topics/team-agreements/working-agreements.md) +- [Sprint planning](../agile-development/ceremonies.md#sprint-planning) +- [Stand-ups](../agile-development/ceremonies.md#stand-up) +- [Working agreement](../agile-development/team-agreements/working-agreement.md) -### Notes on Agile process during exploration and experimentation +## Agile Process During Exploration and Experimentation 1. While acknowledging the fact that ML user stories and research spikes are less predictable than software development ones, we strive to have a deliverable for every user story in every sprint. -2. User stories and spikes are usually estimated using [T-shirt sizes](../agile-development/basics/ceremonies.md#estimation) or similar, and not in actual days/hours. +2. User stories and spikes are usually estimated using [T-shirt sizes](../agile-development/ceremonies.md#estimation) or similar, and not in actual days/hours. 3. ML design sessions should be included in each sprint. -#### Examples of ML deliverables for each sprint +### Examples of ML Deliverables for each Sprint - Working code (e.g. models, pipelines, exploratory code) - Documentation of new hypotheses, and the acceptance or rejection of previous hypotheses as part of a Hypothesis Driven Analysis (HDA). For more information see [Hypothesis Driven Development on Barry Oreilly's website](https://barryoreilly.com/explore/blog/how-to-implement-hypothesis-driven-development/) - Exploratory Data Analysis (EDA) results and learnings documented -## Notes on collaboration between ML team and software development team +## Collaboration Between Data Scientists and Software Developers -- The ML and Software Development teams work together on the project. The team uses one backlog and attend the same Agile ceremonies. In cases where the project has many participants, we will divide into working groups, but still have the entire team join the Agile ceremonies. +- Data scientists and software developers work together on the project. The team uses one backlog and attend the same Agile ceremonies. In cases where the project has many participants, we will divide into working groups, but still have the entire team join the Agile ceremonies. - If possible, feasibility study and initial model experimentation takes place before the operationalization work kicks off. -- The ML team and dev team both share the accountability for the MLOps solution. +- Everyone shares the accountability for the MLOps solution. - The ML model interface (API) is determined as early as possible, to allow the developers to consider its integration into the production pipeline. -- MLOps artifacts are developed with a continuous collaboration and review of the ML team, to ensure the appropriate approaches for experimentation and +- MLOps artifacts are developed with a continuous collaboration and review of the data scientists, to ensure the appropriate approaches for experimentation and productization are used. - Retrospectives and sprint planning are performed on the entire team level, and not the specific work groups level. diff --git a/docs/machine-learning/ml-data-exploration.md b/docs/machine-learning/data-exploration.md similarity index 87% rename from docs/machine-learning/ml-data-exploration.md rename to docs/machine-learning/data-exploration.md index 2e08f21d5d..89195f2c40 100644 --- a/docs/machine-learning/ml-data-exploration.md +++ b/docs/machine-learning/data-exploration.md @@ -1,20 +1,16 @@ # Data Exploration -After [envisioning](./ml-problem-formulation-envisioning.md), and typically as part of the [ML feasibility study](./ml-feasibility-study.md), the next step is to confirm resource access and then dive deep into the available data through data exploration workshops. +After [envisioning](./envisioning-and-problem-formulation.md), and typically as part of the [ML feasibility study](./feasibility-studies.md), the next step is to confirm resource access and then dive deep into the available data through data exploration workshops. ## Purpose of the Data Exploration Workshop The purpose of the data exploration workshop is as follows: 1. Ensure the team can access the data and compute resources that are necessary for the ML feasibility study - -2. Ensure that the data provided is of quality and is relevant to the ML solution - -3. Make sure that the project team has a good understanding of the data - -4. Make sure that the SMEs (Subject Matter Experts) needed are present for Data Exploration Workshop - -5. List people needed for the data exploration workshop +1. Ensure that the data provided is of quality and is relevant to the ML solution +1. Make sure that the project team has a good understanding of the data +1. Make sure that the SMEs (Subject Matter Experts) needed are present for Data Exploration Workshop +1. List people needed for the data exploration workshop ## Accessing Resources diff --git a/docs/machine-learning/ml-problem-formulation-envisioning.md b/docs/machine-learning/envisioning-and-problem-formulation.md similarity index 95% rename from docs/machine-learning/ml-problem-formulation-envisioning.md rename to docs/machine-learning/envisioning-and-problem-formulation.md index 2aad087b3f..f3f73caea2 100644 --- a/docs/machine-learning/ml-problem-formulation-envisioning.md +++ b/docs/machine-learning/envisioning-and-problem-formulation.md @@ -2,7 +2,7 @@ Before beginning a data science investigation, we need to define a problem statement which the data science team can explore; this problem statement can have a significant influence on whether the project is likely to be successful. -## Envisioning goals +## Envisioning Goals The main goals of the envisioning process are: @@ -14,7 +14,7 @@ The main goals of the envisioning process are: The envisioning process usually entails a series of 'envisioning' sessions where the data science team work alongside subject-matter experts to formulate the problem in such a way that there is a shared understanding a shared understanding of the problem domain, a clear goal, and a predefined approach to evaluating a potential solution. -## Understanding the problem domain +## Understanding the Problem Domain Generally, before defining a project scope for a data science investigation, we must first understand the problem domain: @@ -23,7 +23,7 @@ Generally, before defining a project scope for a data science investigation, we * Does this problem require a machine learning solution? * How would a potential solution be used? -However, establishing this understanding can prove difficult, especially for those unfamiliar with the problem domain. To ease this process, we can approach problems in a structured way by taking the following steps: +However, establishing this understanding can prove difficult, especially for those unfamiliar with the problem domain. To ease this process, we can approach problems in a structured way by taking the following steps: * Identify a measurable problem and define this in business terms. The objective should be clear, and we should have a good understanding of the factors that we can control - that can be used as inputs - and how they affect the objective. Be as specific as possible. * Decide how the performance of a solution should be measured and identify whether this is possible within the restrictions of this problem. Make sure it aligns with the business objective and that you have identified the data required to evaluate the solution. Note that the data required to evaluate a solution may differ from the data needed to create a solution. @@ -35,7 +35,7 @@ or disproved - to guide the exploration of the data science team. Where possible Once an understanding of the problem domain has been established, it may be necessary to break down the overall problem into smaller, meaningful chunks of work to maintain team focus and ensure a realistic project scope within the given time frame. -## Listening to the end user +## Listening to the End User These problems are complex and require understanding from a variety of perspectives. It is not uncommon for the stakeholders to not be the end user of the solution framework. In these cases, listening to the actual end users is critical to the success of the project. @@ -51,7 +51,7 @@ The following questions can help guide discussion in understanding the stakehold ## Envisioning Guidance -During envisioning sessions, the following may prove useful for guiding the discussion. Many of these points are taken directly, or adapted from, [[1]](#references) and [[2]](#references). +During envisioning sessions, the following may prove useful for guiding the discussion. Many of these points are taken directly, or adapted from, [[1]](#resources) and [[2]](#resources). ### Problem Framing @@ -81,9 +81,9 @@ During envisioning sessions, the following may prove useful for guiding the disc 7. Who would be responsible for maintaining a solution produced during this project? 8. Are there any restrictions on tooling that must/cannot be used? -## Example - a recommendation engine problem +## Example: A Recommendation Engine Problem -To illustrate how the above process can be applied to a tangible problem domain, as an example, consider that we are looking at implementing a recommendation engine for a clothing retailer. This example was, in part, inspired by [[3]](#references). +To illustrate how the above process can be applied to a tangible problem domain, as an example, consider that we are looking at implementing a recommendation engine for a clothing retailer. This example was, in part, inspired by [[3]](#resources). Often, the objective may be simply presented, in a form such as "to improve sales". However, whilst this is ultimately the main goal, we would benefit from being more specific here. Suppose that we were to deploy a solution in November and then observed a December sales surge; how would we be able to distinguish how much of this was as a result of the new recommendation engine, as opposed to the fact that December is a peak buying season? @@ -126,11 +126,11 @@ We suggest confirming that you have access to all necessary resources (including Below are the links to the exit document template and to some questions which may be helpful in confirming resource access. -* [Summary of Scope Exit Document Template](./ml-envisioning-summary-template.md) -* [List of Resource Access Questions](./ml-data-exploration.md) -* [List of Data Exploration Workshop Questions](./ml-data-exploration.md) +* [Summary of Scope Exit Document Template](./envisioning-summary-template.md) +* [List of Resource Access Questions](./data-exploration.md) +* [List of Data Exploration Workshop Questions](./data-exploration.md) -## References +## Resources Many of the ideas presented here - and much more - were inspired by, and can be found in the following resources; all of which are highly recommended. diff --git a/docs/machine-learning/ml-envisioning-summary-template.md b/docs/machine-learning/envisioning-summary-template.md similarity index 89% rename from docs/machine-learning/ml-envisioning-summary-template.md rename to docs/machine-learning/envisioning-summary-template.md index fa7bcc4ce7..d2c06b8690 100644 --- a/docs/machine-learning/ml-envisioning-summary-template.md +++ b/docs/machine-learning/envisioning-summary-template.md @@ -1,6 +1,6 @@ # Generic Envisioning Summary -## Purpose of this template +## Purpose of this Template This is an example of an envisioning summary completed after envisioning sessions have concluded. It summarizes the materials reviewed, application scenarios discussed and decided, and the next steps in the process. @@ -10,29 +10,26 @@ This is an example of an envisioning summary completed after envisioning session This document is to summarize what we have discussed in these envisioning sessions, and what we have decided to work on in this machine learning (ML) engagement. With this document, we hope that everyone can be on the same page regarding the scope of this ML engagement, and will ensure a successful start for the project. -### Materials Shared with the team +### Materials Shared with the Team **List materials shared with you here. The list below contains some examples. You will want to be more specific.** 1. Business vision statement - 2. Sample Data - 3. Current problem statement Also discuss: 1. How the current solution is built and implemented - 2. Details about the current state of the systems and processes. ### Applications Scenarios that Can Help [People] Achieve [Task] The following application scenarios were discussed: -Scenario 1: +Scenario 1: -Scenario 2: +Scenario 2: > **Add more scenarios as needed** @@ -41,11 +38,8 @@ For each scenario, provide an appropriately descriptive name and then follow up For each scenario, discuss: 1. What problem statement was discussed - 2. How we propose to solve the problem (there may be several proposals) - 3. Who would use the solution - 4. What would it look like to use our solution? An example of how it would bring value to the end user. ### Selected Scenario for this ML Engagement @@ -56,25 +50,21 @@ Why was this scenario prioritised over the others? Will other scenarios be considered in the future? When will we revisit them / what conditions need to be met to pursue them? -### More Details of the Scope for Selected Scenario +### More Details of the Scope for Selected Scenario 1. What is in scope? - 2. What data is available? - -3. Which performance metric to use? - +3. Which performance metric to use? 4. Bar of performance metrics - 5. What are deliverables? ## What’s Next? -### Legal documents to be signed +### Legal Documents to be Signed State documents and timeline -### Responsible AI Review +### Responsible AI Review Plan when to conduct a responsible AI process. What are the prerequisites to start this process? @@ -83,26 +73,22 @@ Plan when to conduct a responsible AI process. What are the prerequisites to sta A data exploration workshop is planned for **DATE RANGE**. This data exploration workshops will be **X**-**Y** days, not including the time to gain access resources. The purpose of the data exploration workshop is as follows: 1. Ensure the team can access the data and compute resources that are necessary for the ML feasibility study - -2. Ensure that the data provided is of quality and is relevant to the ML solution - +2. Ensure that the data provided is of quality and is relevant to the ML solution 3. Make sure that the project team has a good understanding of the data - 4. Make sure that the SMEs (Subject Matter Experts) needed are present for Data Exploration Workshop - 5. List people needed for the data exploration workshop -## ML Feasibility Study till [date] +## ML Feasibility Study til [date] ### Objectives -State what we expect to be the objective in the feasibility study +State what we expect to be the objective in the feasibility study ### Timeline -Give a possible timeline for the feasibility study +Give a possible timeline for the feasibility study -### Personnel needed +### Personnel Needed What sorts of people/roles are needed for the feasibility study? diff --git a/docs/machine-learning/ml-feasibility-study.md b/docs/machine-learning/feasibility-studies.md similarity index 85% rename from docs/machine-learning/ml-feasibility-study.md rename to docs/machine-learning/feasibility-studies.md index a0046ad43c..82f2f3b051 100644 --- a/docs/machine-learning/ml-feasibility-study.md +++ b/docs/machine-learning/feasibility-studies.md @@ -9,7 +9,7 @@ The main goal of feasibility studies is to assess whether it is feasible to solv This effort ensures quality solutions backed by the appropriate, thorough amount of consideration and evidence. -## When are feasibility studies useful? +## When are Feasibility Studies Useful? Every engagement can benefit from a feasibility study early in the project. @@ -17,18 +17,18 @@ Architectural discussions can still occur in parallel as the team works towards Feasibility studies can last between 4-16 weeks, depending on specific problem details, volume of data, state of the data etc. Starting with a 4-week milestone might be useful, during which it can be determined how much more time, if any, is required for completion. -## Who collaborates on feasibility studies? +## Who Collaborates on Feasibility Studies? Collaboration from individuals with diverse skill sets is desired at this stage, including data scientists, data engineers, software engineers, PMs, human experience researchers, and domain experts. It embraces the use of engineering fundamentals, with some flexibility. For example, not all experimentation requires full test coverage and code review. Experimentation is typically not part of a CI/CD pipeline. Artifacts may live in the `main` branch as a folder excluded from the CI/CD pipeline, or as a separate experimental branch, depending on customer/team preferences. -## What do feasibility studies entail? +## What do Feasibility Studies Entail? -### Problem definition and desired outcome +### Problem Definition and Desired Outcome * Ensure that the problem is complex enough that coding rules or manual scaling is unrealistic * Clear definition of the problem from business and technical perspectives -### Deep contextual understanding +### Deep Contextual Understanding Confirm that the following questions can be answered based on what was learned during the Discovery Phase of the project. For items that can not be satisfactorily answered, undertake additional investigation to answer. @@ -46,16 +46,16 @@ Confirm that the following questions can be answered based on what was learned d * Share what was uncovered and understood, and the implications thereof across the engagement team and relevant stakeholders. * If the above research was conducted during the Discovery phase, it should be reviewed, and any substantial knowledge gaps should be identified and filled by following the above process. -### Data access +### Data Access * Verify that the full team has access to the data * Set up a dedicated and/or restricted environment if required * Perform any required de-identification or redaction of sensitive information * Understand data access requirements (retention, role-based access, etc.) -### Data discovery +### Data Discovery -* Hold a [data exploration](ml-data-exploration.md) workshop and deep dive with domain experts +* Hold a [data exploration](./data-exploration.md) workshop and deep dive with domain experts * Understand data availability and confirm the team's access * Understand the data dictionary, if available * Understand the quality of the data. Is there already a data validation strategy in place? @@ -65,12 +65,12 @@ Confirm that the following questions can be answered based on what was learned d * Ideally obtain or create an entity relationship diagram (ERD) * Potentially uncover new useful data sources -### Architecture discovery +### Architecture Discovery * Clear picture of existing architecture * Infrastructure spikes -### Concept ideation and iteration +### Concept Ideation and Iteration * Develop value proposition(s) for users and stakeholders based on the contextual understanding developed through the discovery process (e.g. key elements of value, benefits) * As relevant, make use of @@ -80,7 +80,7 @@ Confirm that the following questions can be answered based on what was learned d * Identify the next set of hypotheses or unknowns to be tested (see concept testing) * Revisit and iterate on the concept throughout discovery as understanding of the problem space evolves -### Exploratory data analysis (EDA) +### Exploratory Data Analysis (EDA) * Data deep dive * Understand feature and label value distributions @@ -90,7 +90,7 @@ Confirm that the following questions can be answered based on what was learned d * Pave the way of further understanding of what techniques are applicable * Establish a mutual understanding of what data is in or out of scope for feasibility, ensuring that the data in scope is significant for the business -### Data pre-processing +### Data Pre-Processing * Happens during EDA and hypothesis testing * Feature engineering @@ -98,7 +98,7 @@ Confirm that the following questions can be answered based on what was learned d * Scaling and/or discretization * Noise handling -### Hypothesis testing +### Hypothesis Testing * Design several potential solutions using theoretically applicable algorithms and techniques, starting with the simplest reasonable baseline * Train model(s) @@ -107,7 +107,7 @@ Confirm that the following questions can be answered based on what was learned d * Iterate * Thoroughly document each step and outcome, plus any resulting hypotheses for easy following of the decision-making process -### Concept testing +### Concept Testing * Where relevant, to test the value proposition, concepts or aspects of the experience * Plan user, stakeholder and expert research @@ -117,7 +117,7 @@ Confirm that the following questions can be answered based on what was learned d * Ensure that the proposed solution and framing are compatible with and acceptable to affected people * Ensure that the proposed solution and framing is compatible with existing business goals and context -### Risk assessment +### Risk Assessment * Identification and assessment of risks and constraints @@ -128,17 +128,17 @@ Confirm that the following questions can be answered based on what was learned d * Testing AI concept and experience elements with users and stakeholders * Discussion and feedback from diverse perspectives around any responsible AI concerns -## Output of a feasibility study - -### Possible outcomes +## Output of a Feasibility Study The main outcome is a feasibility study report, with a recommendation on next steps: -- If there is not enough evidence to support the hypothesis that this problem can be solved using ML, as aligned with the pre-determined performance measures and business impact: - * We detail the gaps and challenges that prevented us from reaching a positive outcome - * We may scope down the project, if applicable - * We may look at re-scoping the problem taking into account the findings of the feasibility study - * We assess the possibility to collect more data or improve data quality +If there is not enough evidence to support the hypothesis that this problem can be solved using ML, as aligned with the pre-determined performance measures and business impact: + +* We detail the gaps and challenges that prevented us from reaching a positive outcome +* We may scope down the project, if applicable +* We may look at re-scoping the problem taking into account the findings of the feasibility study +* We assess the possibility to collect more data or improve data quality + +If there is enough evidence to support the hypothesis that this problem can be solved using ML -- If there is enough evidence to support the hypothesis that this problem can be solved using ML - * Provide recommendations and technical assets for moving to the operationalization phase +* Provide recommendations and technical assets for moving to the operationalization phase diff --git a/docs/machine-learning/ml-fundamentals-checklist.md b/docs/machine-learning/ml-fundamentals-checklist.md index 7a9ef4fd5f..0fbc37315b 100644 --- a/docs/machine-learning/ml-fundamentals-checklist.md +++ b/docs/machine-learning/ml-fundamentals-checklist.md @@ -31,7 +31,7 @@ This checklist helps ensure that our ML projects meet our ML Fundamentals. The i ## Model Baseline -- [ ] Well-defined baseline model exists and its performance is calculated. ([More details on well defined baselines](ml-model-checklist.md#is-there-a-well-defined-baseline-is-the-model-performing-better-than-the-baseline)) +- [ ] Well-defined baseline model exists and its performance is calculated. ([More details on well defined baselines](./ml-model-checklist.md#is-there-a-well-defined-baseline-is-the-model-performing-better-than-the-baseline)) - [ ] The performance of other ML models can be compared with the model baseline. ## Experimentation setup @@ -45,10 +45,10 @@ This checklist helps ensure that our ML projects meet our ML Fundamentals. The i ## Production -- [ ] [Model readiness checklist](ml-model-checklist.md) reviewed. +- [ ] [Model readiness checklist](./ml-model-checklist.md) reviewed. - [ ] Model reviews were performed (covering model debugging, reviews of training and evaluation approaches, model performance). - [ ] Data pipeline for inferencing, including an end-to-end tests. - [ ] SLAs requirements for models are gathered and documented. - [ ] Monitoring of data feeds and model output. - [ ] Ensure consistent schema is used across the system with expected input/output defined for each component of the pipelines (data processing as well as models). -- [ ] [Responsible AI](responsible-ai.md) reviewed. +- [ ] [Responsible AI](./responsible-ai.md) reviewed. diff --git a/docs/machine-learning/ml-model-checklist.md b/docs/machine-learning/ml-model-checklist.md index ae4d027b97..207d1f52d7 100644 --- a/docs/machine-learning/ml-model-checklist.md +++ b/docs/machine-learning/ml-model-checklist.md @@ -1,4 +1,4 @@ -# ML model production checklist +# ML Model Production Checklist The purpose of this checklist is to make sure that: @@ -25,7 +25,7 @@ Before putting an individual ML model into production, the following aspects sho Please note that there might be scenarios where it is not possible to check all the items on this checklist. However, it is advised to go through all items and make informed decisions based on your specific use case. -## Will your model performance be different in production than during training phase +## Will Your Model Performance be Different in Production than During the Training Phase Once deployed into production, the model might be performing much worse than expected. This poor performance could be a result of: @@ -33,7 +33,7 @@ Once deployed into production, the model might be performing much worse than exp - The feature engineering steps are different or inconsistent in production compared to the training process - The performance measure is not consistent (for example your test set covers several months of data where the performance metric for production has been calculated for one month of data) -### Is there a well-defined baseline? Is the model performing better than the baseline? +### Is there a Well-Defined Baseline? Is the Model Performing Better than the Baseline? A good way to think of a model baseline is the simplest model one can come up with: either a simple threshold, a random guess or a very basic linear model. This baseline is the reference point your model needs to outperform. A well-defined baseline is different for each problem type and there is no one size fits all approach. @@ -49,14 +49,14 @@ Some questions to ask when comparing to a baseline: - How does your model performance compare to applying a simple threshold? - How does your model compare with always predicting the most common value? -**Note**: In some cases, human parity might be too ambitious as a baseline, but this should be decided on a case by case basis. Human accuracy is one of the available options, but not the only one. +> **Note**: In some cases, human parity might be too ambitious as a baseline, but this should be decided on a case by case basis. Human accuracy is one of the available options, but not the only one. Resources: - ["How To Get Baseline Results And Why They Matter" article](https://machinelearningmastery.com/how-to-get-baseline-results-and-why-they-matter/) - ["Always start with a stupid model, no exceptions." article](https://blog.insightdatascience.com/always-start-with-a-stupid-model-no-exceptions-3a22314b9aaa) -### Are machine learning performance metrics defined for both training and scoring? +### Are Machine Learning Performance Metrics Defined for Both Training and Scoring? The methodology of translating the training metrics to scoring metrics should be well-defined and understood. Depending on the data type and model, the model metrics calculation might differ in production and in training. For example, the training procedure calculated metrics for a long period of time (a year, a decade) with different seasonal characteristics while the scoring procedure will calculate the metrics per a restricted time interval (for example a week, a month, a quarter). Well-defined ML performance metrics are essential in production so that a decrease or increase in model performance can be accurately detected. @@ -68,11 +68,11 @@ Things to consider: - If sampling techniques (over-sampling, under-sampling) are used to train model when classes are imbalanced, ensure the metrics used during training are comparable with the ones used in scoring. - If the number of samples used for training and testing is small, the performance metrics might change significantly as new data is scored. -### Is the model benchmarked? +### Is the Model Benchmarked? The trained model to be put into production is well benchmarked if machine learning performance metrics (such as accuracy, recall, RMSE or whatever is appropriate) are measured on the train and test set. Furthermore, the train and test set split should be well documented and reproducible. -### Can ground truth be obtained or inferred in production? +### Can Ground Truth be Obtained or Inferred in Production? Without a reliable ground truth, the machine learning metrics cannot be calculated. It is important to identify if the ground truth can be obtained as the model is scoring new data by either manual or automatic means. If the ground truth cannot be obtained systematically, other proxies and methodology should be investigated in order to obtain some measure of model performance. @@ -84,7 +84,7 @@ For clarity, let's consider the following examples (by no means an exhaustive li - **Recommender systems**: For recommender system, obtaining the ground truth is a complex problem in most cases as there is no way of identifying the ideal recommendation. For a retail website for example, click/not click, buy/not buy or other user interaction with recommendation can be used as ground truth proxies. - **Object detection in images**: For an object detection model, as new images are scored, there are no new labels being generated automatically. One option to obtain the ground truth for the new images is to use people to manually label the images. Human labelling is costly, time-consuming and not 100% accurate, so in most cases, only a subset of images can be labelled. These samples can be chosen at random or by using active learning techniques of selecting the most informative unlabeled samples. -### Has the data distribution of training, testing and validation sets been analyzed? +### Has the Data Distribution of Training, Testing and Validation Sets Been Analyzed? The data distribution of your training, test and validation (if applicable) dataset (including labels) should be analyzed to ensure they all come from the same distribution. If this is not the case, some options to consider are: re-shuffling, re-sampling, modifying the data, more samples need to be gathered or features removed from the dataset. @@ -98,7 +98,7 @@ Resources: - ["Splitting into train, dev and test" tutorial](http://cs230.stanford.edu/blog/split/) -### Have goals and hard limits for performance, speed of prediction and costs been established, so they can be considered if trade-offs need to be made? +### Have Goals and Hard Limits for Performance, Speed of Prediction and Costs been Established, so they can be Considered if Trade-Offs Need to be Made? Some machine learning models achieve high ML performance, but they are costly and time-consuming to run. In those cases, a less performant and cheaper model could be preferred. Hence, it is important to calculate the model performance metrics (accuracy, precision, recall, RMSE etc), but also to gather data on how expensive it will be to run the model and how long it will take to run. Once this data is gathered, an informed decision should be made on what model to productionize. @@ -108,7 +108,7 @@ System metrics to consider: - Cost per prediction - Time taken to make a prediction -### How will the model be integrated into other systems, and what impact will it have? +### How Will the Model be Integrated into Other Systems, and what Impact will it Have? Machine Learning models do not exist in isolation, but rather they are part of a much larger system. These systems could be old, proprietary systems or new systems being developed as a results of the creation a new machine learning model. In both of those cases, it is important to understand where the actual model is going to fit in, what output is expected from the model and how that output is going to be used by the larger system. Additionally, it is essential to decide if the model will be used for batch and/or real-time inference as production paths might differ. @@ -120,7 +120,7 @@ Possible questions to assess model impact: - Is the system transparent that there is a model making a prediction and what data is used to make this prediction? - What is the cost of a wrong prediction? -### How will incoming data quality be monitored? +### How Will Incoming Data Quality be Monitored? As data systems become increasingly complex in the mainstream, it is especially vital to employ data quality monitoring, alerting and rectification protocols. Following data validation best practices can prevent insidious issues from creeping into machine learning models that, at best, reduce the usefulness of the model, and at worst, introduce harm. Data validation, reduces the risk of data downtime (increasing headroom) and technical debt and supports long-term success of machine learning models and other applications that rely on the data. @@ -136,7 +136,7 @@ Resources: - ["Data Quality Fundamentals" by Moses et al.](https://www.oreilly.com/library/view/data-quality-fundamentals/9781098112035/) -### How will drift in data characteristics be monitored? +### How Will Drift in Data Characteristics be Monitored? Data drift detection uncovers legitimate changes in incoming data that are truly representative of the phenomenon being modeled,and are not erroneous (ex. user preferences change). It is imperative to understand if the new data in production will be significantly different from the data in the training phase. It is also important to check that the data distribution information can be obtained for any of the new data coming in. Drift monitoring can inform when changes are occurring and what their characteristics are (ex. abrupt vs gradual) and guide effective adaptation or retraining strategies to maintain performance. @@ -152,7 +152,7 @@ Resources: - ["Learning Under Concept Drift: A Review" by Lu at al.](https://arxiv.org/pdf/2004.05785.pdf) - [Understanding dataset shift](https://towardsdatascience.com/understanding-dataset-shift-f2a5a262a766) -### How will performance be monitored? +### How Will Performance be Monitored? It is important to define how the model will be monitored when it is in production and how that data is going to be used to make decisions. For example, when will a model need retraining as the performance has degraded and how to identify what are the underlying causes of this degradation could be part of this monitoring methodology. @@ -164,6 +164,6 @@ Model monitoring should lead to: - Warnings when anomalies in model output are occurring - Retraining decisions and adaptation strategy -### Have any ethical concerns been taken into account? +### Have any Ethical Concerns Been Taken into Account? Every ML project goes through the [Responsible AI](responsible-ai.md) process to ensure that it upholds Microsoft's [6 Responsible AI principles](https://www.microsoft.com/en-us/ai/responsible-ai). diff --git a/docs/machine-learning/ml-experimentation.md b/docs/machine-learning/model-experimentation.md similarity index 90% rename from docs/machine-learning/ml-experimentation.md rename to docs/machine-learning/model-experimentation.md index 13d0c3374b..2be0ba604f 100644 --- a/docs/machine-learning/ml-experimentation.md +++ b/docs/machine-learning/model-experimentation.md @@ -5,7 +5,7 @@ Machine learning model experimentation involves uncertainty around the expected model results and future operationalization. To handle this uncertainty as much as possible, we propose a semi-structured process, balancing between engineering/research best practices and rapid model/data exploration. -## Model experimentation goals +## Model Experimentation Goals - **Performance**: Find the best performing solution - **Operationalization**: Keep an eye towards production, making sure that operationalization is feasible @@ -13,7 +13,7 @@ To handle this uncertainty as much as possible, we propose a semi-structured pro - **Reproducibility**: Keep research active by allowing experiment tracking and reproducibility - **Collaboration**: Foster the collaboration and joint work of multiple people on the team -## Model experimentation challenges +## Model Experimentation Challenges - **Trial and error process**: Difficult to plan and estimate durations and capacity. - **Quick and dirty**: We want to fail fast and get a sense of what’s working efficiently. @@ -28,7 +28,7 @@ while trusting the framework to do the rest. The following tools and guidelines are aimed at achieving experimentation goals as well as addressing the aforementioned challenges. -## Tools and guidelines for successful model experimentation +## Tools and Guidelines for Successful Model Experimentation - [Virtual environments](#virtual-environments) - [Source control and folder/package structure](#source-control-and-folder-or-package-structure) @@ -36,17 +36,17 @@ The following tools and guidelines are aimed at achieving experimentation goals - [Datasets and models abstractions](#datasets-and-models-abstractions) - [Model evaluation](#model-evaluation) -### Virtual environments +### Virtual Environments In languages like Python and R, it is always advised to employ virtual environments. Virtual environments facilitate reproducibility, collaboration and productization. Virtual environments allow us to be consistent across our local dev envs as well as with compute resources. These environments' configuration files can be used to build the code from source in an consistent way. For more details on why we need virtual environments visit [this blog post](https://realpython.com/python-virtual-environments-a-primer/#why-the-need-for-virtual-environments). -#### Which virtual environment framework should I choose +#### Which Virtual Environment Framework should I Choose All virtual environments frameworks create isolation, some also propose dependency management and additional features. Decision on which framework to use depends on the complexity of the development environment (dependencies and other required resources) and on the ease of use of the framework. -#### Types of virtual environments +#### Types of Virtual Environments In ISE, we often choose from either `venv`, `Conda` or `Poetry`, depending on the project requirements and complexity. @@ -54,44 +54,44 @@ In ISE, we often choose from either `venv`, `Conda` or `Poetry`, depending on th - [Conda](https://docs.conda.io/en/latest/) is a popular package, dependency and environment management framework. It supports multiple stacks (Python, R) and multiple versions of the same environment (e.g. multiple Python versions). `Conda` maintains its own package repository, therefore some packages might not be downloaded and managed directly through `Conda`. - [Poetry](https://python-poetry.org/) is a Python dependency management system which manages dependencies in a standard way using `pyproject.toml` files and `lock` files. Similar to `Conda`, `Poetry`'s dependency resolution process is sometimes slow (see [FAQ](https://python-poetry.org/docs/faq/#why-is-the-dependency-resolution-process-slow)), but in cases where dependency issues are common or tricky, it provides a robust way to create reproducible and stable environments. -#### Expected outcomes for virtual environments setup +#### Expected Outcomes for Virtual Environments Setup 1. Documentation describing how to create the selected virtual environment and how to install dependencies. 2. Environment configuration files if applicable (e.g. `requirements.txt` for `venv`, [environment.yml](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-from-an-environment-yml-file) for `Conda` or [pyrpoject.toml](https://python-poetry.org/docs/pyproject/) for `Poetry`). -#### Virtual environments benefits +#### Virtual Environments Benefits - Productization - Collaboration - Reproducibility -### Source control and folder or package structure +### Source Control and Folder or Package Structure Applied ML projects often contain source code, notebooks, devops scripts, documentation, scientific resources, datasets and more. We recommend coming up with an agreed folder structure to keep resources tidy. Consider deciding upon a generic folder structure for projects (e.g. which contains the folders `data`, `src`, `docs` and `notebooks`), or adopt popular structures like the [CookieCutter Data Science](https://drivendata.github.io/cookiecutter-data-science/) folder structure. [Source control](../source-control/README.md) should be applied to allow collaboration, versioning, code reviews, traceability and backup. In data science projects, source control should be used for code, and the storing and versioning of other artifacts (e.g. data, scientific literature) should be decided upon depending on the scenario. -#### Folder structure and source control expected outcomes +#### Folder Structure and Source Control Expected Outcomes - Defined folder structure for all users to use, pushed to the repo. - [.gitignore](https://git-scm.com/docs/gitignore) file determining which folders should be synced with `git` and which should be kept locally. For example, [this one](https://github.com/drivendata/cookiecutter-data-science/blob/master/%7B%7B%20cookiecutter.repo_name%20%7D%7D/.gitignore). - Determine how notebooks are stored and versioned (e.g. [strip output from Jupyter notebooks](https://github.com/kynan/nbstripout)) -#### Source control and folder structure benefits +#### Source Control and Folder Structure Benefits - Collaboration - Reproducibility - Code quality -### Experiment tracking +### Experiment Tracking Experiment tracking tools allow data scientists and researchers to keep track of previous experiments for better understanding of the experimentation process and for the reproducibility of experiments or models. -#### Types of experiment tracking frameworks +#### Types of Experiment Tracking Frameworks Experiment tracking frameworks differ by the set of features they provide for collecting experiment metadata, and comparing and analyzing experiments. In ISE, we mainly use [MLFlow](https://mlflow.org/) on [Databricks](https://databricks.com/product/managed-mlflow) or [Azure ML Experimentation](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-track-experiments). Note that some experiment tracking frameworks require a deployment, while others are SaaS. -#### Experiment tracking outcomes +#### Experiment Tracking Outcomes 1. Decide on an experiment tracking framework 2. Ensure it is accessible to all users @@ -99,17 +99,16 @@ Experiment tracking frameworks differ by the set of features they provide for co 4. Define datasets and evaluation in a way which will allow the comparison of all experiments. **Consistency across datasets and evaluation is paramount for experiment comparison**. 5. Ensure full reproducibility by assuring that all required details are tracked (i.e. dataset names and versions, parameters, code, environment) -#### Experiment tracking benefits +#### Experiment Tracking Benefits - Model performance - Reproducibility - Collaboration - Code quality -### Datasets and models abstractions +### Datasets and Models Abstractions -By creating abstractions to building blocks (e.g., datasets, models, evaluators), -we allow the easy introduction of new logic into the experimentation pipeline while keeping the agreed upon experimentation flow intact. +By creating abstractions to building blocks (e.g., datasets, models, evaluators), we allow the easy introduction of new logic into the experimentation pipeline while keeping the agreed upon experimentation flow intact. These abstractions can be created using different mechanisms. For example, we can use Object-Oriented Programming (OOP) solutions like abstract classes: @@ -117,14 +116,14 @@ For example, we can use Object-Oriented Programming (OOP) solutions like abstrac - [An example from scikit-learn describing the creation of new estimators compatible with the API](https://scikit-learn.org/stable/developers/develop.html). - [An example from PyTorch on extending the abstract `Dataset` class](https://pytorch.org/tutorials/beginner/data_loading_tutorial.html#dataset-class). -#### Abstraction outcomes +#### Abstraction Outcomes 1. Different building blocks have defined APIs allowing them to be replaced or extended. 2. Replacing building blocks does not break the original experimentation flow. 3. Mock building blocks are used for unit tests 4. APIs/mocks are shared with the engineering teams for integration with other modules. -#### Abstraction benefits +#### Abstraction Benefits - Collaboration - Code quality @@ -132,7 +131,7 @@ For example, we can use Object-Oriented Programming (OOP) solutions like abstrac - Operationalization - Model performance -### Model evaluation +### Model Evaluation When deciding on the evaluation of the ML model/process, consider the following checklist: @@ -142,7 +141,7 @@ When deciding on the evaluation of the ML model/process, consider the following - [ ] Evaluation code is unit-tested and reviewed by all team members. - [ ] Evaluation flow facilitates further results and error analysis. -## Evaluation development process outcomes +## Evaluation Development Process Outcomes 1. Evaluation strategy is agreed upon all stakeholders 2. Research and discussion on various evaluation methods and metrics is documented. @@ -150,7 +149,7 @@ When deciding on the evaluation of the ML model/process, consider the following 4. Documentation on how to apply evaluation is reviewed. 5. Performance metrics are automatically tracked into the experiment tracker. -## Evaluation development process benefits +## Evaluation Development Process Benefits - Model performance - Code quality diff --git a/docs/machine-learning/ml-profiling.md b/docs/machine-learning/profiling-ml-and-mlops-code.md similarity index 95% rename from docs/machine-learning/ml-profiling.md rename to docs/machine-learning/profiling-ml-and-mlops-code.md index 7d3ad4664e..9f112dddb5 100644 --- a/docs/machine-learning/ml-profiling.md +++ b/docs/machine-learning/profiling-ml-and-mlops-code.md @@ -11,7 +11,7 @@ Below are some common scenarios in MLOps/Data Science projects, along with sugge - [PyTorch model training profiling](#pytorch-model-training-profiling) - [Azure Machine Learning pipeline profiling](#azure-machine-learning-pipeline-profiling) -## Generic Python profiling +## Generic Python Profiling Usually an MLOps/Data Science solution contains plain Python code serving different purposes (e.g. data processing) along with specialized model training code. Although many Machine Learning frameworks provide their own profiler, @@ -52,9 +52,9 @@ python -m cProfile [-o output_file] [-s sort_order] (-m module | myscript.py) > Note: one epoch of model training is usually enough for profiling. There's no need to run more epochs and produce additional cost. -Refer to [The Python Profilers](https://docs.python.org/3/library/profile.html) for further details. +Refer to [The Python Profilers](https://docs.python.org/3/library/profile.html) for further details. -## PyTorch model training profiling +## PyTorch Model Training Profiling PyTorch 1.8 includes an updated PyTorch [profiler](https://pytorch.org/blog/introducing-pytorch-profiler-the-new-and-improved-performance-tool/) @@ -92,7 +92,7 @@ More information on *PyTorch profiler*: - [PyTorch Profiler Recipe](https://pytorch.org/tutorials/recipes/recipes/profiler_recipe.html) - [Introducing PyTorch Profiler - the new and improved performance tool](https://pytorch.org/blog/introducing-pytorch-profiler-the-new-and-improved-performance-tool/) -## Azure Machine Learning pipeline profiling +## Azure Machine Learning Pipeline Profiling In our projects we often use [Azure Machine Learning](https://azure.microsoft.com/en-us/services/machine-learning/) pipelines to train Machine Learning models. Most of the profilers can also be used in conjunction with Azure Machine Learning. @@ -118,5 +118,5 @@ Boolean flag as a pipeline parameter The results can be found in the `Outputs+logs` tab, under `outputs/profiler_results` folder. 6. You might want to download the results and visualize it locally. -> Note: it's not recommended to run profilers simultaneously. Profiles also consume resources, therefore a simultaneous run +> **Note:** it's not recommended to run profilers simultaneously. Profiles also consume resources, therefore a simultaneous run might significantly affect the results. diff --git a/docs/machine-learning/ml-proposed-process.md b/docs/machine-learning/proposed-ml-process.md similarity index 86% rename from docs/machine-learning/ml-proposed-process.md rename to docs/machine-learning/proposed-ml-process.md index 29b79a3ee6..fcf934390d 100644 --- a/docs/machine-learning/ml-proposed-process.md +++ b/docs/machine-learning/proposed-ml-process.md @@ -35,18 +35,18 @@ The proposed ML development process consists of: * Deployment * Monitoring and Observability -### Version control +### Version Control -* During all stages of the process, it is suggested that artifacts should be version-controlled. Typically, the process is iterative and versioned artifacts can assist in traceability and reviewing. See more [here](ml-experimentation.md#source-control-and-folder-or-package-structure). +* During all stages of the process, it is suggested that artifacts should be [version-controlled](./model-experimentation.md#source-control-and-folder-or-package-structure). Typically, the process is iterative and versioned artifacts can assist in traceability and reviewing. -### Understanding the problem +### Understanding the Problem * Define the business problem for the ML project: * Agree on the success criteria with the customer. * Identify potential data sources and determine the availability of these sources. * Define performance evaluation metrics on ground truth data -* Conduct a [Responsible AI assessment](responsible-ai.md) to ensure development and deployment of the ML solution in a responsible manner. -* Conduct a feasibility study to assess whether the business problem is feasible to solve satisfactorily using ML with the available data. The objective of the feasibility study is to mitigate potential over-investment by ensuring sufficient evidence that ML is possible and would be the best solution. The study also provides initial indications of what the ML solution should look like. This ensures quality solutions supported by thorough consideration and evidence. Refer to [feasibility study](ml-feasibility-study.md). +* Conduct a [Responsible AI assessment](./responsible-ai.md) to ensure development and deployment of the ML solution in a responsible manner. +* Conduct a feasibility study to assess whether the business problem is feasible to solve satisfactorily using ML with the available data. The objective of the feasibility study is to mitigate potential over-investment by ensuring sufficient evidence that ML is possible and would be the best solution. The study also provides initial indications of what the ML solution should look like. This ensures quality solutions supported by thorough consideration and evidence. Refer to [feasibility study](./feasibility-studies.md). * Exploratory data analysis is performed and discussed with the team * **Typical output**: @@ -67,14 +67,14 @@ The proposed ML development process consists of: * **Typical output**: Rough Jupyter notebooks or scripts in Python or R, initial results from baseline model. -For more information on experimentation, refer to the [experimentation](ml-experimentation.md) section. +For more information on experimentation, refer to the [experimentation](./model-experimentation.md) section. ### Model Evaluation * Compare the effectiveness of different algorithms on the given problem. * **Typical output**: - * Evaluation flow is [fully set up](ml-experimentation.md#model-evaluation). + * Evaluation flow is [fully set up](./model-experimentation.md#model-evaluation). * Reproducible experiments for the different approaches experimented with. ### Model Operationalization @@ -88,22 +88,22 @@ For more information on experimentation, refer to the [experimentation](ml-exper * Training a model * CI/CD scripts. * Reproducibility steps for the model in production. - * See more [here](ml-model-checklist.md). + * See more [in the ML model checklist](./ml-model-checklist.md). #### Unit and Integration Testing * Ensuring that production code behaves in the way we expect it to, and that its results match those we saw during the Model Evaluation and Experimentation phases. -* Refer to [ML testing](ml-testing.md) post for further details. +* Refer to [ML testing](./testing-data-science-and-mlops-code.md) post for further details. * **Typical output**: Test suite with unit and end-to-end tests is created and completes successfully. #### Deployment -* [Responsible AI](responsible-ai.md) considerations such as bias and fairness analysis. Additionally, explainability/interpretability of the model should also be considered. +* [Responsible AI](./responsible-ai.md) considerations such as bias and fairness analysis. Additionally, explainability/interpretability of the model should also be considered. * It is recommended for a human-in-the-loop to verify the model and manually approve deployment to production. * Getting the model into production where it can start adding value by serving predictions. Typical artifacts are APIs for accessing the model and integrating the model to the solution architecture. * Additionally, certain scenarios may require training the model periodically in production. * Reproducibility steps of the production model are available. -* **Typical output**: [model readiness checklist](ml-model-checklist.md) is completed. +* **Typical output**: [model readiness checklist](./ml-model-checklist.md) is completed. #### Monitoring and Observability diff --git a/docs/machine-learning/responsible-ai.md b/docs/machine-learning/responsible-ai.md index 4d9dd8d8a3..ce89525900 100644 --- a/docs/machine-learning/responsible-ai.md +++ b/docs/machine-learning/responsible-ai.md @@ -32,6 +32,6 @@ The process begins as soon as we start a prospective project. We start to comple - Have measures for re-training been considered? - How can we address any concerns that arise, and how can we mitigate risk? -At this point we research available [tools and resources](https://www.microsoft.com/en-us/ai/responsible-ai-resources), such as [InterpretML](https://interpret.ml/) or [Fairlearn](https://github.com/fairlearn/fairlearn), that we may use on the project. We may change the project scope or re-define the [ML problem definition](ml-problem-formulation-envisioning.md) if necessary. +At this point we research available [tools and resources](https://www.microsoft.com/en-us/ai/responsible-ai-resources), such as [InterpretML](https://interpret.ml/) or [Fairlearn](https://github.com/fairlearn/fairlearn), that we may use on the project. We may change the project scope or re-define the [ML problem definition](./envisioning-and-problem-formulation.md) if necessary. -The Responsible AI review documents remain living documents that we re-visit and update throughout project development, through the [feasibility study](ml-feasibility-study.md), as the model is developed and prepared for production, and new information unfolds. The documents can be used and expanded once the model is deployed, and monitored in production. +The Responsible AI review documents remain living documents that we re-visit and update throughout project development, through the [feasibility study](./feasibility-studies.md), as the model is developed and prepared for production, and new information unfolds. The documents can be used and expanded once the model is deployed, and monitored in production. diff --git a/docs/machine-learning/ml-testing.md b/docs/machine-learning/testing-data-science-and-mlops-code.md similarity index 94% rename from docs/machine-learning/ml-testing.md rename to docs/machine-learning/testing-data-science-and-mlops-code.md index 51bf543c09..fd87410441 100644 --- a/docs/machine-learning/ml-testing.md +++ b/docs/machine-learning/testing-data-science-and-mlops-code.md @@ -12,11 +12,11 @@ Below are some common operations in MLOps or Data Science projects, along with s * [Data validation](#data-validation) * [Model testing](#model-testing) -## Saving and loading data +## Saving and Loading Data Reading and writing to csv, reading images or loading audio files are common scenarios encountered in MLOps projects. -### Example: Verify that a load function calls read_csv if the file exists +### Example: Verify that a Load Function Calls read_csv if the File Exists `utils.py` @@ -36,7 +36,7 @@ One way to do this would be to provide a sample file and call the function, and A much better way is to **mock** calls to `isfile`, and `read_csv`. Instead of calling the real function, we will return a predefined return value, or call a stub that doesn't have any side effects. This way no files are needed in the repository to execute the test, and the test will always work the same, independent of what machine it runs on. -> Note: Below we mock the specific os and pd functions referenced in the utils file, any others are left unaffected and would run as normal. +> **Note:** Below we mock the specific os and pd functions referenced in the utils file, any others are left unaffected and would run as normal. `test_utils.py` @@ -80,11 +80,11 @@ def test_load_data_does_not_call_read_csv_if_not_exists(mock_isfile, mock_read_c assert utils.pd.read_csv.call_count == 0 ``` -### Example: Using the same sample data for multiple tests +### Example: Using the Same Sample Data for Multiple Tests If more than one test will use the same sample data, fixtures are a good way to reuse this sample data. The sample data can be the contents of a json file, or a csv, or a DataFrame, or even an image. -> Note: The sample data is still hard coded if possible, and does not need to be large. Only add as much sample data as required for the tests to make the tests readable. +> **Note:** The sample data is still hard coded if possible, and does not need to be large. Only add as much sample data as required for the tests to make the tests readable. Use the fixture to return the sample data, and add this as a parameter to the tests where you want to use the sample data. @@ -105,7 +105,7 @@ def test_extract_features_extracts_price_per_area(house_features_json): assert extracted_features['price_per_area'] == 100 ``` -## Transforming data +## Transforming Data For cleaning and transforming data, test fixed input and output, but try to limit each test to one verification. @@ -161,7 +161,7 @@ def test_resize_image_generates_the_correct_size(orig_height, orig_width, expect resized_image.shape[:2] = (expected_height, expected_width) ``` -## Model load or predict +## Model Load or Predict When **unit** testing we should mock model load and model predictions similarly to mocking file access. diff --git a/docs/machine-learning/ml-tpm-guidance.md b/docs/machine-learning/tpm-considerations-for-ml-projects.md similarity index 82% rename from docs/machine-learning/ml-tpm-guidance.md rename to docs/machine-learning/tpm-considerations-for-ml-projects.md index 570d044631..19a15b9b7a 100644 --- a/docs/machine-learning/ml-tpm-guidance.md +++ b/docs/machine-learning/tpm-considerations-for-ml-projects.md @@ -2,26 +2,26 @@ In this document, we explore some of the Program Management considerations for Machine Learning (ML) projects and suggest recommendations for Technical Program Managers (TPM) to effectively work with Data and Applied Machine Learning engineering teams. -## Determine the need for Machine Learning in the project +## Determine the Need for Machine Learning in the Project In Artificial Intelligence (AI) projects, the ML component is generally a part of an overall business problem and **NOT** the problem itself. Determine the overall business problem first and then evaluate if ML can help address a part of the problem space. Few considerations for identifying the right fit for the project: -- Engage experts in human experience and employ techniques such as [Design Thinking](https://www.microsoft.com/en-us/haxtoolkit/ai-guidelines/) and [Problem Formulation](ml-problem-formulation-envisioning.md) to **understand the customer needs** and human behavior first. Identify the right stakeholders from both business and technical leadership and invite them to these workshops. The outcome should be end-user scenarios and [personas](https://en.wikipedia.org/wiki/Persona_(user_experience)) to determine the real needs of the users. +- Engage experts in human experience and employ techniques such as [Design Thinking](https://www.microsoft.com/en-us/haxtoolkit/ai-guidelines/) and [Problem Formulation](./envisioning-and-problem-formulation.md) to **understand the customer needs** and human behavior first. Identify the right stakeholders from both business and technical leadership and invite them to these workshops. The outcome should be end-user scenarios and [personas](https://en.wikipedia.org/wiki/Persona_(user_experience)) to determine the real needs of the users. - Focus on [System Design](https://learn.microsoft.com/en-us/azure/architecture/data-guide/big-data/ai-overview) principles to identify the architectural components, entities, interfaces, constraints. Ask the right questions early and explore design alternatives with the engineering team. - Think hard about the **costs of ML** and whether we are solving a repetitive problem at scale. Many a times, customer problems can be solved with data analytics, dashboards, or rule-based algorithms as the first phase of the project. -### Set Expectations for high ambiguity in ML components +### Set Expectations for High Ambiguity in ML components ML projects can be plagued with a phenomenon we can call as the "**Death by Unknowns**". Unlike software engineering projects, ML focused projects can result in quick success early (aka sudden decrease in error rate), but this may flatten eventually. Few things to consider: - **Set clear expectations**: Identify the performance metrics and discuss on a "good enough" prediction rate that will bring value to the business. An 80% "good enough" rate may save business costs and increase productivity but if going from 80 to 95% would require unimaginable cost and effort. Is it worth it? Can it be a progressive road map? -- Create a smaller team and **undertake a feasibility analysis** through techniques like [EDA](https://en.wikipedia.org/wiki/Exploratory_data_analysis) (Exploratory Data Analysis). A [feasibility study](ml-feasibility-study.md) is much cheaper to evaluate data quality, customer constraints and model feasibility. It allows a TPM to better understand customer use cases and current environment and can act as a fail-fast mechanism. Note that feasibility should be shorter (in weeks) else it misses the point of saving costs. +- Create a smaller team and **undertake a feasibility analysis** through techniques like [EDA](https://en.wikipedia.org/wiki/Exploratory_data_analysis) (Exploratory Data Analysis). A [feasibility study](./feasibility-studies.md) is much cheaper to evaluate data quality, customer constraints and model feasibility. It allows a TPM to better understand customer use cases and current environment and can act as a fail-fast mechanism. Note that feasibility should be shorter (in weeks) else it misses the point of saving costs. -- As in any project, there will be new needs (additional data sources, technical constraints, hiring data labelers, business users time etc.). Incorporate [Agile](ml-project-management.md) techniques to fail fast and minimize cost and schedule surprises. +- As in any project, there will be new needs (additional data sources, technical constraints, hiring data labelers, business users time etc.). Incorporate [Agile](./agile-development-considerations-for-ml-projects.md) techniques to fail fast and minimize cost and schedule surprises. ### Notebooks != ML Production @@ -29,8 +29,8 @@ Notebooks are a great way to kick start Data Analytics and Applied Machine Learn - Understand the [end-end flow of data management](https://learn.microsoft.com/en-us/azure/architecture/data-guide/big-data/ai-overview), how data will be made available (ingestion flows), what's the frequency, storage, retention of data. Plan user stories and design spikes around these flows to ensure a robust ML pipeline is developed. -- Engineering team should follow the same rigor in building ML projects as in any software engineering project. We at ISE (Industry Solutions Engineering) have built a good set of resources from our learnings in our [ISE Engineering Playbook](../index.md). -- Think about the how the model will be deployed, for example, are there technical constraints due to an edge device, or network constraints that will prevent updating the model. Understanding of the environment is critical, refer to the [Model Production Checklist](ml-model-checklist.md) as a reference to determine model deployment choices. +- Engineering team should follow the same rigor in building ML projects as in any software engineering project. We at ISE (Industry Solutions Engineering) have built a good set of resources from our learnings in our [ISE Engineering Playbook](../README.md). +- Think about the how the model will be deployed, for example, are there technical constraints due to an edge device, or network constraints that will prevent updating the model. Understanding of the environment is critical, refer to the [Model Production Checklist](./ml-model-checklist.md) as a reference to determine model deployment choices. - ML Focussed projects are not a "one-shot" release solution, they need to be nurtured, evolved, and improved over time. Plan for a continuous improvement lifecycle, the initial phases can be model feasibility and validation to get the good enough prediction rate, the later phases can be then be scaling and improving the models through feedback loops and fresh data sets. @@ -38,7 +38,7 @@ Notebooks are a great way to kick start Data Analytics and Applied Machine Learn Data quality is a major factor in affecting model performance and production roll-out, consider the following: -- Conduct a [data exploration](ml-data-exploration.md) workshop and **generate a report on data quality** that includes missing values, duplicates, unlabeled data, expired or not valid data, incomplete data (e.g., only having male representation in a people dataset). +- Conduct a [data exploration](./data-exploration.md) workshop and **generate a report on data quality** that includes missing values, duplicates, unlabeled data, expired or not valid data, incomplete data (e.g., only having male representation in a people dataset). - **Identify data source reliability** to ensure data is coming from a production source. (e.g., are the images from a production or industrial camera or taken from an iPhone/Android phone.) @@ -50,11 +50,11 @@ Data quality is a major factor in affecting model performance and production rol An ML Project has multiple stages, and each stage may require additional roles. For example, Design Research & Designers for Human Experience, Data Engineer for Data Collection, Feature Engineering, a Data Labeler for labeling structured data, engineers for MLOps and model deployment and the list can go on. As a TPM, factor in having these resources available at the right time to avoid any schedule risks. -### Feature Engineering and Hyperparameter tuning +### Feature Engineering and Hyperparameter Tuning Feature Engineering enables the transformation of data so that it becomes usable for an algorithm. Creating the right features is an art and may require experimentation as well as domain expertise. Allocate time for domain experts to help with improving and identifying the best features. For example, for a natural language processing engine for text extraction of financial documents, we may involve financial researchers and run a [relevance judgment](https://nlp.stanford.edu/IR-book/html/htmledition/information-retrieval-system-evaluation-1.html) exercise and provide a feedback loop to evaluate model performance. -### Responsible AI considerations +### Responsible AI Considerations Bias in machine learning could be the number one issue of a model not performing to its intended needs. Plan to incorporate [Responsible AI principles](responsible-ai.md) from Day 1 to ensure fairness, security, privacy and transparency of the models. For example, for a person recognition algorithm, if the data source is only feeding a specific skin type, then production scenarios may not provide good results. diff --git a/docs/non-functional-requirements/accessibility.md b/docs/non-functional-requirements/accessibility.md index 06143f939b..ab90f489d4 100644 --- a/docs/non-functional-requirements/accessibility.md +++ b/docs/non-functional-requirements/accessibility.md @@ -12,15 +12,15 @@ Inclusive design is a methodology that embraces the full range of human diversit The Microsoft Inclusive Design methodology includes the following principles: -### Recognize exclusion +### Recognize Exclusion Designing for inclusivity not only opens up our products and services to more people, it also reflects how people really are. All humans grow and adapt to the world around them and we want our designs to reflect that. -### Solve for one, extend to many +### Solve for One, Extend to Many Everyone has abilities, and limits to those abilities. Designing for people with permanent disabilities actually results in designs that benefit people universally. Constraints are a beautiful thing. -### Learn from diversity +### Learn from Diversity Human beings are the real experts in adapting to diversity. Inclusive design puts people in the center from the very start of the process, and those fresh, diverse perspectives are the key to true insight. @@ -56,7 +56,7 @@ Before you get to testing, you can make some small changes in how you write code - When including images or diagrams, add alt text. This should never just be "Image" or "Diagram" (or similar). In your description, highlight the purpose of the image or diagram in the page and what it is intended to convey. - Prefer tabs to spaces when possible. This allows users to default to their preferred tab width, so users with a range of vision can all take in code easily. -## Additional Resources +## Resources * [Microsoft Accessibility Technology & Tools](https://www.microsoft.com/accessibility) * [Web Content Accessibility Guidelines (WCAG)](https://www.w3.org/TR/WCAG20/#intro) diff --git a/docs/non-functional-requirements/availability.md b/docs/non-functional-requirements/availability.md index 1e234ee076..88f482c348 100644 --- a/docs/non-functional-requirements/availability.md +++ b/docs/non-functional-requirements/availability.md @@ -28,7 +28,7 @@ Implementing availability involves various strategies and technologies designed - Scheduled Maintenance Windows: Planning and communicating scheduled maintenance periods during off-peak hours to minimize the impact on users. Systems can be designed to perform maintenance tasks without taking the entire service offline. - High Availability Software Architectures: Designing software with high availability in mind, using principles like microservices architecture, which isolates different functions of an application. This isolation ensures that a failure in one component doesn’t bring down the entire system. -## More information +## Resources - [Recommendations for highly available multi-region design](https://learn.microsoft.com/en-us/azure/well-architected/reliability/highly-available-multi-region-design) - [Recommendations for using availability zones and regions](https://learn.microsoft.com/en-us/azure/well-architected/reliability/regions-availability-zones) diff --git a/docs/non-functional-requirements/capacity.md b/docs/non-functional-requirements/capacity.md index e586655dfd..c291eedffa 100644 --- a/docs/non-functional-requirements/capacity.md +++ b/docs/non-functional-requirements/capacity.md @@ -27,6 +27,6 @@ Capacity is typically implemented through a combination of architectural design, - High Availability and Fault Tolerance: Implementing strategies such as redundant servers, failover mechanisms, and disaster recovery plans ensures that the system remains available and operational even in the event of hardware failures or other disruptions. - Capacity Planning: Conducting thorough capacity planning based on anticipated growth, usage patterns, and business requirements helps forecast resource needs and proactively scale the system to meet future demands. -## More information +## Resources - [Performance Testing](../automated-testing/performance-testing/README.md) diff --git a/docs/non-functional-requirements/compliance.md b/docs/non-functional-requirements/compliance.md index ff13bd007b..1e97bf9428 100644 --- a/docs/non-functional-requirements/compliance.md +++ b/docs/non-functional-requirements/compliance.md @@ -21,7 +21,7 @@ Implementing compliance involves a systematic approach that integrates regulator - Audit and Monitoring: Establishing mechanisms for continuous monitoring, auditing, and logging of activities within the software system to ensure compliance with regulatory requirements. This includes maintaining audit trails, generating compliance reports, and conducting regular security assessments. - Documentation and Record Keeping: Maintaining comprehensive documentation of compliance efforts, including policies, procedures, audit reports, risk assessments, and compliance certifications. -## More information +## Resources - [General Data Protection Regulation (GDPR)](https://en.wikipedia.org/wiki/General_Data_Protection_Regulation) - [Purview Compliance Manager](https://aka.ms/ComplianceManager) diff --git a/docs/non-functional-requirements/data-integrity.md b/docs/non-functional-requirements/data-integrity.md index 014f0661f9..bcc2400ec6 100644 --- a/docs/non-functional-requirements/data-integrity.md +++ b/docs/non-functional-requirements/data-integrity.md @@ -34,6 +34,6 @@ Database constraints: Utilize database constraints such as primary keys, foreign Regular data backups: Implement regular backups of data to prevent loss in case of system failures, errors, or security breaches. Ensure that backup procedures are automated, monitored, and regularly tested. -## More information +## Resources - [Great Expectations](https://greatexpectations.io/): A framework to build data validations and test the quality of your data. diff --git a/docs/non-functional-requirements/disaster-recovery.md b/docs/non-functional-requirements/disaster-recovery.md index c0186af9b7..d4699074be 100644 --- a/docs/non-functional-requirements/disaster-recovery.md +++ b/docs/non-functional-requirements/disaster-recovery.md @@ -36,6 +36,6 @@ Implementing disaster recovery (DR) involves a combination of strategies, techno - **Regular Testing and Drills**: Conduct regular simulation drills to test the effectiveness of the DR plan and to ensure that all team members are familiar with their roles. - **Comprehensive Documentation**: Develop run books with step-by-step instructions for executing the DR plan, tailored to specific scenarios and systems. -## More information +## Resources - [Azure Site Recovery](https://azure.microsoft.com/en-us/products/site-recovery/) diff --git a/docs/non-functional-requirements/internationalization.md b/docs/non-functional-requirements/internationalization.md index 1df0ae03c1..034be46b6b 100644 --- a/docs/non-functional-requirements/internationalization.md +++ b/docs/non-functional-requirements/internationalization.md @@ -4,7 +4,7 @@ Internationalization (i18n) and Localization (l10n) refer to the design and adap ## Characteristics -### Main characteristics of Internationalization +### Main Characteristics of Internationalization - Text Externalization: Moving all user-facing text to external resource files to facilitate easy translation. - Unicode Support: Using Unicode or another character encoding that supports all necessary scripts and characters. @@ -13,7 +13,7 @@ Internationalization (i18n) and Localization (l10n) refer to the design and adap - Locale-Sensitive Data Processing: Adapting data processing to respect locale-specific rules, such as sorting and case conversion. - Bidirectional Text Support: Supporting both left-to-right (LTR) and right-to-left (RTL) text orientations where necessary. -### Main characteristics of Localization +### Main Characteristics of Localization - Translation: Converting text and UI elements to the target language. - Cultural Adaptation: Adapting content and design elements to align with local cultural norms and expectations. diff --git a/docs/non-functional-requirements/performance.md b/docs/non-functional-requirements/performance.md index 4624ded514..9143d182fc 100644 --- a/docs/non-functional-requirements/performance.md +++ b/docs/non-functional-requirements/performance.md @@ -30,6 +30,6 @@ Implementing performance involves a combination of architectural decisions, codi - Performance Testing: Performing rigorous performance tests to pinpoint bottlenecks, measure critical metrics like response time and throughput, and validate system performance across varying load scenarios. - Continuous Monitoring: Implementing ongoing monitoring of performance metrics to identify performance degradation. -## More information +## Resources - [Automated Testing](../automated-testing/README.md) diff --git a/docs/non-functional-requirements/portability.md b/docs/non-functional-requirements/portability.md index 0924debc61..4f6a779b6c 100644 --- a/docs/non-functional-requirements/portability.md +++ b/docs/non-functional-requirements/portability.md @@ -35,5 +35,6 @@ Portability refers to the ease with which software can be transferred and used i - Data Interchange Formats: Using common data formats like JSON, XML, or Protocol Buffers to ensure data can be exchanged and understood across different systems. ### Other Practices + - Debugging and Troubleshooting: Local debugging provides direct access to debugging tools and logs, making it easier to diagnose and resolve issues quickly. - CI/CD Integration: Implementing a CI/CD pipeline to automate the building, testing, and packaging of the solution enhances portability by ensuring consistent and reliable deployments across various platforms and environments. \ No newline at end of file diff --git a/docs/non-functional-requirements/reliability.md b/docs/non-functional-requirements/reliability.md index 7f727e1251..692903ddd8 100644 --- a/docs/non-functional-requirements/reliability.md +++ b/docs/non-functional-requirements/reliability.md @@ -75,7 +75,7 @@ We can build graceful failure (or graceful degradation) into our software stack No software service is complete without playbooks to navigate the developers through unfamiliar territory. Playbooks should be thorough and cover all known failure scenarios and mitigations. -### Run maintenance exercises +### Run Maintenance Exercises Take the time to fabricate scenarios, and run a D&D style campaign to solve your issues. This can be as elaborate as spinning up a new environment and injecting errors, or as simple as asking the "players" to navigate to a dashboard and describing would they would see in the fabricated scenario (small amounts of imagination required). The playbooks should **easily** navigate the user to the correct solution/mitigation. If not, update your playbooks. @@ -92,8 +92,8 @@ Leverage automated chaos testing to see how things break. You can read this play * [Simmy](https://github.com/Polly-Contrib/Simmy) - A .NET library for chaos testing and fault injection integrated with the [Polly](https://github.com/App-vNext/Polly) library for resilience engineering. [This ISE dev blog post](https://devblogs.microsoft.com/ise/build-test-resilience-dotnet-functions/) provides code snippets as an example of how to use Polly and Simmy to implement a hypothesis-driven approach to resilience and chaos testing. -## Analyze all Failures +## Analyze All Failures Writing up a [post-mortem](https://en.wikipedia.org/wiki/Postmortem_documentation) is a great way to document the root causes, and action items for your failures. They're also a great way to track recurring issues, and create a strong case for prioritizing fixes. -This can even be tied into your regular Agile [restrospectives](../agile-development/basics/ceremonies.md#retrospectives). +This can even be tied into your regular Agile [restrospectives](../agile-development/ceremonies.md#retrospectives). diff --git a/docs/user-interface-engineering/usability.md b/docs/non-functional-requirements/usability.md similarity index 95% rename from docs/user-interface-engineering/usability.md rename to docs/non-functional-requirements/usability.md index 64f20c7725..cafe0b3f61 100644 --- a/docs/user-interface-engineering/usability.md +++ b/docs/non-functional-requirements/usability.md @@ -2,7 +2,7 @@ Usability is a topic that is often used interchangeably with user experience (UX), but they are not the same thing. Usability is a subset of UX, focusing specifically on the ease of use and effectiveness of a product, i.e., it is the ease with which users can learn and use a product to achieve their goals. Usability is a key factor in determining the success of a product, as it directly impacts user satisfaction, productivity, and overall experience. A system that is difficult to use or understand can lead to frustration, errors, and ultimately, abandonment by users. -Closely coupled with usability and UX is the concept of accessibility, which you can read more about [here](../accessibility/README.md). +Closely coupled with usability and UX is the concept of [accessibility](./accessibility.md). ## Characteristics @@ -34,7 +34,8 @@ These evaluations can collect two key metrics: **quantitative data** and **quali ## Examples One example of usability in action is the design of a website. A website that is easy to navigate, with clear labels, intuitive menus, and a logical flow of information, is more likely to be successful than one that is cluttered, confusing, and difficult to use. The latter website is likely to have a low rate of user engagement, high [bounce rates](https://backlinko.com/hub/seo/bounce-rate), and low conversion rates, as users will quickly become frustrated and abandon the site. -## More information + +## Resources - [GeeksForGeeks: What is Usability?](https://www.geeksforgeeks.org/what-is-usability/) - [Usability.gov](https://www.usability.gov/) diff --git a/docs/observability/README.md b/docs/observability/README.md index 832502de21..8143bbdf15 100644 --- a/docs/observability/README.md +++ b/docs/observability/README.md @@ -9,31 +9,31 @@ Building observable systems enables development teams at ISE to measure how well ## Pillars of Observability -- [Logs](pillars/logging.md) -- [Metrics](pillars/metrics.md) -- [Tracing](pillars/tracing.md) -- [Logs vs Metrics vs Traces](log-vs-metric-vs-trace.md) +- [Logs](./pillars/logging.md) +- [Metrics](./pillars/metrics.md) +- [Tracing](./pillars/tracing.md) +- [Logs vs Metrics vs Traces](./log-vs-metric-vs-trace.md) ## Insights -- [Dashboards and Reporting](pillars/dashboard.md) +- [Dashboards and Reporting](./pillars/dashboard.md) ## Tools, Patterns and Recommended Practices -- [Tooling and Patterns](tools/README.md) -- [Observability As Code](observability-as-code.md) -- [Recommended Practices](best-practices.md) -- [Diagnostics tools](diagnostic-tools.md) -- [OpenTelemetry](tools/OpenTelemetry.md) +- [Tooling and Patterns](./tools/README.md) +- [Observability As Code](./observability-as-code.md) +- [Recommended Practices](./best-practices.md) +- [Diagnostics tools](./diagnostic-tools.md) +- [OpenTelemetry](./tools/OpenTelemetry.md) ## Facets of Observability -- [Observability for Microservices](microservices.md) -- [Observability in Machine Learning](ml-observability.md) -- [Observability of CI/CD Pipelines](observability-pipelines.md) -- [Observability in Azure Databricks](observability-databricks.md) -- [Recipes](recipes-observability.md) +- [Observability for Microservices](./microservices.md) +- [Observability in Machine Learning](./ml-observability.md) +- [Observability of CI/CD Pipelines](./observability-pipelines.md) +- [Observability in Azure Databricks](./observability-databricks.md) +- [Recipes](./recipes-observability.md) -## Useful links +## Resources - [Non-Functional Requirements Guidance](../design/design-patterns/non-functional-requirements-capture-guide.md) diff --git a/docs/observability/best-practices.md b/docs/observability/best-practices.md index d8838b4c23..d4bfff8b6d 100644 --- a/docs/observability/best-practices.md +++ b/docs/observability/best-practices.md @@ -1,6 +1,6 @@ # Recommended Practices -1. **Correlation Id**: Include unique identifier at the start of the interaction to tie down aggregated data from various system components and provide a holistic view. Read more guidelines about using [correlation id](correlation-id.md). +1. **Correlation Id**: Include unique identifier at the start of the interaction to tie down aggregated data from various system components and provide a holistic view. Read more guidelines about using [correlation id](./correlation-id.md). 1. Ensure health of the services are **monitored** and provide insights into system's performance and behavior. 1. Ensure **dependent services** are monitored properly. Errors and exceptions in dependent services like Redis cache, Service bus, etc. should be logged and alerted. Also, metrics related to dependent services should be captured and logged. @@ -9,9 +9,9 @@ 1. **Faults, crashes, and failures** are logged as discrete events. This helps engineers identify problem area(s) during failures. 1. Ensure logging configuration (eg: setting logging to "verbose") can be controlled without code changes. 1. Ensure that **metrics** around latency and duration are collected and can be aggregated. -1. Start small and add where there is customer impact. [Avoiding metric fatigue](pitfalls.md#metric-fatigue) is very crucial to collecting actionable data. +1. Start small and add where there is customer impact. [Avoiding metric fatigue](./pitfalls.md#metric-fatigue) is very crucial to collecting actionable data. 1. It is important that every data that is collected contains relevant and rich context. 1. Personally Identifiable Information or any other customer sensitive information should never be logged. Special attention should be paid to any local privacy data regulations and collected data must adhere to those. (ex: GDPR) 1. **Health checks** : Appropriate health checks should added to determine if service is healthy and ready to serve traffic. On a kubernetes platform different types of probes e.g. Liveness, Readiness, Startup etc. can be used to determine health and readiness of the deployed service. -Read more [here](pitfalls.md) to understand what to watch out for while designing and building an observable system. +Read more [here](./pitfalls.md) to understand what to watch out for while designing and building an observable system. diff --git a/docs/observability/correlation-id.md b/docs/observability/correlation-id.md index 24eb52e5c6..b46143318e 100644 --- a/docs/observability/correlation-id.md +++ b/docs/observability/correlation-id.md @@ -35,7 +35,7 @@ A Correlation ID is a unique identifier that is added to the very first interact Log correlation is the ability to track disparate events through different parts of the application. Having a Correlation ID provides more context making it easy to build rules for reporting and analysis. -### Secondary reporting/observer systems +### Secondary Reporting/Observer Systems Using Correlation ID helps secondary systems to correlate data without application context. Some examples - generating metrics based on tracing data, integrating runtime/system diagnostics etc. For example, feeding AppInsights data and correlating it to infrastructure issues. diff --git a/docs/observability/diagnostic-tools.md b/docs/observability/diagnostic-tools.md index 0700620093..99bfca2e5a 100644 --- a/docs/observability/diagnostic-tools.md +++ b/docs/observability/diagnostic-tools.md @@ -1,6 +1,6 @@ # Diagnostic tools -Besides [Logging](pillars/logging.md), [Tracing](pillars/tracing.md) and [Metrics](pillars/metrics.md), there are additional tools to help diagnose issues when applications do not behave as expected. In some scenarios, analyzing the memory consumption and drilling down into why a specific process takes longer than expected may require additional measures. In these cases, platform or programming language specific diagnostic tools come into play and are useful to debug a memory leak, profile the CPU usage, or the cause of delays in multi-threading. +Besides [Logging](./pillars/logging.md), [Tracing](./pillars/tracing.md) and [Metrics](./pillars/metrics.md), there are additional tools to help diagnose issues when applications do not behave as expected. In some scenarios, analyzing the memory consumption and drilling down into why a specific process takes longer than expected may require additional measures. In these cases, platform or programming language specific diagnostic tools come into play and are useful to debug a memory leak, profile the CPU usage, or the cause of delays in multi-threading. ## Profilers and Memory Analyzers @@ -20,7 +20,7 @@ Not all programming languages support instrumentation. Instrumentation is mostly Once you have your profiling data, there are multiple ways to visualize this information depending of the format you saved it. As an example for .NET (dotnet-trace), there are three available formats to save these traces: Chromium, NetTrace and SpeedScope. Select the output format depending on the tool you are going to use. [SpeedScope](https://www.speedscope.app/) is an online web application you can use to visualize and analyze traces, and you only need a modern browser. Be careful with online tools, as dumps/traces might contain confidential information that you don't want to share outside of your organization. -### Memory analyzers +### Memory Analyzers Memory analyzers and memory dumps are another set of diagnostic tools you can use to identify issues in your process. Normally these types of tools take the whole memory the process is using at a point in time and saves it in a file which can be analyzed. When using these types of tools, you want to stress your process as much as possible to amplify whatever deficiency you may have in terms of memory management. The memory dump should then be taken when the process is in this stressed state. @@ -41,15 +41,15 @@ There are a range of developer platform specific diagnostic tools which can be u - [Python debugging and profiling - version specific](https://docs.python.org/3/library/debug.html) - [Node.js Diagnostics working group](https://github.com/nodejs/diagnostics) -## Environment for profiling +## Environment for Profiling To create an application profile as close to production as possible, the environment in which the application is intended to run in production has to be considered and it might be necessary to perform a snapshot of the application state [under load](../automated-testing/performance-testing/README.md). -### Diagnostics in containers +### Diagnostics in Containers For monolithic applications, diagnostics tools can be installed and run on the VM hosting them. Most scalable applications are developed as [microservices](./microservices.md) and have complex interactions which require to install the tools in the containers running the process or to leverage a sidecar container (see [sidecar pattern](https://learn.microsoft.com/en-us/azure/architecture/patterns/sidecar)). Some platforms expose endpoints to interact with the application and return a dump. -Useful links: +#### Resources - [.NET Core diagnostics in containers](https://learn.microsoft.com/en-us/dotnet/core/diagnostics/diagnostics-in-containers) - [Experimental tool dotnet-monitor](https://devblogs.microsoft.com/dotnet/introducing-dotnet-monitor/), [What's new](https://devblogs.microsoft.com/dotnet/whats-new-in-dotnet-monitor/), [GItHub repository](https://github.com/dotnet/dotnet-monitor/tree/main/documentation) diff --git a/docs/observability/microservices.md b/docs/observability/microservices.md index 7abb2bce3a..260fbf48b9 100644 --- a/docs/observability/microservices.md +++ b/docs/observability/microservices.md @@ -12,7 +12,7 @@ This is a common issue. When calling other microservices, depending on the techn More important, we don't have any way to associate our Correlation Id to whatever happens inside that microservice. Therefore, is so important to have a plan in place to be able to extend your traceability and monitoring efforts, especially when using a microservice architecture. -## How to extend your tracing information between microservices +## How to Extend Your Tracing Information Between Microservices The W3C consortium is working on a [Trace Context](https://www.w3.org/TR/trace-context/) definition that can be applied when using HTTP as the protocol in a microservice architecture. But let's explain how we can implement this functionality in our software. diff --git a/docs/observability/ml-observability.md b/docs/observability/ml-observability.md index 0d3b98a96f..4821bde7f4 100644 --- a/docs/observability/ml-observability.md +++ b/docs/observability/ml-observability.md @@ -6,7 +6,7 @@ the code, the model and the data. We can distinguish two stages of such system lifespan: experimentation and production that require different approaches to observability as discussed below: -## Model experimentation and tuning +## Model Experimentation and Tuning Experimentation is a process of finding suitable machine learning model and its parameters via training and evaluating such models with one or more datasets. @@ -29,29 +29,29 @@ Application Insights can be used as an alternative sink to capture model metrics An extensive comparison of the four tools can be found as follows: -| | Azure ML | MLFlow | TensorBoard | Application Insights | -|---------------------------| ----------- | ----------- | ----------- | ----------- | -| **Metrics support** | Values, images, matrices, logs | Values, images, matrices and plots as files | Metrics relevant to DL research phase | Values, images, matrices, logs -| **Customizabile** | Basic | Basic | Very basic | High -| **Metrics accessible** | AML portal, AML SDK | MLFlow UI, Tracking service API | Tensorboard UI, history object | Application Insights -| **Logs accessible** | Rolling logs written to .txt files in blob storage, accessible via blob or AML portal. Not query-able | Rolling logs are not stored | Rolling logs are not stored | Application Insights in Azure Portal. Query-able with KQL -| **Ease of use and set up** | Very straightforward, only one portal | More moving parts due to remote tracking server | A bit over process overhead. Also depending on ML framework | More moving parts as a custom app needs to be maintained -| **Shareability** | Across people with access to AML workspace | Across people with access to remote tracking server | Across people with access to same directory | Across people with access to AppInsights +| | Azure ML | MLFlow | TensorBoard | Application Insights | +| -- | -- | -- | -- | -- | +| **Metrics support** | Values, images, matrices, logs | Values, images, matrices and plots as files | Metrics relevant to DL research phase | Values, images, matrices, logs | +| **Customizabile** | Basic | Basic | Very basic | High | +| **Metrics accessible** | AML portal, AML SDK | MLFlow UI, Tracking service API | Tensorboard UI, history object | Application Insights | +| **Logs accessible** | Rolling logs written to .txt files in blob storage, accessible via blob or AML portal. Not query-able | Rolling logs are not stored | Rolling logs are not stored | Application Insights in Azure Portal. Query-able with KQL | +| **Ease of use and set up** | Very straightforward, only one portal | More moving parts due to remote tracking server | A bit over process overhead. Also depending on ML framework | More moving parts as a custom app needs to be maintained | +| **Shareability** | Across people with access to AML workspace | Across people with access to remote tracking server | Across people with access to same directory | Across people with access to AppInsights | -## Model in production +## Model in Production -The trained model can be deployed to production as container. Azure Machine Learning service provides SDK to deploy model as Azure Container Instance and publishes REST endpoint. You can monitor it using microservice observability methods( for more details -refer to [Recipes](README.md) section). MLFLow is an alternative way to deploy ML model as a service. +The trained model can be deployed to production as container. Azure Machine Learning service provides SDK to deploy model as Azure Container Instance and publishes REST endpoint. You can monitor it using microservice observability methods( for more details -refer to [Recipes](./README.md) section). MLFLow is an alternative way to deploy ML model as a service. -## Training and re-training +## Training and Re-Training To automatically retrain the model you can use AML Pipelines or Azure Databricks. When re-training with AML Pipelines you can monitor information of each run, including the output, logs, and various metrics in the Azure portal experiment dashboard, or manually extract it using the AML SDK -## Model performance over time: data drift +## Model Performance Over Time: Data Drift We re-train machine learning models to improve their performance and make models better aligned with data changing over time. However, in some cases model performance may degrade. This may happen in case data change dramatically and do not exhibit the patterns we observed during model development anymore. This effect is called data drift. Azure Machine Learning Service has preview feature to observe and report data drift. This [article](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-monitor-datasets) describes it in detail. -## Data versioning +## Data Versioning It is recommended practice to add version to all datasets. You can create a versioned Azure ML Dataset for this purpose, or manually version it if using other systems. diff --git a/docs/observability/observability-as-code.md b/docs/observability/observability-as-code.md index afcdb24690..f4fa884b93 100644 --- a/docs/observability/observability-as-code.md +++ b/docs/observability/observability-as-code.md @@ -5,23 +5,16 @@ As much as possible, configuration and management of observability assets such a ## Examples of Observability as Code 1. Dashboards as Code - Monitoring Dashboards can be created as JSON or XML templates. This template is source control maintained and any changes to the dashboards can be reviewed. Automation can be built for enabling the dashboard. [More about how to do this in Azure](https://learn.microsoft.com/en-us/azure/azure-portal/azure-portal-dashboards-create-programmatically). Grafana dashboard can also be [configured as code](https://grafana.com/blog/2020/02/26/how-to-configure-grafana-as-code/) which eventually can be source-controlled to be used in automation and pipelines. - 2. Alerts as Code - Alerts can be created within Azure by using Terraform or ARM templates. Such alerts can be source-controlled and be deployed as part of pipelines (Azure DevOps pipelines, Jenkins, GitHub Actions etc.). Few references of how to do this are: [Terraform Monitor Metric Alert](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/monitor_metric_alert). Alerts can also be created based on log analytics query and can be defined as code using [Terraform Monitor Scheduled Query Rules Alert](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/monitor_scheduled_query_rules_alert#example-usage). - 3. Automating Log Analytics Queries - There are several use cases where automation of log analytics queries may be needed. Example, Automatic Report Generation, Running custom queries programmatically for analysis, debugging etc. For these use cases to work, log queries should be source-controlled and automation can be built using [log analytics REST](https://learn.microsoft.com/en-us/rest/api/loganalytics/) or [azure cli](https://learn.microsoft.com/en-us/cli/azure/ext/log-analytics/monitor/log-analytics?view=azure-cli-latest). ## Why - It makes configuration repeatable and automatable. It also avoids manual configuration of monitoring alerts and dashboards from scratch across environments. - - Configured dashboards help troubleshoot errors during integration and deployment (CI/CD) - - We can audit changes and roll them back if there are any issues. - - Identify actionable insights from the generated metrics data across all environments, not just production. - - Configuration and management of observability assets like alert threshold, duration, configuration values using IAC help us in avoiding configuration mistakes, errors or overlooks during deployment. - - When practicing observability as code, the changes can be reviewed by the team similar to other code contributions. diff --git a/docs/observability/observability-databricks.md b/docs/observability/observability-databricks.md index c329834709..b657ac748c 100644 --- a/docs/observability/observability-databricks.md +++ b/docs/observability/observability-databricks.md @@ -57,7 +57,7 @@ subscription level (like provisioning of VM, Disk etc.) These logs can be enabled via Azure Monitor > Activity Logs and shipped to Log Analytics. -### Ganglia metrics +### Ganglia Metrics Ganglia metrics is a Cluster Utilization UI and is available on the Azure Databricks. It is great for viewing live metrics of interactive clusters. Ganglia metrics is available by default and takes snapshot of usage every 15 minutes. Historical metrics are stored as .png files, making it impossible to analyze data. diff --git a/docs/observability/observability-pipelines.md b/docs/observability/observability-pipelines.md index fb65b3fae1..9733a8c5a0 100644 --- a/docs/observability/observability-pipelines.md +++ b/docs/observability/observability-pipelines.md @@ -7,19 +7,15 @@ applications. ## Benefits - Having proper instrumentation during build time helps gain insights into the various stages of the build and release process. - - Helps developers understand where the pipeline performance bottlenecks are, based on the data collected. This helps in having data-driven conversations around identifying latency between jobs, performance issues, artifact upload/download times providing valuable insights into agents availability and capacity. - - Helps to identify trends in failures, thus allowing developers to quickly do root cause analysis. - - Helps to provide an organization-wide view of pipeline health to easily identify trends. ## Points to Consider - It is important to identify the Key Performance Indicators (KPIs) for evaluating a successful CI/CD pipeline. Where needed, additional tracing can be added to better record KPI metrics. For example, adding pipeline build tags to identify a 'Release Candidate' vs. 'Non-Release Candidate' helps in evaluating the end-to-end release process timeline. - - Depending on the tooling used (Azure DevOps, Jenkins etc.,), basic reporting on the pipelines is available out-of-the-box. It is important to evaluate these reports against the KPIs to understand if a custom reporting solution for their pipelines is needed. If required, custom dashboards can be built using diff --git a/docs/observability/pillars/README.md b/docs/observability/pillars/README.md deleted file mode 100644 index 0c9550200f..0000000000 --- a/docs/observability/pillars/README.md +++ /dev/null @@ -1,6 +0,0 @@ -# Pillars - -- [Logging](./logging.md) -- [Metrics](./metrics.md) -- [Tracing](./tracing.md) -- [Dashboards](./dashboard.md) diff --git a/docs/observability/pillars/logging.md b/docs/observability/pillars/logging.md index 448be772b0..24df687f38 100644 --- a/docs/observability/pillars/logging.md +++ b/docs/observability/pillars/logging.md @@ -49,11 +49,11 @@ This approach isn't without trade-offs: - Ensure personal identifiable information policies and restrictions are followed. - Ensure errors and exceptions in dependent services are captured and logged. For example, if an application uses Redis cache, Service Bus or any other service, any errors/exceptions raised while accessing these services should be captured and logged. -### If there's sufficient log data, is there a need for instrumenting metrics? +### If there's Sufficient Log Data, is there a Need for Instrumenting Metrics? [Logs vs Metrics vs Traces](../log-vs-metric-vs-trace.md) covers some high level guidance on when to utilize metric data and when to use log data. Both have a valuable part to play in creating observable systems. -### Having problems identifying what to log? +### Having Problems Identifying What to Log? **At application startup**: diff --git a/docs/observability/pillars/metrics.md b/docs/observability/pillars/metrics.md index d73b12e426..0287b946e3 100644 --- a/docs/observability/pillars/metrics.md +++ b/docs/observability/pillars/metrics.md @@ -30,11 +30,11 @@ Items of concern to some may include: ## Best Practices -### When should I use metrics instead of logs? +### When Should I use Metrics Instead of Logs? [Logs vs Metrics vs Traces](../log-vs-metric-vs-trace.md) covers some high level guidance on when to utilize metric data and when to use log data. Both have a valuable part to play in creating observable systems. -### What should be tracked? +### What Should be Tracked? System critical measurements that relate to the application/machine health, which are usually excellent alert candidates. Work with your engineering and devops peers to identify the metrics, but they may include: diff --git a/docs/observability/pitfalls.md b/docs/observability/pitfalls.md index e068bea4be..78a9894cf8 100644 --- a/docs/observability/pitfalls.md +++ b/docs/observability/pitfalls.md @@ -1,6 +1,6 @@ # Things to Watch for when Building Observable Systems -## Observability as an afterthought +## Observability as an Afterthought One of the design goals when building a system should be to enable monitoring of the system. This helps planning and thinking application availability, logging and metrics at the time of design and development. Observability also acts as a great debugging tool providing developers a bird's eye view of the system. By leaving instrumentation and logging of metrics towards the end, the development teams lose valuable insights during development. @@ -20,4 +20,4 @@ All data logged must contain rich context, which is useful for getting an overal ## Personally Identifiable Information As a general rule, do not log any customer sensitive and Personal Identifiable Information (PII). Ensure any pertinent privacy regulations are followed regarding PII (Ex: GDPR etc.) -Read more [here](logs-privacy.md) on how to keep sensitive data out of logs. +Read more [here](./logs-privacy.md) on how to keep sensitive data out of logs. diff --git a/docs/observability/profiling.md b/docs/observability/profiling.md index 7dc49dce47..aa830435e5 100644 --- a/docs/observability/profiling.md +++ b/docs/observability/profiling.md @@ -10,11 +10,11 @@ Profiling is somewhat language dependent, so start off by searching for "profile Profiling does incur some cost, as it requires inspecting the call stack, and sometimes pausing the application all together (ie: to trigger a full GC in Java). It is recommended to continuously profile your services, say for 10s every 10 minutes. Consider the cost when deciding on tuning these parameters. -Different tools visualize profiles differently. Common CPU profiles might use a directed graph ![graph](images/pprof-dot.png) or a flame graph. ![flame](images/flame.png) +Different tools visualize profiles differently. Common CPU profiles might use a directed graph ![graph](./images/pprof-dot.png) or a flame graph. ![flame](./images/flame.png) Unfortunately, each profiler tool typically uses its own format for storing profiles, and comes with its own visualization. -## Specific tools +## Tools - (Java, Go, Python, Ruby, eBPF) [Pyroscope](https://github.com/pyroscope-io/pyroscope) continuous profiling out of the box. - (Java and Go) [Flame](https://github.com/VerizonMedia/kubectl-flame) - profiling containers in Kubernetes diff --git a/docs/observability/recipes-observability.md b/docs/observability/recipes-observability.md index 129d4387cb..2a18ea69fa 100644 --- a/docs/observability/recipes-observability.md +++ b/docs/observability/recipes-observability.md @@ -4,15 +4,15 @@ [GitHub Repo](https://github.com/Azure-Samples/application-insights-aspnet-sample-opentelemetry), [Article](https://devblogs.microsoft.com/aspnet/observability-asp-net-core-apps/). -## Application Insights/ASP.NET Core with distributed Trace Context propagation to Kafka +## Application Insights/ASP.NET Core with Distributed Trace Context Propagation to Kafka [GitHub Repo](https://github.com/MagdaPaj/application-insights-aspnet-sample-trace-context-propagation). -## Example: OpenTelemetry over a message oriented architecture in Java with Jaeger, Prometheus and Azure Monitor +## Example: OpenTelemetry Over a Message Oriented Architecture in Java with Jaeger, Prometheus and Azure Monitor [GitHub Repo](https://github.com/iamnicoj/OpenTelemetry-Async-Java-with-Jaeger-Prometheus-AzMonitor) -## Example: Setup Azure Monitor dashboards and alerts with Terraform +## Example: Setup Azure Monitor Dashboards and Alerts with Terraform [GitHub Repo](https://github.com/buzzfrog/azure-alert-dashboard-terraform) @@ -30,7 +30,7 @@ The [Azure DevOps Pipelines Report](https://github.com/Azure-Samples/powerbi-pip This dashboard recipe provides observability for AzDO pipelines by displaying various metrics (i.e. average runtime, run outcome statistics, etc.) in a table. Additionally, the second page of the template visualizes pipeline success and failure trends using Power BI charts. Documentation and setup information can be found in the project README. -## Python OpenTelemetry Examples +## Python Logger Class for Application Insights using OpenCensus The Azure SDK for Python contains an [Azure Monitor Opentelemetry Distro client library for Python ](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/monitor/azure-monitor-opentelemetry). You can view samples of how to use the library in this [GitHub Repo](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/monitor/azure-monitor-opentelemetry/samples). With this library you can easily collect traces, metrics, and logs. diff --git a/docs/observability/tools/KubernetesDashboards.md b/docs/observability/tools/KubernetesDashboards.md index 5341200342..dc16691b5d 100644 --- a/docs/observability/tools/KubernetesDashboards.md +++ b/docs/observability/tools/KubernetesDashboards.md @@ -33,6 +33,6 @@ There are currently several UI dashboards available to monitor your applications - [Lens](https://k8slens.dev/): Client side desktop tool - [Thanos](https://github.com/thanos-io/thanos) and [Cortex](https://cortexmetrics.io/docs/): Multi-cluster implementations -## References +## Resources - [Alternatives to Kubernetes Dashboard](https://octopus.com/blog/alternative-kubernetes-dashboards) diff --git a/docs/observability/tools/OpenTelemetry.md b/docs/observability/tools/OpenTelemetry.md index 8c44c71dec..fc8a59f364 100644 --- a/docs/observability/tools/OpenTelemetry.md +++ b/docs/observability/tools/OpenTelemetry.md @@ -71,7 +71,7 @@ From the website: >Our goal is to provide a generally available, production quality release for the tracing data source across most OpenTelemetry components in the first half of 2021. Several components have already reached this milestone! We expect metrics to reach the same status in the second half of 2021 and are targeting logs in 2022. -## What to watch out for +## What to Watch Out for As OpenTelemetry is a very recent project (first GA version of some features released in 2020), many features are still in beta hence due diligence needs to be done before using such features in production. Also, OpenTelemetry supports many popular languages but features in all languages are not at par. Some languages offer more features as compared to other languages. It also needs to be called out as some features are not in GA, there may be some incompatibility issues with the tooling. That being said, OpenTelemetry is one of the most active projects of [CNCF](https://www.cncf.io), so it is expected that many more features would reach GA soon. @@ -82,10 +82,10 @@ Apart from the logging specification and implementation that are still marked as ## Integration Options with Azure Monitor ### Using the Azure Monitor OpenTelemetry Exporter Library - + This scenario uses the OpenTelemetry SDK as the core instrumentation library. Basically this means you will instrument your application using the OpenTelemetry libraries, but you will additionally use the Azure Monitor OpenTelemetry Exporter and then added it as an additional exporter with the OpenTelemetry SDK. In this way, the OpenTelemetry traces your application creates will be pushed to your Azure Monitor Instance. -### Using the Application Insights Agent Jar file - Java only +### Using the Application Insights Agent Jar File - Java Only Java OpenTelemetry instrumentation provides another way to integrate with Azure Monitor, by using [Applications Insights Java Agent jar](https://learn.microsoft.com/en-us/azure/azure-monitor/app/java-in-process-agent). @@ -94,7 +94,7 @@ When configuring this option, the Applications Insights Agent file is added when OpenTelemetry Java Agent instrumentation supports many [libraries and frameworks and application servers](https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/docs/supported-libraries.md#supported-libraries-frameworks-application-servers-and-jvms). Application Insights Java Agent [enhances](https://learn.microsoft.com/en-us/azure/azure-monitor/app/java-in-process-agent#auto-instrumentation) this list. Therefore, the main difference between running the OpenTelemetry Java Agent vs. the Application Insights Java Agent is demonstrated in the amount of traces getting logged in Azure Monitor. When running with Application Insights Java agent there's more telemetry getting pushed to Azure Monitor. On the other hand, when running the solution using the Application Insights agent mode, it is essential to highlight that nothing gets logged on Jaeger (or any other OpenTelemetry exporter). All traces will be pushed exclusively to Azure Monitor. However, both manual instrumentation done via the OpenTelemetry SDK and all automatic traces, dependencies, performance counters, and metrics being instrumented by the Application Insights agent are sent to Azure Monitor. Although there is a rich amount of additional data automatically instrumented by the Application Insights agent, it can be deduced that it is not necessarily OpenTelemetry compliant. Only the traces logged by the manual instrumentation using the OpenTelemetry SDK are. -#### OpenTelemetry vs Application Insights agents compared +#### OpenTelemetry vs Application Insights Agents Compared | Highlight | OpenTelemetry Agent | App Insights Agent | |--------------------------------------------------------------------------|---------------------|--------------------| @@ -116,34 +116,25 @@ Either way, instrumenting your code with OpenTelemetry seems the right approach Use the [Azure OpenTelemetry Tracing plugin library for Java](https://github.com/Azure/azure-sdk-for-java/tree/main/sdk/core/azure-core-tracing-opentelemetry) to enable distributed tracing across Azure components through OpenTelemetry. -### Manual trace context propagation +### Manual Trace Context Propagation The trace context is stored in Thread-local storage. When the application flow involves multiple threads (eg. multithreaded work-queue, asynchronous processing) then the traces won't get combined into one end-to-end trace chain with automatic [context propagation](https://opentelemetry.io/docs/concepts/signals/traces/#context-propagation). To achieve that you need to manually propagate the trace context ([example in Java](https://opentelemetry.io/docs/instrumentation/java/manual/#context-propagation)) by storing the [trace headers](https://www.w3.org/TR/trace-context/#trace-context-http-headers-format) along with the work-queue item. -### Telemetry testing +### Telemetry Testing Mission critical telemetry data should be covered by testing. You can cover telemetry by tests by mocking the telemetry collector web server. In automated testing environment the telemetry instrumentation can be configured to use [OTLP exporter](https://opentelemetry.io/docs/reference/specification/protocol/exporter/) and point the [OTLP exporter endpoint](https://github.com/open-telemetry/opentelemetry-java/blob/main/sdk-extensions/autoconfigure/README.md#otlp-exporter-span-metric-and-log-exporters) to the collector web server. Using mocking servers libraries (eg. MockServer or WireMock) can help verify the telemetry data pushed to the collector. -## References +## Resources * [OpenTelemetry Official Site](https://opentelemetry.io/) - * [Getting Started with dotnet and OpenTelemetry](https://opentelemetry.io/docs/languages/net/getting-started/) - * [Using OpenTelemetry Collector](https://opentelemetry.io/docs/collector/getting-started/) - * [OpenTelemetry Java SDK](https://github.com/open-telemetry/opentelemetry-java) - * [Manual Instrumentation](https://github.com/open-telemetry/opentelemetry-java-instrumentation#manually-instrumenting) - * [OpenTelemetry Instrumentation Agent for Java](https://github.com/open-telemetry/opentelemetry-java-instrumentation) - * [Application Insights Java Agent](https://learn.microsoft.com/en-us/azure/azure-monitor/app/java-in-process-agent) - * [Azure Monitor OpenTelemetry Exporter client library for Java](https://github.com/Azure/azure-sdk-for-java/tree/3f31d68eed6fbe11516ca3afe3955c8840a6e974/sdk/monitor/azure-monitor-opentelemetry-exporter) - * [Azure OpenTelemetry Tracing plugin library for Java](https://github.com/Azure/azure-sdk-for-java/tree/main/sdk/core/azure-core-tracing-opentelemetry) - * [Application Insights Agent's OpenTelemetry configuration](https://github.com/microsoft/ApplicationInsights-Java/wiki/OpenTelemetry-API-support-(3.0)) diff --git a/docs/observability/tools/Prometheus.md b/docs/observability/tools/Prometheus.md index 5005cc527b..ac6563b1ca 100644 --- a/docs/observability/tools/Prometheus.md +++ b/docs/observability/tools/Prometheus.md @@ -33,7 +33,7 @@ Prometheus' metrics format is supported by a wide array of tools and services in There are numerous [exporters](https://prometheus.io/docs/instrumenting/exporters/) which are used in exporting existing metrics from third-party databases, hardware, CI/CD tools, messaging systems, APIs and other monitoring systems. In addition to client libraries and exporters, there is a significant number of [integration points](https://prometheus.io/docs/operating/integrations/) for service discovery, remote storage, alerts and management. -## References +## Resources - [Prometheus Docs](https://prometheus.io/docs) - [Prometheus Best Practices](https://prometheus.io/docs/practices) diff --git a/docs/observability/tools/loki.md b/docs/observability/tools/loki.md index 259b0aa37c..5dc82046df 100644 --- a/docs/observability/tools/loki.md +++ b/docs/observability/tools/loki.md @@ -30,7 +30,7 @@ The main reason to use Loki instead of other log aggregation tools, is that Loki storage. It does that by following the same pattern as prometheus, which index the labels and make chunks of the log itself, using less space than just storing the raw logs. -## References +## Resources - [Loki Official Site](https://grafana.com/oss/loki/) - [Inserting logs into Loki](https://grafana.com/docs/loki/latest/getting-started/get-logs-into-loki/) diff --git a/docs/privacy/README.md b/docs/privacy/README.md index 64cf40fdf7..1978d4e74d 100644 --- a/docs/privacy/README.md +++ b/docs/privacy/README.md @@ -8,5 +8,5 @@ In general, developers working on [ISE](../ISE.md) projects should adhere to Mic The playbook currently contains two main parts: -1. [Privacy and Data](data-handling.md): Best practices for properly handling sensitive and private data. -2. [Privacy frameworks](privacy-frameworks.md): A list of frameworks which could be applied in private data scenarios. +1. [Privacy and Data](./data-handling.md): Best practices for properly handling sensitive and private data. +2. [Privacy frameworks](./privacy-frameworks.md): A list of frameworks which could be applied in private data scenarios. diff --git a/docs/privacy/data-handling.md b/docs/privacy/data-handling.md index e381accc55..c5bc7dc6a3 100644 --- a/docs/privacy/data-handling.md +++ b/docs/privacy/data-handling.md @@ -18,30 +18,26 @@ Developers working on ISE projects should implement best practices and guidance - [Limited Data Protection Addendum](https://aka.ms/mpsldpa) - [Professional Services Data Protection Addendum](https://www.microsoft.com/licensing/docs/view/Microsoft-Products-and-Services-Data-Protection-Addendum-DPA) -## 5 W's of data handling +## 5 W's of Data Handling When working on an engagement it is important to address the following 5 **W**'s: - **Who** – gets access to and with whom will we share the data and/or models developed with the data? - - **What** – data is shared with us and under what expectations and understanding. Customers need to be explicit about how the data they share applies to the overarching effort. The understanding shouldn't be vague and we shouldn't have access to broad set of data if not necessary. - - **Where** – will the data be stored and what legal jurisdiction will preside over that data. This is particularly important in countries like Germany, where different privacy laws apply but also important when it comes to responding to legal subpoenas for the data. - - **When** – will the access to data be provided and for how long? It is important to not leave straggling access to data once the engagement is completed, and define a priori the data retention policies. - - **Why** – have you given access to the data? This is particularly important to clarify the purpose and any restrictions on usage beyond the intended purpose. Please use the above guidelines to ensure the data is used only for intended purposes and thereby gain trust. It is important to be aware of data handling best practices and ensure the required clarity is provided to adhere to the above 5Ws. -## Handling data in ISE engagements +## Handling Data in ISE Engagements Data should never leave customer-controlled environments and contractors and/or other members in the engagement should never have access to complete customer data sets but use limited customer data sets using the following prioritized approaches: @@ -67,7 +63,7 @@ Customers should provide ISE with a copy of the requested data in a location man The customer should consider turning any logging capabilities on so they can clearly identify who has access and what they do with that access. ISE should notify the customer when they are done with the data and suggest the customer destroy copies of the data if they are no longer needed. -### Our guiding principles when handling data in an engagement +### Our Guiding Principles when Handling Data in an Engagement - Never directly access production data. - Explicitly state the intended purpose of data that can be used for engagement. @@ -77,7 +73,7 @@ the team should promptly work to clean up engagement copies of data. - Do not send any copies of the production data outside the customer-controlled environment. - Only use the minimal subset of the data needed for the purpose of the engagement. -### Questions to consider when working with data +### Questions to Consider when Working with Data - What data do we need? - What is the legal basis for processing this data? diff --git a/docs/privacy/privacy-frameworks.md b/docs/privacy/privacy-frameworks.md index 9cd3f59dc6..4b14626a1c 100644 --- a/docs/privacy/privacy-frameworks.md +++ b/docs/privacy/privacy-frameworks.md @@ -1,15 +1,15 @@ -# Privacy related frameworks +# Privacy Related frameworks The following tools/frameworks could be leveraged when data analysis or model development needs to take place on private data. Note that the use of such frameworks still requires the solution to adhere to privacy regulations and others, and additional safeguards should be applied. -## Typical scenarios for leveraging a Privacy framework +## Typical Scenarios for Leveraging a Privacy Framework - Sharing data or results while preserving data subjects' privacy - Performing analysis or statistical modeling on private data - Developing privacy preserving ML models and data pipelines -## Privacy frameworks +## Privacy Frameworks Protecting private data involves the entire data lifecycle, from acquisition, through storage, processing, analysis, modeling and usage in reports or machine learning models. Proper safeguards and restrictions should be applied in each of these phases. @@ -22,19 +22,17 @@ We focus on four main use cases in the data lifecycle: 3. [Creating privacy preserving data and ML pipelines](#privacy-preserving-data-pipelines-and-ml) 4. [Data loss prevention](#data-loss-prevention) -### Obtaining non-sensitive data +### Obtaining Non-Sensitive Data In many scenarios, analysts, researchers and data scientists require access to a non-sensitive version or sample of the private data. In this section we focus on two approaches for obtaining non-sensitive data. **Note:** These two approaches do not guarantee that the outcome would not include private data, and additional measures should be applied. -#### Data de-identification +#### Data De-Identification -De-identification is the process of applying a set of transformations to a dataset, -in order to lower the risk of unintended disclosure of personal data. -De-identification involves the removal or substitution of both direct identifiers (such as name, or social security number) or quasi-identifiers, -which can be used for re-identification using additional external information. +De-identification is the process of applying a set of transformations to a dataset, in order to lower the risk of unintended disclosure of personal data. +De-identification involves the removal or substitution of both direct identifiers (such as name, or social security number) or quasi-identifiers, which can be used for re-identification using additional external information. De-identification can be applied to different types of data, such as structured data, images and text. However, de-identification of non-structured data often involves statistical approaches which might result in undetected PII (Personal Identifiable Information) or non-private information being redacted or replaced. @@ -48,7 +46,7 @@ Here we outline several de-identification solutions available as open source: | [ARX](https://arx.deidentifier.org/) | Anonymization using statistical models, specifically k-anonymity, ℓ-diversity, t-closeness and δ-presence. Useful for validating the anonymization of aggregated data. Links: [Repo](https://github.com/arx-deidentifier/arx), [Website](https://arx.deidentifier.org/). Written in Java. | | [k-Anonymity](https://github.com/Nuclearstar/K-Anonymity) | GitHub repo with examples on how to produce k-anonymous datasets. k-anonymity protects the privacy of individual persons by pooling their attributes into groups of at least *k* people. [repo](https://github.com/Nuclearstar/K-Anonymity/blob/master/k-Anonymity.ipynb) | -#### Synthetic data generation +#### Synthetic Data Generation A synthetic dataset is a repository of data generated from actual data and has the same statistical properties as the real data. The degree to which a synthetic dataset is an accurate proxy for real data is a measure of utility. @@ -68,9 +66,9 @@ When determining the best method for creating synthetic data, it is essential fi | [Faker](https://github.com/joke2k/faker) | Faker is a Python package that generates fake data for you. Whether you need to bootstrap your database, create good-looking XML documents, fill-in your persistence to stress test it, or anonymize data taken from a production service, Faker is for you. | | [Plaitpy](https://github.com/plaitpy/plaitpy) | The idea behind plait.py is that it should be easy to model fake data that has an interesting shape. Currently, many fake data generators model their data as a collection of IID variables; with plait.py we can stitch together those variables into a more coherent model. | -### Trusted research and modeling environments +### Trusted Research and Modeling Environments -#### Trusted research environments +#### Trusted Research Environments Trusted Research Environments (TREs) enable organizations to create secure workspaces for analysts, data scientists and researchers who require access to sensitive data. @@ -86,7 +84,7 @@ We highlight several alternatives for Trusted Research Environments: | [Azure Trusted Research Environment](https://github.com/microsoft/azuretre) | An Open Source TRE for Azure. | | [Aridhia DRE](https://www.aridhia.com/) | | -#### Eyes-off machine learning +#### Eyes-Off Machine Learning In certain situations, Data Scientists may need to train models on data they are not allowed to see. In these cases, an "eyes-off" approach is recommended. An eyes-off approach provides a data scientist with an environment in which scripts can be run on the data but direct access to samples is not allowed. @@ -99,14 +97,14 @@ For example, a user would be able to submit a script which trains a model and in In addition to the eyes-off environment, this approach usually entails providing access to an "eyes-on" dataset, which is a representative, cleansed, sample set of data for model design purposes. The Eyes-on dataset is often a de-identified subset of the private dataset, or a synthetic dataset generated based on the characteristics of the private dataset. -#### Private data sharing platforms +#### Private Data Sharing Platforms Various tools and systems allow different parties to share data with 3rd parties while protecting private entities, and securely process data while reducing the likelihood of data exfiltration. These tools include [Secure Multi Party Computation (SMPC)](https://en.wikipedia.org/wiki/Secure_multi-party_computation) systems, [Homomorphic Encryption](#homomorphic-encryption) systems, [Confidential Computing](https://azure.microsoft.com/en-us/solutions/confidential-compute/), private data analysis frameworks such as [PySift](https://github.com/OpenMined/PySyft) among others. -### Privacy preserving data pipelines and ML +### Privacy Preserving Data Pipelines and ML Even when our data is secure, private entities can still be extracted when the data is consumed. Privacy preserving data pipelines and ML models focus on minimizing the risk of private data exfiltration during data querying or model predictions. @@ -141,7 +139,7 @@ Homomorphic Encryption frameworks: A list of additional OSS tools can be found [here](https://homomorphicencryption.org/introduction/). -#### Federated learning +#### Federated Learning Federated learning is a Machine Learning technique which allows the training of ML models in a decentralized way without having to share the actual data. Instead of sending data to the processing engine of the model, the approach is to distribute the model to the different data owners and perform training in a distributed fashion. @@ -154,7 +152,7 @@ Federated learning frameworks: | [FATE](https://fate.fedai.org/) | An OSS federated learning system with different options for deployment and different algorithms adapted for federated learning | | [IBM Federated Learning](https://github.com/IBM/federated-learning-lib) | A Python based federated learning framework focused on enterprise environments. | -### Data loss prevention +### Data Loss Prevention Organizations have sensitive information under their control such as financial data, proprietary data, credit card numbers, health records, or social security numbers. To help protect this sensitive data and reduce risk, they need a way to prevent their users from inappropriately sharing it with people who shouldn't have it. @@ -162,7 +160,7 @@ This practice is called [data loss prevention (DLP)](https://learn.microsoft.com Below we focus on two aspects of DLP: Sensitive data classification and Access management. -#### Sensitive data classification +#### Sensitive Data Classification Sensitive data classification is an important aspect of DLP, as it allows organizations to track, monitor, secure and identify sensitive and private data. Furthermore, different sensitivity levels can be applied to different data items, facilitating proper governance and cataloging. @@ -190,7 +188,7 @@ Additional resources: - [Example guidelines for data classification](https://www.cmu.edu/iso/governance/guidelines/data-classification.html) - [Learn about sensitivity levels](https://learn.microsoft.com/en-us/microsoft-365/compliance/sensitivity-labels?view=o365-worldwide) -#### Access management +#### Access Management Access control is an important component of privacy by design and falls into overall data lifecycle protection. Successful access control will restrict access only to authorized individuals that should have access to data. diff --git a/docs/resources/templates/CONTRIBUTING.md b/docs/resources/templates/CONTRIBUTING.md deleted file mode 100644 index 70bc8024d2..0000000000 --- a/docs/resources/templates/CONTRIBUTING.md +++ /dev/null @@ -1,28 +0,0 @@ -# Contributing - -We love pull requests from everyone. By participating in this project, you -agree to abide by the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/) - -Fork, then clone the repo - -Make sure the tests pass - -Make your change. Add tests for your change. Make the tests pass - -Push to your fork and [submit a pull request][pr]. - -[pr]: https://github.com/xyz - -At this point you're waiting on us. We like to at least comment on pull requests -within three business days (and, typically, one business day). We may suggest -some changes or improvements or alternatives. - -Some things that will increase the chance that your pull request is accepted: - -* Write tests. -* Follow our [engineering playbook][playbook] and the [style guide][style] for this project. -* Write a [good commit message][commit]. - -[playbook]: https://github.com/cloudbeatsch/code-with-engineering-playbook -[style]: https://github.com/xyz -[commit]: http://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html diff --git a/docs/resources/templates/LICENSE b/docs/resources/templates/LICENSE deleted file mode 100644 index 1df9cf4020..0000000000 --- a/docs/resources/templates/LICENSE +++ /dev/null @@ -1,21 +0,0 @@ -MIT License - -Copyright (c) 2017 Microsoft Corporation - -Permission is hereby granted, free of charge, to any person obtaining a copy -of this software and associated documentation files (the "Software"), to deal -in the Software without restriction, including without limitation the rights -to use, copy, modify, merge, publish, distribute, sublicense, and/or sell -copies of the Software, and to permit persons to whom the Software is -furnished to do so, subject to the following conditions: - -The above copyright notice and this permission notice shall be included in all -copies or substantial portions of the Software. - -THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, -FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE -AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER -LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, -OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE -SOFTWARE. \ No newline at end of file diff --git a/docs/resources/templates/README.md b/docs/resources/templates/README.md deleted file mode 100644 index 12c2fb6a57..0000000000 --- a/docs/resources/templates/README.md +++ /dev/null @@ -1,17 +0,0 @@ - -# project-xyz - -Description of the project - -## Deploying to Azure - -## Getting started - -## Dependencies - -## Run it locally - -## Code of conduct - -By participating in this project, you -agree to abide by the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/) diff --git a/docs/security/README.md b/docs/security/README.md index 5eace3cbfa..2c98169d73 100644 --- a/docs/security/README.md +++ b/docs/security/README.md @@ -6,26 +6,26 @@ Developers working on projects should adhere to industry-recommended standard pr ## Requesting Security Reviews -When requesting a security review for your application, please make sure you have familiarized yourself with the [Rules of Engagement](rules-of-engagement.md). This will help you to prepare the application for testing, as well as understand the scope limits of the test. +When requesting a security review for your application, please make sure you have familiarized yourself with the [Rules of Engagement](./rules-of-engagement.md). This will help you to prepare the application for testing, as well as understand the scope limits of the test. -## Quick References +## Quick Resources - [Secure Coding Practices Quick Reference](https://owasp.org/www-pdf-archive/OWASP_SCP_Quick_Reference_Guide_v2.pdf) - [Web Application Security Quick Reference](https://owasp.org/www-pdf-archive//OWASP_Web_Application_Security_Quick_Reference_Guide_0.3.pdf) - [Security Mindset/Creating a Security Program Quick Start](https://github.com/OWASP/Quick-Start-Guide/blob/master/OWASP%20Quick%20Start%20Guide.pdf?raw=true) -- [Credential Scanning / Secret Detection](../continuous-integration/dev-sec-ops/secret-management/credential_scanning.md) +- [Credential Scanning / Secret Detection](../CI-CD/dev-sec-ops/secrets-management/credential_scanning.md) - [Threat Modelling](./threat-modelling.md) ## Azure DevOps Security -- [Security Engineering DevSecOps Practices](https://www.microsoft.com/en-us/securityengineering/devsecops) +- [Security Engineering DevSecOps Practices](https://wiki.owasp.org/images/0/08/OWASP_SCP_Quick_Reference_Guide_v2.pdf) - [Azure DevOps Data Protection Overview](https://learn.microsoft.com/en-us/azure/devops/organizations/security/data-protection?view=azure-devops) - [Security and Identity in Azure DevOps](https://learn.microsoft.com/en-us/azure/devops/organizations/security/about-security-identity?view=azure-devops) - [Security Code Analysis](https://secdevtools.azurewebsites.net/) ## DevSecOps -Introduce security to your project at early stages. The [DevSecOps section](../continuous-integration/dev-sec-ops/README.md) covers security practices, automation, tools and frameworks as part of the application CI. +Introduce security to your project at early stages. The [DevSecOps section](../CI-CD/dev-sec-ops/README.md) covers security practices, automation, tools and frameworks as part of the application CI. ## OWASP Cheat Sheets @@ -93,6 +93,6 @@ Check out the list of tools to help enable security in your projects. - [cert-manager](https://github.com/jetstack/cert-manager) for easy certificate provisioning and automatic rotation. - [Quickly enable mTLS between your microservices with Linkerd](https://linkerd.io/2/features/automatic-mtls/). -## Useful links +## Resources - [Non-Functional Requirements Guidance](../design/design-patterns/non-functional-requirements-capture-guide.md) diff --git a/docs/security/rules-of-engagement.md b/docs/security/rules-of-engagement.md index 8919c0c230..215fa5f4b6 100644 --- a/docs/security/rules-of-engagement.md +++ b/docs/security/rules-of-engagement.md @@ -1,15 +1,15 @@ -# Rules of Engagement +# Application Security Analysis: Rules of Engagement When performing application security analysis, it is expected that the tester follow the *Rules of Engagement* as laid out below. This is to standardize the scope of application testing and provide a concrete awareness of what is considered "out of scope" for security analysis. -## Rules of Engagement - For those requesting review +## Rules of Engagement - For Those Requesting Review * Web Application Firewalls can be up and configured, but do not enable any automatic blocking. This can greatly slow down the person performing the test. * Similarly, if a service is running on a virtual machine, ensure services such as `fail2ban` are disabled. * You cannot make changes to the running application until the test is complete. This is to prevent accidentally breaking an otherwise valid attack in progress. * Any review results are not considered as "final". A security review should always be performed by a security team orchestrated by the customer prior to moving an application into production. If a customer requires further assistance, they can engage Premier Support. -## Rules of Engagement - For those performing tests +## Rules of Engagement - For Those Performing Tests * Do not attempt to perform Denial-of-Service attacks or otherwise crash services. Heavy active scanning is tolerated (and is assumed to be somewhat of a load test) but deliberate takedowns are not permitted. * Do not interact with human beings. Phishing credentials or other such client-side attacks are off-limits. Detailing XSS and similar attacks is encouraged as a part of the test, but do not leverage these against internal users or customers. diff --git a/docs/security/threat-modelling-example.md b/docs/security/threat-modelling-example.md index 6e58978e6a..a7b9ea2489 100644 --- a/docs/security/threat-modelling-example.md +++ b/docs/security/threat-modelling-example.md @@ -1,11 +1,11 @@ -# Overview +# Threat Modelling Example -This document covers the threat models for a sample project which takes video frames from video camera and process these frames on IoTEdge device and send them to Azure Cognitive Service to get the audio output. +This document covers the threat models for a sample project which takes video frames from video camera and process these frames on IoTEdge device and send them to Azure Cognitive Service to get the audio output. These models can be considered as reference template to show how we can construct threat modeling document. Each of the labeled entities in the figures below are accompanied by meta-information which describe the threats, recommended mitigations, and the associated [security principle or goal](#security-principles). ## Architecture Diagram -![Graphical user interface, application Description automatically generated](images/arch_diagram.png) +![Graphical user interface, application Description automatically generated](./images/arch_diagram.png) ## Assets @@ -34,7 +34,7 @@ This document covers the threat models for a sample project which takes video fr ## Threat List -![Diagram Description automatically generated](images/threat_list.png) +![Diagram Description automatically generated](./images/threat_list.png) ## Assumptions @@ -59,7 +59,7 @@ This document covers the threat models for a sample project which takes video fr ## Threat Model -![A picture containing text, map, indoor Description automatically generated](images/threat_model.png) +![A picture containing text, map, indoor Description automatically generated](./images/threat_model.png) ## Threat Properties diff --git a/docs/security/threat-modelling.md b/docs/security/threat-modelling.md index 7b5603fca5..2f38c72b80 100644 --- a/docs/security/threat-modelling.md +++ b/docs/security/threat-modelling.md @@ -4,23 +4,23 @@ Threat modeling is an effective way to help secure your systems, applications, n ## Threat Modeling Phases -1. *Diagram* +1. *Diagram* Capture all requirements for your system and create a data-flow diagram -2. *Identify* - Apply a threat-modeling framework to the data-flow diagram and find potential security issues. Here we can use [STRIDE framework](https://learn.microsoft.com/en-us/training/modules/tm-use-a-framework-to-identify-threats-and-find-ways-to-reduce-or-eliminate-risk/1b-threat-modeling-framework) to identify the threats. -3. *Mitigate* - Decide how to approach each issue with the appropriate combination of security controls. -4. *Validate* +2. *Identify* + Apply a threat-modeling framework to the data-flow diagram and find potential security issues. Here we can use [STRIDE framework](https://learn.microsoft.com/en-us/training/modules/tm-use-a-framework-to-identify-threats-and-find-ways-to-reduce-or-eliminate-risk/1b-threat-modeling-framework) to identify the threats. +3. *Mitigate* + Decide how to approach each issue with the appropriate combination of security controls. +4. *Validate* Verify requirements are met, issues are found, and security controls are implemented. -Example of these phases is covered in the [threat modelling example.](./threat-modelling-example.md) +Example of these phases is covered in the [threat modelling example.](./threat-modelling-example.md) More details about these phases can be found at [Threat Modeling Security Fundamentals.](https://learn.microsoft.com/en-us/training/paths/tm-threat-modeling-fundamentals/) ## Threat Modeling Example [Here is an example](./threat-modelling-example.md) of a threat modeling document which talks about the architecture and different phases involved in the threat modeling. This document can be used as reference template for creating threat modeling documents. -## References +## Resources * [Threat Modeling](https://www.microsoft.com/en-us/securityengineering/sdl/threatmodeling) * [Microsoft Threat Modeling Tool](https://learn.microsoft.com/en-us/azure/security/develop/threat-modeling-tool) diff --git a/docs/source-control/README.md b/docs/source-control/README.md index 0d9676da93..26ca35e681 100644 --- a/docs/source-control/README.md +++ b/docs/source-control/README.md @@ -2,14 +2,6 @@ There are many options when working with Source Control. In [ISE](../ISE.md) we use [AzureDevOps](https://azure.microsoft.com/en-us/services/devops/) for private repositories and [GitHub](https://github.com/) for public repositories. -## Sections within Source Control - -* [Merge Strategies](merge-strategies.md) -* [Branch Naming](naming-branches.md) -* [Versioning](component-versioning.md) -* [Working with Secrets](secrets-management.md) -* [Git Guidance](git-guidance/README.md) - ## Goal * Following industry best practice to work in geo-distributed teams which encourage contributions from all across [ISE](../ISE.md) as well as the broader OSS community @@ -18,23 +10,23 @@ There are many options when working with Source Control. In [ISE](../ISE.md) we ## General Guidance -Consistency is important, so agree to the approach as a team before starting to code. Treat this as a design decision, so include a design proposal and review, in the same way as you would document all design decisions (see [Working Agreements](../agile-development/advanced-topics/team-agreements/working-agreements.md) and [Design Reviews](../design/design-reviews/README.md)). +Consistency is important, so agree to the approach as a team before starting to code. Treat this as a design decision, so include a design proposal and review, in the same way as you would document all design decisions (see [Working Agreements](../agile-development/team-agreements/working-agreement.md) and [Design Reviews](../design/design-reviews/README.md)). -## Creating a new repository +## Creating a New Repository When creating a new repository, the team should at least do the following * Agree on the **branch**, **release** and **merge strategy** -* Define the merge strategy ([linear or non-linear](merge-strategies.md)) +* Define the merge strategy ([linear or non-linear](./merge-strategies.md)) * Lock the default branch and merge using [pull requests (PRs)](../code-reviews/pull-requests.md) -* Agree on [branch naming](naming-branches.md) (e.g. `user/your_alias/feature_name`) +* Agree on [branch naming](./naming-branches.md) (e.g. `user/your_alias/feature_name`) * Establish [branch/PR policies](../code-reviews/pull-requests.md) * For public repositories the default branch should contain the following files: - * [LICENSE](../resources/templates/LICENSE) - * [README.md](../resources/templates/README.md) - * [CONTRIBUTING.md](../resources/templates/CONTRIBUTING.md) + * LICENSE + * README.md + * contributing.md -## Contributing to an existing repository +## Contributing to an Existing Repository When working on an existing project, `git clone` the repository and ensure you understand the team's branch, merge and release strategy (e.g. through the projects [CONTRIBUTING.md file](https://blog.github.com/2012-09-17-contributing-guidelines/)). diff --git a/docs/source-control/component-versioning.md b/docs/source-control/component-versioning.md index 9dcb9855b9..c0f069207a 100644 --- a/docs/source-control/component-versioning.md +++ b/docs/source-control/component-versioning.md @@ -6,7 +6,7 @@ Larger applications consist of multiple components that reference each other and To achieve the goal of loosely coupled applications, each component should be versioned independently hence allowing developers to detect breaking changes or seamless updates just by looking at the version number. -## Version Numbers and Versioning schemes +## Version Numbers and Versioning Schemes For developers or other components to detect breaking changes the version number of a component is important. @@ -58,13 +58,13 @@ Version updates happen through: * Branch names (e.g. develop, release/..) for Alpha / Beta / RC * Otherwise: Number of commits (+12, ...) -## Semantic Versioning within a Monorepo +## Semantic Versioning Within a Monorepo A monorepo, short for "monolithic repository", is a software development practice where multiple related projects, components, or modules are stored within a single version-controlled repository as opposed to maintaining them in separate repositories. drawing -### Challenges with Versioning in a monorepo structure +### Challenges with Versioning in a Monorepo Structure Versioning in a monorepo involves making decisions about how to assign version numbers to different projects and components contained within the repository. @@ -73,7 +73,7 @@ Assigning a single version number to all projects in a monorepo can lead to freq Ideally, we would want each project within the monorepo to have its own version number. Changes in one project shouldn't necessarily trigger version changes in others. This strategy allows projects to evolve at their own pace, without forcing all projects to adopt the same version number. It aligns well with the differing release cadences of distinct projects. -### semantic-release package for versioning +### semantic-release Package for Versioning [semantic-release](https://github.com/semantic-release/semantic-release) simplifies the entire process of releasing a package, which encompasses tasks such as identifying the upcoming version number, producing release notes, and distributing the package. This process severs the direct link between human sentiments and version identifiers. Instead, it rigorously adheres to the Semantic Versioning standards and effectively conveys the significance of alterations to end users. @@ -102,7 +102,7 @@ In order to avoid version collisions, generated git tags are namespaced using th ![monorepo-git-tags](./assets/monorepo-git-tags.png) -### semantic-release configurations +### semantic-release Configurations `semantic-release`’s options, mode and plugins can be set via either: diff --git a/docs/source-control/git-guidance/README.md b/docs/source-control/git-guidance/README.md index f198c45275..027c4630c3 100644 --- a/docs/source-control/git-guidance/README.md +++ b/docs/source-control/git-guidance/README.md @@ -6,7 +6,7 @@ Git is a distributed version control system. This means that - unlike SVN or CVS For example: -```plain +```sh repo 1: A -> B -> C -> D -> HEAD repo 2: A -> B -> HEAD repo 3: X -> Y -> Z -> HEAD @@ -27,7 +27,7 @@ A recommended installation is the [Git Lens extension for Visual Studio Code](ht You can use these commands as well to configure your Git for Visual Studio Code as an editor for merge conflicts and diff tool. -```cmd +```sh git config --global user.name [YOUR FIRST AND LAST NAME] git config --global user.email [YOUR E-MAIL ADDRESS] @@ -38,11 +38,11 @@ git config --global diff.tool vscode git config --global difftool.vscode.cmd "code --wait --diff $LOCAL $REMOTE" ``` -## Basic workflow +## Basic Workflow A basic Git workflow is as follows; you can find more information on the specific steps below. -```cmd +```sh # pull the latest changes git pull @@ -72,7 +72,7 @@ git push --set-upstream origin feature/123-add-git-instructions Whenever you want to make a change to a repository, you need to first clone it. Cloning a repository pulls down a full copy of all the repository data, so that you can work on it locally. This copy includes all versions of every file and folder for the project. -```cmd +```sh git clone https://github.com/username/repo-name ``` @@ -84,7 +84,7 @@ To avoid adding code that has not been peer reviewed to the main branch (ex. `de Pull the latest changes and create a new branch for your work based on the trunk (in this case `develop`). -```cmd +```sh git pull git checkout -b feature/feature-name develop ``` @@ -96,10 +96,9 @@ At any point, you can move between the branches with `git checkout ` as To avoid losing work, it is good to commit often in small chunks. This allows you to revert only the last changes if you discover a problem and also neatly explains exactly what changes were made and why. 1. Make changes to your branch - 2. Check what files were changed - ```cmd + ```sh > git status On branch feature/271-basic-commit-info Changes not staged for commit: @@ -110,19 +109,19 @@ To avoid losing work, it is good to commit often in small chunks. This allows yo 3. Track the files you wish to include in the commit. To track all modified files: - ```cmd + ```sh git add --all ``` Or to track only specific files: - ```cmd + ```sh git add source-control/git-guidance/README.md ``` 4. Commit the changes to your local branch with a descriptive [commit message](#commit-best-practices) - ```cmd + ```sh git commit -m "add basic git instructions" ``` @@ -130,13 +129,13 @@ To avoid losing work, it is good to commit often in small chunks. This allows yo When you are done working, push your changes to a branch in the remote repository using: -```cmd +```sh git push ``` The first time you push, you first need to set an upstream branch as follows. After the first push, the --set-upstream parameter and branch name are not needed anymore. -```cmd +```sh git push --set-upstream origin feature/feature-name ``` @@ -152,7 +151,7 @@ The Pull Request (PR) process in [Azure DevOps](https://learn.microsoft.com/en-u If multiple people make changes to the same files, you may need to resolve any conflicts that have occurred before you can merge. -```cmd +```sh # check out the develop branch and get the latest changes git checkout develop git pull @@ -189,7 +188,7 @@ And here is another line that is cleanly resolved or unmodified When this process is completed, make sure you test the result by executing build, checks, test to validate this merged result. -```cmd +```sh # conclude the merge git merge --continue @@ -202,11 +201,11 @@ git push If no other conflicts appear, the PR can now be merged, and your branch deleted. Use `squash` to reduce your changes into a single commit, so the commit history can be within an acceptable size. -### Stashing changes +### Stashing Changes `git stash` is super handy if you have un-committed changes in your working directory, but you want to work on a different branch. You can run `git stash`, save the un-committed work, and revert to the HEAD commit. You can retrieve the saved changes by running `git stash pop`: -```cmd +```sh git stash … git stash pop @@ -214,21 +213,21 @@ git stash pop Or you can move the current state into a new branch: -```cmd +```sh git stash branch ``` -### Recovering lost commits +### Recovering Lost Commits If you "lost" a commit that you want to return to, for example to revert a `git rebase` where your commits got squashed, you can use `git reflog` to find the commit: -```cmd +```sh git reflog ``` Then you can use the reflog reference (`HEAD@{}`) to reset to a specific commit before the rebase: -```cmd +```sh git reset HEAD@{2} ``` @@ -245,7 +244,7 @@ A commit combines changes into a logical unit. Adding a descriptive commit messa You can specify the default git editor, which allows you to write your commit messages using your favorite editor. The following command makes Visual Studio Code your default git editor: -```bash +```sh git config --global core.editor "code --wait" ``` @@ -271,26 +270,26 @@ For more information on commit message conventions, see: * [Information in commit messages](https://wiki.openstack.org/wiki/GitCommitMessages#Information_in_commit_messages) * [On commit messages](http://who-t.blogspot.com/2009/12/on-commit-messages.html) -## Managing remotes +## Managing Remotes A local git repository can have one or more backing remote repositories. You can list the remote repositories using `git remote` - by default, the remote repository you cloned from will be called origin -```cmd +```sh > git remote -v origin https://github.com/microsoft/code-with-engineering-playbook.git (fetch) origin https://github.com/microsoft/code-with-engineering-playbook.git (push) ``` -### Working with forks +### Working with Forks You can set multiple remotes. This is useful for example if you want to work with a forked version of the repository. For more info on how to set upstream remotes and syncing repositories when working with forks see GitHub's [Working with forks documentation](https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/working-with-forks). -### Updating the remote if a repository changes names +### Updating the Remote if a Repository Changes Names If the repository is changed in some way, for example a name change, or if you want to switch between HTTPS and SSH you need to update the remote -```cmd +```sh # list the existing remotes > git remote -v origin https://hostname/username/repository-name.git (fetch) @@ -305,9 +304,9 @@ origin https://hostname/username/new-repository-name.git (fetch) origin https://hostname/username/new-repository-name.git (push) ``` -## Rolling back changes +## Rolling Back Changes -### Reverting and deleting commits +### Reverting and Deleting Commits To "undo" a commit, run the following two commands: `git revert` and `git reset`. `git revert` creates a new commit that undoes commits while `git reset` allows deleting commits entirely from the commit history. @@ -341,7 +340,7 @@ The above command will open an interactive session in an editor (for example vim Running rebase will locally modify the history, after this one can use `force` to push the changes to remote without the deleted commit. -## Using submodules +## Using Submodules Submodules can be useful in more complex deployment and/or development scenarios @@ -360,21 +359,19 @@ git submodule foreach git checkout master git submodule foreach git pull origin ``` -## Working with images, video and other binary content +## Working with Images, Video and Other Binary Content Avoid committing frequently changed binary files, such as large images, video or compiled code to your git repository. Binary content is not diffed like text content, so cloning or pulling from the repository may pull each revision of the binary file. -One solution to this problem is `Git LFS (Git Large File Storage)` - an open source Git extension for versioning large files. You can find more information on Git LFS in the [Git LFS and VFS document](git-lfs-and-vfs.md). +One solution to this problem is `Git LFS (Git Large File Storage)` - an open source Git extension for versioning large files. You can find more information on Git LFS in the [Git LFS and VFS document](./git-lfs-and-vfs.md). -## Working with large repositories +## Working with Large Repositories -When working with a very large repository of which you don't require all the files, you can use `VFS for Git` - an open source Git extension that virtualize the file system beneath your Git repository, so that you seem to work in a regular working directory but while VFS for Git only downloads objects as they are needed. You can find more information on VFS for Git in the [Git LFS and VFS document](git-lfs-and-vfs.md). +When working with a very large repository of which you don't require all the files, you can use `VFS for Git` - an open source Git extension that virtualize the file system beneath your Git repository, so that you seem to work in a regular working directory but while VFS for Git only downloads objects as they are needed. You can find more information on VFS for Git in the [Git LFS and VFS document](./git-lfs-and-vfs.md). ## Tools * Visual Studio Code is a cross-platform powerful source code editor with built in git commands. Within Visual Studio Code editor you can review diffs, stage changes, make commits, pull and push to your git repositories. You can refer to [Visual Studio Code Git Support](https://code.visualstudio.com/docs/editor/versioncontrol#_git-support) for documentation. - * Use a shell/terminal to work with Git commands instead of relying on [GUI clients](https://git-scm.com/downloads/guis/). - * If you're working on Windows, [posh-git](https://github.com/dahlbyk/posh-git) is a great PowerShell environment for Git. Another option is to use [Git bash for Windows](http://www.techoism.com/how-to-install-git-bash-on-windows/). On Linux/Mac, install git and use your favorite shell/terminal. diff --git a/docs/source-control/git-guidance/git-lfs-and-vfs.md b/docs/source-control/git-guidance/git-lfs-and-vfs.md index 5b1bbe3ba6..afef82c02b 100644 --- a/docs/source-control/git-guidance/git-lfs-and-vfs.md +++ b/docs/source-control/git-guidance/git-lfs-and-vfs.md @@ -1,4 +1,4 @@ -# Using Git LFS and VFS for Git introduction +# Using Git LFS and VFS for Git Introduction **Git LFS** and **VFS for Git** are solutions for using Git with (large) binary files and large source trees. @@ -59,7 +59,7 @@ With these commands a `.gitattribute` file is created which contains these setti From here on you just use the standard Git commands to work in the repository. The rest will be handled by Git and Git LFS. -### Common LFS commands +### Common LFS Commands Install Git LFS @@ -147,7 +147,7 @@ gvfs unmount This will stop the process and unregister it, after that you can safely remove the folder. -### References +### Resources * [Git LFS getting started](https://git-lfs.github.com/) * [Git LFS manual](https://github.com/git-lfs/git-lfs/tree/master/docs) diff --git a/docs/source-control/merge-strategies.md b/docs/source-control/merge-strategies.md index 14616feffe..1cd12f2b0f 100644 --- a/docs/source-control/merge-strategies.md +++ b/docs/source-control/merge-strategies.md @@ -1,15 +1,15 @@ -# Merge strategies +# Merge Strategies Agree if you want a linear or non-linear commit history. There are pros and cons to both approaches: * Pro linear: [Avoid messy git history, use linear history](https://dev.to/bladesensei/avoid-messy-git-history-3g26) * Con linear: [Why you should stop using Git rebase](https://medium.com/@fredrikmorken/why-you-should-stop-using-git-rebase-5552bee4fed1) -## Approach for non-linear commit history +## Approach for Non-Linear Commit History Merging `topic` into `main` -```md +```sh A---B---C topic / \ D---E---F---G---H main @@ -19,9 +19,9 @@ git checkout main git merge topic ``` -## Two approaches to achieve a linear commit history +## Two Approaches to Achieve a Linear Commit History -### Rebase topic branch before merging into main +### Rebase Topic Branch Before Merging into Main Before merging `topic` into `main`, we rebase `topic` with the `main` branch: @@ -38,7 +38,7 @@ git rebase origin/main Create a PR topic --> main in Azure DevOps and approve using the squash merge option -### Rebase topic branch before squash merge into main +### Rebase Topic Branch Before Squash Merge into Main [Squash merging](https://learn.microsoft.com/en-us/azure/devops/repos/git/merging-with-squash?view=azure-devops) is a merge option that allows you to condense the Git history of topic branches when you complete a pull request. Instead of adding each commit on `topic` to the history of `main`, a squash merge takes all the file changes and adds them to a single new commit on `main`. diff --git a/docs/source-control/naming-branches.md b/docs/source-control/naming-branches.md index 532b418dd0..792bd28b14 100644 --- a/docs/source-control/naming-branches.md +++ b/docs/source-control/naming-branches.md @@ -1,4 +1,4 @@ -# Naming branches +# Naming Branches When contributing to existing projects, look for and stick with the agreed branch naming convention. In open source projects this information is typically found in the contributing instructions, often in a file named `CONTRIBUTING.md`. @@ -6,13 +6,13 @@ In the beginning of a new project the team agrees on the project conventions inc Here's an example of a branch naming convention: -```plaintext +```sh /[feature/bug/hotfix]/_ ``` Which could translate to something as follows: -```plaintext +```sh dickinson/feature/271_add_more_cowbell ``` diff --git a/docs/source-control/secrets-management.md b/docs/source-control/secrets-management.md index d1ffcc91e9..685f7ccadd 100644 --- a/docs/source-control/secrets-management.md +++ b/docs/source-control/secrets-management.md @@ -8,6 +8,6 @@ E.g. the following pattern will exclude all files with the extension `.private.c *.private.config ``` -For more details on proper management of credentials and secrets in source control, and handling an accidental commit of secrets to source control, please refer to the [Secrets Management](../continuous-delivery/secrets-management/README.md) document which has further information, split by language as well. +For more details on proper management of credentials and secrets in source control, and handling an accidental commit of secrets to source control, please refer to the [Secrets Management](../CI-CD/dev-sec-ops/secrets-management/README.md) document which has further information, split by language as well. -As an extra security measure, apply [credential scanning](../continuous-integration/dev-sec-ops/secret-management/credential_scanning.md) in your CI/CD pipeline. +As an extra security measure, apply [credential scanning](../CI-CD/dev-sec-ops/secrets-management/credential_scanning.md) in your CI/CD pipeline. diff --git a/docs/SPRINT-STRUCTURE.md b/docs/the-first-week-of-an-ise-project.md similarity index 64% rename from docs/SPRINT-STRUCTURE.md rename to docs/the-first-week-of-an-ise-project.md index 4433a1dd42..64629d547b 100644 --- a/docs/SPRINT-STRUCTURE.md +++ b/docs/the-first-week-of-an-ise-project.md @@ -1,4 +1,4 @@ -# Structure of a Sprint +# The First Week of an ISE Project The purpose of this document is to: @@ -6,26 +6,25 @@ The purpose of this document is to: - Provide content in a logical structure which reflects the engineering process - Extensible hierarchy to allow teams to share deep subject-matter expertise -## The first week of an ISE Project -### Before starting the project +## Before Starting the Project - [ ] Discuss and start writing the Team Agreements. Update these documents with any process decisions made throughout the project - - [Working Agreement](agile-development/advanced-topics/team-agreements/working-agreements.md) - - [Definition of Ready](agile-development/advanced-topics/team-agreements/definition-of-ready.md) - - [Definition of Done](agile-development/advanced-topics/team-agreements/definition-of-done.md) - - [Estimation](agile-development/basics/ceremonies.md#estimation) + - [Working Agreement](agile-development/team-agreements/working-agreement.md) + - [Definition of Ready](agile-development/team-agreements/definition-of-ready.md) + - [Definition of Done](agile-development/team-agreements/definition-of-done.md) + - [Estimation](agile-development/ceremonies.md#estimation) - [ ] [Set up the repository/repositories](source-control/README.md#creating-a-new-repository) - Decide on repository structure/s - - Add [README.md](resources/templates/README.md), [LICENSE](resources/templates/LICENSE), [CONTRIBUTING.md](resources/templates/CONTRIBUTING.md), .gitignore, etc + - Add README.md, LICENSE, CONTRIBUTING.md, .gitignore, etc - [ ] [Build a Product Backlog](agile-development/advanced-topics/backlog-management) - Set up a project in your chosen project management tool (ex. Azure DevOps) - [INVEST](https://en.wikipedia.org/wiki/INVEST_(mnemonic)) in good User Stories and Acceptance Criteria - [Non-Functional Requirements Guidance](design/design-patterns/non-functional-requirements-capture-guide.md) -### Day 1 +## Day 1 -- [ ] [Plan the first sprint](agile-development/basics/ceremonies.md#sprint-planning) +- [ ] [Plan the first sprint](agile-development/ceremonies.md#sprint-planning) - Agree on a sprint goal, and how to measure the sprint progress - Determine team capacity - Assign user stories to the sprint and split user stories into tasks @@ -35,41 +34,41 @@ The purpose of this document is to: - Agree on how to separate unit tests from integration, load and smoke tests - Design the first test cases - [ ] [Decide on branch naming](source-control/naming-branches.md) -- [ ] [Discuss security needs and verify that secrets are kept out of source control](continuous-delivery/gitops/secret-management/azure-devops-secret-management-per-branch.md) +- [ ] [Discuss security needs and verify that secrets are kept out of source control](./CI-CD/dev-sec-ops/secrets-management/README.md) -### Day 2 +## Day 2 - [ ] [Set up Source Control](source-control/README.md) - Agree on [best practices for commits](source-control/git-guidance/README.md#commit-best-practices) -- [ ] [Set up basic Continuous Integration with linters and automated tests](continuous-integration/README.md) -- [ ] [Set up meetings for Daily Stand-ups and decide on a Process Lead](agile-development/basics/ceremonies.md#stand-up) + - [ ] [Set up basic Continuous Integration with linters and automated tests](./CI-CD/continuous-integration.md) + - [ ] [Set up meetings for Daily Stand-ups and decide on a Process Lead](agile-development/ceremonies.md#stand-up) - Discuss purpose, goals, participants and facilitation guidance - Discuss timing, and how to run an efficient stand-up - [ ] [If the project has sub-teams, set up a Scrum of Scrums](agile-development/advanced-topics/effective-organization/scrum-of-scrums.md) -### Day 3 +## Day 3 - [ ] [Agree on code style](code-reviews/README.md) and on [how to assign Pull Requests](code-reviews/pull-requests.md) -- [ ] [Set up Build Validation for Pull Requests (2 reviewers, linters, automated tests)](code-reviews/README.md) and agree on [Definition of Done](agile-development/advanced-topics/team-agreements/definition-of-done.md) -- [ ] [Agree on a Code Merging strategy](source-control/merge-strategies.md) and update the [CONTRIBUTING.md](resources/templates/CONTRIBUTING.md) +- [ ] [Set up Build Validation for Pull Requests (2 reviewers, linters, automated tests)](code-reviews/README.md) and agree on [Definition of Done](agile-development/team-agreements/definition-of-done.md) +- [ ] [Agree on a Code Merging strategy](source-control/merge-strategies.md) and update the CONTRIBUTING.md - [ ] [Agree on logging and observability frameworks and strategies](observability/README.md) -### Day 4 +## Day 4 -- [ ] [Set up Continuous Deployment](continuous-delivery/README.md) +- [ ] [Set up Continuous Deployment](./CI-CD/continuous-delivery.md) - Determine what environments are appropriate for this solution - For each environment discuss purpose, when deployment should trigger, pre-deployment approvers, sing-off for promotion. - [ ] [Decide on a versioning strategy](source-control/component-versioning.md) - [ ] Agree on how to [Design a feature and conduct a Design Review](design/design-reviews/README.md) -### Day 5 +## Day 5 -- [ ] Conduct a [Sprint Demo](agile-development/basics/ceremonies.md#sprint-demo) -- [ ] Conduct a [Retrospective](agile-development/basics/ceremonies.md#retrospectives) +- [ ] Conduct a [Sprint Demo](agile-development/ceremonies.md#sprint-demo) +- [ ] Conduct a [Retrospective](agile-development/ceremonies.md#retrospectives) - Determine required participants, how to capture input (tools) and outcome - Set a timeline, and discuss facilitation, meeting structure etc. - [ ] [Refine the Backlog](agile-development/advanced-topics/backlog-management) - Determine required participants - - Update the [Definition of Ready](agile-development/advanced-topics/team-agreements/definition-of-ready.md) - - Update estimates, and the [Estimation](agile-development/basics/ceremonies.md#estimation) document + - Update the [Definition of Ready](agile-development/team-agreements/definition-of-ready.md) + - Update estimates, and the [Estimation](agile-development/ceremonies.md#estimation) document - [ ] [Submit Engineering Feedback for issues encountered](engineering-feedback/README.md) diff --git a/docs/user-interface-engineering/stability.md b/docs/user-interface-engineering/stability.md deleted file mode 100644 index 89a58395ad..0000000000 --- a/docs/user-interface-engineering/stability.md +++ /dev/null @@ -1,3 +0,0 @@ -# Stability - -> Coming soon! diff --git a/linkcheck.json b/linkcheck.json index 47bc45face..fda8c05f1b 100755 --- a/linkcheck.json +++ b/linkcheck.json @@ -65,9 +65,13 @@ "https://www.pluralsight.com/courses/", "https://www.gartner.com/en/information-technology/glossary/citizen-developer", "https://www.onetrust.com/blog/principles-of-privacy-by-design/", - "https://docs.github.com/en/rest/commits/statuses" + "https://docs.github.com/en/rest/commits/statuses", + "https://blog.twitter.com/engineering/en_us/a/2015/diffy-testing-services-without-writing-tests.html", + "https://blog.github.com/2012-09-17-contributing-guidelines/", + "https://www.fast.design/docs/integrations/react", + "http://unitycontainer.org/articles/introduction.html" ], "only_errors": true, "cache_duration": "24h", - "cache_output_path": "/github/workspace/megalinter-reports/linchcheck-cache" + "cache_output_path": "/github/workspace/megalinter-reports/linkcheck-cache" } \ No newline at end of file diff --git a/package-lock.json b/package-lock.json index 3c8fe1dc5c..0e16e3c4e7 100644 --- a/package-lock.json +++ b/package-lock.json @@ -2285,11 +2285,11 @@ } }, "node_modules/micromatch": { - "version": "4.0.5", - "resolved": "https://registry.npmjs.org/micromatch/-/micromatch-4.0.5.tgz", - "integrity": "sha512-DMy+ERcEW2q8Z2Po+WNXuw3c5YaUSFjAO5GsJqfEl7UjvtIuFKO6ZrKvcItdy98dwFI2N1tg3zNIdKaQT+aNdA==", + "version": "4.0.8", + "resolved": "https://registry.npmjs.org/micromatch/-/micromatch-4.0.8.tgz", + "integrity": "sha512-PXwfBhYu0hBCPw8Dn0E+WDYb7af3dSLVWKi3HGv84IdF4TyFoC0ysxFd0Goxw7nSv4T/PzEJQxsYsEiFCKo2BA==", "dependencies": { - "braces": "^3.0.2", + "braces": "^3.0.3", "picomatch": "^2.3.1" }, "engines": { @@ -6312,11 +6312,11 @@ "integrity": "sha512-8q7VEgMJW4J8tcfVPy8g09NcQwZdbwFEqhe/WZkoIzjn/3TGDwtOCYtXGxA3O8tPzpczCCDgv+P2P5y00ZJOOg==" }, "micromatch": { - "version": "4.0.5", - "resolved": "https://registry.npmjs.org/micromatch/-/micromatch-4.0.5.tgz", - "integrity": "sha512-DMy+ERcEW2q8Z2Po+WNXuw3c5YaUSFjAO5GsJqfEl7UjvtIuFKO6ZrKvcItdy98dwFI2N1tg3zNIdKaQT+aNdA==", + "version": "4.0.8", + "resolved": "https://registry.npmjs.org/micromatch/-/micromatch-4.0.8.tgz", + "integrity": "sha512-PXwfBhYu0hBCPw8Dn0E+WDYb7af3dSLVWKi3HGv84IdF4TyFoC0ysxFd0Goxw7nSv4T/PzEJQxsYsEiFCKo2BA==", "requires": { - "braces": "^3.0.2", + "braces": "^3.0.3", "picomatch": "^2.3.1" } },