Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Observability doc drafts #3515

Open
wants to merge 65 commits into
base: observability-doc-outlines
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
012f851
Fix: Prevent unintentional table schema changes during evaluation (#3…
izeigerman Dec 9, 2024
ab3af75
Chore: fix faulty test (#3490)
erindru Dec 9, 2024
bee88a4
Chore: Address flaky Redshift test (#3491)
erindru Dec 9, 2024
f91a9ca
feat(cli): ability to set config CLI params via env vars (#3493)
kelsin Dec 10, 2024
7e42ea3
fix!: make signals serializable (#3480)
tobymao Dec 10, 2024
7f23d3d
chore: document clearing out history of scd type 2 model (#3495)
eakmanrq Dec 10, 2024
3b404fe
Add Snowflake Tracking (#3492)
sungchun12 Dec 10, 2024
3af1f17
feat: make log output more human-friendly (#3496)
plaflamme Dec 10, 2024
6b13cf7
Fix: quote the table produced by _resolve_table (#3494)
georgesittas Dec 10, 2024
a404175
Fix(bigquery): Pass catalog when checking for clustering key changes …
erindru Dec 10, 2024
5ad4a39
chore: cleanup
tobymao Dec 10, 2024
ec2210c
Docs: cloud overview page (#3473)
mesmith027 Dec 10, 2024
02c0618
Docs: use image with white background on tcloud landing page (#3499)
treysp Dec 10, 2024
a637248
Chore: update readthedocs python from 3.8 to 3.10 (#3504)
treysp Dec 11, 2024
c512e63
Refactor!: make when_matched syntax compatible with merge syntax (#3497)
georgesittas Dec 12, 2024
434be40
Fix: Always treat forward-only models as non-deployable (#3510)
izeigerman Dec 12, 2024
2ac8894
Feat: support AzureSQL (#3509)
treysp Dec 14, 2024
af6ec80
Check that test_adapter exist before closing (#3518)
Kayrnt Dec 16, 2024
3634518
Fix: Adapt evaluator test to prevent duplicate macro possibility (#3520)
themisvaltinos Dec 16, 2024
b2b87d8
Fix: propagate dialect to extract call for script loading (#3521)
georgesittas Dec 16, 2024
cc0330e
Chore: switch tests from freezegun to time-machine (#3516)
treysp Dec 16, 2024
96efab4
chore: add missing utc timezone to test (#3526)
eakmanrq Dec 17, 2024
48680db
Feat: allow macro functions in when_matched property (#3527)
georgesittas Dec 17, 2024
2102cc1
Chore: fix arg name in Node Field definition (#3530)
treysp Dec 17, 2024
d6be17c
Feat: add this_model property in the macro evaluator to return a stri…
georgesittas Dec 18, 2024
3a89d9e
Feat!: Support 'optimize' flag in model defs (#3512)
VaggelisD Dec 18, 2024
54f49ee
Fix: include column descriptions in optimized query cache key (#3532)
georgesittas Dec 18, 2024
25df941
feat: add multiple catalogs functionality to MotherDuck connection (…
naoyak Dec 18, 2024
a3e3382
Refactor: remove freezegun dependency in favor of time-machine (#3533)
georgesittas Dec 18, 2024
756eb70
Feat: Add support for auto-restatements (#3529)
izeigerman Dec 18, 2024
ca109fb
Fix!: reject string model names (#3534)
georgesittas Dec 19, 2024
b0d4627
Chore: fix bigquery integration test (#3536)
georgesittas Dec 19, 2024
9aedcf7
fix: ensure that restatements in prod also trigger restatements in de…
erindru Dec 19, 2024
2d2878c
Fix: A flaky auto-restatement test (#3541)
izeigerman Dec 19, 2024
86f3dec
Chore: fix tests that didnt work in non-UTC timezone (#3542)
erindru Dec 19, 2024
e2d32e4
Fix: Adapt evaluator test to use unique model names (#3538)
themisvaltinos Dec 19, 2024
289d5e5
feat!: exclusions in selector powered by full parsing (#3535)
tobymao Dec 19, 2024
d12ddbb
chore: remove deprecation warning from sushi (#3543)
tobymao Dec 19, 2024
2adf5a1
Feat: do recursive glob matching for ignore_patterns (#3539)
georgesittas Dec 19, 2024
4f4f9f4
Feat: Add ability to exclude dependencies in the requirements lock fi…
izeigerman Dec 20, 2024
1f71537
Feat: improve message when no models are ready to run (#3513)
treysp Dec 20, 2024
0e51dd8
Feat!: Add support for merge_filter and dbt incremental_predicates fo…
themisvaltinos Dec 20, 2024
f0d2e0f
Chore!: deprecate pydantic v1 (#3548)
georgesittas Dec 20, 2024
864e01b
Chore: Fix the lock file name in docs (#3549)
izeigerman Dec 20, 2024
adbf098
Chore: Remove the unused variable in the run_merged_intervals method …
izeigerman Dec 20, 2024
ee3be51
Feat(dlt): Add support for generation of nested dlt tables (#3547)
themisvaltinos Dec 20, 2024
774cff7
Chore: Move application of pending restatement intervals into the Sna…
izeigerman Dec 20, 2024
f438960
Chore: Doc outlines for tcloud observability
crericha Dec 11, 2024
ae0dfde
Add glightbox back into plugins
crericha Dec 11, 2024
1ea7c5c
revealing the observability and freshness pages
mesmith027 Dec 13, 2024
33e1da2
overview draft
mesmith027 Dec 13, 2024
94cd086
model freshness draft
mesmith027 Dec 13, 2024
1b0e446
prod copy
mesmith027 Dec 13, 2024
53677b4
adding dev env copy
mesmith027 Dec 19, 2024
bb13b87
plan copy and extra image
mesmith027 Dec 19, 2024
3e3f824
run draft and extra image
mesmith027 Dec 19, 2024
035e1d5
minor image and line addition to plan
mesmith027 Dec 19, 2024
ce30357
adding images to runs lower section
mesmith027 Dec 19, 2024
902927a
model draft and images
mesmith027 Dec 19, 2024
5c32f55
adding plan type options
mesmith027 Dec 19, 2024
1313e48
model loaded interval update
mesmith027 Dec 19, 2024
d5aec66
added plan history section
mesmith027 Dec 19, 2024
57f16c7
forgot hyperlink to plan page
mesmith027 Dec 19, 2024
7cbd213
Fix: remove os-specific strftime format character (#3559)
treysp Dec 23, 2024
ba7f3cc
Merge branch 'TobikoData:main' into observability-doc-drafts
treysp Dec 23, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 0 additions & 32 deletions .circleci/continue_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -81,37 +81,6 @@ jobs:
- store_test_results:
path: test-results

style_and_slow_tests_pydantic_v1:
docker:
- image: cimg/python:3.10
resource_class: large
environment:
PYTEST_XDIST_AUTO_NUM_WORKERS: 8
steps:
- halt_unless_core
- checkout
- run:
name: Install OpenJDK
command: sudo apt-get update && sudo apt-get install default-jdk
- run:
name: Install ODBC
command: sudo apt-get install unixodbc-dev
- run:
name: Install SQLMesh and dbt adapter dependencies
command: make install-cicd-test
- run:
name: Install Pydantic v1
command: pip install --upgrade "pydantic<2.0.0" && pip uninstall pydantic_core -y
- run:
name: Fix Git URL override
command: git config --global --unset url."ssh://[email protected]".insteadOf
- run:
name: Run linters and code style checks
command: make py-style
- run:
name: Run slow tests
command: make cicd-test

migration_test:
docker:
- image: cimg/python:3.10
Expand Down Expand Up @@ -309,7 +278,6 @@ workflows:
- "3.10"
- "3.11"
- "3.12"
- style_and_slow_tests_pydantic_v1
- airflow_docker_tests:
requires:
- style_and_slow_tests
Expand Down
2 changes: 1 addition & 1 deletion .readthedocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ version: 2
build:
os: ubuntu-22.04
tools:
python: "3.8"
python: "3.10"
jobs:
pre_build:
- pip install -e .
Expand Down
47 changes: 45 additions & 2 deletions docs/cloud/cloud_index.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,46 @@
# Tobiko Cloud

Coming Soon!
# Welcome to Tobiko Cloud

[Tobiko Cloud](https://tobikodata.com/product.html) is a data transformation platform that enhances the ease and efficiency of managing data pipelines with SQLMesh.

Tobiko Cloud is designed for companies who want to:

- Host SQLMesh on a robust, reliable platform without building and maintaining it themselves
- Understand the status, activity, and performance of data pipelines at a glance
- Rapidly detect and debug problems with their pipelines
- Monitor cloud costs over time, by model (BigQuery and Snowflake engines only)

![Tobiko Cloud](./cloud_index/tobiko-cloud.png)

## How is Tobiko Cloud different from SQLMesh?

Tobiko Cloud complements SQLMesh, supporting companies that need enterprise-level features like scalability, observability, and cost optimization.

Here’s a comparison:

1. **Deployment**: Tobiko Cloud simplifies SQLMesh deployment by hosting it on our infrastructure.

It provides enterprise-grade hosting and scalability for complex data transformations, freeing teams from managing infrastructure themselves.

2. **Observability and Insights**: Tobiko Cloud integrates deeply with SQLMesh, providing instant visibility into pipeline versions, code changes, and errors.

This allows teams to monitor their pipelines, detect changes in pipeline behavior, and rapidly trace the root causes of data issues.

4. **Efficiency**: SQLMesh's built-in features like virtual data environments and automatic change classification reduce computational costs and improve processing speeds.

Tobiko Cloud's enhanced change classification identifies even more scenarios where code changes don't require rerunning downstream models.

4. **Cost monitoring**: Tobiko Cloud automatically tracks costs per model execution for BigQuery and Snowflake.

This allows teams to rapidly detect anomalous spending and to identify the models driving cloud costs.

## Learn more

Ready to unlock a faster, smarter, and more efficient way to manage your data pipelines? Book a call with the Tobiko Cloud team today!

Discover how Tobiko's managed SQLMesh platform will empower your team to scale effortlessly, optimize costs, and deliver accurate data faster — all while freeing your team from infrastructure headaches.

Whether you're a data engineer, or decision-maker, Tobiko Cloud gives you data transformation without the waste. Let's talk!

<div class="calendly-inline-widget" data-url="https://calendly.com/d/cp8k-4jm-m6p/tobiko-cloud-intros" style="min-width:320px;height:630px;"></div>
<script type="text/javascript" src="https://assets.calendly.com/assets/external/widget.js"></script>
Binary file added docs/cloud/cloud_index/tobiko-cloud.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 2 additions & 0 deletions docs/cloud/features/debugger_view.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,8 @@ The rest of this page describes those tabs.

> Pro tip: you can toggle whether timestamps are in UTC or your local timezone in the page's upper right corner.

## Debugger View Tabs

### Overview

See a summary of the model's characteristics and behavior during current and historical runs.
Expand Down
Binary file added docs/cloud/features/find_model_freshness.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
25 changes: 24 additions & 1 deletion docs/cloud/features/model_freshness.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,26 @@
# Model Freshness

Coming Soon!
From the homescreen of Tobiko Cloud we have the graph on historical freshness.

![tcloud model freshness](find_model_freshness.png)

**Data model freshness** here refers to the timeliness and relevance of the data used in a data model, ensuring that it reflects the most current and accurate state of the underlying system or domain. In other words, it measures how up-to-date and synchronized the model is with the real-world data.

Zooming into that data, the model freshness chart shows you the freshness of your models within your data warehouse relative to the model's configured cron.

![tcloud model freshness](tcloud_model_freshness.png)

The chart displays historical data, showing freshness levels across time (shown on the `x-axis`). This historical view helps when troubleshooting reported data issues—you can quickly check if problems were caused by delayed data runs or other underlying issues.

The chart uses three colors to show the percentage of models in different states:

1. Models that have run for all previous cron periods are "complete" (green).
- All green indicates the data warehouse is fully up-to-date with model crons.
2. Models that haven't run for the most recent cron period are "pending" (yellow).
3. Models that haven't run for multiple previous cron periods are "behind" (red).
- Red signals potential issues that need investigation.

To make your life easier, the chart is interactive—you can click any point in time to see which specific models were complete, pending, or behind.

<NEED a screenshot of this at minimum>

25 changes: 25 additions & 0 deletions docs/cloud/features/observability/development_environment.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Development Environment

From the Environments section, users can access a comprehensive development environment overview that provides detailed insights.
![tcloud environmnet page](environments.png)

The overview presents an intuitive at-a-glance summary of the most recent plan execution in the development environment, featuring key metrics and information

- detailed plan execution interval timing and frequency
- comprehensive count and analysis of all models present in the environment
- an interactive and informative pie chart visualization that clearly illustrates model changes within the development environment
- models with direct modifications are represented in blue for easy identification
- newly introduced models are highlighted in green to show additions
- models marked for removal are indicated in red for clear visibility

![tcloud development environmnet](tcloud_development_environment.png)

## Differences from Prod section

Detailed comparative list view showing model differences between development and production environments. Enhanced filtering capabilities allow users to toggle between viewing models affected by direct modifications versus those impacted by indirect changes.

## Plan history information

The plan applications table delivers a comprehensive calendar visualization of all plans that have been executed within this specific environment. Each executed plan is represented visually as a distinctive green bar positioned along the timeline of the chart, making it easy to track execution history at a glance.

When users hover their cursor over any of these green bars, a detailed information panel appears, presenting crucial metrics and specifics about that particular plan execution. This hover functionality not only provides immediate insights but also serves as an interactive gateway - users can click through from this preview to access the complete, detailed [plan view](plan.md) for more in-depth analysis and review.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
33 changes: 33 additions & 0 deletions docs/cloud/features/observability/model.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Model

From the main environments list, you can access individual models to explore their comprehensive observability features and detailed summary information. This centralized view provides quick access to critical model metrics and performance data.

- Click the environment you want to explore from the environments list
![tcloud environmnet page](environments.png)

- Navigate to the Models section and click "Explore" to view available models
![environmnet explore models](model_explore.png)

- Browse through the model list and select your desired model to access its detailed information
![environmnet model list](model_list.png)

Each model presents a comprehensive summary overview that includes several key components and metrics for monitoring and analysis. The following detailed information outlines the different sections:

![tcloud model 1](tcloud_model_1.png)

- Current status: Provides visual representations of model health through freshness indicators and detailed daily execution graphs
- Model details: Features comprehensive tabs that display summary statistics, complete source code documentation, and interactive model lineage visualizations

![tcloud model 2](tcloud_model_2.png)

- Version history: Delivers a comprehensive chronological view of all model versions, with detailed information including:
- Precise timestamp of version promotion
- Clear indication of change impact (breaking or non-breaking modifications)
- Direct access to the complete implementation plan code
- Loaded intervals: these periods represent the time spans between consecutive cron job executions, from the start of one cron job to the end of the next cron job. These intervals are crucial for understanding the boundaries of data processing cycles
- the table displays which specific model version was responsible for generating and processing data during each distinct cron interval, enabling precise tracking of version-specific outputs
- helps track forward-only model changes by maintaining a clear chronological record of modifications, ensuring data consistency and preventing retroactive alterations
- provides comprehensive visibility into completed data processing operations, allowing users to monitor progress and verify successful execution of scheduled tasks
- Recent activity: Maintains a detailed log of version executions and comprehensive version audits


Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/cloud/features/observability/model_list.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
22 changes: 22 additions & 0 deletions docs/cloud/features/observability/overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Overview

Tobiko Cloud implements useful observability by capturing detailed metadata throughout your data project's lifecycle. During the execution of plans and runs, it collects information about data transformations, model performance, and system health to give you complete visibility into your project's operations.

This observability serves multiple purposes:

- Monitors the health and performance of your data pipelines
- Tracks the status of current and historical runs
- Maintains detailed version history of your models and transformations
- Enables creation of custom visualizations and metrics
- Facilitates troubleshooting and optimization

Observability features are seamlessly integrated throughout the Tobiko Cloud's interface, making it simple to monitor and understand your data project's behaviour. Instead of digging through complex logs or piecing together information from multiple sources, you can quickly access relevant insights from any part of your project. This includes real-time monitoring, historical analysis, and performance metrics that help you maintain reliable and efficient data operations.

The list below provides links to the integrated observability features, so you can find where and how to use/view them.

- [Prod Environment](prod_environment.md) - Health and recent activity
- [Development Environments](development_environment.md) - Differences from prod environment and recent activity
- [Plan](plan.md) - Overall status and detailed model execution data
- [Run](run.md) - Overall status and detailed model execution data
- [Model](model.md) - Status and version history
- [Dashboards](measures_dashboards.md) - Custom visualizations of observability data
31 changes: 31 additions & 0 deletions docs/cloud/features/observability/plan.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Plan

From the Recent Activity section within your environment overview (which could be either production or development environments), you can easily access detailed status information and comprehensive metadata for any individual plan. To do this, simply locate the specific plan you're interested in and click on its distinctive blue ID hash that appears within the table.
![tcloud paln information](plan_info.png)

This opens the detailed plan overview page:

![tcloud plan](tcloud_plan.png)

The top section provides an intuitive, comprehensive at-a-glance overview of the plan, displaying all essential information in a clear, organized format.

- current plan status, with clear indicators showing whether the plan is complete, currently in progress, or has encountered any failures
- detailed timing information, including both the precise start time when the plan was initiated and the total duration it ran before either completing successfully or encountering a failure
- specific plan type classification; in this particular example, it's identified as an environment update operation. Other options and their explanation are below:
- Environment Update: models themselves have changed
- Restatement: updating a model that you already have to get the newest, freshest data
- System: the Tobiko Data Cloud team has made a upgrade to your system. None of your models or data is affected (general housekeeping)
- comprehensive model backfill dates showing the temporal scope of the operation
- visual graph representation of all model changes, using our standard intuitive colour-coding system: directly modified models are highlighted in blue, newly added models appear in green, and removed models are marked in red for easy identification

## Plan changes

The middle section presents a detailed summary of all plan changes, featuring an interactive interface that allows users to filter and view specific types of changes. Users can toggle between various categories including added models, directly modified models, metadata-only modified models, and deleted models. For reference, below you'll find a screenshot that illustrates a plan containing multiple types of changes: models with direct modifications, models with metadata-only changes, and newly added models.

![plan example](plan_example.png)

## Execution and Audits section

The final section provides a comprehensive overview of all execution statuses and audit results. This includes in-depth information about individual model executions (with convenient, direct access to associated logs), virtual updates, and a complete chronological listing of all audits that have been performed.

![tcloud plan audits section](tcloud_plan_zoom.png)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/cloud/features/observability/plan_info.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
33 changes: 33 additions & 0 deletions docs/cloud/features/observability/prod_environment.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Prod Environment

For the production environment, there are three main observability features that can be seen directly from the homepage of Tobiko Cloud. They are the:

1. [Model Freshness](../model_freshness.md)
2. Weekly runs and plans chart
3. Recent activity table

![tcloud prod env](tcloud_prod_environment_labelled.png)

!!! Note

Model freshness has its own feature page. Please use the link above for more information.

## Weekly Runs and Plans Chart

The weekly runs and plans chart is strategically positioned to the left of the model freshness chart, providing users with an easily accessible visualization of system activity.

![tcloud weekly runs](weekly_runs.png)

Following the established visual language of the model freshness chart, this chart uses an intuitive color-coding system to convey run status at a glance: successfully completed runs are marked in green, failed runs are highlighted in red, and runs currently in progress are displayed in gray.

The chart presents a comprehensive view of daily runs along its `x-axis`, with the vertical bars representing run duration. The height of each bar corresponds directly to the duration of the run, allowing you to quickly assess execution times. As an illustration of this functionality, the screenshot has a specific instance where a single run lasting 20 seconds executed successfully on November 26.

The chart also incorporates plan execution data, represented by distinctive purple horizontal bars that span their respective execution days. Enhanced with interactive functionality, the chart allows users to explore additional details by hovering over these plan indicators, revealing a detailed breakdown of which specific models were executed within each plan's scope.

## Recent Activity Table

The recent activity table provides comprehensive information about system activities, displaying both runs and plans in chronological order. For each activity entry, you can view its current status, estimated cost of execution, total duration from start to finish, precise timestamps for both start and completion times, and a unique identification hash for reference purposes.

![tcloud recent activity](recent_activity.png)

To help you locate specific information within the activity log, the table includes a convenient filtering feature located in the top right corner. This filter allows you to narrow down the displayed activities based on various criteria, making it easier to find and analyze particular events or patterns in your system's operation history.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading