Skip to content

Commit

Permalink
Merge branch 'm-kovalsky/cleaning'
Browse files Browse the repository at this point in the history
  • Loading branch information
m-kovalsky committed Oct 1, 2024
2 parents 2f2fcd5 + 6ed7b3c commit ed8c95b
Show file tree
Hide file tree
Showing 43 changed files with 1,684 additions and 625 deletions.
28 changes: 24 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,29 @@
[Read the documentation on ReadTheDocs!](https://semantic-link-labs.readthedocs.io/en/stable/)
---

This is a python library intended to be used in [Microsoft Fabric notebooks](https://learn.microsoft.com/fabric/data-engineering/how-to-use-notebook). This library was originally intended to solely contain functions used for [migrating semantic models to Direct Lake mode](https://github.com/microsoft/semantic-link-labs?tab=readme-ov-file#direct-lake-migration). However, it quickly became apparent that functions within such a library could support many other useful activities in the realm of semantic models, reports, lakehouses and really anything Fabric-related. As such, this library contains a variety of functions ranging from running [Vertipaq Analyzer](https://semantic-link-labs.readthedocs.io/en/stable/sempy_labs.html#sempy_labs.import_vertipaq_analyzer) or the [Best Practice Analyzer](https://semantic-link-labs.readthedocs.io/en/stable/sempy_labs.html#sempy_labs.run_model_bpa) against a semantic model to seeing if any [lakehouse tables hit Direct Lake guardrails](https://semantic-link-labs.readthedocs.io/en/stable/sempy_labs.lakehouse.html#sempy_labs.lakehouse.get_lakehouse_tables) or accessing the [Tabular Object Model](https://semantic-link-labs.readthedocs.io/en/stable/sempy_labs.tom.html) and more!

Instructions for migrating import/DirectQuery semantic models to Direct Lake mode can be found [here](https://github.com/microsoft/semantic-link-labs?tab=readme-ov-file#direct-lake-migration).
Semantic Link Labs is a Python library designed for use in [Microsoft Fabric notebooks](https://learn.microsoft.com/fabric/data-engineering/how-to-use-notebook). This library extends the capabilities of [Semantic Link](https://learn.microsoft.com/fabric/data-science/semantic-link-overview) offering additional functionalities to seamlessly integrate and work alongside it. The goal of Semantic Link Labs is to simplify technical processes, empowering people to focus on higher level activities and allowing tasks that are better suited for machines to be efficiently handled without human intervention.

## Featured Scenarios
* Semantic Models
* [Migrating an import/DirectQuery semantic model to Direct Lake](https://github.com/microsoft/semantic-link-labs?tab=readme-ov-file#direct-lake-migration)
* [Model Best Practice Analyzer (BPA)](https://semantic-link-labs.readthedocs.io/en/stable/sempy_labs.html#sempy_labs.run_model_bpa)
* [Vertipaq Analyzer](https://semantic-link-labs.readthedocs.io/en/stable/sempy_labs.html#sempy_labs.vertipaq_analyzer)
* [Tabular Object Model (TOM)](https://github.com/microsoft/semantic-link-labs/blob/main/notebooks/Tabular%20Object%20Model.ipynb)
* [Translate a semantic model's metadata](https://semantic-link-labs.readthedocs.io/en/stable/sempy_labs.html#sempy_labs.translate_semantic_model)
* [Check Direct Lake Guardrails](https://semantic-link-labs.readthedocs.io/en/stable/sempy_labs.lakehouse.html#sempy_labs.lakehouse.get_lakehouse_tables)
* [Refresh](https://github.com/microsoft/semantic-link-labs/blob/main/notebooks/Semantic%20Model%20Refresh.ipynb), [clear cache](https://semantic-link-labs.readthedocs.io/en/stable/sempy_labs.html#sempy_labs.clear_cache), [backup](https://semantic-link-labs.readthedocs.io/en/stable/sempy_labs.html#sempy_labs.backup_semantic_model), [restore](https://semantic-link-labs.readthedocs.io/en/stable/sempy_labs.html#sempy_labs.restore_semantic_model), [copy backup files](https://semantic-link-labs.readthedocs.io/en/stable/sempy_labs.html#sempy_labs.copy_semantic_model_backup_file), [move/deploy across workspaces](https://semantic-link-labs.readthedocs.io/en/stable/sempy_labs.html#sempy_labs.deploy_semantic_model)
* [Run DAX queries which impersonate a user](https://semantic-link-labs.readthedocs.io/en/stable/sempy_labs.html#sempy_labs.evaluate_dax_impersonation)
* Reports
* [Report Best Practice Analyzer (BPA)](https://semantic-link-labs.readthedocs.io/en/stable/sempy_labs.report.html#sempy_labs.report.run_report_bpa)
* [View report metadata](https://github.com/microsoft/semantic-link-labs/blob/main/notebooks/Report%20Analysis.ipynb)
* [Rebind reports](https://semantic-link-labs.readthedocs.io/en/stable/sempy_labs.report.html#sempy_labs.report.report_rebind)
* Capacities
* [Migrating a Power BI Premium capacity (P sku) to a Fabric capacity (F sku)](https://github.com/microsoft/semantic-link-labs/blob/main/notebooks/Capacity%20Migration.ipynb)
* APIs
* Wrapper functions for [Power BI](https://learn.microsoft.com/rest/api/power-bi/), [Fabric](https://learn.microsoft.com/rest/api/fabric/articles/using-fabric-apis), and [Azure](https://learn.microsoft.com/rest/api/azure/?view=rest-power-bi-embedded-2021-01-01) APIs


### Make sure you check out the starter [notebooks](https://github.com/microsoft/semantic-link-labs/tree/main/notebooks) for getting started!

If you encounter any issues, please [raise a bug](https://github.com/microsoft/semantic-link-labs/issues/new?assignees=&labels=&projects=&template=bug_report.md&title=).

Expand Down Expand Up @@ -55,7 +75,7 @@ An even better way to ensure the semantic-link-labs library is available in your
2. Select your newly created environment within the 'Environment' drop down in the navigation bar at the top of the notebook

## Version History
* [0.8.0](https://github.com/microsoft/semantic-link-labs/releases/tag/0.8.0) (September 24, 2024)
* [0.8.0](https://github.com/microsoft/semantic-link-labs/releases/tag/0.8.0) (September 25, 2024)
* [0.7.4](https://github.com/microsoft/semantic-link-labs/releases/tag/0.7.4) (September 16, 2024)
* [0.7.3](https://github.com/microsoft/semantic-link-labs/releases/tag/0.7.3) (September 11, 2024)
* [0.7.2](https://github.com/microsoft/semantic-link-labs/releases/tag/0.7.2) (August 30, 2024)
Expand Down
3 changes: 2 additions & 1 deletion docs/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,5 @@ anytree
IPython
polib
azure.mgmt.resource
jsonpath_ng
jsonpath_ng
deltalake
2 changes: 1 addition & 1 deletion notebooks/Tabular Object Model.ipynb

Large diffs are not rendered by default.

3 changes: 2 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ dependencies = [
"polib",
"azure.mgmt.resource",
"jsonpath_ng",
"deltalake",
]

[tool.setuptools.packages.find]
Expand All @@ -46,7 +47,7 @@ test = [
Repository = "https://github.com/microsoft/semantic-link-labs.git"

[[tool.mypy.overrides]]
module = "sempy.*,Microsoft.*,System.*,anytree.*,powerbiclient.*,synapse.ml.services.*,polib.*,azure.mgmt.resource.*,jsonpath_ng.*"
module = "sempy.*,Microsoft.*,System.*,anytree.*,powerbiclient.*,synapse.ml.services.*,polib.*,azure.mgmt.resource.*,jsonpath_ng.*,deltalake.*"
ignore_missing_imports = true

[tool.flake8]
Expand Down
80 changes: 79 additions & 1 deletion src/sempy_labs/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,44 @@
from sempy_labs._ml_models import (
list_ml_models,
create_ml_model,
delete_ml_model,
)
from sempy_labs._ml_experiments import (
list_ml_experiments,
create_ml_experiment,
delete_ml_experiment,
)
from sempy_labs._warehouses import (
create_warehouse,
list_warehouses,
delete_warehouse,
)
from sempy_labs._data_pipelines import (
list_data_pipelines,
create_data_pipeline,
delete_data_pipeline,
)
from sempy_labs._eventhouses import (
create_eventhouse,
list_eventhouses,
delete_eventhouse,
)
from sempy_labs._eventstreams import (
list_eventstreams,
create_eventstream,
delete_eventstream,
)
from sempy_labs._kql_querysets import (
list_kql_querysets,
create_kql_queryset,
delete_kql_queryset,
)
from sempy_labs._kql_databases import (
list_kql_databases,
create_kql_database,
delete_kql_database,
)
from sempy_labs._mirrored_warehouses import list_mirrored_warehouses
from sempy_labs._environments import (
create_environment,
delete_environment,
Expand Down Expand Up @@ -109,6 +150,10 @@
list_lakehouses,
list_warehouses,
create_warehouse,
list_dashboards,
list_datamarts,
list_lakehouses,
list_sql_endpoints,
update_item,
)
from sempy_labs._helper_functions import (
Expand Down Expand Up @@ -235,7 +280,6 @@
"cancel_dataset_refresh",
"translate_semantic_model",
"vertipaq_analyzer",
# 'visualize_vertipaq',
"import_vertipaq_analyzer",
"list_semantic_model_objects",
"list_shortcuts",
Expand Down Expand Up @@ -297,4 +341,38 @@
"migrate_access_settings",
"migrate_delegated_tenant_settings",
"convert_to_friendly_case",
"resume_fabric_capacity",
"suspend_fabric_capacity",
"update_fabric_capacity",
"delete_fabric_capacity",
"check_fabric_capacity_name_availablility",
"delete_embedded_capacity",
"delete_premium_capacity",
"list_mirrored_warehouses",
"list_kql_databases",
"create_kql_database",
"delete_kql_database",
"create_warehouse",
"list_warehouses",
"delete_warehouse",
"create_eventhouse",
"list_eventhouses",
"delete_eventhouse",
"list_data_pipelines",
"create_data_pipeline",
"delete_data_pipeline",
"list_eventstreams",
"create_eventstream",
"delete_eventstream",
"list_kql_querysets",
"create_kql_queryset",
"delete_kql_queryset",
"list_ml_models",
"create_ml_model",
"delete_ml_model",
"list_ml_experiments",
"create_ml_experiment",
"delete_ml_experiment",
"list_sql_endpoints",
"list_datamarts",
]
2 changes: 0 additions & 2 deletions src/sempy_labs/_ai.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
import sempy
import sempy.fabric as fabric
import pandas as pd
from synapse.ml.services.openai import OpenAICompletion
from pyspark.sql.functions import col
from pyspark.sql import SparkSession
from typing import List, Optional, Union
from IPython.display import display
Expand Down
118 changes: 118 additions & 0 deletions src/sempy_labs/_data_pipelines.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
import sempy.fabric as fabric
import pandas as pd
import sempy_labs._icons as icons
from typing import Optional
from sempy_labs._helper_functions import (
resolve_workspace_name_and_id,
lro,
pagination,
)
from sempy.fabric.exceptions import FabricHTTPException


def list_data_pipelines(workspace: Optional[str] = None) -> pd.DataFrame:
"""
Shows the data pipelines within a workspace.
Parameters
----------
workspace : str, default=None
The Fabric workspace name.
Defaults to None which resolves to the workspace of the attached lakehouse
or if no lakehouse attached, resolves to the workspace of the notebook.
Returns
-------
pandas.DataFrame
A pandas dataframe showing the data pipelines within a workspace.
"""

df = pd.DataFrame(columns=["Data Pipeline Name", "Data Pipeline ID", "Description"])

(workspace, workspace_id) = resolve_workspace_name_and_id(workspace)

client = fabric.FabricRestClient()
response = client.get(f"/v1/workspaces/{workspace_id}/dataPipelines")
if response.status_code != 200:
raise FabricHTTPException(response)

responses = pagination(client, response)

for r in responses:
for v in r.get("value", []):
new_data = {
"Data Pipeline Name": v.get("displayName"),
"Data Pipeline ID": v.get("id"),
"Description": v.get("description"),
}
df = pd.concat([df, pd.DataFrame(new_data, index=[0])], ignore_index=True)

return df


def create_data_pipeline(
name: str, description: Optional[str] = None, workspace: Optional[str] = None
):
"""
Creates a Fabric data pipeline.
Parameters
----------
name: str
Name of the data pipeline.
description : str, default=None
A description of the environment.
workspace : str, default=None
The Fabric workspace name.
Defaults to None which resolves to the workspace of the attached lakehouse
or if no lakehouse attached, resolves to the workspace of the notebook.
"""

(workspace, workspace_id) = resolve_workspace_name_and_id(workspace)

request_body = {"displayName": name}

if description:
request_body["description"] = description

client = fabric.FabricRestClient()
response = client.post(
f"/v1/workspaces/{workspace_id}/dataPipelines", json=request_body
)

lro(client, response, status_codes=[201, 202])

print(
f"{icons.green_dot} The '{name}' data pipeline has been created within the '{workspace}' workspace."
)


def delete_data_pipeline(name: str, workspace: Optional[str] = None):
"""
Deletes a Fabric data pipeline.
Parameters
----------
name: str
Name of the data pipeline.
workspace : str, default=None
The Fabric workspace name.
Defaults to None which resolves to the workspace of the attached lakehouse
or if no lakehouse attached, resolves to the workspace of the notebook.
"""

(workspace, workspace_id) = resolve_workspace_name_and_id(workspace)

item_id = fabric.resolve_item_id(
item_name=name, type="DataPipeline", workspace=workspace
)

client = fabric.FabricRestClient()
response = client.delete(f"/v1/workspaces/{workspace_id}/dataPipelines/{item_id}")

if response.status_code != 200:
raise FabricHTTPException(response)

print(
f"{icons.green_dot} The '{name}' data pipeline within the '{workspace}' workspace has been deleted."
)
Loading

0 comments on commit ed8c95b

Please sign in to comment.