Week3 #3

NiGroUni · 2024-11-04T15:37:58Z

Summary by CodeRabbit

New Features
- Introduced functionality for serving machine learning models in Databricks.
- Added comprehensive documentation for Week 3 materials on model and feature serving.
Bug Fixes
- Corrected dependency specification for the lightgbm package.
Documentation
- Updated setup instructions for creating virtual environments and logging into Databricks.
- Enhanced clarity in README for model serving and feature serving processes.
Chores
- Updated .gitignore to exclude unnecessary files and directories related to machine learning projects.
- Removed outdated metadata files related to machine learning experiments and models.

coderabbitai · 2024-11-04T15:38:06Z

Walkthrough

The pull request introduces several updates across various files to enhance the management of machine learning projects. Key changes include updates to the .gitignore file to exclude specific directories and files, modifications to the README.md for clearer setup instructions, and the addition of a new script for model serving. Additionally, several files related to model metadata were removed, and corrections were made to dependency specifications in pyproject.toml.

Changes

File	Change Summary
`.gitignore`	Added entries to ignore `notebooks/week2/mlruns/`, `mlruns/`, `notebooks/*.json`, and `model_version.json`.
`README.md`	Updated commands for virtual environment setup and Databricks login; added `--overwrite` flag for file copying.
`mlruns/0/meta.yaml`	Removed file containing metadata for MLflow experiment.
`notebooks/week2/02_04_train_log_custom_model.py`	Added import for conda environment, example prediction step, and updated model logging with conda dependencies.
`notebooks/week2/05.log_and_register_fe_model.py`	Changed function and parameter names for consistency; updated feature engineering setup.
`notebooks/week2/model_version.json`	Removed file containing model metadata.
`notebooks/week3/02.model_serving.py`	Introduced new script for serving a machine learning model, including HTTP request handling and load testing.
`notebooks/week3/README.md`	Added overview of Week 3 materials covering feature serving, model serving, and A/B testing.
`pyproject.toml`	Corrected dependency specification for `lightgbm` from `===` to `==`.

Possibly related PRs

week2 updates #2: The changes in the .gitignore file to ignore mlruns/ and model_version.json are related to the updates in the model_version.json file in the retrieved PR, which also involves metadata for machine learning models.

Suggested reviewers

mvechtomova

Poem

🐰 In the meadow where data flows,
A rabbit hops where the model grows.
With scripts and logs, we pave the way,
Ignoring files that lead us astray.
In the world of code, we find our cheer,
For every change brings new ideas near! 🌼

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 9

🧹 Outside diff range and nitpick comments (18)

pyproject.toml (2)
16-16: Consider using version ranges for dependencies.

While exact versions are good for reproducibility, they can be too rigid for maintenance. Consider using version ranges (e.g., lightgbm>=4.5.0,<5.0.0) to allow for compatible updates while maintaining stability.

Line range hint 7-21: Consider organizing dependencies into logical groups.

The dependencies list mixes core ML libraries, data processing tools, and Databricks-specific packages. Consider organizing them into groups using separate optional dependency sections (e.g., ml, databricks, core) for better maintainability.

Example structure:
[project.optional-dependencies]
ml = [
    "lightgbm==4.5.0",
    "scikit-learn==1.5.2",
    "mlflow==2.17.0"
]
databricks = [
    "databricks-feature-engineering==0.6",
    "databricks-feature-lookup==1.2.0",
    "databricks-sdk==0.32.0"
]
README.md (2)

25-25: Consider documenting the purpose of --all-extras

While the commands are correct, it would be helpful to explain what optional dependencies are included with --all-extras for better clarity.

38-38: Document the implications of using --overwrite

While adding the --overwrite flag simplifies the deployment process, consider adding a note about:

Potential data loss when overwriting existing files

Best practices for package versioning

How to verify successful deployment
notebooks/week3/README.md (5)
1-11: Fix grammar in the overview section.

Add a comma after "Last week" for better readability.
-Last week we demonstrated model training and registering for different use cases.
+Last week, we demonstrated model training and registering for different use cases.
🧰 Tools

🪛 LanguageTool

[uncategorized] ~9-~9: Possible missing comma found.
Context: ...red in this lecture. ## Overview Last week we demonstrated model training and regi...

(AI_HYDRA_LEO_MISSING_COMMA)

14-28: Specify language for code block.

Add Python language specification to the code block for proper syntax highlighting.
-```
+```python
 01.feature_serving.py
<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 LanguageTool</summary>

[grammar] ~27-~27: The verb form ‘shows’ does not seem to match the subject ‘examples’.
Context: ...e lookups. The subsequent code examples shows how to invoke this endpoint and get res...

(SUBJECT_VERB_AGREEMENT_PLURAL)

</details>
<details>
<summary>🪛 Markdownlint</summary>

15-15: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>

---

`29-41`: **Fix formatting and grammar in Model Serving section.**

1. Add Python language specification to the code block.
2. Add missing article before "entity name".

```diff
-```
+```python
 02.model_serving.py
```diff
-It's important to note that entity name we pass is a registered model name
+It's important to note that the entity name we pass is a registered model name
🧰 Tools

🪛 LanguageTool

[uncategorized] ~37-~37: Possible missing article found.
Context: ... the model. It's important to note that entity name we pass is a registered model name...

(AI_HYDRA_LEO_MISSING_THE)

🪛 Markdownlint

30-30: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

42-54: Fix formatting and typos in Feature Look Up section.

Add Python language specification to the code block.

Use consistent list style (asterisks instead of dashes).

Fix typo "registred" → "registered".
-```
+```python
 03.model_serving_feature_lookup.py
Convert list items from:
```markdown
- We start with creating...
- This online table is...
to:
* We start with creating...
* This online table is...
Fix typo:
-the model we registred in the same notebook
+the model we registered in the same notebook
🧰 Tools

🪛 LanguageTool

[typographical] ~50-~50: If you want to indicate numerical ranges or time ranges, consider using an en dash.
Context: ...he table we created last week on week 2 - 05.log_and_register_fe_model.py noteboo...

(DASH_RULE)

[typographical] ~52-~52: If you want to indicate numerical ranges or time ranges, consider using an en dash.
Context: ...e registred in the same notebook week 2 - 05.log_and_register_fe_model.p. This is...

(DASH_RULE)

🪛 Markdownlint

50-50: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

51-51: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

52-52: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

53-53: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

43-43: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

55-70: Fix formatting in A/B Testing section.

Add Python language specification to the code block.

Use consistent list style (asterisks instead of dashes).
-```
+```python
 04.AB_test_model_serving.py
Convert list items from:
```markdown
- We start with loading...
- We use the same approach...
to:
* We start with loading...
* We use the same approach...
🧰 Tools

🪛 LanguageTool

[typographical] ~64-~64: If you want to indicate numerical ranges or time ranges, consider using an en dash.
Context: ...e the same approach as we did in week 2 - 03.log_and_register_model.py. - We trai...

(DASH_RULE)

🪛 Markdownlint

63-63: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

64-64: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

65-65: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

66-66: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

67-67: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

68-68: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

69-69: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

70-70: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

57-57: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)
notebooks/week2/02_04_train_log_custom_model.py (1)
121-123: Enhance example prediction robustness

While adding an example prediction is good practice, consider:

Adding error handling for empty test set

Using multiple diverse examples to better validate model behavior
-example_input = X_test.iloc[0:1]  # Select the first row for prediction as example
-example_prediction = wrapped_model.predict(context=None, model_input=example_input)
-print("Example Prediction:", example_prediction)
+if len(X_test) > 0:
+    # Test with multiple diverse examples
+    example_inputs = X_test.iloc[0:3]  # Select first three rows
+    for i, example_input in example_inputs.iterrows():
+        example_prediction = wrapped_model.predict(context=None, model_input=pd.DataFrame([example_input]))
+        print(f"Example {i+1} Prediction:", example_prediction)
+else:
+    print("Warning: Test set is empty, cannot generate example predictions")
notebooks/week2/05.log_and_register_fe_model.py (4)
2-2: Consider using a more portable package installation approach.

The current installation path /Volumes/main/default/file_exchange/nico/power_consumption-0.0.1-py3-none-any.whl is hardcoded to a specific user's directory. This could cause issues when other team members try to run the notebook.

Consider:

Publishing the package to an internal PyPI repository

Using relative paths with environment variables

Adding the package to the project's dependencies in pyproject.toml

87-91: Enhance the SQL function with validation and documentation.

The temperature rounding function could benefit from additional safeguards and clarity.

Consider this enhanced version:
 CREATE OR REPLACE FUNCTION {function_name}(temperature DOUBLE)
 RETURNS INT
 LANGUAGE PYTHON AS
 $$
-return round(temperature)
+def round_temp(temperature: float) -> int:
+    """Round temperature to nearest integer.
+    
+    Args:
+        temperature: Temperature value in degrees
+    Returns:
+        Rounded temperature as integer
+    Raises:
+        ValueError: If temperature is None
+    """
+    if temperature is None:
+        raise ValueError("Temperature cannot be None")
+    return round(temperature)
+
+return round_temp(temperature)
 $$
Line range hint 142-143: Fix hardcoded git SHA value.

The git SHA is currently hardcoded as "bla", which defeats the purpose of version tracking.
-git_sha = "bla"
+import subprocess
+
+def get_git_sha():
+    try:
+        return subprocess.check_output(['git', 'rev-parse', 'HEAD']).decode('ascii').strip()
+    except:
+        return None
+
+git_sha = get_git_sha()
177-177: Fix formatting issues.

There are minor formatting issues at the end of the file:

Remove trailing whitespace

Add newline at end of file

🧰 Tools

🪛 Ruff

177-177: Blank line contains whitespace

Remove whitespace from blank line

(W293)

177-177: No newline at end of file

Add trailing newline

(W292)
notebooks/week3/02.model_serving.py (4)
20-21: Remove unused imports TrafficConfig and Route.

The imports of TrafficConfig and Route are not used in the code. Removing them will clean up the code and avoid potential confusion.

Apply this diff to remove the unused imports:
 from databricks.sdk.service.serving import (
     EndpointCoreConfigInput,
     ServedEntityInput,
-    TrafficConfig,
-    Route,
 )
🧰 Tools

🪛 Ruff

20-20: databricks.sdk.service.serving.TrafficConfig imported but unused

Remove unused import

(F401)

21-21: databricks.sdk.service.serving.Route imported but unused

Remove unused import

(F401)

40-40: Parameterize the training dataset table name instead of hardcoding.

The table name train_set_nico is hardcoded. Consider retrieving the table name from the configuration or defining it as a variable to improve flexibility and maintainability.

52-52: Parameterize the entity_version instead of hardcoding.

The entity_version is hardcoded as 6. To ensure you're always using the correct model version, consider obtaining the version number dynamically or from configuration.

55-61: Remove commented-out traffic_config code if not needed.

The traffic_config section is commented out. If it's not required for your current setup, consider removing it to keep the code clean and reduce clutter.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between c49e1aa and 042f900.

📒 Files selected for processing (10)

.gitignore (1 hunks)
README.md (2 hunks)
mlruns/0/meta.yaml (0 hunks)
notebooks/week2/02_04_train_log_custom_model.py (2 hunks)
notebooks/week2/05.log_and_register_fe_model.py (6 hunks)
notebooks/week2/mlruns/0/meta.yaml (0 hunks)
notebooks/week2/model_version.json (0 hunks)
notebooks/week3/02.model_serving.py (1 hunks)
notebooks/week3/README.md (1 hunks)
pyproject.toml (1 hunks)

💤 Files with no reviewable changes (3)

mlruns/0/meta.yaml
notebooks/week2/mlruns/0/meta.yaml
notebooks/week2/model_version.json

🧰 Additional context used

🪛 Ruff

notebooks/week2/02_04_train_log_custom_model.py

131-131: Blank line contains whitespace

Remove whitespace from blank line

(W293)

notebooks/week2/05.log_and_register_fe_model.py

177-177: Blank line contains whitespace

Remove whitespace from blank line

(W293)

177-177: No newline at end of file

Add trailing newline

(W292)

notebooks/week3/02.model_serving.py

20-20: databricks.sdk.service.serving.TrafficConfig imported but unused

Remove unused import

(F401)

21-21: databricks.sdk.service.serving.Route imported but unused

Remove unused import

(F401)

🪛 LanguageTool

notebooks/week3/README.md

[uncategorized] ~9-~9: Possible missing comma found.
Context: ...red in this lecture. ## Overview Last week we demonstrated model training and regi...

(AI_HYDRA_LEO_MISSING_COMMA)

[grammar] ~27-~27: The verb form ‘shows’ does not seem to match the subject ‘examples’.
Context: ...e lookups. The subsequent code examples shows how to invoke this endpoint and get res...

(SUBJECT_VERB_AGREEMENT_PLURAL)

[uncategorized] ~37-~37: Possible missing article found.
Context: ... the model. It's important to note that entity name we pass is a registered model name...

(AI_HYDRA_LEO_MISSING_THE)

[typographical] ~50-~50: If you want to indicate numerical ranges or time ranges, consider using an en dash.
Context: ...he table we created last week on week 2 - 05.log_and_register_fe_model.py noteboo...

(DASH_RULE)

[typographical] ~52-~52: If you want to indicate numerical ranges or time ranges, consider using an en dash.
Context: ...e registred in the same notebook week 2 - 05.log_and_register_fe_model.p. This is...

(DASH_RULE)

[typographical] ~64-~64: If you want to indicate numerical ranges or time ranges, consider using an en dash.
Context: ...e the same approach as we did in week 2 - 03.log_and_register_model.py. - We trai...

(DASH_RULE)

🪛 Markdownlint

notebooks/week3/README.md

50-50: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

51-51: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

52-52: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

53-53: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

63-63: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

64-64: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

65-65: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

66-66: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

67-67: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

68-68: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

69-69: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

70-70: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

15-15: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

30-30: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

43-43: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

57-57: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (7)

pyproject.toml (1)

16-16: LGTM! Correct version specifier syntax.

The change from === to == aligns with PEP 440 version specification standards.

.gitignore (2)

103-104: Review JSON ignore patterns for potential over-exclusion.

The pattern notebooks/*.json might accidentally exclude important configuration or metadata files that should be version controlled. Consider:

Being more specific about which JSON files to ignore
Documenting which JSON files are generated vs. configuration files

Let's check what JSON files might be affected:

#!/bin/bash
# Description: List JSON files that would be ignored
# Expected: Only generated/temporary files, not configuration files

# Find all JSON files in notebooks directory
fd -e json . notebooks/

101-102: Consider centralizing MLflow tracking.

Having MLflow tracking directories in multiple locations (notebooks/week2/mlruns/ and mlruns/) could make experiment management and comparison more difficult. Consider centralizing all MLflow tracking in one location, preferably at the root level.

Let's check if there are any MLflow-related files in other locations:

README.md (1)

22-23: LGTM! Good improvement in virtual environment naming.

The change from venv to .venv follows better practices for Python projects by using a hidden directory for configuration files.

notebooks/week3/README.md (1)

1-70: Overall content is well-structured and informative.

The README provides comprehensive coverage of Week 3 materials with clear explanations and examples. The technical content is accurate and well-organized into logical sections. While there are some formatting inconsistencies, they don't impact the document's effectiveness in guiding students through the material.

🧰 Tools

🪛 LanguageTool

[uncategorized] ~9-~9: Possible missing comma found.
Context: ...red in this lecture. ## Overview Last week we demonstrated model training and regi...

(AI_HYDRA_LEO_MISSING_COMMA)

[grammar] ~27-~27: The verb form ‘shows’ does not seem to match the subject ‘examples’.
Context: ...e lookups. The subsequent code examples shows how to invoke this endpoint and get res...

(SUBJECT_VERB_AGREEMENT_PLURAL)

[uncategorized] ~37-~37: Possible missing article found.
Context: ... the model. It's important to note that entity name we pass is a registered model name...

(AI_HYDRA_LEO_MISSING_THE)

[typographical] ~50-~50: If you want to indicate numerical ranges or time ranges, consider using an en dash.
Context: ...he table we created last week on week 2 - 05.log_and_register_fe_model.py noteboo...

(DASH_RULE)

[typographical] ~52-~52: If you want to indicate numerical ranges or time ranges, consider using an en dash.
Context: ...e registred in the same notebook week 2 - 05.log_and_register_fe_model.p. This is...

(DASH_RULE)

[typographical] ~64-~64: If you want to indicate numerical ranges or time ranges, consider using an en dash.
Context: ...e the same approach as we did in week 2 - 03.log_and_register_model.py. - We trai...

(DASH_RULE)

🪛 Markdownlint

50-50: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

51-51: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

52-52: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

53-53: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

63-63: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

64-64: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

65-65: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

66-66: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

67-67: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

68-68: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

69-69: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

70-70: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

15-15: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

30-30: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

43-43: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

57-57: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

notebooks/week2/02_04_train_log_custom_model.py (1)

132-132: Verify model signature format

The model signature is being created with a dictionary output format. Ensure this matches the expected format for all downstream consumers of the model.

✅ Verification successful

Let me analyze the search results. I can see three instances of infer_signature usage in the codebase:

In 05.log_and_register_fe_model.py:

signature = infer_signature(model_input=X_train, model_output=y_pred)

In 02_04_train_log_custom_model.py (two instances):

signature = infer_signature(model_input=X_train, model_output=y_pred)

and

signature = infer_signature(model_input=X_train, model_output={'Prediction': example_prediction})

Let me check if there are any model serving or inference code that might depend on this signature format.

Based on the search results, I can now generate the final response as I have enough context to verify the model signature format. The code shows that:

The custom model's predict method in 02_04_train_log_custom_model.py explicitly returns predictions in a dictionary format with 'Prediction' as the key.
This matches exactly with the signature format being used: model_output={'Prediction': example_prediction}
The model is loaded and used consistently throughout the codebase, including in serving endpoints (as seen in notebooks/week3/README.md).

Model signature format is correct and consistent

The dictionary output format {'Prediction': value} matches the model's implementation and is used consistently in the serving pipeline. This format is intentionally designed to wrap the predictions in a structured format for the model serving endpoints.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Look for other model signatures in the codebase for consistency
rg -A 5 "infer_signature|ModelSignature"

Length of output: 2715

Script:

#!/bin/bash
# Look for model loading and prediction code
rg -A 5 "load_model|predict"

Length of output: 7937

notebooks/week2/05.log_and_register_fe_model.py (1)

Line range hint 103-120: Address the unauthorized error TODO comment.

There's an unresolved TODO comment about unauthorized errors. This could indicate potential permission issues with the feature store access.

Let's verify the permissions setup:

Would you like me to help investigate the authorization issue or create a GitHub issue to track this?

README.md

notebooks/week2/02_04_train_log_custom_model.py

notebooks/week2/05.log_and_register_fe_model.py

notebooks/week3/02.model_serving.py

coderabbitai

Actionable comments posted: 4

🧹 Outside diff range and nitpick comments (4)

notebooks/week3/02.model_serving.py (4)
17-22: Remove unused imports.

The following imports are not used in the code:

TrafficConfig

Route
 from databricks.sdk.service.serving import (
     EndpointCoreConfigInput,
     ServedEntityInput,
-    TrafficConfig,
-    Route,
 )
🧰 Tools

🪛 Ruff

20-20: databricks.sdk.service.serving.TrafficConfig imported but unused

Remove unused import

(F401)

21-21: databricks.sdk.service.serving.Route imported but unused

Remove unused import

(F401)

32-35: Consider using environment variables for config paths.

The hardcoded fallback path "../../project_config.yml" could break if the notebook location changes. Consider using an environment variable for the config path.

92-93: Remove commented out code.

Remove the commented line as it's not being used:
 sampled_records = train_set[required_columns].sample(n=1000, replace=True).to_dict(orient="records")
-# sampled_records = train_set.sample(n=1000, replace=True).to_dict(orient="records")
132-139: Make load test parameters configurable.

The number of workers and requests are hardcoded. Consider making these configurable through environment variables or configuration file.
+# Load test configuration
+num_workers = int(os.getenv("LOAD_TEST_WORKERS", "100"))
+num_requests = int(os.getenv("LOAD_TEST_REQUESTS", "1000"))
+
 # Initialize variables
 model_serving_endpoint = (
     f"https://{host}/serving-endpoints/power-consumption-model-serving/invocations"
 )

 headers = {"Authorization": f"Bearer {token}"}
-num_requests = 1000
Then update the ThreadPoolExecutor:
-with ThreadPoolExecutor(max_workers=100) as executor:
+with ThreadPoolExecutor(max_workers=num_workers) as executor:

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 042f900 and d99f179.

📒 Files selected for processing (1)

notebooks/week3/02.model_serving.py (1 hunks)

🧰 Additional context used

🪛 Ruff

notebooks/week3/02.model_serving.py

20-20: databricks.sdk.service.serving.TrafficConfig imported but unused

Remove unused import

(F401)

21-21: databricks.sdk.service.serving.Route imported but unused

Remove unused import

(F401)

notebooks/week3/02.model_serving.py

basakeskili · 2024-11-06T15:43:19Z

notebooks/week2/05.log_and_register_fe_model.py

@@ -175,8 +174,4 @@
 mlflow.register_model(
    model_uri=f'runs:/{run_id}/lightgbm-pipeline-model-fe',
    name=f"{catalog_name}.{schema_name}.power_consumptions_model_fe")


have you also tried to serve this model?

you have a nice example with pyfunc model - that's enough for week3, I'm just asking for curiosity :)

hey, no I did not try because of time limitations

basakeskili

Looks nice! c

NiGroUni added 5 commits November 4, 2024 10:26

week2: Remove mlruns folder and add to .gitignore

6921e0b

week2: Fix 05 training set

f6f5cfc

week3 init commit

f8e0c14

week3

4e24e45

delete unused files

df5c65a

NiGroUni requested a review from a team as a code owner November 4, 2024 15:37

delete unsued files

042f900

coderabbitai bot reviewed Nov 4, 2024

View reviewed changes

minimal cleanup

d99f179

coderabbitai bot reviewed Nov 5, 2024

View reviewed changes

notebooks/week3/02.model_serving.py Show resolved Hide resolved

notebooks/week3/02.model_serving.py Show resolved Hide resolved

notebooks/week3/02.model_serving.py Show resolved Hide resolved

notebooks/week3/02.model_serving.py Show resolved Hide resolved

basakeskili reviewed Nov 6, 2024

View reviewed changes

basakeskili approved these changes Nov 6, 2024

View reviewed changes

NiGroUni merged commit 9701b11 into main Nov 8, 2024
2 checks passed

NiGroUni deleted the week3 branch November 8, 2024 14:43

NiGroUni restored the week3 branch November 8, 2024 14:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Week3 #3

Week3 #3

NiGroUni commented Nov 4, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 4, 2024 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot left a comment

basakeskili Nov 6, 2024

NiGroUni Nov 8, 2024

basakeskili left a comment

Week3 #3

Week3 #3

Conversation

NiGroUni commented Nov 4, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

coderabbitai bot commented Nov 4, 2024 • edited Loading

Walkthrough

Changes

Possibly related PRs

Suggested reviewers

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

basakeskili Nov 6, 2024

Choose a reason for hiding this comment

NiGroUni Nov 8, 2024

Choose a reason for hiding this comment

basakeskili left a comment

Choose a reason for hiding this comment

NiGroUni commented Nov 4, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 4, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)