-
Notifications
You must be signed in to change notification settings - Fork 1
Research QA Test Set
There are two main types of questions and answers:
What is ML flow and for what is it used for?
"MLflow is an open-source platform designed to manage the end-to-end machine learning (ML) lifecycle. It is used by MLOps teams and data scientists. It provides tools for tracking experiments, packaging code into reproducible runs, and sharing and deploying models. Here's a breakdown of its main components and functionalities:
- MLflow Tracking: o Experiment Tracking: Helps in logging and querying experiments. It records and organizes parameters, metrics, and other metadata from ML experiments. o UI and API: Provides a web-based UI and API for viewing and querying the logged data, which helps in comparing different runs and understanding the performance of models.
- MLflow Projects: o Packaging Code: Defines a format for packaging data science code in a reusable and reproducible way. o Environment Management: Supports various environments (e.g., Conda, Docker) to ensure consistency across different runs.
- MLflow Models: o Model Packaging: Facilitates packaging ML models in a standard format that can be shared and deployed across different platforms. o Serving: Provides tools to deploy models as REST API endpoints, making it easy to integrate them into applications. o Multiple Flavors: Supports different "flavors" (e.g., Python, R, Java) to ensure models can be used with various tools and frameworks.
- MLflow Registry: o Model Versioning: Manages the lifecycle of a model, including stages like staging, production, and archiving. o Collaboration: Enables collaboration by tracking model lineage, versioning, and annotations, facilitating model governance. Use Cases
- Experiment Management: o Track experiments to compare different approaches and understand which models perform best under various conditions.
- Reproducibility: o Package code and environment configurations to ensure experiments and models can be reproduced reliably across different environments.
- Model Deployment: o Seamlessly deploy models into production, ensuring that they can be served as APIs or integrated into applications.
- Collaboration and Governance: o Collaborate with team members by sharing models and tracking their versions, ensuring proper governance and management of ML assets.
- Scalability: o Integrate with existing infrastructure and scale up from prototyping on a local machine to running on distributed cloud environments. Integration MLflow integrates well with popular ML frameworks and libraries such as TensorFlow, PyTorch, Scikit-learn, and XGBoost. As well as LLM speceice libraries and frameworks like Hugging Face Transformers, OpenAI SDK and LangChain. By providing a unified interface for managing experiments, projects, and models, MLflow helps data scientists and ML engineers streamline their workflows, enhance productivity, and ensure the reliability and reproducibility of their ML solutions."
**How to quick start MLflow Tracking? **
"Step 1: Get MLflow
Installing Stable Release If you don’t already have it installed on your system, you can install it with:
pip install mlflow
Step 2: Start a Tracking Server Using a Managed MLflow Tracking Server For details on options for using a managed MLflow Tracking Server, including how to create a free Databricks Community Edition account with managed MLflow, see the guide for tracking server options. (Optional) Run a local Tracking Server We’re going to start a local MLflow Tracking Server, which we will connect to for logging our data for this quickstart. From a terminal, run:
mlflow server --host 127.0.0.1 --port 8080
Note: You can choose any port that you would like, provided that it’s not already in use. Set the Tracking Server URI (if not using a Databricks Managed MLflow Tracking Server). If you’re using a managed MLflow Tracking Server that is not provided by Databricks, or if you’re running a local tracking server, ensure that you set the tracking server’s uri using:
import mlflow
mlflow.set_tracking_uri(uri="http://<host>:<port>")
If this is not set within your notebook or runtime environment, the runs will be logged to your local file system. Step 3 - Train a model and prepare metadata for logging In this section, we’re going to log a model with MLflow. A quick overview of the steps are: Load and prepare the Iris dataset for modeling. Train a Logistic Regression model and evaluate its performance. Prepare the model hyperparameters and calculate metrics for logging.
import mlflow
from mlflow.models import infer_signature
import pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
# Load the Iris dataset
X, y = datasets.load_iris(return_X_y=True)
# Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
# Define the model hyperparameters
params = {
"solver": "lbfgs",
"max_iter": 1000,
"multi_class": "auto",
"random_state": 8888,
}
# Train the model
lr = LogisticRegression(**params)
lr.fit(X_train, y_train)
# Predict on the test set
y_pred = lr.predict(X_test)
# Calculate metrics
accuracy = accuracy_score(y_test, y_pred)
Step 4 - Log the model and its metadata to MLflow In this next step, we’re going to use the model that we trained, the hyperparameters that we specified for the model’s fit, and the loss metrics that were calculated by evaluating the model’s performance on the test data to log to MLflow.
The steps that we will take are: Initiate an MLflow run context to start a new run that we will log the model and metadata to. Log model parameters and performance metrics. Tag the run for easy retrieval. Register the model in the MLflow Model Registry while logging (saving) the model. Note: While it can be valid to wrap the entire code within the start_run block, this is not recommended. If there as in issue with the training of the model or any other portion of code that is unrelated to MLflow-related actions, an empty or partially-logged run will be created, which will necessitate manual cleanup of the invalid run. It is best to keep the training execution outside of the run context block to ensure that the loggable content (parameters, metrics, artifacts, and the model) are fully materialized prior to logging.
# Set our tracking server uri for logging
mlflow.set_tracking_uri(uri="http://127.0.0.1:8080/")
# Create a new MLflow Experiment
mlflow.set_experiment("MLflow Quickstart")
# Start an MLflow run
with mlflow.start_run():
# Log the hyperparameters
mlflow.log_params(params)
# Log the loss metric
mlflow.log_metric("accuracy", accuracy)
# Set a tag that we can use to remind ourselves what this run was for
mlflow.set_tag("Training Info", "Basic LR model for iris data")
# Infer the model signature
signature = infer_signature(X_train, lr.predict(X_train))
# Log the model
model_info = mlflow.sklearn.log_model(
sk_model=lr,
artifact_path="iris_model",
signature=signature,
input_example=X_train,
registered_model_name="tracking-quickstart",
)
Step 5 - Load the model as a Python Function (pyfunc) and use it for inference After logging the model, we can perform inference by: Loading the model using MLflow’s pyfunc flavor. Running Predict on new data using the loaded model. Note: The iris training data that we used was a numpy array structure. However, we can submit a Pandas DataFrame as well to the predict method, as shown below.
# Load the model back for predictions as a generic Python Function model
loaded_model = mlflow.pyfunc.load_model(model_info.model_uri)
predictions = loaded_model.predict(X_test)
iris_feature_names = datasets.load_iris().feature_names
result = pd.DataFrame(X_test, columns=iris_feature_names)
result["actual_class"] = y_test
result["predicted_class"] = predictions
result[:4]
Step 6 - View the Run in the MLflow UI In order to see the results of our run, we can navigate to the MLflow UI. Since we have already started the Tracking Server at http://localhost:8080/, we can simply navigate to that URL in our browser."
What other CNCF projects integrate with MLflow and which CNCF projects are used in MLflow?
"The Cloud Native Computing Foundation (CNCF) hosts numerous projects that can complement and integrate with MLflow to enhance the machine learning lifecycle, particularly in cloud-native environments. Here are some key CNCF projects that integrate well with MLflow and can be used to create robust ML pipelines:
-
Kubeflow:
- Integration: Kubeflow is a platform for deploying, managing, and scaling machine learning workloads on Kubernetes. It can integrate with MLflow for experiment tracking, model management, and serving.
- Usage: Kubeflow Pipelines can be used to orchestrate ML workflows, where MLflow handles experiment tracking and model logging.
-
Argo:
- Integration: Argo Workflows is a container-native workflow engine for orchestrating parallel jobs on Kubernetes.
- Usage: Argo can be used to create complex ML pipelines that include steps for data preprocessing, training, and model evaluation, with MLflow handling the tracking and logging aspects.
-
KNative:
- Integration: Knative provides a set of components to build, deploy, and manage modern serverless workloads. It can be used for model serving in a scalable and serverless manner.
- Usage: After training a model with MLflow, Knative can deploy the model as a serverless function, automatically scaling based on demand.
-
Prometheus:
- Integration: Prometheus is a monitoring and alerting toolkit.
- Usage: Prometheus can be used to monitor the health and performance of ML workloads, including those managed by MLflow, by collecting metrics from various components of the ML pipeline.
-
Grafana:
- Integration: Grafana is an open-source platform for monitoring and observability.
- Usage: Grafana can be used in conjunction with Prometheus to visualize the metrics collected from ML workloads and MLflow experiments.
-
Envoy:
- Integration: Envoy is a high-performance edge and service proxy.
- Usage: Envoy can be used to manage traffic for ML models served through MLflow, providing features like load balancing, security, and observability.
-
Istio:
- Integration: Istio is a service mesh that provides traffic management, security, and observability for microservices.
- Usage: Istio can manage the network traffic for ML models and services in a Kubernetes cluster, ensuring secure and efficient communication between components.
-
NATS:
- Integration: NATS is a connective technology for real-time data streaming and messaging.
- Usage: NATS can be used for messaging between different components of the ML pipeline, facilitating real-time data processing and communication.
While MLflow itself is not a CNCF project, it can leverage several CNCF technologies for deployment, scaling, and management. Key CNCF projects that can be used in conjunction with MLflow include:
-
Kubernetes:
- Usage: Kubernetes is the foundation for deploying and managing MLflow in a cloud-native environment. MLflow components (tracking server, model server) can run on Kubernetes clusters.
-
Helm:
- Usage: Helm can be used to package and deploy MLflow on Kubernetes, simplifying the installation and management of MLflow components.
-
MinIO:
- Usage: MinIO is a high-performance, S3-compatible object storage that can be used as a backend store for MLflow artifacts.
-
Apache Kafka:
- Usage: Kafka, while not a CNCF project, integrates well with many CNCF technologies and can be used for real-time data streaming in ML workflows managed by MLflow.
- Kubeflow: Integration for ML workflows on Kubernetes.
- Argo: Workflow orchestration.
- KNative: Serverless model serving.
- Prometheus & Grafana: Monitoring and visualization.
- Envoy & Istio: Traffic management and service mesh.
- NATS: Messaging and real-time data streaming.
- Kubernetes & Helm: Deployment and management.
- MinIO: Object storage for artifacts.
By leveraging these CNCF projects, MLflow users can build scalable, robust, and efficient machine learning pipelines in cloud-native environments."
Which of these features does Cilium provide? Select all answers that apply.
- a. Network policy enforcement
- b. Visibility into traffic at L3/L4 and L7
- c. Networking between Kubernetes pods
- d. Kubernetes secrets management
Answer: a, b, c
What is the name of the Cilium component that provides visibility into network traffic?
- a. Hubble
- b. Hobble
- c. Grafana
Answer: a
Cilium can generate Prometheus metrics showing you information about network performance and latency.
Answer: True
Which of these are characteristics of eBPF?
- a. It allows dynamic changes to the kernel
- b. It can be used to drop network packets that are forbidden by the network policy
- c. It enables high-performance networking with security and observability built-in
Answer: d
What are the different tools for installing Cilium components into a Kubernetes cluster? Select all that apply.
- a. Helm
- b. The Cilium CLI
- c. Curl
Answer: a, b
A Cilium agent runs on every node in a Kubernetes cluster.
True/False
- a. True
- b. False
Answer: a
How many instances of the Hubble Relay run on each cluster?
- a. One per node
- b. One per pod
- c. One per cluster
Answer: c
What is the Cilium CLI command for checking that connectivity and policy enforcement is working correctly?
- a. kubectl connectivity test
- b. cilium connectivity test
- c. cilium policy test
Answer: b
Which network policy resource type does Cilium support?
- a. Standard Kubernetes NetworkPolicy
- b. CiliumNetworkPolicy
- c. CiliumClusterNetworkPolicy
- d. All of the above
Answer: d
It’s possible to export Layer 3 network policy using the networkpolicy.io’s visual policy editor as CiliumNetworkPolicy YAML files.
True/False
- a. True
- b. False
Answer: a
Which of the following is a true statement? Select all that apply.
- a. The CiliumNetworkPolicy resource supports service name-based egress policy for internal cluster communications
- b. The standard Kubernetes NetworkPolicy resource supports L7 HTTP protocol rules limiting access to specific HTTP paths
- c. The CiliumNetworkPolicy resource supports both TCP and ICMP egress policy
Answer: a
Hubble flows include packet dumps. True/False
- a. True
- b. False
Answer: b
You can filter by packet verdict in the Hubble UI service map. True/False
- a. True
- b. False
Answer: a
Which of the following statements are TRUE? Select all that apply.
- a. The Hubble Relay service provides cluster-wide network observability
- b. It's not possible to filter flows by namespace using the Hubble CLI tool
- c. Hubble flows include information about traffic direction
- d. Hubble is optional and needs to be enabled when installing Cilium
Answer: a, c, d
What is the syntax of an API element?
- a. The reasons for making calls
- b. The relationship of the element to other elements
- c. The rules for making calls and passing data
Answer: c
True or False? Information about reasons and relationships adds value to a syntax description.
- a. True
- b. False
Answer: a
Which of the following statements about Kubernetes label selectors is correct?
a. An empty label selector matches no objects.
b. A null label selector matches all objects.
c. matchExpressions and matchLabels are ORed together.
d. matchLabels is a map of {key, value} pairs.
Answer: d
Which of the following best describes the purpose of the Patch type in Kubernetes?
a. To define a label selector for Kubernetes resources.
b. To specify the body of a PATCH request.
c. To create a new resource in Kubernetes.
d. To delete a resource in Kubernetes.
Answer: b
Which of the following kubectl commands creates a Kubernetes resource from a URL?
a. kubectl apply -f ./my-manifest.yaml
b. kubectl apply -f ./dir
c. kubectl apply -f https://example.com/manifest.yaml
d. kubectl create deployment nginx --image=nginx
Answer: c
Which of the following statements is true about configuring kubeadm using a YAML configuration file?
a. A kubeadm config file can contain multiple configuration types separated by three dashes (---).
b. All configuration options must be provided in the kubeadm config file, as there are no defaults.
c. The kubeadm config file does not support overriding default values.
d. Providing unexpected configuration types in the config file will cause kubeadm to fail without any warnings.
Answer: a
Which of the following is a valid phase executed by the kubeadm init command?
a. Run post-flight checks
b. Create network policies
c. Generate static Pod manifest files for control plane components
d. Install custom addons
Answer: c
Which of the following statements about StatefulSets in Kubernetes is true?
a. StatefulSets do not provide stable network identities to their pods.
b. The serviceName field in StatefulSetSpec is optional.
c. Pods in a StatefulSet have unique identities based on their ordinal index.
d. The only allowed template.spec.restartPolicy value for a StatefulSet is "Never".
Answer: c
Which of the following is true about container runtimes in Kubernetes?
a. Kubernetes supports only containerd as its container runtime.
b. Containers are intended to be stateful and mutable.
c. RuntimeClass allows running different Pods with different container runtimes or different settings.
d. Container images should be modified directly to reflect any code changes.
Answer: c
What is a RuntimeClass in Kubernetes and how is it used?
"RuntimeClass is a feature introduced in Kubernetes v1.20 [stable] that allows the selection of container runtime configurations to run a Pod's containers. It is used to provide different runtime configurations for different Pods, which can help balance performance and security needs. Here's a breakdown of its main components and functionalities:
Motivation:
Different RuntimeClass configurations can be set between different Pods to balance performance and security.
For example, Pods requiring high security can be scheduled to run in a container runtime that uses hardware virtualization for extra isolation at the cost of some overhead.
Setup:
Configure the Container Runtime Interface (CRI) implementation on nodes.
Create the corresponding RuntimeClass resources.
Configuration:
The RuntimeClass resource includes the name (metadata.name) and the handler (handler) fields.
RuntimeClass resources are non-namespaced and must be valid DNS subdomain names.
Usage:
Once configured, a RuntimeClass can be specified in the Pod spec via the runtimeClassName field.
If the specified RuntimeClass does not exist or cannot run the corresponding handler, the Pod will fail to run.
CRI Configuration:
Different CRI runtimes (e.g., containerd, CRI-O) have specific configuration methods for setting up runtime handlers.
Scheduling:
The scheduling field can be used to ensure Pods with a specific RuntimeClass are scheduled on nodes that support it.
This involves setting node labels and selectors to match the RuntimeClass requirements.
Use Cases:
Performance vs. Security:
Different runtime configurations for balancing the needs of various workloads.
Consistency:
Ensure consistent runtime configurations across a cluster or heterogeneous node configurations.
Resource Allocation:
Efficiently allocate resources by specifying appropriate runtime settings.
By utilizing RuntimeClass, Kubernetes users can fine-tune the execution environment of their Pods to meet specific performance and security requirements, making it a versatile tool for managing containerized workloads."
Question:
Which of the following best describes the purpose and functionality of RuntimeClass in Kubernetes?
A. A feature for managing network policies between Pods to enhance security and performance.
B. A feature for selecting container runtime configurations to balance performance and security needs for different Pods.
C. A tool for automating the deployment of Kubernetes clusters across multiple cloud providers.
D. A service for monitoring and logging container metrics and events in a Kubernetes cluster.
Answer:
B. A feature for selecting container runtime configurations to balance performance and security needs for different Pods.