This repository contains a Python script for analyzing Kubernetes pod logs. It helps in extracting log data from specific pods based on labels, checks for container restarts, flags error messages, and saves the results in a log file.
- Fetches logs of pods within a specified namespace using a label selector.
- Monitors restart counts and flags containers exceeding the restart threshold.
- Searches logs for error messages or failure-related keywords such as "Error", "Failed", or "Crashloopback".
- Logs findings (restart counts and flagged errors) into a specified output file.
Before running the script, ensure the following:
-
Python 3.6 or later.
-
A running Kubernetes cluster. You can use any of the following options:
- Minikube: A local Kubernetes cluster setup tool.
- Kubeadm: A tool for managing a Kubernetes cluster, typically used for setting up production environments.
If you don't have a Kubernetes cluster running, you can set up Minikube or Kubeadm. Minikube is a great option for a local cluster for testing purposes.
- Minikube setup: Install Minikube
- Kubeadm setup: Install Kubeadm
-
Kubernetes cluster access and kubeconfig file set up (
~/.kube/config
). -
Required Python libraries:
kubernetes
.
You can install the required Python libraries using pip
:
pip install kubernetes
The script can be run from the command line with the following syntax:
python kubernetes_log_analyzer.py --namespace <namespace> --label-selector <label-selector> --restart-threshold <restart-threshold> --output-file <output-file-path>
--namespace
: The Kubernetes namespace of the pods to analyze. (Required)--label-selector
: The label selector to filter pods. (Required)--restart-threshold
: The threshold for restarts at which a warning will be logged. (Required)--output-file
: The file path where logs and analysis results will be stored. (Required)
python kubernetes_log_analyzer.py --namespace my-namespace --label-selector app=my-app --restart-threshold 5 --output-file /path/to/output.log
This will analyze all pods in the my-namespace
namespace with the label app=my-app
, check for pods that have restarted more than 5 times, and save the results in /path/to/output.log
.
- This function uses
argparse
to handle command-line arguments, ensuring that the required parameters are provided for the script to run.
- A helper function to format log messages with timestamps and write them to the output file.
- The main function that:
- Loads the Kubernetes configuration.
- Fetches all pods in the specified namespace with the provided label selector.
- Checks if any container in the pods has exceeded the restart threshold.
- Retrieves logs for each container and flags any lines containing error keywords.
- Writes the log data and any flagged errors to the specified output file.
- Handles potential errors gracefully, such as issues with fetching pod data or logs, by logging error messages in the output file.
2024-12-19 12:34:56 - Pod my-pod, Container: my-container - Restarts: 6 has exceeded the restart threshold of 5
2024-12-19 12:35:01 - Pod: my-pod, Container: my-container - Error: Connection refused
2024-12-19 12:35:05 - Flagged: Error: Connection refused