Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log install preflight errors/warnings in both the CLI and kotsadm pod #4046

Merged
merged 16 commits into from
Oct 9, 2023
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 42 additions & 11 deletions cmd/kots/cli/install.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ import (
"encoding/json"
"fmt"
"io"
"io/ioutil"
"mime/multipart"
"net/http"
"net/url"
Expand All @@ -33,6 +32,7 @@ import (
"github.com/replicatedhq/kots/pkg/kurl"
"github.com/replicatedhq/kots/pkg/logger"
"github.com/replicatedhq/kots/pkg/metrics"
"github.com/replicatedhq/kots/pkg/print"
"github.com/replicatedhq/kots/pkg/pull"
"github.com/replicatedhq/kots/pkg/replicatedapp"
"github.com/replicatedhq/kots/pkg/store/kotsstore"
Expand All @@ -47,6 +47,10 @@ import (
"k8s.io/client-go/kubernetes"
)

var client = &http.Client{
Copy link
Member Author

@banjoh banjoh Sep 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, there was no timeout. Just being defensive here.

Timeout: time.Second * 30,
}

func InstallCmd() *cobra.Command {
cmd := &cobra.Command{
Use: "install [upstream uri]",
Expand Down Expand Up @@ -321,7 +325,7 @@ func InstallCmd() *cobra.Command {

log.ActionWithoutSpinner("Extracting airgap bundle")

airgapRootDir, err := ioutil.TempDir("", "kotsadm-airgap")
airgapRootDir, err := os.MkdirTemp("", "kotsadm-airgap")
if err != nil {
return errors.Wrap(err, "failed to create temp dir")
}
Expand Down Expand Up @@ -426,6 +430,7 @@ func InstallCmd() *cobra.Command {
case err := <-errChan:
if err != nil {
log.Error(err)
// TODO: Why is this a negative exit code?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question... I think this end up resulting in a 255 exit code:

$ echo $?
255

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's fix this and set it to 255

os.Exit(-1)
}
case <-stopCh:
Expand All @@ -446,7 +451,13 @@ func InstallCmd() *cobra.Command {
log.ActionWithSpinner("Waiting for preflight checks to complete")
if err := ValidatePreflightStatus(deployOptions, authSlug, apiEndpoint); err != nil {
log.FinishSpinnerWithError()
return errors.Wrap(err, "failed to validate preflight results")
log.Errorf("Preflight checks contain warnings or errors. The application was not deployed")
perr := preflightError{}
if errors.As(err, &perr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any benefit to handling specific error types this way vs. how do it elsewhere in the codebase? For example: https://github.com/replicatedhq/kots/blob/main/pkg/handlers/snapshots.go#L202

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any benefit to handling specific error types this way vs. how do it elsewhere in the codebase? For example: https://github.com/replicatedhq/kots/blob/main/pkg/handlers/snapshots.go#L202

They both achieve the same result I think. errors.Cause() is from github.com/pkg/errors library while errors.As() is part of go standard library

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if they achieve the same thing, let's just stick to errors.Cause() for consistency?

print.PreflightErrors(log, perr.Results)
cmd.SilenceErrors = true // Stop Cobra from printing the error, we print it ourselves
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of silencing the errors, can we just return/print a generic error message?

Copy link
Member Author

@banjoh banjoh Sep 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of silencing the errors, can we just return/print a generic error message?

That extra duplicated line after the formatted preflight check results at the end did not feel right. This is how it ends up looking like

evans@evans-vm3:~$ ./kots install thanos-reloaded -n default --shared-password password --license-file ~/thanos_license.yaml --kubeconfig /etc/rancher/k3s/k3s.yaml
  • Deploying Admin Console
    • Creating namespace ✓
    • Waiting for datastore to be ready ✓
  • Waiting for Admin Console to be ready ✓
  • Waiting for installation to complete ✓
  • Waiting for preflight checks to complete ✓
  • There are preflight check warnings for the application. The app was not deployed.

STATE    TITLE                          MESSAGE
PASS     Required Kubernetes Version    Your cluster meets the recommended and required versions of Kubernetes.
PASS     Container Runtime              containerd container runtime was found.
WARN     Kubernetes Distribution        Unable to determine the distribution of Kubernetes.
WARN     Total CPU Cores                The cluster should contain at least 4 cores.

Error: There are preflight check warnings for the application. The app was not deployed.

NOTE: Any other errors will get printed, we just silence the errors when we format the preflight checks table since we are printing our own error message.

Copy link
Member Author

@banjoh banjoh Sep 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it is as an image to show the colours

image

}
return err
}
log.FinishSpinner()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should leave this as we need to stop the spinner in the case where preflights pass too. Otherwise we are left with an indefinite spinner:

  • Deploying Admin Console
    • Creating namespace ✓  
    • Waiting for datastore to be ready ✓  
  • Waiting for Admin Console to be ready ✓  
  • Waiting for installation to complete ✓  
  • Waiting for preflight checks to complete ⠇
  • Press Ctrl+C to exit
  • Go to http://localhost:8800 to access the Admin Console

  • Waiting for preflight checks to complete ⠏

case storetypes.VersionPendingConfig:
Expand Down Expand Up @@ -692,7 +703,7 @@ func getIngressConfig(v *viper.Viper) (*kotsv1beta1.IngressConfig, error) {

ingressConfig := kotsv1beta1.IngressConfig{}
if ingressConfigPath != "" {
content, err := ioutil.ReadFile(ingressConfigPath)
content, err := os.ReadFile(ingressConfigPath)
if err != nil {
return nil, errors.Wrap(err, "failed to read ingress service config file")
}
Expand All @@ -719,7 +730,7 @@ func getIdentityConfig(v *viper.Viper) (*kotsv1beta1.IdentityConfig, error) {

identityConfig := kotsv1beta1.IdentityConfig{}
if identityConfigPath != "" {
content, err := ioutil.ReadFile(identityConfigPath)
content, err := os.ReadFile(identityConfigPath)
if err != nil {
return nil, errors.Wrap(err, "failed to read identity service config file")
}
Expand Down Expand Up @@ -910,7 +921,7 @@ func getAutomatedInstallStatus(url string, authSlug string) (*kotsstore.TaskStat
return nil, errors.Errorf("unexpected status code %d", resp.StatusCode)
}

b, err := ioutil.ReadAll(resp.Body)
b, err := io.ReadAll(resp.Body)
if err != nil {
return nil, errors.Wrap(err, "failed to read response body")
}
Expand Down Expand Up @@ -964,7 +975,7 @@ func getPreflightResponse(url string, authSlug string) (*handlers.GetPreflightRe
newReq.Header.Add("Content-Type", "application/json")
newReq.Header.Add("Authorization", authSlug)

resp, err := http.DefaultClient.Do(newReq)
resp, err := client.Do(newReq)
if err != nil {
return nil, errors.Wrap(err, "failed to execute request")
}
Expand All @@ -974,7 +985,7 @@ func getPreflightResponse(url string, authSlug string) (*handlers.GetPreflightRe
return nil, errors.Errorf("unexpected status code %d", resp.StatusCode)
}

b, err := ioutil.ReadAll(resp.Body)
b, err := io.ReadAll(resp.Body)
if err != nil {
return nil, errors.Wrap(err, "failed to read response body")
}
Expand Down Expand Up @@ -1005,6 +1016,17 @@ func checkPreflightsComplete(response *handlers.GetPreflightResultResponse) (boo
return true, nil
}

type preflightError struct {
Msg string
Results []*preflight.UploadPreflightResult
}

func (e preflightError) Error() string {
return e.Msg
}

func (e preflightError) Unwrap() error { return fmt.Errorf(e.Msg) }

func checkPreflightResults(response *handlers.GetPreflightResultResponse) (bool, error) {
if response.PreflightResult.Result == "" {
return false, nil
Expand All @@ -1027,14 +1049,23 @@ func checkPreflightResults(response *handlers.GetPreflightResultResponse) (bool,
}

if isWarn && isFail {
return false, errors.New("There are preflight check failures and warnings for the application. The app was not deployed.")
return false, preflightError{
Msg: "There are preflight check failures and warnings for the application. The app was not deployed.",
Results: results.Results,
}
}

if isWarn {
return false, errors.New("There are preflight check warnings for the application. The app was not deployed.")
return false, preflightError{
Msg: "There are preflight check warnings for the application. The app was not deployed.",
Results: results.Results,
}
}
if isFail {
return false, errors.New("There are preflight check failures for the application. The app was not deployed.")
return false, preflightError{
Msg: "There are preflight check failures for the application. The app was not deployed.",
Results: results.Results,
}
}

return true, nil
Expand Down
8 changes: 4 additions & 4 deletions cmd/kots/cli/install_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -223,7 +223,7 @@ var _ = Describe("Install", func() {
})

It("returns an error if the preflight collection results cannot be parsed", func() {
invalidPreflightCollection, err := json.Marshal(handlers.GetPreflightResultResponse{
invalidPreflightCollection, _ := json.Marshal(handlers.GetPreflightResultResponse{
PreflightProgress: `{invalid: json}`,
PreflightResult: preflighttypes.PreflightResult{},
})
Expand All @@ -239,13 +239,13 @@ var _ = Describe("Install", func() {
),
)

err = ValidatePreflightStatus(validDeployOptions, authSlug, server.URL())
err := ValidatePreflightStatus(validDeployOptions, authSlug, server.URL())
Expect(err).To(HaveOccurred())
Expect(err.Error()).To(ContainSubstring("failed to unmarshal collect progress for preflights"))
})

It("returns an error if the upload preflight results are invalid", func() {
invalidUploadPreflightResponse, err := json.Marshal(handlers.GetPreflightResultResponse{
invalidUploadPreflightResponse, _ := json.Marshal(handlers.GetPreflightResultResponse{
PreflightProgress: "",
PreflightResult: preflighttypes.PreflightResult{
Result: "{invalid: json}",
Expand All @@ -262,7 +262,7 @@ var _ = Describe("Install", func() {
),
)

err = ValidatePreflightStatus(validDeployOptions, authSlug, server.URL())
err := ValidatePreflightStatus(validDeployOptions, authSlug, server.URL())
Expect(err).To(HaveOccurred())
Expect(err.Error()).To(ContainSubstring("failed to unmarshal upload preflight results"))
})
Expand Down
3 changes: 2 additions & 1 deletion cmd/kots/cli/set-config.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ import (
"bytes"
"encoding/json"
"fmt"
"io"
"io/ioutil"
"net/http"
"net/url"
Expand Down Expand Up @@ -127,7 +128,7 @@ func SetConfigCmd() *cobra.Command {
}
defer resp.Body.Close()

respBody, err := ioutil.ReadAll(resp.Body)
respBody, err := io.ReadAll(resp.Body)
if err != nil {
log.FinishSpinnerWithError()
return errors.Wrap(err, "failed to read server response")
Expand Down
2 changes: 1 addition & 1 deletion pkg/handlers/preflight.go
Original file line number Diff line number Diff line change
Expand Up @@ -201,7 +201,7 @@ func (h *Handler) StartPreflightChecks(w http.ResponseWriter, r *http.Request) {
return
}

archiveDir, err := ioutil.TempDir("", "kotsadm")
archiveDir, err := os.MkdirTemp("", "kotsadm")
if err != nil {
logger.Error(errors.Wrap(err, "failed to create temp dir"))
w.WriteHeader(http.StatusInternalServerError)
Expand Down
3 changes: 1 addition & 2 deletions pkg/kotsutil/troubleshoot.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@ package kotsutil

import (
"context"
"io/ioutil"
"os"
"path/filepath"

Expand All @@ -22,7 +21,7 @@ func LoadTSKindsFromPath(dir string) (*troubleshootloader.TroubleshootKinds, err
return nil
}

contents, err := ioutil.ReadFile(path)
contents, err := os.ReadFile(path)
if err != nil {
return errors.Wrap(err, "failed to read file")
}
Expand Down
3 changes: 2 additions & 1 deletion pkg/kurl/kurl_nodes.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ import (
"io/ioutil"
"math"
"net/http"
"os"
"strconv"
"time"

Expand Down Expand Up @@ -140,7 +141,7 @@ func getNodeMetrics(nodeIP string) (*statsv1alpha1.Summary, error) {
port := 10255

// only use mutual TLS if client cert exists
_, err := ioutil.ReadFile("/etc/kubernetes/pki/kubelet/client.crt")
_, err := os.ReadFile("/etc/kubernetes/pki/kubelet/client.crt")
if err == nil {
cert, err := tls.LoadX509KeyPair("/etc/kubernetes/pki/kubelet/client.crt", "/etc/kubernetes/pki/kubelet/client.key")
if err != nil {
Expand Down
13 changes: 9 additions & 4 deletions pkg/preflight/execute.go
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ func setPreflightResult(appID string, sequence int64, preflightResults *types.Pr
// execute will execute the preflights using spec in preflightSpec.
// This spec should be rendered, no template functions remaining
func execute(appID string, sequence int64, preflightSpec *troubleshootv1beta2.Preflight, ignorePermissionErrors bool) (*types.PreflightResults, error) {
logger.Debug("executing preflight checks",
logger.Info("executing preflight checks",
zap.String("appID", appID),
zap.Int64("sequence", sequence))

Expand All @@ -65,7 +65,12 @@ func execute(appID string, sequence int64, preflightSpec *troubleshootv1beta2.Pr
if err, ok := msg.(error); ok {
logger.Errorf("error while running preflights: %v", err)
} else {
logger.Infof("preflight progress: %v", msg)
switch m := msg.(type) {
case preflight.CollectProgress:
logger.Infof("preflight progress: %s", m.String())
default:
logger.Infof("preflight progress: %+v", msg)
}
}

progress, ok := msg.(preflight.CollectProgress)
Expand Down Expand Up @@ -121,7 +126,7 @@ func execute(appID string, sequence int64, preflightSpec *troubleshootv1beta2.Pr
KubernetesRestConfig: restConfig,
}

logger.Debug("preflight collect phase")
logger.Info("preflight collect phase")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving this to info cause they show progress when performing an installation, which is critical functionality IMO to always log

collectResults, err := troubleshootpreflight.Collect(collectOpts, preflightSpec)
if err != nil && !isPermissionsError(err) {
preflightRunError = err
Expand All @@ -147,7 +152,7 @@ func execute(appID string, sequence int64, preflightSpec *troubleshootv1beta2.Pr
}
uploadPreflightResults.Errors = rbacErrors
} else {
logger.Debug("preflight analyze phase")
logger.Info("preflight analyze phase")
analyzeResults := collectResults.Analyze()

// the typescript api added some flair to this result
Expand Down
32 changes: 30 additions & 2 deletions pkg/preflight/preflight.go
Original file line number Diff line number Diff line change
Expand Up @@ -160,13 +160,26 @@ func Run(appID string, appSlug string, sequence int64, isAirgap bool, archiveDir
preflight.Spec.Collectors = collectors

go func() {
logger.Debug("preflight checks beginning")
logger.Info("preflight checks beginning")
uploadPreflightResults, err := execute(appID, sequence, preflight, ignoreRBAC)
if err != nil {
logger.Error(errors.Wrap(err, "failed to run preflight checks"))
return
}
logger.Debug("preflight checks completed")

// Log the preflight results if there are any warnings or errors
// The app may not get installed so we need to see this info for debugging
if GetPreflightState(uploadPreflightResults) != "pass" {
logger.Warnf("Preflight checks completed with warnings or errors. The application will not get deployed")
for _, result := range uploadPreflightResults.Results {
if result == nil {
continue
}
logger.Infof("preflight state=%s title=%q message=%q", preflightState(*result), result.Title, result.Message)
}
} else {
logger.Info("preflight checks completed")
}

go func() {
err := reporting.GetReporter().SubmitAppInfo(appID) // send app and preflight info when preflights finish
Expand Down Expand Up @@ -216,6 +229,21 @@ func Run(appID string, appSlug string, sequence int64, isAirgap bool, archiveDir
return nil
}

func preflightState(p troubleshootpreflight.UploadPreflightResult) string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use the existing GetPreflightState function`?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The state from GetPreflightState is of the entire list of preflights i.e if there is an error in any, the state == error. preflightState gets the state of each preflight for logging purposes.

if p.IsFail {
return "fail"
}

if p.IsWarn {
return "warn"
}

if p.IsPass {
return "pass"
}
return "unknown"
}

// maybeDeployFirstVersion will deploy the first version if preflight checks pass
func maybeDeployFirstVersion(appID string, sequence int64, preflightResults *types.PreflightResults) (bool, error) {
if sequence != 0 {
Expand Down
24 changes: 24 additions & 0 deletions pkg/print/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import (

configtypes "github.com/replicatedhq/kots/pkg/kotsadmconfig/types"
"github.com/replicatedhq/kots/pkg/logger"
"github.com/replicatedhq/troubleshoot/pkg/preflight"
)

func ConfigValidationErrors(log *logger.CLILogger, groupValidationErrors []configtypes.ConfigGroupValidationError) {
Expand All @@ -26,3 +27,26 @@ func ConfigValidationErrors(log *logger.CLILogger, groupValidationErrors []confi
log.FinishSpinnerWithError()
log.Errorf(sb.String())
}

func PreflightErrors(log *logger.CLILogger, results []*preflight.UploadPreflightResult) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not following the existing pattern of printing?

func printRestoresTable(restores []velerov1.Restore) {
w := NewTabWriter()
defer w.Flush()
fmtColumns := "%s\t%s\t%s\t%s\t%s\t%s\t%s\n"
fmt.Fprintf(w, fmtColumns, "NAME", "BACKUP", "STATUS", "STARTED", "COMPLETED", "ERRORS", "WARNINGS")
for _, r := range restores {
var startedAt *time.Time
if r.Status.StartTimestamp != nil && !r.Status.StartTimestamp.Time.IsZero() {
startedAt = &r.Status.StartTimestamp.Time
}
var completedAt *time.Time
if r.Status.CompletionTimestamp != nil && !r.Status.CompletionTimestamp.Time.IsZero() {
completedAt = &r.Status.CompletionTimestamp.Time
}
phase := r.Status.Phase
if phase == "" {
phase = "New"
}
fmt.Fprintf(w, fmtColumns, r.ObjectMeta.Name, r.Spec.BackupName, phase, startedAt, completedAt, fmt.Sprintf("%d", r.Status.Errors), fmt.Sprintf("%d", r.Status.Warnings))
}
}

that would help align the column titles with the values.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not following the existing pattern of printing?

No particular reason. I'll try to adopt the same approach

var s strings.Builder
s.WriteString("\nPreflight check results (state - title - message)")
for _, result := range results {
s.WriteString(fmt.Sprintf("\n%s - %q - %q", preflightState(result), result.Title, result.Message))
}
log.Info(s.String())
}

func preflightState(r *preflight.UploadPreflightResult) string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Member Author

@banjoh banjoh Sep 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll DRY this out in favour of preflights.GetPreflightCheckState which I'll rename from the existing preflights.preflightState

if r.IsFail {
return "FAIL"
}
if r.IsWarn {
return "WARN"
}
if r.IsPass {
return "PASS"
}
// We should never get here
return "UNKNOWN"
}
7 changes: 6 additions & 1 deletion pkg/reporting/distribution.go
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,12 @@ func distributionFromServerGroupAndResources(clientset kubernetes.Interface) Dis

func distributionFromProviderId(clientset kubernetes.Interface) Distribution {
nodes, err := clientset.CoreV1().Nodes().List(context.TODO(), metav1.ListOptions{})
logger.Infof("Found %d nodes", len(nodes.Items))
nodeCount := len(nodes.Items)
if nodeCount > 1 {
logger.Infof("Found %d nodes", nodeCount)
} else {
logger.Infof("Found %d node", nodeCount)
}
if err != nil {
logger.Infof("got error listing node: %v", err.Error())
}
Expand Down
Loading