Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance monitor package #2046

Merged
merged 10 commits into from
Sep 26, 2023
9 changes: 9 additions & 0 deletions cmds/modules/noded/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ import (
"github.com/threefoldtech/zos/pkg/environment"
"github.com/threefoldtech/zos/pkg/events"
"github.com/threefoldtech/zos/pkg/monitord"
"github.com/threefoldtech/zos/pkg/perf"
"github.com/threefoldtech/zos/pkg/registrar"
"github.com/threefoldtech/zos/pkg/stubs"
"github.com/threefoldtech/zos/pkg/utils"
Expand Down Expand Up @@ -280,6 +281,14 @@ func action(cli *cli.Context) error {
return err
}

log.Info().Msg("Start Perf scheduler")
performanceMonitor := perf.NewPerformanceMonitor("/var/run/redis.sock")
performanceMonitor.InitScheduler()
err = performanceMonitor.RunScheduler(ctx)
if err != nil {
log.Error().Err(err).Msg("fails in scheduler")
}

log.Debug().Msg("start message bus")
for {
err := runMsgBus(ctx, sk, env.SubstrateURL, env.RelayURL, msgBrokerCon)
Expand Down
5 changes: 5 additions & 0 deletions pkg/perf/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Perf

Perf is a performance test module than is scheduled to run and cache those tests results in redis which can be retrieved later over RMB call.

Perf tests are monitored by `noded` service from zos modules.
43 changes: 43 additions & 0 deletions pkg/perf/cache.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
package perf

import (
"context"
"encoding/json"
"time"

"github.com/pkg/errors"
)

// TestResultData the result test schema
type TestResultData struct {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now after checking the rest of the files, this should be TaskResult struct only task name and the result

TestName string
TestNumber uint64
Result interface{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fields missing json tags

}

// CacheResult set result in redis
func (pm *PerformanceMonitor) CacheResult(ctx context.Context, resultKey string, resultData TestResultData) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this a public method. I don't expect anyone should ever call this method by himself. this should an internal method. I see many attributes and methods that are public for no obvious reason. It make the code subtle to abuse and bad usage. And allows bad design

data, err := json.Marshal(resultData)
if err != nil {
return errors.Wrap(err, "Error marshaling data to JSON")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

errors should be lowercased

}

return pm.RedisClient.Set(resultKey, data, 10*time.Second).Err()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extract the expiration into an unexported constant resultTTL, also 10 is too low imo,

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was gonna say that data should never expire, but instead have the timestamp of when the test was run. but never expire. Some tests run every 6 hours our once per day. Hence expiration is not needed. so 0 is the right value

}

// GetCachedResult get data from redis
func (pm *PerformanceMonitor) GetCachedResult(resultKey string) (TestResultData, error) {
var res TestResultData

data, err := pm.RedisClient.Get(resultKey).Result()
if err != nil {
return res, errors.Wrap(err, "Failed getting the cached result")
}

err = json.Unmarshal([]byte(data), &res)
if err != nil {
return res, errors.Wrap(err, "Failed unmarshal data from json")
}

return res, nil
}
88 changes: 88 additions & 0 deletions pkg/perf/monitor.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
package perf

import (
"context"
"time"

"github.com/go-redis/redis"
"github.com/jasonlvhit/gocron"
"github.com/pkg/errors"
"github.com/rs/zerolog/log"
)

// Task is the task method signature
type Task func(ctx context.Context) (interface{}, error)

// PerformanceMonitor holds the module data
type PerformanceMonitor struct {
Scheduler *gocron.Scheduler
RedisClient *redis.Client
Intervals map[string]time.Duration // interval in seconds
Tasks map[string]Task
ExecutionCounts map[string]uint64
}

// NewPerformanceMonitor returns PerformanceMonitor instance
func NewPerformanceMonitor(redisAddr string) *PerformanceMonitor {
redisClient := redis.NewClient(&redis.Options{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already using a redis package in zos so no need to add a new dependency. Please move to github.com/gomodule/redigo you can already create a new pool directly by calling this util function here https://github.com/threefoldtech/zos/blob/main_perf_package/pkg/utils/redis.go#L11

where the url is just unix://path/to/socket

Network: "unix",
Addr: redisAddr,
})

redisClient.Ping()
log.Info().Msg("redis passed")

scheduler := gocron.NewScheduler()

return &PerformanceMonitor{
Scheduler: scheduler,
RedisClient: redisClient,
Tasks: make(map[string]Task),
Intervals: make(map[string]time.Duration),
ExecutionCounts: make(map[string]uint64),
}
}

// AddTask a simple helper method to add new tasks
func (pm *PerformanceMonitor) AddTask(taskName string, interval time.Duration, task Task) {
pm.Intervals[taskName] = interval
pm.Tasks[taskName] = task
pm.ExecutionCounts[taskName] = 0
}

// InitScheduler adds all the test to the scheduler queue
func (pm *PerformanceMonitor) InitScheduler() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this

pm.AddTask("TestLogging", 5, TestLogging)
}

// RunScheduler adds the tasks to the corn queue and start the scheduler
func (pm *PerformanceMonitor) RunScheduler(ctx context.Context) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the name RunScheduler shouldn't be introduced, a simple Run or Start should have been enough, now whoever is using this API will need to know it's using go-cron scheduler

for key, task := range pm.Tasks {
err := pm.Scheduler.Every(uint64(pm.Intervals[key])).Seconds().Do(func() error {
testResult, err := task(ctx)
if err != nil {
return errors.Wrapf(err, "failed running test: %s", key)
}

count := pm.ExecutionCounts[key]
pm.ExecutionCounts[key]++

err = pm.CacheResult(ctx, key, TestResultData{
TestName: key,
TestNumber: count + 1,
Result: testResult,
})
if err != nil {
return errors.Wrap(err, "failed setting cache")
}

return nil
})
if err != nil {
return errors.Wrapf(err, "failed scheduling the job: %s", key)
}
}

pm.Scheduler.Start()
return nil
}
38 changes: 38 additions & 0 deletions pkg/perf/tasks.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
package perf

import (
"context"
"time"

"github.com/rs/zerolog/log"
)

type PerfTest func() (interface{}, error)

// TestLogging simple helper method that ensure scheduler is working
func TestLogging(ctx context.Context) (interface{}, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's remove this one, there're other logs we can see.

log.Info().Msgf("TestLogging: time is: %v", time.Now().Hour())
return nil, nil
}

// should run every 5 min
// ping 12 point, each 3 times and make average result
// iperf 12 grid node with ipv4
// iperf 12 grid node with ipv6
// uses the iperf binary
func TestNetworkPerformance(ctx context.Context) (interface{}, error) {
return nil, nil
}

// should run every 6min
// upload/download a 1 MB file to any point
func TestNetworkLoading(ctx context.Context) (interface{}, error) {
return nil, nil
}

// should run every 6 hours
// measure cpu, mem, disk performance and usage
// uses some tool that do the mentoring
func TestResourcesPerformance(ctx context.Context) (interface{}, error) {
return nil, nil
}