Skip to content

Commit

Permalink
Enhance SkillTreeViewModel with Benchmark Runs Tracking and Leaderboa…
Browse files Browse the repository at this point in the history
…rd Submission

This commit incorporates significant enhancements to the SkillTreeViewModel, introducing the ability to track current benchmark runs and submit results to the leaderboard. A new list, `currentBenchmarkRuns`, is introduced to store each benchmark run object during a specific benchmark session. This list is reset to an empty state when initiating a new benchmark.

Changes made:
- Introduced `currentBenchmarkRuns` to track ongoing benchmark runs, ensuring real-time data availability.
- Enhanced `runBenchmark` method to populate `currentBenchmarkRuns` with benchmark run objects as the benchmark progresses.
- Implemented `submitToLeaderboard` method, accepting parameters `teamName`, `repoUrl`, and `agentGitCommitSha`, and updating each run object with this information. All runs share a common UUID generated at the beginning of the submission process.

These enhancements ensure that benchmark run data is readily available and organized, facilitating a streamlined process for submitting well-structured data to the leaderboard. It fosters a more interactive and informative user experience, offering insights into each benchmark run's progress and outcomes.
  • Loading branch information
hunteraraujo committed Sep 27, 2023
1 parent 9c1b55b commit e4d84da
Showing 1 changed file with 28 additions and 1 deletion.
29 changes: 28 additions & 1 deletion frontend/lib/viewmodels/skill_tree_viewmodel.dart
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ import 'package:collection/collection.dart';
import 'package:flutter/foundation.dart';
import 'package:flutter/services.dart';
import 'package:graphview/GraphView.dart';
import 'package:uuid/uuid.dart';

class SkillTreeViewModel extends ChangeNotifier {
// TODO: Potentially move to task queue view model when we create one
Expand All @@ -26,8 +27,11 @@ class SkillTreeViewModel extends ChangeNotifier {
// TODO: Potentially move to task queue view model when we create one
bool isBenchmarkRunning = false;
// TODO: Potentially move to task queue view model when we create one
// TODO: clear when clicking a new node
Map<SkillTreeNode, BenchmarkTaskStatus> benchmarkStatusMap = {};

List<BenchmarkRun> currentBenchmarkRuns = [];

List<SkillTreeNode> _skillTreeNodes = [];
List<SkillTreeEdge> _skillTreeEdges = [];
SkillTreeNode? _selectedNode;
Expand Down Expand Up @@ -156,6 +160,9 @@ class SkillTreeViewModel extends ChangeNotifier {
// Clear the benchmarkStatusList
benchmarkStatusMap.clear();

// Reset the current benchmark runs list to be empty at the start of a new benchmark
currentBenchmarkRuns = [];

// Create a new TestSuite object with the current timestamp
final testSuite =
TestSuite(timestamp: DateTime.now().toIso8601String(), tests: []);
Expand Down Expand Up @@ -215,11 +222,15 @@ class SkillTreeViewModel extends ChangeNotifier {
// Decode the evaluationResponse into a BenchmarkRun object
BenchmarkRun benchmarkRun = BenchmarkRun.fromJson(evaluationResponse);

// Add the benchmark run object to the list of current benchmark runs
currentBenchmarkRuns.add(benchmarkRun);

// Update the benchmarkStatusList based on the evaluation response
bool successStatus = benchmarkRun.metrics.success;
benchmarkStatusMap[node] = successStatus
? BenchmarkTaskStatus.success
: BenchmarkTaskStatus.failure;
// await Future.delayed(Duration(seconds: 2));
notifyListeners();

// If successStatus is false, break out of the loop
Expand All @@ -243,5 +254,21 @@ class SkillTreeViewModel extends ChangeNotifier {
}

// TODO: Move to task queue view model
Future<void> submitToLeaderboard() async {}
Future<void> submitToLeaderboard(
String teamName, String repoUrl, String agentGitCommitSha) async {
// Create a UUID.v4 for our unique run ID
String uuid = const Uuid().v4();

for (var run in currentBenchmarkRuns) {
run.repositoryInfo.teamName = teamName;
run.repositoryInfo.repoUrl = repoUrl;
run.repositoryInfo.agentGitCommitSha = agentGitCommitSha;
run.runDetails.runId = uuid;

await leaderboardService.submitReport(run);
}

// Clear the currentBenchmarkRuns list after submitting to the leaderboard
currentBenchmarkRuns.clear();
}
}

0 comments on commit e4d84da

Please sign in to comment.