Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Baseline Experiments. #345

Closed
wants to merge 51 commits into from
Closed

Baseline Experiments. #345

wants to merge 51 commits into from

Conversation

amlatyrngom
Copy link
Contributor

No description provided.

@amlatyrngom amlatyrngom requested a review from geoffxy November 2, 2023 14:02
Copy link
Member

@geoffxy geoffxy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @amlatyrngom. Can you test this out on the 100 GB dataset with the scale down workload? experiments/15-e2e-scenarios-v2/scale_down/run_workload.sh. The script assumes it is being run by a tool called Conductor. The easiest way to run it without the tool for testing is to set the COND_OUT environment variable to a path where you want to store the results.

Comment on lines +159 to +161
@pytest.mark.skip(
reason="TODO(Amadou): This is failing even I haven't changed it. Flaky test?"
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test should be fixed now - you can remove this decorator and the test should pass.

@@ -0,0 +1,100 @@
# See workloads/cross_db_benchmark/benchmark_tools/tidb/README.md

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this file into workloads/IMDB_extended

cursor = self._cursor
# Exec
cursor.execute(query)
if cursor.rowcount is None or cursor.rowcount <= 0 or not(query.strip().lower().startswith("SELECT")):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

query.strip().lower().startswith("SELECT") is always going to be false?

Comment on lines +40 to +48
# Get cursor
if self._cursor is None:
had_cursor = False
cursor = self._conn.cursor()
else:
had_cursor = True
cursor = self._cursor
# Exec
cursor.execute(query)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these changes to support a MySQL connector? If so, please just create another Database class that is specialized for MySQL instead of modifying this class.

if cursor.rowcount is None or cursor.rowcount <= 0 or not(query.strip().lower().startswith("SELECT")):
rows = []
else:
print(f"Rows: {cursor.rowcount}. Q: {query}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Please use logger.debug() instead of print(). Otherwise this will get very verbose during a workload run.

dataset_type = "20gb"
else:
dataset_type = "original"
worker = TransactionWorker(worker_idx, args.seed ^ worker_idx, args.scale_factor, dataset_type=dataset_type)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops - thanks for catching this. Instead of setting dataset_type here based on flags, I'd suggest we make it an argparse argument instead.

parser.add_argument(
"--scale-factor",
type=int,
default=1,
default=6,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Please keep this at 1 for now.

Comment on lines +98 to +100
if __name__ == "__main__":
yaml_main()
sys.exit(0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still needed?

Comment on lines 220 to 225
parser.add_argument(
"--output-dir",
type=str,
default=".",
help="Environment variable that stores the output directory of tidb bench",
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still used? Also it seems like the help string is not correct since the default is a path and not an environment variable?

@geoffxy
Copy link
Member

geoffxy commented Sep 11, 2024

Closing since I think the intention was to not merge given the code divergence.

@geoffxy geoffxy closed this Sep 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants