[DISCUSSION] More SqlLogicTest test coverage for queries, including join queries #13470

findepi · 2024-11-18T16:12:14Z

During SDF's upgrade to DataFusion 43 we found a bug #13425 . This was possible thanks to extensive test coverage for certain query shapes that we have internally, which is good. The bad part is that the bug could be caught at the PR stage, or before the release, should similar tests exist in DataFusion project.

Would it be useful if there were more SLT query tests, including the ones that allowed to catch #13425?

cc @alamb @schulte-lukas @gliga

alamb · 2024-11-19T11:34:13Z

Yes, 100% I think adding additional test coverage in all areas would be super helpful

We / I also regularly hit bugs that are caught by InfluxData's test suite during upgrades of DataFusion but were not caught during the DataFusion PR

Other Areas that would benefit from improved testing from my experience:

Metadata handling (like attaching field level metadata, etc)
UNION coverage (which we use extensively)
Dictionary handling (which we also use a lot)

Dandandan · 2024-11-22T16:32:53Z

One way to extend join coverage is to have some more "answer checking" for different benchmarks, such as:

Verify TPC-DS answers #13073

(Earlier doing this for TPC-H found many issues in joins handling, decimal issues, etc.)

We can do the same for the imdb join order benchmark, clickbench, etc.

Omega359 · 2024-11-24T16:10:30Z

I have been looking over the sqllogictests this weekend and I think first of all that they need a good reorganization to clean them up, split them into smaller chunks testing just specific portions of functionality and to reduce the size of some of the files.

I also have started looking at other projects's tests to see how they've done their testing. I am unsure about actually incorporating some of those tests directly into DF (duckdb for example, MIT license vs Apache) but they could be used as a basis for seeing where DF's test suite is falling short and expanding the test suite some more.

Omega359 · 2024-11-29T16:45:06Z

I came across this paper which I found interesting for join testing. It won't work for DF yet since DF doesn't support query hints (at least as best as I know).

schulte-lukas · 2024-12-01T16:01:51Z

I would also love to see a separate test-runner (maybe nightly) of the official SQLLite SLTs (where applicable).

At this point DF seems to be able to pass the majority without modification (except delete). Since they are (a) pre-vetted by sqllite and (b) test foundational capability it would make sense to me that they are included in datafusion

Omega359 · 2024-12-01T16:51:31Z

I would also love to see a separate test-runner (maybe nightly) of the official SQLLite SLTs (where applicable).

At this point DF seems to be able to pass the majority without modification (except delete). Since they are (a) pre-vetted by sqllite and (b) test foundational capability it would make sense to me that they are included in datafusion

I actually am part way through porting those slt's to something that df's slt test runner can run

alamb · 2024-12-02T16:50:59Z

Can't wait to see what happens here

comphead · 2024-12-02T16:58:17Z

We / I also regularly hit bugs that are caught by InfluxData's test suite during upgrades of DataFusion but were not caught during the DataFusion PR

I'm wondering what are test strategies in Influx? is it something we can reuse in DF as well to avoid regression

alamb · 2024-12-02T17:24:24Z

I'm wondering what are test strategies in Influx? is it something we can reuse in DF as well to avoid regression

We have a bunch of our own basic tests for end to end functionality in our system (some is SQL based, some is "InfluxQL" based)

We have often seen issues where features we use heavily (metadata, and UnionExec, for example) are not as heavily covered in the existing DataFusion test suites 🤔

comphead · 2024-12-02T17:42:16Z

I'm wondering what are test strategies in Influx? is it something we can reuse in DF as well to avoid regression

We have a bunch of our own basic tests for end to end functionality in our system (some is SQL based, some is "InfluxQL" based)

We have often seen issues where features we use heavily (metadata, and UnionExec, for example) are not as heavily covered in the existing DataFusion test suites 🤔

is it real world use cases ?

if its not I found something used for MariaDB and get investigate it https://mariadb.com/kb/en/random-query-generator-tests/

Omega359 · 2024-12-02T17:47:52Z

After the slt work I was thinking of taking a look at the work that @2010YOUY01 has done with sqlancer to see if there are areas that I could add into it. I noticed that the datafusion-contrib for that was merged into sqlancer main a few months back - https://github.com/sqlancer/sqlancer/tree/main/src/sqlancer/datafusion.

sqlancer has quite a few options wrt test oracles that might be useful to look at. The one I was looking at (DQP) I think would need support for query hints (or corresponding set options) to function but if we could get it to work would allow a lot of join testing to happen.

alamb · 2024-12-02T18:38:23Z

I agree -- I think using sqlancer is a great idea -- I suspect it simply needs someone to maintain and file tickets

Omega359 · 2024-12-03T23:19:10Z

I would also love to see a separate test-runner (maybe nightly) of the official SQLLite SLTs (where applicable).
At this point DF seems to be able to pass the majority without modification (except delete). Since they are (a) pre-vetted by sqllite and (b) test foundational capability it would make sense to me that they are included in datafusion

I actually am part way through porting those slt's to something that df's slt test runner can run

I've gotten many of them to run so far after removing or changing troublesome features (create index, valuesort, etc). There are differences with sqllite vs DataFusion wrt how sqllite handles integers vs real numbers that I'm still working through slowly. I do see a number of failures that I believe are real or at least point to limitations of DF. I'm hoping to have a full 'clean' run sometime in the next week that only fails on likely valid queries.

Here is an example of something that is currently failing:

CREATE TABLE tab0(pk INTEGER PRIMARY KEY, col0 INTEGER, col1 FLOAT, col2 TEXT, col3 INTEGER, col4 FLOAT, col5 TEXT);
INSERT INTO tab0 VALUES(0,1164,1149.64,'eyxra',1147,1174.8,'fpigl');
INSERT INTO tab0 VALUES(1,1165,1150.72,'zqxbf',1148,1175.52,'zmukh');
INSERT INTO tab0 VALUES(2,1166,1154.12,'itglg',1149,1176.1,'cfmkm');
INSERT INTO tab0 VALUES(3,1167,1156.44,'xmaca',1150,1177.85,'ggpiv');
INSERT INTO tab0 VALUES(4,1168,1157.98,'lewrx',1151,1178.59,'hjscv');
INSERT INTO tab0 VALUES(5,1169,1158.81,'enubf',1152,1180.13,'apmjh');
INSERT INTO tab0 VALUES(6,1171,1159.96,'olnov',1153,1181.89,'smeyx');
INSERT INTO tab0 VALUES(7,1172,1160.51,'yngfz',1154,1182.90,'vakuy');
INSERT INTO tab0 VALUES(8,1174,1161.72,'awjqq',1155,1183.93,'qzqaw');
SELECT pk, col0 FROM tab0 WHERE col0 IN (SELECT col3 FROM tab0 WHERE col4 > 73.18) ORDER BY 2 DESC;

SanityCheckPlan
caused by
Error during planning: Plan: ["SortPreservingMergeExec: [col0@1 DESC]", "  CoalesceBatchesExec: target_batch_size=8192", "    HashJoinExec: mode=Partitioned, join_type=RightSemi, on=[(col3@0, col0@1)]", "      CoalesceBatchesExec: target_batch_size=8192", "        RepartitionExec: partitioning=Hash([col3@0], 20), input_partitions=20", "          RepartitionExec: partitioning=RoundRobinBatch(20), input_partitions=1", "            CoalesceBatchesExec: target_batch_size=8192", "              FilterExec: CAST(col4@1 AS Float64) > 73.18, projection=[col3@0]", "                MemoryExec: partitions=1, partition_sizes=[9]", "      SortExec: expr=[pk@0 DESC], preserve_partitioning=[true]", "        CoalesceBatchesExec: target_batch_size=8192", "          RepartitionExec: partitioning=Hash([col0@1], 20), input_partitions=1", "            MemoryExec: partitions=1, partition_sizes=[9]"] does not satisfy order requirements: [col0@1 DESC]. Child-0 order: [[pk@0 DESC]]

gliga · 2024-12-04T19:16:44Z

It would be great if there was a common infrastructure across
databases for slt. High level shown below.

------------     --------------   ----------------
|  DF slt  |     | SQLite slt |   | whatever slt |
------------     --------------   ----------------
    |                 |                 |
 ----------------------------------------------    --------------
 |     slt transpiler                         |<---| slt bank   |
 ----------------------------------------------    --------------
    |                 |                 |
    v                 v                 v
------------     --------------   ----------------
|  DF slt  |     | SQLite slt |   | whatever slt |
------------     --------------   ----------------

Boxes on the top are existing slt tests available in various
databases/repositories, e.g., SQLite (which is mostly discussed in
this thread).

slt bank is a set of slt tests independent of any dialect (or even
potentially dialect specific) that is contributed by third party
groups, e.g., researchers.

slt transpiler takes existing slt tests and transpiles to the
desired target (e.g., DF). Translation steps could be:

Take one record at a time
Transpile SQL to the target dialect (using one of the existing tools)
Adjust output (likely most challenging)
Output record: if it works keep it if it does not comment out in the output

The transpiler would also be a repository of links to existing slts.

Boxes at the bottom are the resulting slt tests.

Ideally, the transpilation would not be a one off process, but it
would be done for example every time slt tests are run. Benefit of
such an approach would be to ensure that any updates made to existing
tests (e.g., new tests added to SQLite) are reflected in the target
run (e.g., DF runs).

gliga · 2024-12-04T19:17:36Z

@Omega359 I see you went quite far and already had some good findings.
Would you need any help at this point?

Omega359 · 2024-12-04T19:27:42Z

@Omega359 I see you went quite far and already had some good findings. Would you need any help at this point?

Right now I'm just working through issues with the sqlite test files. My biggest issue actually is that the test runner seems to get very very very very slow sometimes (most of the time actually). I don't know why atm - I forked off the rust sqllogictest repo and tried a few things to try and automate converting value-oriented results to row-oriented results and at one point it ran super fast but otherwise it's 'fire off a run and come back in 1/2 hour'.

I'll push up my changes to my repo on github in the next day or so. I suspect it's a good few days of solid work going through all the tests and fixing/changing things.

Omega359 · 2024-12-04T19:48:37Z

It would be great if there was a common infrastructure across databases for slt.

Boxes on the top are existing slt tests available in various databases/repositories, e.g., SQLite (which is mostly discussed in this thread).

slt bank is a set of slt tests independent of any dialect (or even potentially dialect specific) that is contributed by third party groups, e.g., researchers.

slt transpiler takes existing slt tests and transpiles to the desired target (e.g., DF).

The transpiler would also be a repository of links to existing slts.

Ideally, the transpilation would not be a one off process, but it would be done for example every time slt tests are run. Benefit of such an approach would be to ensure that any updates made to existing tests (e.g., new tests added to SQLite) are reflected in the target run (e.g., DF runs).

I am absolutely not against this at all - but it's a pretty large undertaking.

Currently the definition of what the format of a slt file is .. fluid. The original format from sqlite isn't directly compatible with the rust crate which has been a large part of my work. Fixing that would be step 1.

Having a base set of slt tests that work around most every db would be nice however given what I've seen so far those would be very basic tests. Even things like what the field type of a Float field isn't stable across db's (or at least the sqllogictest runner impl's). Support for create index .. not in DF (yet). Mysql using DIV for integer division vs / (mysql/oracle treat / as always float result iirc). Just a few basic examples

2010YOUY01 · 2024-12-08T18:10:02Z

After the slt work I was thinking of taking a look at the work that @2010YOUY01 has done with sqlancer to see if there are areas that I could add into it. I noticed that the datafusion-contrib for that was merged into sqlancer main a few months back - https://github.com/sqlancer/sqlancer/tree/main/src/sqlancer/datafusion.

sqlancer has quite a few options wrt test oracles that might be useful to look at. The one I was looking at (DQP) I think would need support for query hints (or corresponding set options) to function but if we could get it to work would allow a lot of join testing to happen.

I think adding test oracles like DQP to sqllogictests is also a project worth to do.

SQLancer is a fuzzer implementation with query generator + test oracles (for property tests), and these two component are loosely coupled: test oracles don't necessarily have to be implemented with SQLancer's query generator, the invariants can be tested as long as there is a valid query, maybe the best strategy is to implement them both in existing test cases(slt) and SQLancer's randomly generated queries.

For DQP, I remember there is a configuration we can turn off 🤔, and there will be no join-reordering, so join order can be controlled by writing table order in SQL queries differently.

alamb mentioned this issue Nov 22, 2024

[DISCUSSION] Making it easier to use DataFusion (lessons from GlareDB) #13525

Open

10 tasks

alamb mentioned this issue Dec 2, 2024

Nov 20. 2024: This week in DataFusion #13503

Closed

3 tasks

alamb mentioned this issue Dec 3, 2024

Dec 3. 2024: This week in DataFusion #13630

Open

3 tasks

Omega359 mentioned this issue Dec 4, 2024

Upgrade from 40 to 43 causes utf8 timestamp queries to fail #13625

Open

alamb mentioned this issue Dec 4, 2024

[EPIC] A collection of items to improve DataFuson stability (reduce effort required to upgrade) #13648

Open

10 tasks

findepi mentioned this issue Dec 5, 2024

[DISCUSS] More extensive pre-release testing #13661

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DISCUSSION] More SqlLogicTest test coverage for queries, including join queries #13470

[DISCUSSION] More SqlLogicTest test coverage for queries, including join queries #13470

findepi commented Nov 18, 2024

alamb commented Nov 19, 2024

Dandandan commented Nov 22, 2024 •

edited by alamb

Loading

Omega359 commented Nov 24, 2024

Omega359 commented Nov 29, 2024

schulte-lukas commented Dec 1, 2024

Omega359 commented Dec 1, 2024

alamb commented Dec 2, 2024

comphead commented Dec 2, 2024

alamb commented Dec 2, 2024

comphead commented Dec 2, 2024

Omega359 commented Dec 2, 2024

alamb commented Dec 2, 2024

Omega359 commented Dec 3, 2024

gliga commented Dec 4, 2024

gliga commented Dec 4, 2024

Omega359 commented Dec 4, 2024

Omega359 commented Dec 4, 2024

2010YOUY01 commented Dec 8, 2024

[DISCUSSION] More SqlLogicTest test coverage for queries, including join queries #13470

[DISCUSSION] More SqlLogicTest test coverage for queries, including join queries #13470

Comments

findepi commented Nov 18, 2024

alamb commented Nov 19, 2024

Dandandan commented Nov 22, 2024 • edited by alamb Loading

Omega359 commented Nov 24, 2024

Omega359 commented Nov 29, 2024

schulte-lukas commented Dec 1, 2024

Omega359 commented Dec 1, 2024

alamb commented Dec 2, 2024

comphead commented Dec 2, 2024

alamb commented Dec 2, 2024

comphead commented Dec 2, 2024

Omega359 commented Dec 2, 2024

alamb commented Dec 2, 2024

Omega359 commented Dec 3, 2024

gliga commented Dec 4, 2024

gliga commented Dec 4, 2024

Omega359 commented Dec 4, 2024

Omega359 commented Dec 4, 2024

2010YOUY01 commented Dec 8, 2024

Dandandan commented Nov 22, 2024 •

edited by alamb

Loading