Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issues in simpley ANY queries with or. #686

Closed
hrach opened this issue Oct 29, 2024 · 3 comments · Fixed by #687
Closed

Performance issues in simpley ANY queries with or. #686

hrach opened this issue Oct 29, 2024 · 3 comments · Fixed by #687
Labels
Milestone

Comments

@hrach
Copy link
Member

hrach commented Oct 29, 2024

Firstly thank you for moving this further. I have few observations, not necessarily fatal errors, might create issues later. For now just one thing:

I was able to update my project from ORM 4 to 5, finally getting my complex query to show correct results. However, the performance drop is significant. Probably due to all the HAVING clauses and ON clause filters. Even when using only simple "any" aggregations that worked in version 4.

$this->model->books->findBy([
	ICollection::OR,
	['id' => 1],
	['tags->id' => 2],
]);

ORM 4:

SELECT "books".* 
FROM "books" AS "books" 
LEFT JOIN "books_x_tags" AS "books_x_tags" ON ("books"."id" = "books_x_tags"."book_id") 
LEFT JOIN "tags" AS "tags" ON ("books_x_tags"."tag_id" = "tags"."id") 
WHERE (("books"."id" = 3)) OR (("tags"."id" = 1)) 
GROUP BY "books"."id"

ORM 5:

SELECT "books".* 
FROM "books" AS "books" 
LEFT JOIN "books_x_tags" AS "books_x_tags_any" ON ("books"."id" = "books_x_tags_any"."book_id") 
LEFT JOIN "tags" AS "tags_any" ON (("books_x_tags_any"."tag_id" = "tags_any"."id") AND "tags_any"."id" = 1) 
GROUP BY "books"."id", "books"."title" 
HAVING ((("books"."id" = 3)) OR ((COUNT("tags_any"."id") > 0)))

This is just a simple query, but in complex queries, the drop is from miliseconds to seconds. The results are correct in both. While I am eager to use new aggregation features, I also need my current queries to work fast without change. I don't know how hard and whether even achievable it would be to optimize this. Not to delay release further, my suggestion is a "legacy mode" that would allow user to use old ORM 4 queries (possibly by overriding AnyAggregator::aggregateExpression to always just return $expression;).

Originally posted by @stepapo in #666


Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior. It could include:

  1. Code which you run
  2. Definition of entities
  3. SQL table definitions
  4. Which SQL queries are executed

Expected behavior
A clear and concise description of what you expected to happen.

Versions::

  • Database: [e.g. MySQL 5.7]
  • Orm: [e.g. Orm 3.0]
  • Dbal: [e.g. Dbal 3.0]
@hrach hrach added the bug label Oct 29, 2024
@hrach
Copy link
Member Author

hrach commented Oct 29, 2024

@stepapo thank you for an awesome report.

The problem is that we do not know upfront that some other conditions in the expression tree will "always require" HAVING clause.

A similar "problem" was in OR that was outside the specific expression, but it is known when processing, so we could pass it down to let the expression known.

The only sane solution seems to delay processing and do some two-pass processing to realize requirements first and the let the expression produce the correct needed variant.

@hrach
Copy link
Member Author

hrach commented Oct 29, 2024

Btw, I realized this need earlier but I wanted to avoid it. But it seems here we are. Let me take a quick look at how difficult this would be.

@hrach hrach linked a pull request Oct 29, 2024 that will close this issue
@stepapo
Copy link

stepapo commented Oct 30, 2024

This solution definitely fixed performance, thank you.

hrach added a commit that referenced this issue Oct 31, 2024
To provide more smarted SQL rewrites, we need to know if the expression
itself is in AND/OR junction and if other parts of the junction
require a HAVING clause. This is possible only after getting the full
expression tree. Then we collect the actual expressions.

[closes #690]
[closes #686]
@hrach hrach closed this as completed in ad6dfe0 Oct 31, 2024
@hrach hrach added this to the v5.0 milestone Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants