You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SELECT
id,
(SELECT COUNT(*) FROM tbl2 WHERE tbl2.parent_id = tbl1.id)
FROM tbl1; -- + WHERE, ...
The query plain currently looks somewhat like this:
Foreign Scan on tbl1
Output: tbl1.id, (SubPlan 1)
Remote query: SELECT `id` FROM `s`.`tbl1` -- + WHERE, ORDER BY, ...
SubPlan 1
-> Foreign Scan
Output: (count(tbl2.parent_id))
Relations: Aggregate on (s.tbl2)
Remote query: SELECT count(`id`) FROM `s`.`tbl2` WHERE ((`parent_id` = ?))
I.e. for each row of tbl1 a separate query is send to the remote database.
Similar case (worse performance):
SELECT
id,
EXISTS (SELECT FROM tbl2 WHERE tbl2.parent_id = tbl1.id)
FROM tbl1;
Query Plan:
Foreign Scan on tbl1
Output: tbl1.id, (hashed SubPlan 2)
Remote query: SELECT `id` FROM `s`.`tbl1` -- + WHERE, ORDER BY, ...
SubPlan 2
-> Foreign Scan on s.tbl2
Output: tbl2.parent_id
Remote query: SELECT `parent_id` FROM `s`.`tbl2`
This unfortunately tries to download the complete tbl2!
Another version (might be more difficult to implement?):
SELECT
id,
r1.*
FROM tbl1,
LATERAL (SELECT COUNT(*) FROM tbl2 WHERE tbl2.parent_id = tbl1.id) r1;
Query Plan:
Nested Loop
Output: tbl1.id, (count(tbl2.parent_id))
-> Foreign Scan on s.tbl1
Output: tbl1.id, ... more fields...
Remote query: SELECT `id` FROM `s`.`tbl1` -- + WHERE, ORDER BY, ...
-> Foreign Scan
Output: (count(*))
Relations: Aggregate on ("s".tbl2)
Remote query: SELECT count(*) FROM `s`.`tbl2` WHERE ((`parent_id` = ?))
This is basically the same as 1.
Partial workaround:
SELECT
tbl1.id,
COUNT(tbl2.id)
FROM tbl1
LEFT JOIN tbl2 ON tbl2.parent_id = tbl1.id
GROUP BY 1
Query Plan: Single Foreign Scan, everything is pushed down
However, this no longer works when additional fields, e.g. tbl1.*, should be retrieved.
One might try this:
WITH t0 AS ( -- NOT MATERIALIZED does not help
SELECT
tbl1.id,
COUNT(tbl2.id)
FROM table
LEFT JOIN tbl2 ON tbl2.parent_id = tbl1.id
GROUP BY 1
ORDER BY 1
)
SELECT
tbl1.*,
count
FROM t0
JOIN tbl1 ON tbl1.id = t0.id
But the resulting query plan (Merge Join between Foreign Scan of tbl1 and Materialize of Foreign Scan from t0) is not feasible for bigger tables, although SELECT * FROM t0 and SELECT * FROM tbl1 are quite fast by themselves...
Expected Result
The aggregation is evaluated on the remote mysql server without requiring a separate query for each row of tbl1.
Also, EXISTS should probably keep the WHERE, instead of downloading the whole table.
The text was updated successfully, but these errors were encountered:
Motivation
Examples:
The query plain currently looks somewhat like this:
I.e. for each row of
tbl1
a separate query is send to the remote database.Query Plan:
This unfortunately tries to download the complete
tbl2
!Query Plan:
This is basically the same as 1.
Partial workaround:
Query Plan: Single Foreign Scan, everything is pushed down
However, this no longer works when additional fields, e.g.
tbl1.*
, should be retrieved.One might try this:
But the resulting query plan (Merge Join between Foreign Scan of tbl1 and Materialize of Foreign Scan from t0) is not feasible for bigger tables, although
SELECT * FROM t0
andSELECT * FROM tbl1
are quite fast by themselves...Expected Result
The aggregation is evaluated on the remote mysql server without requiring a separate query for each row of tbl1.
Also, EXISTS should probably keep the WHERE, instead of downloading the whole table.
The text was updated successfully, but these errors were encountered: