-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't send dynamic map keys to the database #4327
Don't send dynamic map keys to the database #4327
Conversation
I requested a review because I wasn't sure if this had slipped by you or if you didn't have time for it right now. If the latter please ignore me. |
Thank you for requesting it. It definitely slipped through. I think we just need to check this is also sane for unions/intersections, if you haven't yet! |
@fuelen you could try this out to see how it works for you :) |
Just checked, it works fine. Thank you! |
@josevalim Actually sorry there is something about combinations I didn't consider. I was hoping to get your opinion. If there is a combination and the user interpolates a string map key then the string literal will be sent to the adapter. Then it will be up to the adapter to escape string properly to avoid something like this union_query = from(c in Post, select: %{"id" => c.id})
outer_map_key = "id'; drop table posts; select 'id"
from(c in Post, select: %{^outer_map_key => c.id}, union: ^union_query) Adapters are suppose to be escaping string literals properly. Otherwise even literals defined by the dev and not the user won't work properly. But it seems extreme to open them to SQL injection from outside users if they fail to do that. This means if we want to close this hole, we have to continue to have interpolated map keys get changed to query parameters. And for atoms we need to have an escape hatch. There are two mechanisms we have today that kind of fit but not exactly:
Based on the above, I can think of a few options:
I am personally leaning towards option 1 but this feels hairy so not super confident. WDYT? |
The adapters have to escape string literals, no? Otherwise someone could write this: So maybe we are not escaping strings in this particular scenario? |
Oh yeah sorry what I meant is that adapters still need to escape string literals properly. And the current ones do. I was asking more from the angle of new adapters that might have mistakes in their implementation at the beginning. |
Yeah, they have to. Otherwise all kinds of weird stuff break! |
Ok phew =). Thanks for confirming! |
@greg-rychlewski are we good to release a new patch version or would you like to leave this particular feature out just in case? |
I'm good to release. I don't see a release to hold this feature back. |
@greg-rychlewski I am reconsidering this one. Here we are using |
Alternatively, if we don't send it to the database, then we are still not passing it as parameter either. Do we do it anywhere else besides literal? or unsafe_fragment? |
I believe the only two are literal and unsafe_fragment. But they are not really matching this purpose too well. For most queries map keys won't be sent to the database but if combinations exist then they will. So there is definitely the gap you are saying in that scenario. I'm pretty pessimistic that we can make combinations work without sending literals to the database. For example: select "string1" union select "string2" Something like this would be returned as ["select1", "select1"] from Ecto if the actual values are not sent to the database. Since we don't know where the main query ends/combination query starts in the results from the driver we can't really extract the literals and place them back in the right place after. Given the above, I can only think of two options. Either scrap this entirely or introduce something like unsafe_literal in selects. Maybe the world unsafe is not needed here. But reusing literal for a different behaviour might be confusing. |
It might also be possible in the planner to modify this part so that defp collect_kv([{key, value} | elems], fields, from, query, take, keep_literals?, acc) do
{key, fields, from} = collect_fields(key, fields, from, query, take, keep_literals?, %{})
{value, fields, from} = collect_fields(value, fields, from, query, take, keep_literals?, %{})
collect_kv(elems, fields, from, query, take, keep_literals?, [{key, value} | acc])
end |
No the above will break existing code for combinations. For example: from c in Comment,
select: %{"outer_title" => c.title},
union: ^from(c in Comment, select: %{"inner_title" => c.title}) If the map key is sent to the database then it will return correctly [%{"outer_title" => "title"}, %{"inner_title" => "title"}] But if it is not sent to the database it will return this [%{"outer_title" => "title"}, %{"outer_title" => "title"}] |
I remember now what I was thinking. Because of literal/unsafe_fragment I was treating the pin operator as signal the user knows they want us to handle dynamic input rather than a signal to use a query parameter. Then depending on where the operator is used we handle it safely. And because map keys can only ever enter the query as literals that are used to populate their values during postprocessing, it is safe how we are handling it. I'm not sure how much of a stretch the above is. It makes some kind of sense to me but maybe i'm too tired. |
Actually, I just realized that |
Ah that is a really good example! Completely forgot about it. |
selected_as 1 and 2, as well |
I was reading over the issue from this morning about setting dynamic map keys (#4326). I think I have a decent way to solve it.
The idea is that any interpolated map key doesn't need to be sent to the database because it won't affect the query:
This means instead of making map keys query parameters we can simply validate them on the spot and save them into the query struct.
There is a limitation where we can really only interpolate map keys for basic literals like atoms, strings and numbers. Composite literals like tuples and maps would require recursive escaping, which we lose the ability to do because we are at runtime.
I'm also restricting this proposal to just map keys even though I think there might be a case to extend it to anywhere in select. But it is more complicated (i.e. the distinction between outer and inner query becomes important) and more tests break, so I thought I would save it for later.