-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Statement.add() is Awkward #259
Comments
With Cloud Spanner, we allowed trailing adds before the specification got clarified to forbid them, and never updated the implementation to be stricter. |
I agree that with the target audience being mainly tool vendors, correctness and predictability is more important than usability. I was the one requesting the specification here: #229 In particular, the comparison with JDBC doesn't hold, because JDBC unambiguously distinguishes between I agree the status quo is awkward, but it already was, before this specification request. Forbidding trailing adds was the path to least incompatible changes, as removing awkwardness from the current design would have required more significant re-design (which can still be debated, just wanted to give some background). |
There's indeed inconvenience when calling statement.addBinding(binder -> binder.bind(0, …))
.addBinding(binder -> binder.bind(0, …))
.execute(); which omits the need for calling You could have also something like: conn.createStatement("UPDATE person SET count = count + 1").add().add().execute() While the statement is not parametrized, how often does it run? With a lenient approach, it would run only once, with a strict approach it would run twice. I'm happy to sort out the issue before 1.0 as we seem to have entered a phase where we gather learnings and it makes sense to polish up the SPI. |
@mp911de Would this be offered as a convenience |
There's some tension between "make it convenient to use" and "keep it as is and don't break things". I think the cost of breaking the SPI would need to be outweighed by the benefit it provides. I clearly see the case for direct SPI users but as you already mentioned, the SPI is intended for libraries/frameworks as users. If we find that we can live with it as things are (although strange to use), then I'd vote to keep things as they are. If we find that we cannot live with it then we need to accept the fact of breaking a pretty core area of query execution code paths. |
I think a vote would depend on seeing actual drafts of the improvement, though I currently can't imagine they'll pull their weight. I think the core problem here is that R2DBC inherited JDBC's flaw of re-using the same type for 2 purposes:
In JDBC, I imagine this was done to avoid re-declaring the 500 |
What if we specify that after calling add(), a Statement ignores the current set of binds, unless those are added as well? Would this resolve the ambiguity? // Executes 1 INSERT (add is not called)
connection.createStatement("INSERT INTO t VALUES (0)")
.execute();
// Executes 1 INSERT (add is not called)
connection.createStatement("INSERT INTO t VALUES (?)")
.bind(0, 0)
.execute();
// Executes 1 INSERT (add is called 1 time)
connection.createStatement("INSERT INTO t VALUES (0)")
.add()
.execute();
// Executes 1 INSERT (add is called 1 time)
connection.createStatement("INSERT INTO t VALUES (?)")
.bind(0, 0).add()
.execute();
// Executes 2 INSERTs (add is called 2 times)
connection.createStatement("INSERT INTO t VALUES (0)")
.add()
.add()
.execute();
// Executes 2 INSERTs (add is called 2 times)
connection.createStatement("INSERT INTO t VALUES (?)")
.bind(0, 0).add()
.bind(0, 1).add()
.execute();
// Throws a error: "You forgot to add() your binds!"
connection.createStatement("INSERT INTO t VALUES (?)")
.bind(0, 0).add()
.bind(0, 1).add()
.bind(0, 2)
.execute(); |
@Michael-A-McMahon There's now a different kind of awkwardness (or rather, inconsistency) between these two: // Executes 1 INSERT (add is not called)
connection.createStatement("INSERT INTO t VALUES (?)")
.bind(0, 0)
.execute();
// Throws a error: "You forgot to add() your binds!"
connection.createStatement("INSERT INTO t VALUES (?)")
.bind(0, 0).add()
.bind(0, 1).add()
.bind(0, 2)
.execute(); How come a trailing The current specification is, in fact, the most logical one if single statements and batched statements are to share the same API. You have to call // 1 operand, no + (add) operator
1
// 2 operands, 1 + operator
1 + 2
// 3 operands, 2 + operators
1 + 2 + 3 I don't think anyone would suggest requiring a trailing // Both are legal
1
1 +
// Now, trailing + operators are required
1 + 2 +
1 + 2 + 3 + The awkwardness criticised in the original comment stems from the fact that this isn't about binary operators, but about a set of chained method calls. If you could But perhaps, the underlying idea could be the solution here? |
I now understand what has happened: There are at least two mental models of what add() does, and those models are incompatible. In one model, add() is a mathematical operator, so add() goes between the operands. In another model (the model that I use) add() pushes a set of binds into a queue, so add() always follows after binds have been set. https://en.wikipedia.org/wiki/Principle_of_least_astonishment Under the queue model, it is astonishing that a trailing add() results in an error. I'm not sure what to do about this :) What mental model do we think is more intuitive for most programmers? |
Yes, and I'm quite impartial to which one is chosen in the end, as long as there is not a single ambiguity left for the not-so-edge-case of executing a single query. Again, my point here is, if this were to be fixed, then the best option I see is to separate the single execution API from the batch API. |
Splitting parametrized batches into a different API would change the programming model significantly and break not only drivers but also the calling code. We would reduce confusion greatly by giving the
|
I think add() is already a good name. It adds the current set of binds to the batch, so the verb "add" seems correct. If there is another name that screams "Don't call this after adding your last of binds!", that might be better. Verbs like "concat" or "append" might express that, however I personally like the name as it is. To me, the most natural behavior of add() would require it to be trailing, as shown in my previous examples. If you have a single set of binds, then you don't call add(). If have (potentially) more than one set of binds, then you always call add(). This seems simple to understand, and is easier to work with as it eliminates the if(isFirst) condition for calling add(). But I think the issues with this idea are:
I hope I'm understanding the issues correctly; Please let me know if I'm missing something. |
The general SPI approach seems to require rethinking in the context of usability and how parametrized batches should be supported. Right now, we do not have a strong demand to change the SPI as a proper approach requires more thought and we do not want to add something else that we would remove in the future. Rescheduling the ticket for a future release. |
Yes, I agree that
class StatementWrapper {
private Statement statement;
@Getter
private int rowCounter = 0;
private StatementWrapper(@NonNull Connection connection, @NonNull String sql) {
this.statement = connection.createStatement(sql);
}
public static StatementWrapper of(@NonNull Connection connection, @NonNull String sql) {
return new StatementWrapper(connection, sql);
}
/**
* Unlike {@link Statement#bind(String, Object)}, this method supports both non-{@code null} as well as {@code null} values.
*/
public <T> StatementWrapper bind(String name, T value, Class<T> type) {
if (nonNull(value)) {
statement = statement.bind(name, value);
} else {
statement = statement.bindNull(name, type);
}
return this;
}
/**
* Unlike {@link Statement#add()}, this method is loop-friendly.
*/
public StatementWrapper add() {
if (rowCounter++ > 0) { // the first call is ignored
statement = statement.add();
}
return this;
}
public Publisher<? extends Result> execute() {
return statement.execute();
}
} The usage is then much more straightforward: public Mono<Integer> insertBooks(Connection connection, List<Book> books) {
final StatementWrapper statement = StatementWrapper(connection, "INSERT INTO book (author, title) VALUES ($1, $2)");
books.forEach(book ->
.add() // the first call is simply ignored
.bind("$1", book.getAuthor(), String.class)
.bind("$1", book.getTitle(), String.class)
);
return Flux.from(statement.execute())
.flatMap(Result::getRowsUpdated)
.reduce(0, Integer::sum);
} |
I'm finding it a bit awkward to use add() correctly. It requires coding in a branch to have the first set of binds are handled differently, like this:
I think the SPI would be easier to use and less error prone if it allowed the trailing add(). As a JDBC programmer, I also find the trailing add() to be more intuitive, because this is how JDBC's PreparedStatement.addBatch() works.
With a trailing add(), the code above could be written as:
Am I the only one who thinks this? I'd like to see what others think now, before the 1.0 release solidifies the behavior of add().
The text was updated successfully, but these errors were encountered: