Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor tiny_tds to avoid sharing DBPROCESS #571

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
172 changes: 51 additions & 121 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,8 +116,8 @@ opts[:message_handler] = Proc.new { |m| puts m.message }
client = TinyTds::Client.new opts
# => Changed database context to 'master'.
# => Changed language setting to us_english.
client.execute("print 'hello world!'").do
# => hello world!
client.do("print 'hello world!'")
# => -1 (no affected rows)
```

Use the `#active?` method to determine if a connection is good. The implementation of this method may change but it should always guarantee that a connection is good. Current it checks for either a closed or dead connection.
Expand Down Expand Up @@ -147,169 +147,99 @@ Send a SQL string to the database and return a TinyTds::Result object.
result = client.execute("SELECT * FROM [datatypes]")
```

## Sending queries and receiving results

## TinyTds::Result Usage
The client implements three different methods to send queries to a SQL server.

A result object is returned by the client's execute command. It is important that you either return the data from the query, most likely with the #each method, or that you cancel the results before asking the client to execute another SQL batch. Failing to do so will yield an error.

Calling #each on the result will lazily load each row from the database.
`client.insert` will execute the query and return the last identifier.

```ruby
result.each do |row|
# By default each row is a hash.
# The keys are the fields, as you'd expect.
# The values are pre-built Ruby primitives mapped from their corresponding types.
end
client.insert("INSERT INTO [datatypes] ([varchar_50]) VALUES ('text')")
# => 363
```

A result object has a `#fields` accessor. It can be called before the result rows are iterated over. Even if no rows are returned, #fields will still return the column names you expected. Any SQL that does not return columned data will always return an empty array for `#fields`. It is important to remember that if you access the `#fields` before iterating over the results, the columns will always follow the default query option's `:symbolize_keys` setting at the client's level and will ignore the query options passed to each.
`client.do` will execute the query and tell you how many rows were affected.

```ruby
result = client.execute("USE [tinytdstest]")
result.fields # => []
result.do

result = client.execute("SELECT [id] FROM [datatypes]")
result.fields # => ["id"]
result.cancel
result = client.execute("SELECT [id] FROM [datatypes]")
result.each(:symbolize_keys => true)
result.fields # => [:id]
client.do("DELETE FROM [datatypes] WHERE [varchar_50] = 'text'")
# 1
```

You can cancel a result object's data from being loading by the server.
Both `do` and `insert` will not serialize any results sent by the SQL server, making them extremely fast and memory-efficient for large operations.

`client.execute` will execute the query and return you a `TinyTds::Result` object.

```ruby
result = client.execute("SELECT * FROM [super_big_table]")
result.cancel
client.execute("SELECT [id] FROM [datatypes]")
# =>
# #<TinyTds::Result:0x000057d6275ce3b0
# @fields=["id"],
# @return_code=nil,
# @rows=
# [{"id"=>11},
# {"id"=>12},
# {"id"=>21},
# {"id"=>31},
```

You can use results cancelation in conjunction with results lazy loading, no problem.
A result object has a `fields` accessor. Even if no rows are returned, `fields` will still return the column names you expected. Any SQL that does not return columned data will always return an empty array for `fields`.

```ruby
result = client.execute("SELECT * FROM [super_big_table]")
result.each_with_index do |row, i|
break if row > 10
end
result.cancel
result = client.execute("USE [tinytdstest]")
result.fields # => []

result = client.execute("SELECT [id] FROM [datatypes]")
result.fields # => ["id"]
```

If the SQL executed by the client returns affected rows, you can easily find out how many.
You can retrieve the results by accessing the `rows` property on the result.

```ruby
result.each
result.affected_rows # => 24
result.rows
# =>
# [{"id"=>11},
# {"id"=>12},
# {"id"=>21},
# ...
```

This pattern is so common for UPDATE and DELETE statements that the #do method cancels any need for loading the result data and returns the `#affected_rows`.
The result object also has `affected_rows`, which usually also corresponds to the length of items in `rows`. But if you execute a `DELETE` statement with `execute, `rows` is likely empty but `affected_rows` will still list a couple of items.

```ruby
result = client.execute("DELETE FROM [datatypes]")
result.do # => 72
# #<TinyTds::Result:0x00005efc024d9f10 @affected_rows=75, @fields=[], @return_code=nil, @rows=[]>
result.count
# 0
result.affected_rows
# 75
```

Likewise for `INSERT` statements, the #insert method cancels any need for loading the result data and executes a `SCOPE_IDENTITY()` for the primary key.

```ruby
result = client.execute("INSERT INTO [datatypes] ([xml]) VALUES ('<html><br/></html>')")
result.insert # => 420
```
But as mentioned earlier, best use `do` when you are only interested in the `affected_rows`.

The result object can handle multiple result sets form batched SQL or stored procedures. It is critical to remember that when calling each with a block for the first time will return each "row" of each result set. Calling each a second time with a block will yield each "set".
The result object can handle multiple result sets form batched SQL or stored procedures.

```ruby
sql = ["SELECT TOP (1) [id] FROM [datatypes]",
"SELECT TOP (2) [bigint] FROM [datatypes] WHERE [bigint] IS NOT NULL"].join(' ')

set1, set2 = client.execute(sql).each
set1, set2 = client.execute(sql).rows
set1 # => [{"id"=>11}]
set2 # => [{"bigint"=>-9223372036854775807}, {"bigint"=>9223372036854775806}]

result = client.execute(sql)

result.each do |rowset|
# First time data loading, yields each row from each set.
# 1st: {"id"=>11}
# 2nd: {"bigint"=>-9223372036854775807}
# 3rd: {"bigint"=>9223372036854775806}
end

result.each do |rowset|
# Second time over (if columns cached), yields each set.
# 1st: [{"id"=>11}]
# 2nd: [{"bigint"=>-9223372036854775807}, {"bigint"=>9223372036854775806}]
end
```

Use the `#sqlsent?` and `#canceled?` query methods on the client to determine if an active SQL batch still needs to be processed and or if data results were canceled from the last result object. These values reset to true and false respectively for the client at the start of each `#execute` and new result object. Or if all rows are processed normally, `#sqlsent?` will return false. To demonstrate, lets assume we have 100 rows in the result object.

```ruby
client.sqlsent? # = false
client.canceled? # = false

result = client.execute("SELECT * FROM [super_big_table]")

client.sqlsent? # = true
client.canceled? # = false

result.each do |row|
# Assume we break after 20 rows with 80 still pending.
break if row["id"] > 20
end

client.sqlsent? # = true
client.canceled? # = false

result.cancel

client.sqlsent? # = false
client.canceled? # = true
```

It is possible to get the return code after executing a stored procedure from either the result or client object.

```ruby
client.return_code # => nil

result = client.execute("EXEC tinytds_TestReturnCodes")
result.do
result.return_code # => 420
client.return_code # => 420
```


## Query Options

Every `TinyTds::Result` object can pass query options to the #each method. The defaults are defined and configurable by setting options in the `TinyTds::Client.default_query_options` hash. The default values are:

* :as => :hash - Object for each row yielded. Can be set to :array.
* :symbolize_keys => false - Row hash keys. Defaults to shared/frozen string keys.
* :cache_rows => true - Successive calls to #each returns the cached rows.
* :timezone => :local - Local to the Ruby client or :utc for UTC.
* :empty_sets => true - Include empty results set in queries that return multiple result sets.
You can pass query options to `execute`. The defaults are defined and configurable by setting options in the `TinyTds::Client.default_query_options` hash. The default values are:

Each result gets a copy of the default options you specify at the client level and can be overridden by passing an options hash to the #each method. For example
* `as: :hash` - Object for each row yielded. Can be set to :array.
* `empty_sets: true` - Include empty results set in queries that return multiple result sets.
* `timezone: :local` - Local to the Ruby client or :utc for UTC.

```ruby
result.each(:as => :array, :cache_rows => false) do |row|
# Each row is now an array of values ordered by #fields.
# Rows are yielded and forgotten about, freeing memory.
end
result = client.execute("SELECT [datetime2_2] FROM [datatypes] WHERE [id] = 74", as: :array, timezone: :utc, empty_sets: true)
# => #<TinyTds::Result:0x000061e841910600 @affected_rows=1, @fields=["datetime2_2"], @return_code=nil, @rows=[[9999-12-31 23:59:59.12 UTC]]>
```

Besides the standard query options, the result object can take one additional option. Using `:first => true` will only load the first row of data and cancel all remaining results.

```ruby
result = client.execute("SELECT * FROM [super_big_table]")
result.each(:first => true) # => [{'id' => 24}]
```


## Row Caching

By default row caching is turned on because the SQL Server adapter for ActiveRecord would not work without it. I hope to find some time to create some performance patches for ActiveRecord that would allow it to take advantages of lazily created yielded rows from result objects. Currently only TinyTDS and the Mysql2 gem allow such a performance gain.


## Encoding Error Handling

TinyTDS takes an opinionated stance on how we handle encoding errors. First, we treat errors differently on reads vs. writes. Our opinion is that if you are reading bad data due to your client's encoding option, you would rather just find `?` marks in your strings vs being blocked with exceptions. This is how things wold work via ODBC or SMS. On the other hand, writes will raise an exception. In this case we raise the SYBEICONVO/2402 error message which has a description of `Error converting characters into server's character set. Some character(s) could not be converted.`. Even though the severity of this message is only a `4` and TinyTDS will automatically strip/ignore unknown characters, we feel you should know that you are inserting bad encodings. In this way, a transaction can be rolled back, etc. Remember, any database write that has bad characters due to the client encoding will still be written to the database, but it is up to you rollback said write if needed. Most ORMs like ActiveRecord handle this scenario just fine.
Expand Down
Loading