-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Runner #41
base: master
Are you sure you want to change the base?
Runner #41
Conversation
0a7da39
to
a20bea6
Compare
@nbigaouette Any thoughts about this? This has been been open for a month now... |
Sorry about the delay, personal time is quite short these days and was hopping holidays would make things easier. I'm taking a look right now. |
Love the zero copy aspects. Right now if I understand correctly an addition to the API allows this zero-copy. I would like to have a single API instead and thus get rid of the original one. I'll play around a bit with the new one. |
The zero-copy aspect is issue #11. |
Yea, this was kind of the main point. In my use case it's absolutely prohibiting to spend any time (even if it's microseconds) on either cloning arrays, re-validating what's already been validated or doing any ffi calls aside from the actual execute(). I've added it as a separate API so as not to touch the original one but I think it does fully cover the original one at least in the single-threaded case. Note that this is not thread-safe (I think) as it holds pointers to some mutable arrays that you are supposed to fill out. You would probably have to create one runner per thread if you wanted MT (#38)? Would sure be up for any discussion, anyway :) +@marshallpierce as well since they've shown interest in the original issue as well. |
I was about to suggest to merge the runner into the session and get rid of the original API. But I think it makes sense to keep both actually. The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Public items will have to be documented when the API is settled.
self, | ||
) -> Result<Runner<'s, TIn, DIn, TOut, DOut>> { | ||
Runner::new(self.session, self.input_arrays) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is meant to be used like this:
let mut runner = session
.make_runner(input_tensor_values)
.with_output::<f32, Ix4>()?;
Since the model (and thus the session) knows the shape of the output, can this be leveraged? Or would that require const generics (probably...)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, looking at the following with_output_dyn()
answers the comment above, please ignore.
.collect() | ||
} | ||
|
||
fn compute_output_shapes<TIn, DIn: Dimension>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer if the trait bounds were using where clauses to be consistent with the rest of the crates.
Forget my comment about the input reference, I see that it's made available to modification. |
I don't want to push to the branch but I did modify the 'sample' example. Here's the patch if you don't mind applying it: From ade58d96d16703edb15ad806b8a37d64475e5d54 Mon Sep 17 00:00:00 2001
From: Nicolas Bigaouette <[email protected]>
Date: Sat, 26 Dec 2020 14:44:50 -0500
Subject: [PATCH] Adapt 'sample' to runner API
---
onnxruntime/examples/sample.rs | 35 +++++++++++++++++++++++++++-------
1 file changed, 28 insertions(+), 7 deletions(-)
diff --git a/onnxruntime/examples/sample.rs b/onnxruntime/examples/sample.rs
index fff0772..3d5ee51 100644
--- a/onnxruntime/examples/sample.rs
+++ b/onnxruntime/examples/sample.rs
@@ -1,9 +1,11 @@
#![forbid(unsafe_code)]
-use ndarray::Array;
+use ndarray::{Array, Ix4, IxDyn};
use onnxruntime::{
- environment::Environment, tensor::OrtOwnedTensor, GraphOptimizationLevel, LoggingLevel,
+ environment::Environment,
+ runner::{Outputs, Runner},
+ GraphOptimizationLevel, LoggingLevel,
};
use tracing::Level;
use tracing_subscriber::FmtSubscriber;
@@ -62,12 +64,31 @@ fn run() -> Result<(), Error> {
.unwrap();
let input_tensor_values = vec![array];
- let outputs: Vec<OrtOwnedTensor<f32, _>> = session.run(input_tensor_values)?;
+ // You can simply run the session with the input to get the output...
+ // let outputs: Vec<OrtOwnedTensor<f32, _>> = session.run(input_tensor_values)?;
- assert_eq!(outputs[0].shape(), output0_shape.as_slice());
- for i in 0..5 {
- println!("Score for class [{}] = {}", i, outputs[0][[0, i, 0, 0]]);
- }
+ // Or, you can build a runner to pre-allocate the output
+ let mut runner = session
+ .make_runner(input_tensor_values)
+ .with_output::<f32, Ix4>()?;
+ runner.execute()?;
+
+ print_runner_outputs(&runner);
+
+ // Since the runner now owns the input and keep it alive, we can access it
+ // and modify it without reallocation.
+ *(&mut runner.inputs()[0]) *= 2.0f32;
+ runner.execute()?;
+
+ print_runner_outputs(&runner);
Ok(())
}
+
+fn print_runner_outputs(runner: &Runner<f32, IxDyn, f32, Ix4>) {
+ let outputs: Outputs<f32, Ix4> = runner.outputs();
+ let output = &outputs[0];
+ for i in 0..5 {
+ println!("Score for class [{}] = {}", i, output[[0, i, 0, 0]]);
+ }
+}
--
2.29.2
|
Yea, inputs are available for modification of course. Here's the use case that's not covered: you have an It would probably be possible to cover this use case as well, if the runner had something like |
FWIW, this style of API seems like it would be a good fit for my use case: an http service that receives onnx tensor protobuf requests, runs them through onnxruntime via this crate, and emits the output tensor as the response. The model doesn't change in the lifetime of one process, so I would ideally need one |
I have updated the branch with latest changes from master (sorry if I pushed to your fork? not sure how github handled this...) Only thing remaining is documentation and trivial clippy comments. |
@nbigaouette Apologies, was a bit busy and was planning to get back to this on Friday/weekend. (As a minor note, it's generally a bad idea to merge from master back into feature branches which will then be merged back into master - I'll see if I can try rebasing this branch on master and force pushing) |
I can try handling the documentation. What else do you think should be done, overall? |
I agree a rebase should be done instead of merging master. GitHub's UI is not always clear as the action's result... What remains to be done:
After that I'd like to merge this and release a new version to crates.io to be experimented more with. |
.map(|(jdx, dim)| match dim { | ||
None => input_arrays[idx].shape()[jdx], | ||
Some(d) => *d as usize, | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes a bad assumption about the output shape corresponding to the input shape (see #56).
Is there an ORT API that can be used to get the output shape before execution? If not, can we allow users to specify the output shape?
No more updates on this branch? I think a lot of people could use a faster version, why did it never got merged? |
@nbigaouette This is somewhat WIP in that if this is deemed acceptable, then some docs should be added, things may be cleaned up a bit etc, but it's fully functional. Basically I wanted to push this asap to see what you think.
Default
currently required for output type. As long as we're sure we'll never use strings, might add aCopy
to the element type requirements and thenDefault
can be dropped (so that the output array can be zero-initialized).Can be used like this: