Conversation
| struct SessionPrivateData<'a> { | ||
| session: &'a (dyn Session + Send + Sync), | ||
| runtime: Option<Handle>, | ||
| } |
There was a problem hiding this comment.
This structure is different than all of the rest in this crate. We are carrying around borrowed data to the session. The reason for this is that TableProvider::scan passes &dyn Session.
I need to think more about this. It is possible we need this to be 'static instead, but that also would not work with how we intend to use it in the FFI Table Providers that need to implement the TableProvider trait.
I am not 100% confident that we can guarantee the lifetime safety of this structure. Maybe we need a piece of dummy data just so we can carry a lifetime on FFI_Session?
There was a problem hiding this comment.
If you were able to start with an Arc<dyn Session> you could store a weak reference to the session (which would let you error if it was handled improperly). If you were able to remove the pub you could ensure the lifetimes are handled correctly because it would all be internal (do you need this to be exposed or is it just used as parts of other FFI implementations?). You could also rename it to FFI_SessionRef and/or add some strongly worded documentation and hope for the best 🙂
There was a problem hiding this comment.
Yes, I think the two options are:
- update
TableProviderto pass anArc<dyn Session>. This is a breaking change that would require all implementers to support. Maybe that is ok? - Your approach of removing
puband restricting this to the crate where we know the two use cases have already guaranteed the lifetime of the Session during their methods.
I'm tempted to go with the second. It becomes a small implementation problem because the code that uses this is not in this PR so it generates clippy errors for unused functions once they are no longer pub. I think for this PR I will add some strongly worded documentation and then in the follow on PR that updates TableProvider we can revisit if we want to remove the pub
|
@milenkovicm @paleolimbot @renato2099 @comphead If you aren't tired of these yet, here is the last of the building blocks for the FFI renovation. The PRs after this are just pulling them all together and then removing the core repo dependency! |
|
will have a look later, thanks @timsaucer |
datafusion/ffi/src/session/mod.rs
Outdated
| // TODO: Expand scope of FFI to include runtime environment | ||
| // pub runtime_env: unsafe extern "C" fn(&Self) -> FFI_RuntimeEnv, | ||
|
|
||
| // pub execution_props: unsafe extern "C" fn(&Self) -> FFI_ExecutionProps, |
There was a problem hiding this comment.
Are these tracked in a follow-up issue that can be linked instead of this commented out code?
| /// Utility to identify when FFI objects are accessed locally through | ||
| /// the foreign interface. | ||
| pub library_marker_id: extern "C" fn() -> usize, |
There was a problem hiding this comment.
I don't think you should change how this is handled, but I usually use the address of the release callback for this. Perhaps in Rust the address of the release callback is less predictable than in C or C++.
| struct SessionPrivateData<'a> { | ||
| session: &'a (dyn Session + Send + Sync), | ||
| runtime: Option<Handle>, | ||
| } |
There was a problem hiding this comment.
If you were able to start with an Arc<dyn Session> you could store a weak reference to the session (which would let you error if it was handled improperly). If you were able to remove the pub you could ensure the lifetimes are handled correctly because it would all be internal (do you need this to be exposed or is it just used as parts of other FFI implementations?). You could also rename it to FFI_SessionRef and/or add some strongly worded documentation and hope for the best 🙂
| fn table_options_mut(&mut self) -> &mut TableOptions { | ||
| log::warn!("Mutating table options is not supported via FFI. Changes will not have an effect."); | ||
| &mut self.table_options | ||
| } |
There was a problem hiding this comment.
Can/Should Session be updated to return Option<&mut TableOptions>?
comphead
left a comment
There was a problem hiding this comment.
Thanks @timsaucer
for the SessionPrivateData does that mean every session contain its own configuration set and can be updated without affecting other sessions?
… about the dyn Session lifetimes
Yes! This was one of the things that I was trying to capture rather than doing something like keeping a singleton around. In my usage of datafusion-python I quite commonly create new session contexts. Often they're identical, but I can imagine other users besides myself who do have differences in their sessions. This code basically allows them to remain independent but also gives most of the session traits. |
|
Thank you everyone for the extremely helpful discussion! |
Which issue does this PR close?
Addresses part of #18671 but does not close it.
Rationale for this change
One of the reasons we are unable to remove
SessionContextfrom the FFI crate is that we rely on building a session in theTableProvider::scanfunction. This is because our current implementation usesFFI_SessionConfigand passes around a hashmap of String->String values of config options. Then in the other side of the FFI boundary we build a temporary session. This is not ideal because those table provider cannot utilize the actual current session.By implementing a FFI safe version of the
Sessiontrait we can overcome this limitation.What changes are included in this PR?
FFI_SessionThis PR does not use these structures in the current code. That is coming as part of a later PR in an effort to keep the size of the PRs small for effective code review.
Are these changes tested?
Unit tests are added. Coverage report:
Are there any user-facing changes?
No.