Skip to content

Implement FFI_Session#19223

Merged
timsaucer merged 2 commits intoapache:mainfrom
timsaucer:feat/4-ffi-implement-session
Dec 11, 2025
Merged

Implement FFI_Session#19223
timsaucer merged 2 commits intoapache:mainfrom
timsaucer:feat/4-ffi-implement-session

Conversation

@timsaucer
Copy link
Member

Which issue does this PR close?

Addresses part of #18671 but does not close it.

Rationale for this change

One of the reasons we are unable to remove SessionContext from the FFI crate is that we rely on building a session in the TableProvider::scan function. This is because our current implementation uses FFI_SessionConfig and passes around a hashmap of String->String values of config options. Then in the other side of the FFI boundary we build a temporary session. This is not ideal because those table provider cannot utilize the actual current session.

By implementing a FFI safe version of the Session trait we can overcome this limitation.

What changes are included in this PR?

  • Implement FFI_Session

This PR does not use these structures in the current code. That is coming as part of a later PR in an effort to keep the size of the PRs small for effective code review.

Are these changes tested?

Unit tests are added. Coverage report:

Screenshot 2025-12-09 at 8 35 56 AM

Are there any user-facing changes?

No.

@github-actions github-actions bot added the ffi Changes to the ffi crate label Dec 9, 2025
Comment on lines +128 to +131
struct SessionPrivateData<'a> {
session: &'a (dyn Session + Send + Sync),
runtime: Option<Handle>,
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This structure is different than all of the rest in this crate. We are carrying around borrowed data to the session. The reason for this is that TableProvider::scan passes &dyn Session.

I need to think more about this. It is possible we need this to be 'static instead, but that also would not work with how we intend to use it in the FFI Table Providers that need to implement the TableProvider trait.

I am not 100% confident that we can guarantee the lifetime safety of this structure. Maybe we need a piece of dummy data just so we can carry a lifetime on FFI_Session?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you were able to start with an Arc<dyn Session> you could store a weak reference to the session (which would let you error if it was handled improperly). If you were able to remove the pub you could ensure the lifetimes are handled correctly because it would all be internal (do you need this to be exposed or is it just used as parts of other FFI implementations?). You could also rename it to FFI_SessionRef and/or add some strongly worded documentation and hope for the best 🙂

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think the two options are:

  • update TableProvider to pass an Arc<dyn Session>. This is a breaking change that would require all implementers to support. Maybe that is ok?
  • Your approach of removing pub and restricting this to the crate where we know the two use cases have already guaranteed the lifetime of the Session during their methods.

I'm tempted to go with the second. It becomes a small implementation problem because the code that uses this is not in this PR so it generates clippy errors for unused functions once they are no longer pub. I think for this PR I will add some strongly worded documentation and then in the follow on PR that updates TableProvider we can revisit if we want to remove the pub

@timsaucer timsaucer marked this pull request as ready for review December 9, 2025 13:44
@timsaucer
Copy link
Member Author

@milenkovicm @paleolimbot @renato2099 @comphead If you aren't tired of these yet, here is the last of the building blocks for the FFI renovation. The PRs after this are just pulling them all together and then removing the core repo dependency!

@milenkovicm
Copy link
Contributor

will have a look later, thanks @timsaucer

Copy link
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool!

Comment on lines +94 to +97
// TODO: Expand scope of FFI to include runtime environment
// pub runtime_env: unsafe extern "C" fn(&Self) -> FFI_RuntimeEnv,

// pub execution_props: unsafe extern "C" fn(&Self) -> FFI_ExecutionProps,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these tracked in a follow-up issue that can be linked instead of this commented out code?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +120 to +122
/// Utility to identify when FFI objects are accessed locally through
/// the foreign interface.
pub library_marker_id: extern "C" fn() -> usize,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you should change how this is handled, but I usually use the address of the release callback for this. Perhaps in Rust the address of the release callback is less predictable than in C or C++.

Comment on lines +128 to +131
struct SessionPrivateData<'a> {
session: &'a (dyn Session + Send + Sync),
runtime: Option<Handle>,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you were able to start with an Arc<dyn Session> you could store a weak reference to the session (which would let you error if it was handled improperly). If you were able to remove the pub you could ensure the lifetimes are handled correctly because it would all be internal (do you need this to be exposed or is it just used as parts of other FFI implementations?). You could also rename it to FFI_SessionRef and/or add some strongly worded documentation and hope for the best 🙂

Comment on lines +546 to +549
fn table_options_mut(&mut self) -> &mut TableOptions {
log::warn!("Mutating table options is not supported via FFI. Changes will not have an effect.");
&mut self.table_options
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can/Should Session be updated to return Option<&mut TableOptions>?

Copy link
Contributor

@milenkovicm milenkovicm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks Tim

Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @timsaucer

for the SessionPrivateData does that mean every session contain its own configuration set and can be updated without affecting other sessions?

@timsaucer
Copy link
Member Author

Thanks @timsaucer

for the SessionPrivateData does that mean every session contain its own configuration set and can be updated without affecting other sessions?

Yes! This was one of the things that I was trying to capture rather than doing something like keeping a singleton around. In my usage of datafusion-python I quite commonly create new session contexts. Often they're identical, but I can imagine other users besides myself who do have differences in their sessions. This code basically allows them to remain independent but also gives most of the session traits.

@timsaucer
Copy link
Member Author

Thank you everyone for the extremely helpful discussion!

@timsaucer timsaucer added this pull request to the merge queue Dec 11, 2025
Merged via the queue into apache:main with commit 39a1f75 Dec 11, 2025
30 checks passed
@timsaucer timsaucer deleted the feat/4-ffi-implement-session branch December 11, 2025 09:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ffi Changes to the ffi crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants