Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Should Statement be a token? #346

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sfackler opened this issue Apr 23, 2018 · 11 comments
Closed

Should Statement be a token? #346

sfackler opened this issue Apr 23, 2018 · 11 comments

Comments

@sfackler
Copy link
Owner

A statement is currently modeled as a type that borrows the connection and has methods defined to query/execute it. This is conceptually nice, but has a few issues in practice:

  • Since it borrows the connection, it's hard to save off. We have a prepare_cached method, but if that doesn't work for your use case you have to write unsafe code to package the statement alongside its connection.
  • It forces all methods to take &self and connection to have an internal RefCell. This means that we have to dynamically check that the connection/transaction you're using is the "active" one.

An alternate approach is for statement to be a "token" which does not borrow the connection. It has no methods defined on itself, but is instead provided to methods on the connection for use. It holds onto a channel sender and on drop enqueues itself for cleanup. The connection periodically checks the cleanup list and closes all dead statements. This is the approach taken in tokio-postgres.

A high level API sketch:

pub struct Connection { ... }

impl Connection {
    pub fn prepare(&mut self, query: &str) -> Result<Statement> { ... }

    pub fn query(&mut self, statement: &Statement, params: &[&ToSql]) -> Result<Rows> { ... }

    pub fn execute(&mut self, statement: &mut Statement, params: &[&ToSql]) -> Result<u64> { ... }

    pub fn query_once(&mut self, query: &str, params: &[&ToSql]) -> Result<Rows> { ... }

    pub fn execute_once(&mut self, query: &str, params: &[&ToSql]) -> Result<Rows> { ... }

    pub fn transaction<'a>(&'a mut self) -> Result<Transaction<'a>> { ... }
}

pub struct Statement { ... }

pub struct Transaction<'a> { ... }

impl<'a> Transaction<'a> {
    // same methods as Connection
}
  • Pros
    • We no longer need a RefCell inside of Connection.
    • You're statically prevented from using a non-active transaction by the borrow checker.
    • prepare_cached can go away since you can save off statements how you see fit. r2d2-postgres will need to have some logic to allow you to attach statements to the pooled connection probably.
    • Slightly less network traffic since we'll delay statement closing until the next use of the connection after drop.
  • Cons
    • We introduce the possibility of people trying to use a statement with the wrong connection, particularly when there's a connection pool involved. We can either panic immediately or just ensure that statement names are globally unique and let the DB reject it.
    • The "simple" Connection::query and Connection::execute methods get a bit longer. We could overload a single set of methods with a trait bound that takes both query strings and statements?
    fn query<T>(&mut self, statement: &T, params: &[&ToSql]) -> Result<Rows>
    where
        T: Query
    { ... }
    
    pub trait Query: Sealed {}
    
    impl Query for str {}
    impl Query for Statement {}
@sfackler
Copy link
Owner Author

cc @jwilm

@sfackler
Copy link
Owner Author

Hmm, portals still imply a need for internal mutability :(

@sfackler
Copy link
Owner Author

We can drop LazyRows in favor of a lower level portal interface:

impl<'a> Transaction<'a> {
    pub fn bind(&mut self, statement: &Statement, params: &[&ToSql]) -> Result<Portal> { ... }

    pub fn query_portal(&mut self, portal: &Portal, rows: u32) -> Result<Rows> { ... }
}

The portal will be automatically closed when the active transaction/savepoint closes and we lose the static prohibition of that similarly to statements after the connection closes.

@jwilm
Copy link
Contributor

jwilm commented May 18, 2018

Have you put any thought into having Statement hold a Rc<RefCell<Connection>>? That would solve some of the borrowing awkwardness and prevent any "wrong connection" problems under the current proposal.

I'm thinking about this purely from the non-tokio perspective; I haven't really thought about that side of the library at all.

@sfackler
Copy link
Owner Author

Rc doesn't really work since you wouldn't be able to save the statements across use of a pooled connection due to the lack of a Send bound.

@jwilm
Copy link
Contributor

jwilm commented May 18, 2018

Ah right, connection poolers..

I don't see the cons to your approach as particularly negative. The overloading approach makes sense and aligns with my own thoughts about it.

We introduce the possibility of people trying to use a statement with the wrong connection, particularly when there's a connection pool involved. We can either panic immediately or just ensure that statement names are globally unique and let the DB reject it.

Panicking seems strong. A globally unique statement name seems like a nice solution.

@jwilm
Copy link
Contributor

jwilm commented May 20, 2018

There's also the possibility that if someone puts a connection pooler between this library and the server that the statement wouldn't be on the same connection anyhow.

@rmanoka
Copy link

rmanoka commented Dec 3, 2018

Quite like the API proposed; I think the static-checking advantages of &mut over-weight the cons.

One comment on LazyRows: I personally feel it repeats the functionality of "DECLARE CURSOR" and "FETCH" commands already provided by postgresql (and already working nicely via rust-postgres). Also, the concern of reading a large number of rows is a sync-only concern, as tokio-postgres anyway makes it a Stream. Are portals also aimed at the same use-case as LazyRows only?

@sfackler
Copy link
Owner Author

sfackler commented Dec 3, 2018

One comment on LazyRows: I personally feel it repeats the functionality of "DECLARE CURSOR" and "FETCH" commands already provided by postgresql (and already working nicely via rust-postgres).

You can achieve the same thing via SQL-level cursors for sure. LazyRows doesn't use that, but rather a feature of the postgres protocol that should I think be a bit more efficient network transfer-wise.

Also, the concern of reading a large number of rows is a sync-only concern, as tokio-postgres anyway makes it a Stream. Are portals also aimed at the same use-case as LazyRows only?

Portals are a bit more powerful than that - you have control over when you want more rows to be sent. That means if you for example don't know how many rows you'll need to you can just stop receiving them with a portal rather than having to read and discard them all with a normal query.

@rmanoka
Copy link

rmanoka commented Dec 3, 2018

You can achieve the same thing via SQL-level cursors for sure. LazyRows doesn't use that, but rather a feature of the postgres protocol that should I think be a bit more efficient network transfer-wise.

Nice! Would you say it has lesser round-trips? With CURSOR + FETCH, we have about n+1 round-trips, if I want to read the total output in n chunks.

Portals are a bit more powerful than that - you have control over when you want more rows to be sent. That means if you for example don't know how many rows you'll need to you can just stop receiving them with a portal rather than having to read and discard them all with a normal query.

Ah, I see: even with tokio-postgres, a long query can not be aborted midway without reading all the data the server is sending. I didn't realize that.

@sfackler
Copy link
Owner Author

This is now implemented on the master branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants