Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Support of java.sql.ResultSet and spring-jdbc RowMapper #1731

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Lysergid opened this issue Feb 21, 2019 · 24 comments
Open

Support of java.sql.ResultSet and spring-jdbc RowMapper #1731

Lysergid opened this issue Feb 21, 2019 · 24 comments
Assignees

Comments

@Lysergid
Copy link

Lysergid commented Feb 21, 2019

What about having support of JDBC call results in mappers.

Example of how it may look like:

@Mapper
public interface CarMapper {

    CarMapper INSTANCE = Mappers.getMapper( CarMapper.class );
 
    @JdbcMapping(column = "NUMBER_OF_SEATS", target = "seatCount")
    CarEntity carRsToCarEntity(ResultSet rs)
}

It's not necessary to have new annotation, but I think it will improve readability.

CarMapper might also extend spring-jdbc RowMapper so that mapping implementation will be in CarMapperImpl.mapRow()

I just want to know if this can be in scope of MapStruct or no, thus I don't go into details.

@sjaakd
Copy link
Contributor

sjaakd commented Feb 22, 2019

I'm not sure how the other MapStruct members feel about this one.. But I think its too much geared towards one purpose. @filiphr : WDYT?

@filiphr
Copy link
Member

filiphr commented Feb 24, 2019

I think that the purpose is not that bad honestly. java.sql.ResultSet is part of Java itself so it can be optionally used (due to modules).

The option with the spring-jdbc RowMapper will come for free, with the only exception that it would need to override the mapRow with

@Mapper
public interface CarMapper extends RowMapper {

    CarMapper INSTANCE = Mappers.getMapper( CarMapper.class );
 
    @JdbcMapping(column = "NUMBER_OF_SEATS", target = "seatCount")
    CarEntity carRsToCarEntity(ResultSet rs)

    @Override
    default T mapRow(ResultSet rs, int rowNum) {
        return carRsToCarEntity(rs);
    }
}

I think that this is really similar to #1075. Similar in the sense that it allows using the target field name in order to get a value for it.

I would not add a new custom @JdbcMapping in our API as we can just reuse:

@Mapping(target = "seatCount", source = "NUMBER_OF_SEATS")

@sjaakd I think that if someone from the community wants to work on this we should help them out.

@sjaakd
Copy link
Contributor

sjaakd commented Feb 24, 2019

@sjaakd I think that if someone from the community wants to work on this we should help them out.

Ok... @Lysergid how would the generated code look like? If we would be able to define an api that would cover both cases (so also #1075) without causing harm to the existing implementation, I'm in as well. I'm also in favour of using the current @Mapping in stead of building a new one. A lot of stuff in there would make sense out of the box (constants, ignore, expressions, etc).. @filiphr : I'm not so sure whether the complete mapping can be annotated as @BeanMapping. That would not make sense. We would need something special for that. WDYT?

@filiphr
Copy link
Member

filiphr commented Feb 24, 2019

. @filiphr : I'm not so sure whether the complete mapping can be annotated as @BeanMapping. That would not make sense. We would need something special for that. WDYT?

Well I am not sure if we would need something specific from @BeanMapping. Most probably we would need #1735. As some of the @BeanMapping methods are not only needed for bean mapping and can be applied here (like all the different strategies).

@sjaakd
Copy link
Contributor

sjaakd commented Feb 24, 2019

Well I am not sure if we would need something specific from @BeanMapping.

A bean mapping is mapping a bean to a bean.. This is by definition not a bean mapping but something different.. Possibly with its own attributes in the future. But we also need to refer to it in the documentation. So a good name would be appropriate to have.

@chris922
Copy link
Member

Of course it is not exactly the same like a bean mapping, but at least really similar?

Imho there is not a big difference if we call bean.getFoobar() or row.get("foobar").

So maybe we can find a way to make it more general how the "getter" will be called. For beans it is bean.getFoobar(), if the input is ResultSet it is result.getString("foobar"), for Map<String, ?> it is map.get("foobar") ... all calls are more or less doing the same: reading a value. Just "how" the call must be made (/written) is a bit different.

Of course there are then a few things that will needs to be considered.. e. g. do we have to use row.getString(...) or row.getInt(...)? What if "0" is given as source? row.getString(0) (by column index) or row.getString("0") (by column label)?
Maybe we could just extend the @Mapping annotation with something like sourceType=String.class to solve the first issue, this could not just be used to know if row.getString("foobar") or row.getInt("foobar") is correct, but also for beans to do something like (String) bean.getObject() [what definitely must be used with care]

Personally I think #1075 has a higher prio, but maybe these both tickets can be solved together or one after another.
And after it is possible when explicitely using @Mapping annotations it could be extended to also support automatic mappings... something like writing code for every not in @Mapping mentioned target property
if(map.containsKey("targetProperty")) { target.setTargetProperty(map.get("targetProperty")); }
(for ResultSet it would be a bit harder as we have to use the ResultSetMetaData)

But yeah.. we definitly need a tough concept here. I guess there are a few more pitfalls.

@sjaakd
Copy link
Contributor

sjaakd commented Mar 3, 2019

Why don't we generate a wrapper for the source object..? Like my answer here: https://stackoverflow.com/questions/54945661/how-to-map-values-from-map-to-object-based-on-containskey .. then we can use a regular bean mapping method..

@sjaakd
Copy link
Contributor

sjaakd commented Mar 3, 2019

Of course, we can do something similar in target side.. we could use a parameter annotation to control this

@sjaakd
Copy link
Contributor

sjaakd commented Mar 7, 2019

Anyone working on this currently? I'd like to start with it..

@agentgt
Copy link

agentgt commented Apr 25, 2019

Just some incoherent thoughts/findings I thought I would share:

When I was looking into doing this I was actually surprised to find almost zero annotation processor libraries to map jdbc stuff to beans. Using reflection is actually a serious performance issue at mass scale as well for micro scale such as Android. If JDBC ever supports streaming clients that support pipelining then it will become an even bigger problem. This project shows some benchmarked code: https://github.com/aaberg/sql2o just to give you an idea. (I mention this to provide motivation to all that it would be very useful feature :) ).

Anyway I was working on my own annotation processor to do this (not ready for opensource) and thus is not specific to MapStruct.

I could look into putting it into MapStruct but it looks like the code generation is done with lots and lots of FTL which I don't mind but I would have to figure out how its all connected to make sure I reuse macros, api etc. I was using square/JavaPoet for mine because its an internal (not yet opensource) processor.

One thing ours does is take JPA annotations into consideration for mapping. Lots of other JDBC wrapping projects do this but use reflection instead of code generation. e.g. jOOQ, JDBI, sql2o, and probably some Spring JDBC addon libs as well.

Speaking of jOOQ... it actually does code generation from the database and will even generate plain POJOs but for some strange reason it does not generate mapping code for the POJOs (even though those POJOs are one-to-one with the tables). @lukaseder would know best but jOOQ has a very specific API and I'm not even sure you can get the ResultSet and thus would require another MapStruct extension (is that right word?).

Thus I think @chris922 might be right about first fixing the general problem of needing Map -> Bean. That is most of the JDBC wrappers express a single row/resultset/record more or less as a Map<String,?> and an adapter could easily be made.

@lukaseder
Copy link

This project shows some benchmarked code: https://github.com/aaberg/sql2o just to give you an idea.

Ah that benchmark. While it did help discover a few issues in jOOQ at the time, the numbers have never been updated

image

I doubt they're still accurate. For example, I really doubt that JdbcTemplate is that slow if used properly. As far as jOOQ is concerned, 90% of jOOQ users use the code generator, in case of which many of the slowdowns will not be experienced (ad hoc mapping of JDBC ResultSetMetaData to reflection logic).

Speaking of jOOQ... it actually does code generation from the database and will even generate plain POJOs but for some strange reason it does not generate mapping code for the POJOs (even though those POJOs are one-to-one with the tables). @lukaseder would know best but jOOQ has a very specific API

While it would be possible to auto map a well known record type to a well known POJO type through generated code (and I think there's a low prio feature request for this), the overhead from always blindly running a SELECT * just because that's more convenient, is going to totally eclipse any gain you could get from replacing reflection usage by generated code in the mapping logic.

Note that there's a reflection cache that will remember source and target types and reuse cached Method instances, so the big overhead in reflection (which is looking up method names) is effectively avoided.

If Java had anonymous types like C#, that might be a different story, because then, a "projection POJO" could be generated on the fly for the use of a single query, using an arbitrary projection and the mapping logic could be generated as well for that single query.

Having said so, jOOQ's generated POJOs are overrated, but I believe we've had this discussion a few times elsewhere :)

and I'm not even sure you can get the ResultSet and thus would require another MapStruct extension (is that right word?).

You can use Result.intoResultSet() for API compatibility with JDBC. The original JDBC ResultSet will already have been closed at this time, so this method returns an in-memory buffer.

Alternatively, when using ResultQuery.fetchLazy(), you can get a resourceful Cursor on which you can call Cursor.resultSet(). This returns a proxy of the actual JDBC ResultSet, which you could use more efficiently. You'd have to close either the Cursor or the ResultSet proxy. Both will close the underlying ResultSet and the originating PreparedStatement.

@arnaudroger
Copy link

@lukaseder on the benchmark it's not jdbcTemplate that is slow but BeanPropertyRowMapper which was really bad, I doubt it it has changed much.
from the doc Please note that this class is designed to provide convenience rather than high performance. For best performance, consider using a custom RowMapper implementation.

@lukaseder
Copy link

@arnaudroger I understand, but this mapper is instantiated every time in the benchmark. Perhaps it, too, caches some things if reused? In a fair benchmark, it should be reused.

@agentgt
Copy link

agentgt commented Apr 26, 2019

@lukaseder Yes this is a valid point for the performance case of blindly selecting. Also I wasn't trying to single out jOOQ on bad performance (on the contrary the performance is great!). My point on mentioning jOOQ is that MapStruct might not even been the best library for this thing since there are lots of JDBC like libraries to consider.

I think a better library for this is probably @arnaudroger sfm library but code generation for some reason was abandoned.

Note that there's a reflection cache that will remember source and target types and reuse cached Method instances, so the big overhead in reflection (which is looking up method names) is effectively avoided.

Now one could argue that code generation in a java server environment is not exactly that much more performant on properly cached reflection and bean access (sfm, and various other libraries) but for certain platforms like Android code generation is usually far superior in performance.

Also debugging code generation I find to be slightly easier to understand whats going on but that is just a minor preference.

You can use Result.intoResultSet() for API compatibility with JDBC. The original JDBC ResultSet will already have been closed at this time, so this method returns an in-memory buffer

Yeah for some reason I was confusing binding statements with ResultSet and the general problem of CRUD-ing. I think sfm handles that. To be honest I was pretty tired when I wrote that.

@arnaudroger
Copy link

@agentgt the difficulty of code generation at pre processing time is that there is no information available on the query. it might be possible to have something that extract the query and managed to get the metadata from the db but it's not trivial.
When I benchmarked sfm vs roma https://serkan-ozal.github.io/spring-jdbc-roma/ - that uses preprocessing - surprisingly sfm end up being faster because it can use number lookup instead of column name lookup that requires a name to number translation by the ResultSet on each get.

Also saying that, one needs to keep in mind the relative cost of the mapping compare to other cost, for example you will not find a resultset mapping library that has a benchmark with a date, the reason is that the date conversion from the incoming data by the jdbc driver end up dominating the benchmark.

Overall It is not obvious to me that using a pre-processor would lead to a noticable increase in perf after the jit kicks in compare to using asm generation.

@agentgt
Copy link

agentgt commented Apr 26, 2019

@lukaseder and @arnaudroger

While it would be possible to auto map a well known record type to a well known POJO type through generated code (and I think there's a low prio feature request for this), the overhead from always blindly running a SELECT * just because that's more convenient, is going to totally eclipse any gain you could get from replacing reflection usage by generated code in the mapping logic.

Note that there's a reflection cache that will remember source and target types and reuse cached Method instances, so the big overhead in reflection (which is looking up method names) is effectively avoided.

As I mentioned in a previous comment I sort of internally got querying mixed up with CRUD-ing but we actually did have some minor performance issues mapping from Bean -> jOOQ Record for insert. For that case pure reflection is used.

To give you an idea how it affects us we are talking more than 1k-10k large records/rows being inserted every second and we found a 20% increase in speed by eliminating the DSL.newRecord(TABLE, bean). Consequently we just manually map the data over to the record type.

Just to be clear I understand how selecting isn't going to make that much a of a difference (based on both of your comments) but I do think it will make some difference with CRUD-ing. My original comment was to sort of steer MapStruct away from doing this and/or at least fix the map->bean feature first.

I haven't ran a performance test using sfm since we are doing it with static code now but I will say @arnaudroger I'm a big fan of the library.

@arnaudroger
Copy link

@agentgt would you know a way to run benchmark for Android? I don't even know how that would work... which db do you usually use?

@agentgt
Copy link

agentgt commented Apr 26, 2019

@arnaudroger

When I benchmarked sfm vs roma https://serkan-ozal.github.io/spring-jdbc-roma/ - that uses preprocessing - surprisingly sfm end up being faster because it can use number lookup instead of column name lookup that requires a name to number translation by the ResultSet on each get.

I don't think Roma uses a preprocessor. Are you sure they do? I had looked at it as tried to do an extensive search on any libraries that might doing Annotation Preprocessing code generation I was trying to avoid doing the manually mapping of Bean -> Insert Statement as previously mentioned (I realize roma doesn't do that but I happened to run across it). My search is actually how I found your library as well.

As for the performance of retrieving ResultSet columns by index instead of by name (by using the ResultSetMetaData first) I bet varies by client. I saw your orm-benchmark project and I'm not sure if you test Postgresql but I don't find retrieving by column name to be that slow (its also notable that your library performs faster than raw static jdbc at times so at that point I think this is just jit fluctations).

@agentgt would you know a way to run benchmark for Android? I don't even know how that would work... which db do you usually use?

Sadly I don't touch the Android code but I believe the database is SQLlite. I'm super super rusty on Android development... I try to avoid it :)

@arnaudroger
Copy link

@agentgt you might be right, been a long time since I looked at that... that's prob a conversation to take of this thread anyway.

@lukaseder
Copy link

@agentgt Also I wasn't trying to single out jOOQ on bad performance (on the contrary the performance is great!)

That wasn't my point. It just caught my attention more than other things :)

[...] we actually did have some minor performance issues mapping from Bean -> jOOQ Record for insert. For that case pure reflection is used. [...] we found a 20% increase in speed by eliminating the DSL.newRecord(TABLE, bean).

Would love to see details on the jOOQ tracker if you find some time. Surely, there's more room for improvement!

In any case, I'll try to find the relevant issue on Monday and increase its priority. Mapping the generated POJOs to the generated Records using generated mapping code certainly doesn't hurt.

Just to be clear I understand how selecting isn't going to make that much a of a difference (based on both of your comments) but I do think it will make some difference with CRUD-ing

My point also applies here. Updating 50 columns is usually worse than updating 3 columns.

@sjaakd
Copy link
Contributor

sjaakd commented Apr 26, 2019

Mapping the generated POJOs to the generated Records using generated mapping code certainly doesn't hurt.

Ah.. and now we are back to the essence of MapStruct. Interesting discussion above. From our side I started working on a solution. Initially the idea is map to / and from a regular Map. But the solution should be easy extendable to other types like the ResultSet. Perhaps it is interesting to setup an example in our examples repo how to use it in combination with jOOQ in a later phase.

@agentgt
Copy link

agentgt commented Apr 26, 2019

@lukaseder I was reluctant to file an issue as I hadn’t yet setup an isolated performance test and I wanted to sensitive to your time (particularly because we are not customers). Furthermore I did post on the mailing list to see if there was interest but didn’t get a response (not a complaint).

Updating 50 columns...

As for updating columns... I’m batch inserting into Postgresql database.

Update is so slow in Postgres that even trying to update less columns matters little because of MVCC.

But in the general case yes.

@sjaakd :

From our side I have started working on a solution

Ignoring simple ResultSets there is actually a fair amount of data conversion and it can vary by database.

These other jdbc libraries like sfm, jooq, and jdbi are often aware of this. Thus it might be easier to map to their abstractions.

@lukaseder
Copy link

@lukaseder I was reluctant to file an issue as I hadn’t yet setup an isolated performance test and I wanted to sensitive to your time (particularly because we are not customers).

Sounds like a good reason to become customers and then make best use of our time! :)

Furthermore I did post on the mailing list to see if there was interest but didn’t get a response (not a complaint).

Well, I don't think this is a big issue for most users. That doesn't mean it's not an issue, though.

Update is so slow in Postgres that even trying to update less columns matters little because of MVCC.

Sounds like I have to write a blog post now, to benchmark the performance of this! :)

@robertobsc
Copy link

Hello, are there any news about this feature? I am interested in using it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants