Java Bean Mapping is wrong, let's fix it!

Java Bean Mapping is wrong, let's fix it!

When it comes to Bean Mapping, it is surprising to see how many tools/frameworks are available. Some as old and famous as Dozer, other more recent and innovative as Selma (see this highly referenced post if you need a list).

But no matter how different the technical underlying paradigms of these tools are, they have one thing in common: they take Bean Mapping code out of the application, either as XML or annotation configuration + reflection, or as generated bytecode or, more recently, as generated classes.

In my opinion and experience, this is a wrong approach to Bean Mapping and the cause of many problems in the life of an application.

Bean mapping can look like simple/boring/obvious/tedious-to-write code but it still holds a lot of business logic and as such must be plain part of the application code. We shouldn’t take that code away or hide it.

A little bit on my experience on the subject

I have worked for 3 years on a big online application for a major French telco as a senior developer and then as a technical lead.

This application integrates with nearly 100 web services (or other remote services), with a bunch of business components, a database, throws DTOs to the client user interface exposed as 150+ methods, …, and internally, the application is made of several software layers. Overall, Bean Mapping occurs in many places and is a strong aspect of the application.

Developers on this application tested several Bean Mapping solutions, from Dozer to fully hand coded mapping as the developer feels to write it, to extensive use of Guava’s Function, to other exotic approaches. I saw them behave as time goes and as the application lives, and I learned a lot.

After three years, I must say that I didn’t see any solution that ruled them all. I still have nightmare of the time I lost with some of them.

So, I started thinking about what Bean Mapping is really about, the problems I had and concluded that Bean Mapping should be very different.

Existing tools were created to write mapping code

Bean mapping frameworks quick descriptions are all pretty much the same:

  1. include our jar/declare maven dependency/some other kind of quick setup step
  2. add this config file/this annotation/whatever to your application (optional)
  3. with these 2/3/4 lines of code, Tadah ! class A is transformed into class B

Ok, good. This is simple enough and it solves the initial problem of the developer : writing the mapping code from one bean to another.

Bonus, it seems to also work for tree of beans and offers customization possibilities.

Then, why should I be unhappy with this solution ?

Because writing the Bean Mapping code is far from the only concern of the developer, even more of the software architect : maintainability, stability, readability, support for debugging, learning curve for new developers, etc…

The practical problems of hidden mapping code

As I said earlier, these frameworks remove the mapping code from the application and this introduces many practical problems as the application lives, grows (hopefully) and as developers come and go.

Bean Mapping code is not source code in your application

Therefore it is not in SCM, you can not tell when nor where a change occurred and who did it. You can’t tell either when some mapping code was added or if it actually was added.

Bean Mapping code is not stable

The generated code/bytecode can change when upgrading the mapping framework and reflection-based mapping is even worse on that point.

Sure, you are supposed to have unit tests to ensure stability but, supposing unit tests break, you will waste time to fix pieces of code which had no reason to change in the first place since they were working.

you can’t leverage the power of the IDE

To find out where and how class X is instanced or property foo is set/read, you are on your own.

But when it is time to find out where a problem is coming from, believe me, you will curse the guy who decided to use a Bean Mapping tool instead of just writing dumb b.setFoo(a.getFoo()) lines.

debugging is usually not easy

Hidden mapping code is acting as a black box: when mapping a single bean to another, it is ok, when mapping tree of beans however, it is not.

Also, forget about putting a breakpoint in generated byte code or reflection-based frameworks code…

no direct access to the mapping code for new developers

When a new developer joins the project, supposing she has to fix a bug or develop an evolution, she will need to learn some tool to know about know about some dumb 1-to-1 mapping, where it happens and how.

need to customize ? say goodbye to compiler feedback

When the time comes to customize the mapping, you generally lose compile-time feedback and type safety because you end up using strings to designate properties and/or you are required to add some XML configuration.

You can then forget about refactoring your bean classes and having the mapping code updated consistently by your IDE. Also forget about the compiler telling you that, for example, by changing the type of this property, your Bean Mapping code now fails to execute.

Some tool such as ModelMapper provide a solution to this problem, but at the cost of very complex and verbose technical solutions. It is way simpler to just write the Bean Mapping code from the beginning. In addition, everyone will understand it just by looking at it.

immutability is not a prime citizen

Designing immutable beans wherever possible can be tough but it solves many issues in the long run.

Unfortunately, immutable beans are not well supported by Bean Mapping frameworks, notably because:

  • it involves bean not having setters (basic requirement of property based framework)
    • immutable beans only have constructors to initialize their state or, better, builders
    • both constructors and builders can hardly be automatically mapped
  • also mapping tree of immutable beans requires to map beans in a bottom-up fashion instead of the usual top-bottom way
    • children of immutable beans must be created before their parent

you can’t investigate how mapping actually occurs

This applies especially to reflection-based or bytecode generation-based mapping. Good luck when you want to make sure the problem is not at the mapping level.

you can’t tell dead code apart

With most tools, it can be hard to tell that mapping configuration and/or customization is actually dead code:

  • is that line of configuration still required for mapping to occur ?
  • is that custom whatever object still used ?

Unless you have extensive test coverage and remove the suspicious part, sometimes, you just can not tell.

not much (or none) control over mapping of bean trees

When it comes to mapping trees of beans, you either do not have control over it or need to customize the framework and you end up with code that is not that simple any more:

  • want to write one mapper for each node of a tree of beans, so you can really unit test ? good luck
  • want to keep type dependency between mappers so you can easily tell how your code behaves ? for some frameworks, that’s just impossible

performance …

Performance is a big issue with reflection based mapping tools, they are very slow and CPU/memory intensive compared to plain Java code. In addition, this kind of code can never benefit from compiler and/or JVM optimizations.

Frameworks based on other technical paradigms always compare each other on that subject but no matter what I don’t think they can beat plain Java code (unless you just write shitty Java code but that’s not a Bean Mapping issue).

The down side of Bean Mapping code in source

Naturally, each point made above is addressed as the developer has direct access to the code. Most of them can be dealt with as any regular Java coding problem, any solution can be used.

But I am also aware of the main reasons to hide Bean Mapping code in the first place:

  • Bean Mapping code does not add much value, no need to have it in the application
    • as I explain below, I think this statement is wrong
  • Bean Mapping code written in source does not adapt easily to change
    • I think that using the IDE refactoring capabilities is a better way to adapt to change than having code generated with each build or at runtime
    • if it isn’t, it only means that we need new tools

Bean mapping code is business code

After thorough thinking, I found that the most important problem with Bean Mapping code being out of the application is that it removes business code from the application.

Yes, Bean Mapping code is business code.

Even exact 1-to-1 mapping is business logic. This code could have been different. Some properties could have been nullified or hardcoded to a specific value on purpose. The fact it is not the case should be written in code. It will save any question in the future.

Also, the very facts that so many bugs occur at the Bean Mapping level and that such big parts of documentation are about mapping are the proof that Bean Mapping is business logic.

Bean mapping code is not some technical problem that a framework can hide/remove from the application.

But I agree, writing Bean Mapping is a technical problem.

Bean Mapping is dead, long live Bean Mapping

Starting from the hypothesis that Bean Mapping code should be part of the application code as any other piece of business code, two questions arise:

  1. Shouldn’t that code be organized a bit ? If everyone starts writing Bean Mapping code without any guideline of some kind, code will just end up being a mess and it will be worse than before
  2. Some mapping code is just tedious to write, it feels like a waste of time, shouldn’t there be some tool to help the developer ?

And one strong constraint: the developer must always keep control over the code and the tool must stay out of the way

The answer to both questions is “obviously, yes!” and the constraint drove my research for a solution.

I think that what we need is not one tool, but two, very much complementary:

  1. a framework to structure Bean Mapping code and handle the wiring with the rest of the application and with other pieces of Bean Mapping code
    • lets call it a Bean Mapping Wiring Framework
  2. a tool/plugin at the IDE level to generates mapping code from one class to another
    • lets call it a Bean Mapping Code Generator

Those two tools would solve very different problems and could be used together or not. In addition, they would never be mandatory once you start using them, they would step out of the way anytime. This way the developer keep controls over the code.

Still, the ultimate goal is that these tools would be so convenient that they will end up in the coding guidelines of the team.

Bean Mapping Wiring Framework

The Bean Mapping Wiring Framework is about letting the developer write the Bean Mapping code in a class and giving her the power to use that class’s code as easily as possible:

  • using interfaces for loose coupling
    • developer writes the implementation, the interface which will be used in other classes to call this implementation will be generated by the framework
  • complying with Separation Of Concern
    • using a class for each mapping from one class to another
    • side effect, unit testing is much easier
  • integrating with Dependency Injection frameworks
    • for example, Spring integration would be about generating annotations on classes or XML configuration files or configuration classes
  • providing coding patterns to help mapping complex structures of beans or solve common Bean Mapping problems
    • using Guava’s Function to easily convert collections of beans or integrating with Java 8 lambdas
    • using mapper factories when creating a bean from more than one source bean
    • etc.

To my knowledge, there is no such framework at the moment except DAMapping which development started several month ago.

Bean Mapping Code Generator

The primary goal of this tool is to provide convenient generation of Bean Mapping code inside the application’s source code from one class to another. The IDE is the best place for this as we could rely on interactive UI to generate the source code living option to the developers.

Second goal of this tool is the integration with the Bean Mapping Wiring Framework. This integration would give the option to the developer to not only generate Bean Mapping code but also generate Bean Mapping classes. This would be a convenient way of creating the code for tree of beans without being intrusive about it.

Since the logic of generating mapping code already exists, developing this tool would really be a matter of IDE integration and UI design.

Generating the initial Bean Mapping code is relatively easy but this tool will obviously need to complete partial mapping code to be successful. And that’s a little harder to do.

Conclusion

People keep on creating new Bean Mapping tools, changing the technical approach. But they tend to keep the same underlying paradigm which in my opinion is the root cause of their unhappiness with the solution they had before: hidden Bean Mapping code.

This article is the first of a series on this new approach to Bean Mapping. Other articles will follow which will dive deeper into the theory and implementation of those new tools. Next article will be on the Bean Mapping Wiring Framework implementation since I already have initial working results with DAMapping.

I know many people (very) unhappy with Dozer and such frameworks. I’m interesting in their opinion, maybe their contribution, or existing works I am not aware of.

Do not hesitate to comment below or contact me on Twitter.