On DTOs
DTOs, or data-transfer objects, are commonly used. What is not со commonly-known is that they originate from DDD (Domain-driven design). There it makes a lot of sense – domain objects have state, identity and business logic while DTOs have only state.
But many projects today are using the anemic data model approach (my opinion) and still use DTOs. They are used whenever an object “leaves” the service layer or “leaves” the system (through web services, rmi, etc.). There are three approaches:
- every entity has at least one corresponding DTO. Usually more than one, for different scenarios in the view layer. When you display a user in a list you have one DTO, when you display it in a “user details” window you need a more extended DTO. I am not in favour of this approach because in too many cases the DTO and the domain structure have exactly the same structure and as a result there’s a lot of duplicated code + redundant mapping. Another thing is the variability of multiple DTOs. Even if they differ from the entity, they differ from one another with one or two fields. Why duplication is a bad thing? Because changes are to be made in two places, issues are traced harder when data passes through multiple objects, and because..it is duplication. Copy & paste within the same project is a sin.
- DTOs are only created when their structure significantly differs from the that of the entity. In all other cases the entity itself is used. The cases when you don’t want to show some fields (especially when exposing via web services to 3rd parties) exist, but are not that common. This can sometimes be handled via the serialization mechanism – mark them as @JsonIgnore or @XmlTransient for example – but in other cases the structures are just different. In these cases a DTO is due. For example you have a User and UserDetails, where UserDetails holds the details + the relations of the currently logged user to the given user. The latter has nothing to do with the entity, so you create a DTO. However in the case of a DirectMessage you have sender, recipient, text and datetime both in the DB and in the UI. No need to have a DTO.
One caveat with this approach (as well as with the next one). Anemic entities usually come with an ORM (JPA in the case of Java). Whenever they exit the service layer they may be invalid, because of lazy collections that require an open session. You have two options here:
- use OpenSessionInView / OpenEntityManagerInView
– thus your session stays open until you are finished preparing the response. This is easy to configure but is not my preferred option – it violates layer boundaries in a subtle way, and this sometimes leads to problems especially for novice developers
- Don’t use lazy collections. Lazy collections are unneeded. Either make them eager, if they are supposed to hold a small list of items (for example – the list of roles for a user), or if the data is likely to grow use queries. Yes, you are not going to show 1000 records at on go anyway, you will have to page it. Without lazy associations (@*ToOne are eager by default) you won’t have invalid objects when the session is closed
Don’t use DTOs at all. Applicable a soon as there aren’t significantly varying structures. For smaller projects this is usually a good way to go. Everything mentioned in the above point applies here.
So my preferred approach is the “middle way”. But it requires a lot of consideration in each case, which may not be applicable for bigger and/or less experienced teams. So one of the two “extremes” should be picked. Since the “no DTOs” approach also requires consideration – what to make @Transient, how does lazy collections affect the flow, etc, the “All DTOs” is usually chosen. But even though it is seemingly the safest approach, it has many pitfalls.
First, how do you map from DTOs to entities and vice-versa? Three options:
- dedicated mapper classes
- constructors – the DTO constructor takes the entity and fills itself, and vice-versa (remember to also provide a default constructor)
- declarative mapping (e.g. Dozer). This is practically the same as the first option – it externalizes the mapping. It can even be used together with a dedicated mapper class
- map them in-line (whenever needed). This can generate unmaintainable code and is not preferred
I prefer the constructor approach, at least because fewer classes are created. But they are essentially the same (DTOs are not famous for encapsulation, so all of your properties are exposed anyway). Here is a list of guidelines when using DTOs and either of the “mapping” approaches:
- Don’t generate too much redundant code. If two scenarios require slightly different DTOs, reuse. No need to create a new DTO for a difference of one or two fields
- Don’t put presentation logic in mappers/constructors. For example
if (entity.isActive()) dto.setStatus("Active");
This should happen in the view layer
- Don’t sneak entities together with DTOs. DTOs should not have members which are entities. Generally, entities should not be used outside the service layer (this is a bit extreme, but if we use DTOs everywhere we should be consistent and stick to that practice)
- Don’t use the mappers/entity-to-dto constructors in controllers, use them in the service layer. The reason DTOs are used in the first place is that entities may be ORM-bound, and they may not valid outside a session (i.e. outside the service layer).
- If using mappers, prefer static mapper methods. Mappers don’t have state, so no need for them to be instantiated. (And they don’t have to be mocked, wrapped, etc).
- If using mappers, there’s no need for a separate mapper for each entity(+its multiple DTOs). Related entities can be grouped in one mapper. For example Company, CompanyProfile, CompanySubsidiary can use the same mapper class
Just make sure you make all these decisions at the beginning of a project and figure out which is applicable in your scenario (team size and experience, project size, domain complexity).
DTOs, or data-transfer objects, are commonly used. What is not со commonly-known is that they originate from DDD (Domain-driven design). There it makes a lot of sense – domain objects have state, identity and business logic while DTOs have only state.
But many projects today are using the anemic data model approach (my opinion) and still use DTOs. They are used whenever an object “leaves” the service layer or “leaves” the system (through web services, rmi, etc.). There are three approaches:
- every entity has at least one corresponding DTO. Usually more than one, for different scenarios in the view layer. When you display a user in a list you have one DTO, when you display it in a “user details” window you need a more extended DTO. I am not in favour of this approach because in too many cases the DTO and the domain structure have exactly the same structure and as a result there’s a lot of duplicated code + redundant mapping. Another thing is the variability of multiple DTOs. Even if they differ from the entity, they differ from one another with one or two fields. Why duplication is a bad thing? Because changes are to be made in two places, issues are traced harder when data passes through multiple objects, and because..it is duplication. Copy & paste within the same project is a sin.
- DTOs are only created when their structure significantly differs from the that of the entity. In all other cases the entity itself is used. The cases when you don’t want to show some fields (especially when exposing via web services to 3rd parties) exist, but are not that common. This can sometimes be handled via the serialization mechanism – mark them as @JsonIgnore or @XmlTransient for example – but in other cases the structures are just different. In these cases a DTO is due. For example you have a User and UserDetails, where UserDetails holds the details + the relations of the currently logged user to the given user. The latter has nothing to do with the entity, so you create a DTO. However in the case of a DirectMessage you have sender, recipient, text and datetime both in the DB and in the UI. No need to have a DTO.
One caveat with this approach (as well as with the next one). Anemic entities usually come with an ORM (JPA in the case of Java). Whenever they exit the service layer they may be invalid, because of lazy collections that require an open session. You have two options here:
- use OpenSessionInView / OpenEntityManagerInView
– thus your session stays open until you are finished preparing the response. This is easy to configure but is not my preferred option – it violates layer boundaries in a subtle way, and this sometimes leads to problems especially for novice developers
- Don’t use lazy collections. Lazy collections are unneeded. Either make them eager, if they are supposed to hold a small list of items (for example – the list of roles for a user), or if the data is likely to grow use queries. Yes, you are not going to show 1000 records at on go anyway, you will have to page it. Without lazy associations (@*ToOne are eager by default) you won’t have invalid objects when the session is closed
So my preferred approach is the “middle way”. But it requires a lot of consideration in each case, which may not be applicable for bigger and/or less experienced teams. So one of the two “extremes” should be picked. Since the “no DTOs” approach also requires consideration – what to make @Transient, how does lazy collections affect the flow, etc, the “All DTOs” is usually chosen. But even though it is seemingly the safest approach, it has many pitfalls.
First, how do you map from DTOs to entities and vice-versa? Three options:
- dedicated mapper classes
- constructors – the DTO constructor takes the entity and fills itself, and vice-versa (remember to also provide a default constructor)
- declarative mapping (e.g. Dozer). This is practically the same as the first option – it externalizes the mapping. It can even be used together with a dedicated mapper class
- map them in-line (whenever needed). This can generate unmaintainable code and is not preferred
I prefer the constructor approach, at least because fewer classes are created. But they are essentially the same (DTOs are not famous for encapsulation, so all of your properties are exposed anyway). Here is a list of guidelines when using DTOs and either of the “mapping” approaches:
- Don’t generate too much redundant code. If two scenarios require slightly different DTOs, reuse. No need to create a new DTO for a difference of one or two fields
- Don’t put presentation logic in mappers/constructors. For example
if (entity.isActive()) dto.setStatus("Active");
This should happen in the view layer - Don’t sneak entities together with DTOs. DTOs should not have members which are entities. Generally, entities should not be used outside the service layer (this is a bit extreme, but if we use DTOs everywhere we should be consistent and stick to that practice)
- Don’t use the mappers/entity-to-dto constructors in controllers, use them in the service layer. The reason DTOs are used in the first place is that entities may be ORM-bound, and they may not valid outside a session (i.e. outside the service layer).
- If using mappers, prefer static mapper methods. Mappers don’t have state, so no need for them to be instantiated. (And they don’t have to be mocked, wrapped, etc).
- If using mappers, there’s no need for a separate mapper for each entity(+its multiple DTOs). Related entities can be grouped in one mapper. For example Company, CompanyProfile, CompanySubsidiary can use the same mapper class
Just make sure you make all these decisions at the beginning of a project and figure out which is applicable in your scenario (team size and experience, project size, domain complexity).
In fact, this really depend on your core design. Here it seem implicit that your entities are mapped with some ORM to a database.
It seem also implicit that you use staticaly typed language too.
In the general case, one might want to retrieve only information necessary for the transaction at hand. If you retrieve complete entities from the database, you already returning too much. More you are database dependant. If we end up splitting one table into two or doing refactoring, your access layer change.
A perfectly viable model then is to have a stored procedure that return only theses relevant data, and map it directly to what you call a DTO. The DTO then is transmited, as this up to the UI.
I’ll add that if you use a dynamically typed language (or a map in a statically typed one), you don’t need to define theses objects at all. After all, all you do is returning record from database.
This is not to say that you design is bad or unsuited. Just to say that you already have a really defined and constrained design in your head. A design that (in my humble opinion) is good if you intend to have lot things to do in the business layer and where you embrace static typing.
We speak about duplicated code… Static typic violate the dry and KISS principes badly. But one might prefer to manage that and get easier refactoring and “free unit tests”. DTO, and to an extent DDD or ORM like JPA are the consequence of this static typing choice.
One might say that if your final target doesn’t benefit from static typing (like for rendering HTML or JSON)and you business layer doesn’t do much than routing the data all of this DTO/entity/JPA bring very little to the table.
Even with stored procedures, you will have a “database-tied” object – i.e. one that reflects that DB structure, which might not be exactly what the UI wants. And a different storproc for each UI scenario would be wrong.
Btw, I don’t see how static typing violates DRY and KISS. Yes, people using static-typed languages violate them more often, so we may say they are more prone to these violations, but you can keep to both principles in a statically typed language without any problems.
On the final paragraph – the view is in most cases dynamic. (JSP and other java view technologies). But this doesn’t mean the rest of the code doesn’t benefit from static typing.
Hi, very interesting post, thank you! Indeed most if not all projects I have worked used an anemic domain model, so the question why DTOs still puzzles me…
Here is a related and very interesting post of another blogger, http://bit.ly/rjUJf3
He mentions 3 reasons why DTO’s are needed:
– security: you might not want to send some fields to the client such as password etc. to the client.
– performance: you want to be sure that a huge object graph does NOT get serialized and sent over the wire just to display a small set of fields. Example a developer comes and makes a collection eagerly loaded on the domain model, this triggers an unexpected performance problem on the GUI that is very hard to anticipate.
– maintainability: preventing the domain model from reaching the UI layer will help ensure that no business logic is created in the presentation layer. It will not completely prevent this problem tough, just reduce its likellyhood.
All in all I am not 100% convinced pro or against DTOs, its one of those patterns that just somehow feels funny, like something is missing somewhere and thats why we need it, maybe at the language level.
We all have our view on the best way to do things, and there is no true solution that always work.
For a very basic application, maybe for making report again a very stable database structure, embedding SQL into the view is ok. This is what many report libraries embrace.
You may require little more abstraction, and define an access layer interface that make a facade of differents request that can be made against the database. Yes you can share some or not. There no absolute, it depends. THis interface can do very litle work. Call the request, and wrap the result in a generic structure. In a dynamic language, well this is just an object. whose member and value type don’t need to be declared but will be discovered at runtime.
If you thinking that you’ll do extensive processing with theses value inside the business layer, you can add types information on it. So you have free checks… but more boilerplate too… After all you must define the types.
And last step you may want to dissociate what your access layer get, what your business layer keep and what your view get… and define at much at 3 differents objects (+ their converters) for any query.
You decided that for your tastes, and need, one type of solution was good for you. But it is not the only one… And it is not universal neither.
That’s what I said – that I prefer one solution, but in different scenarios different solutions apply. The idea is to point out problems and things to consider so that people don’t choose a “path” without understanding why.
One adjective that defines Simon Wilby is smart. He is the CEO of Smart Power, Inc. He developed
Really interesting post.
I found myself having lots of design questions about DTOs into DDD. I still have some:
– Is it a good practice to keep a “primary key” attribute as a part of the DTO?
– And the most important question that I ever wanted to know: is it OK to define a DTO with collections of other DTOs or is it an aberration? Complex Service layer operations may need some DTO with collections of DTOs. Is it wrong? If so, how can we achieve, for example, registering complex relations between entities without having a huge list of methods to make those associations?
Thanks a lot!!
– yes – it’s usually the autoincrement ID of the entity, and it’s often needed outside the service layer (ids of the target entities)
– yes, it’s ok. Mostly, though, that’s used when getting data out of the service layer. When passing a DTO as parameter, it’s usually a better option to have just List
Thanks for your answer, Bozho! I will now publish a link to this post in my website. It’s a really interesting article on DTOs, a part that is sometimes obscured when it comes to crossing the limits of the service layer because it’s often not deeply explained.
Ho quasi mai lasciare una risposta, ma dopo aver letto attraverso una
grande quantità di commenti su questa pagina del blog di tecnologia
di Bozho ? Su DTOs . Ho alcune domande per voi se va bene .
Potrebbe essere solo io o lo fa apparire come alcune delle risposte venire attraverso come se fossero lasciati
dai visitatori morte cerebrale ? : – P E , se stai scrivendo su altri siti social , mi piacerebbe tenere il passo con qualcosa di
nuovo dovete postare . Potresti fare una lista di
tutte tutte le pagine di social networking come tua pagina di Facebook , Twitter mangimi , o profilo linkedin ?
Hi Bozho
“Don’t use the mappers/entity-to-dto constructors in controllers, use them in the service layer. ”
From my point of view, if the service layer is your business layer, then the objects that compose the API of this layer are de facto your domain business objects. If you call them “DTOs” it means that these DTOs are your business objects which does not make much sense in architecture terms. I agree though that if the hibernate entities are your domain business objects then you might have issues with sessions and so on. So although your approach seems practical in this technology stack it seems it does not follow clean architecture guidelines.
Since this is already 8 years old topic, did your opinion on this evolved during the years?
Best,
Angel
@Angel Gruev well, the point of that is to not leak entities into the controller layer. If your business objects are rich domain objects, then the controller interacts with them, but it can be again the domain object that emits a DTO of itself. I don’t see that as a violation. You’d do the conversion in the controller, however (e.g. businessObject.toSomeDto()) inside the controller