Anemic Objects Are OK

I thought for a while that object-oriented purism has died off. But it hasn’t – every now and then there’s an article that tries to tell us how evil setters and getters are, how bad (Java) annotations are, and how horrible and anti-object-oriented the anemic data model is (when functionality-only services act upon data-only objects) and eventually how dependency injection is ruining software.

Several years ago I tried to counter these arguments, saying that setters and getters are not evil per se, and that the anemic data model is mostly fine, but I believe I was worse at writing then, so maybe I didn’t get to the core of the problem.

This summer we had a short twitter discussion with Yegor Bugayenko and Vlad Mihalcea on the matter and a few arguments surfaced. I’ll try to summarize them:

  • Our metaphors are often wrong. An actual book doesn’t know how to print itself. Its full contents are given to a printer, which knows how to print a book. Therefore it doesn’t make sense to put logic for printing (to JSON/XML), or persisting to a database in the Book class. It belongs elsewhere.
  • The advice to use (embedded) printers instead of getters is impractical even if a Book should know how to print itself – how do you transform your objects to other formats (JSON, XML, Database rows/etc..)? With an Jackson/JAXB/ORM/.. you simply add a few annotations, if any at all and it works. With “printers” you have to manually implement the serialization logic. Even with Xembly you still have to do a tedious, potentially huge method with add()’s and up()’s. And when you add, or remove a field, or change a field definition, or add a new serialization format, it gets way more tedious to support. Another approach mentioned in the twitter thread is having separate subclasses for each format/database. And an example can be seen here. I really don’t find that easy to read or support. And even if that’s adopted in a project I’m working on, I’d be the first to replace that manual adding with reflection, however impure that may be. (Even Uncle Bob’s Fitnesse project has getters or even public fields where that makes sense in terms of the state space)
  • Having too much logic/behaviour in an objects may be seen as breaking the Single responsibility principle. In fact, this article argues that the anemic approach is actually SOLID, unlike the rich business object approach. The SRP may actually be understood in multiple ways, but I’ll get to that below.
  • Dependency injection containers are fine. The blunt example of how the code looks without them is here. No amount of theoretical object-oriented programming talk can make me write that piece of code. I guess one can get used to it, but (excuse my appeal to emotion fallacy here) – it feels bad. And when you consider the case of dependency injection containers – whether you’ll invoke a constructor from a main method, or your main will invoke automatic cosntructor (or setter) injection context makes no real difference – your objects are still composed of their dependencies, and their dependencies are set externally. Except the former is more practical and after a few weeks of nested instantiation you’ll feel inclined to write your own semi-automated mechanism to do that.

But these are all arguments derived from a common root – encapsulation. Your side in the above arguments depends on how you view and understand encapsulation. I see the purpose of encapsulation as a way to protect the state space of a class – an object of a given class is only valid if it satisfies certain conditions. If you expose the data via getters and setters, then the state space constraints are violated – everyone can invalidate your object. For example, if you were able to set the size of an ArrayList without adding the corresponding element to the backing array, you’d break the behaviour of an ArrayList object – it will report its size inconsistently and the code that depends on the List contract would not always work.

But in practical terms encapsulation still allows for the distinction between “data objects” vs “business objects”. The data object has no constraints on its state – any combination of the values of its fields is permitted. Or in some cases – it isn’t, but it is enforced outside the current running program (e.g. via database constraints when persisting an object with an ORM). In these cases, where the state spaces is not constraint, encapsulation is useless. And forcing it upon your software blindly results in software that, I believe, is harder to maintain and extend. Or at least you gain nothing – testing isn’t easier that way (you can have a perfectly well tested anemic piece of software), deployment is not impacted, tracing problems doesn’t seem much of a difference.

I’m even perfectly fine with getting rid of the getters and exposing the state directly (as the aforementioned LogData class from Fitnesse).

And most often, in business applications, websites, and the general type of software out there, most objects don’t need to enforce any constraints on their state. Because there state is just data, used somewhere else, in whatever ways the business needs it to be used. To get back to the single responsibility principle – these data objects have only one reason to change – their … data has changed. The data itself is irrelevant. It will become relevant at a later stage – when it’s fetched via a web service (after it’s serialized to JSON), or after it’s fetched from the database by another part of the application or a completely different system. And that’s important – encapsulation cannot be enforced across several systems that all work with a piece of data.

In the whole debate I haven’t seen a practical argument against the getter/setter/anemic style. The only thing I see is “it’s not OOP” and “it breaks encapsulation”. Well, I think it should be settled now that encapsulation should not be always there. It should be there only when you need it, to protect the state of your object from interference.

So, don’t feel bad to continue with your ORMs, DI frameworks and automatic JSON and XML serializers.

7 thoughts on “Anemic Objects Are OK”

  1. Bozhidar, thanks for the analysis, but I disagree with this: “Because their state is just data, used somewhere else, in whatever ways the business needs it to be used”. I don’t like that “somewhere else” part, which literally means that the object loses control over its data and becomes a mere container for something. That’s what leads to poor maintainability. You may find this article interesting too: http://www.yegor256.com/2016/11/21/naked-data.html

  2. Of course, I would not drop the power of JAXB and others, but where it is possible, I think it’s very advisable to avoid getters and setters, mainly because it makes you think about and visualize your objects better, as ‘living organisms’, as Yegor says

    You don’t look at it as if it were a bag of data or configurable object anymore.

    I really came to understand this after starting to practice it and I would love to see a serialization lib that doesn’t rely on get/set prefixes.

  3. Data is consumed “somewhere else”, that’s the nature of it being “data”, rather than “internal state”. Your points are perfectly valid when building a commons/utility library, for example. But equating data with internal state adds unnecessary overhead.

    http://culttt.com/2014/04/30/difference-entities-value-objects/ Even DDD allows for value objects. I don’t think we should scratch that option.

  4. This is interesting, the book metaphor might actually be correct, but people are then implementing the objects wrong. The book doesn’t know how to print itself, in fact, the “Creator” sends the “Printer” a PDF (or Postscript?) usually, this “PDF” contains metadata about how to render it for printing. The PDF is text plus images (content) plus metadata. So arguably data + annotations sent to a render-er (jackson). Isn’t incorrect. Now I have also used the Data Transfer Object (DTO) pattern, and sometimes those can be created directly from said object without an assembler, this is usually to have a wildly different representation (perhaps to represent a difference between a draft and a final presentation).

    That said I think the problem with anemic model objects, vs rich domain objects is a different one. It doesn’t necessarily have anything to do with Dependency Injection or Serialization. I think people are confusing issues, in part because they are using their Domain objects as their DTO’s. That’s not necessarily wrong, but it makes it harder to understand what your domain model is. Also many people only have, or need CRUD, which lends itself to anemic design, and doesn’t need Domain Driven Design.

    We have a large project (400k lines) we have made the mistake in the past of treating our DTO’s (Local DTO’s mostly) as anything other than anemic, encoding all of our logic outside of entities into services, or even worse the UI. Generaly never using any design other than service has business logic (if devs aren’t being lazy), and entities are anemic. DTO’s, and Interfaces are a 1 to 1 reflection of enties, and their services, meaning no abstraction. In short it was a mess.

    Now we’re making Dto’s anemic, and only contain presentation data for the view that contains them. Business logic specific to a view is written into a view model, shared business logic is being distributed between entities, services, and other objects, e.g we have encoded some complex permission logic into strategy patterns. We are designing the objects to fit the problems and the needs of the consumers, as well as ensuring test-ability. Services are starting to gain, in some cases more generic interfaces so they can be substituted for one another, thus taking advantage of polymorphism.

    Fundamentally what I’m saying is, if you’re dealing with CRUD many of the problems that call for separation aren’t as big of a concern and are overkill. That said, beware of thinking that you’re just doing CRUD, try to recognize when you need multiple representations, or permission logic more simple than, is owner? can read/write is shared? can read. If you have a complex model, look to DDD and patterns designed to handle complexity. Don’t make things more complex than they need to be.

    sorry for the book, I mostly agree with some disagreements.

  5. I agree with you. I think that in most cases, runtime data and behavior should be put in separate places. If we make our data “objects” immutable, then we mostly fix the problem of having them vulnerable to illegal changes.

    The code base becomes a set of mostly-stateless “function-objects” that act on immutable “data-objects”.

    This is basically like functional programming. And there is nothing wrong in using a language designed for OOP to do functional programming. This is basically how I write code.

    I disagree with you though on the usage of DI containers. I think that there is a serious problem with using a DI container (although I love DI). They basically limit composability. See this article here for more details: http://criticalsoftwareblog.com/index.php/2015/08/23/why-di-containers-fail-with-complex-object-graphs/

    I compose my “objects” in the composition root manually (via Pure DI). See this article here for more details: http://criticalsoftwareblog.com/index.php/2016/05/26/clean-composition-roots-with-pure-di/

  6. Objects come in different flavors. You can have anemic Value objects like DTOs, or entity objects whose goal is Persistence.

    You can have Service objects that are meant to define transaction boundaries and encapsulate business logic. You can have Rest Controllers which have the responsibility of service requests and generating responses.

    With this in mind, I don’t mind if some Objects are Anemic as long as they follow to the Single Responsibility Principle, and they are thoroughly tested.

Leave a Reply

Your email address will not be published. Required fields are marked *