We Don’t Need That Documentation

July 1, 2013

“We must write more documentation”. Have you heard that? I have, many times in many companies. Most people feel guilty for not writing documentation and agree. I don’t.

There are two types of documentation – in-code and external. In-code documentation consists of javadoc (or whatever language/tool you are using to describe what classes and their methods do) and comments. External documentation consists of documents or intranet articles describing what the product does.

Rule number one of external documentation: it goes out of date. And keeping it in sync is a tedious and time-consuming task.
Rule number two of external documentation: nobody actually uses it.

Programmers work with code. They are good at reading code. And code makes perfect sense to them. Documenting the core classes and any non-trivial use-case within a method is part of coding best practices, and good programmers do that. It makes understanding the code much easier for other people, and together with writing self-documenting code, it is an entirely sufficient documentation. And it doesn’t need keeping in sync (well, apart from changing a comment above a snippet).

So do you need external documentation? Well, yes, some of it. It is important to make a very succinct description of the overall architecture, so that the pieces of code make sense in a broader context. What are the modules and how they interact – that’s enough, one page. It will also go out of sync, but at least it’s easy to support and it might be the duty of the team lead / architect to verify its validity every now and then.

And now, if you are a QA or a manager, you’d say that you don’t understand the code and you need the documentation in order to do you work. I might sound a bit rude here, but if you need something, do it yourself. Developers are not technical writers and keeping external documentation is not something they enjoy or need. And if you don’t want to do the documentation, but you still insist you need to know what’s happening, then just ask the developers. They’ll be happy to read the code and explain the behaviour. And what if there’s a mismatch between the behaviour of the code and the expected behaviour? Then check the issue tracking system and see what has been requested. You don’t need external documents describing what the code does.

Of course, software that is to be used by end users would need some sort of user manual in addition to the in-code documentation, but that can be kept to a minimum as well and should be written by other people, if possible.

So, next time someone insists on programmers writing documents describing what the code is doing, do not agree with them. Insists that it’s a waste of effort and that code and comments are sufficient. If they aren’t, then you need better programming standards and habits, not documentation.

If you find the content interesting, you can subscribe and get updates


 

12 Responses to “We Don’t Need That Documentation”

  1. In general I am against comment-type documentation as well. Comments can rot the same way external documentation can rot. Ideally method names, types and other language constructs would be the documentation. Of course in practice things are rarely ideal and then we do need comments but we need to recognize that the need for comments is a symptom of bad code/architecture and not the right way to do things.

  2. True that, comments are needed only for very specific, convoluted use-cases that would be harder to understand by simple being expressed with code.

  3. I think some comments are essential. Programs are almost never about programming. So there are always foreign concepts that cannot be conveniently represented by simple code. Example:

    class PartialItemType102 { … }

    This name may be the best for this class, but only for those that already know wtf a partial item of type 102 is. A comment like:

    /* Represents a partial item of type 102. */

    Is not useful. But something like

    /* A partial item of type 102 is a partial item (one that will be completed sometime in the future) that cannot be sold to people, only companies, and has rules described on federal law 198/1994. */

    Is a lot more useful.

  4. I agree with you to some extend. What you are saying is true only for simple CRUD websites. I strongly disagree when it comes to very complex code that has a lot of context embedded into it. I will give you a prime example – spider monkey is the java script engine in firefox. Trying to pick that code up without any documentation and comments is extremely challenging. Reading the code and understanding what it does is never the difficult part. But understanding why the code does a lot of the stuff is extremely difficult. A lot of the context is outside the code itself. For example, spider monkey has weird branches which are there in order to optimize special conditions and edge cases, so that specific javascript statements would run faster in the browser. However, one wouldn’t be able to learn that by just reading what the code does. So yes, you are right for 90% of the simple stuff that most people write; however if one rights code where a lot of architectural decisions were made through meetings and discussions and background knowledge, picking that up from the code is almost impossible. you will learn what the code does, but you would have no idea why it does it this way and you might introduce bugs by making modifications without fully understanding the reasoning behind the decisions.

  5. I think this is the first time I disagree with one your posts.

    In a general context, more documentation is always better.

    The process of writing documentation actually serves a purpose – it can clarify the business processes and interactions, and highlight problems.

    The documentation allows non-programmers to comprehend functionality without looking at the code. And there can be many purposes to the documentation that programmers are not aware of.

    It doesn’t take long to write, it helps me understand what I have done and it helps others. I enjoy writing and I don’t understand why so many people dislike it.

  6. Indeed the biggest problem is that the documentation is rarely up-to-date. This may work if a project is frozen. But for a moving target it will be hard. I wrote a similar post about comments some while ago: http://javarizon.wordpress.com/2011/12/03/code-comments-gone-wrong/

  7. I think it’s a wrong question to ask whether we need to write external document.

    The right one is: what should be put into external document and source code comments.

    The major problem I met with source code comments is, sometimes the comments just translate source code into natural language(such as English). It should focus on “business meaning”.

    And external documents should be about high level infrastructure and business structure.

  8. Agreed. I’ve linked another article of mine about what should be contained in comments, and obviously, translating the code is not desired.

  9. Your linked article sums up my opinion about comments to the dot. Very good one. I didn’t put this comment there, because well… it’s 2 years old :-)

  10. I can see your point, but perhaps it is too radical and can only work on small scale. Big systems are usually developed over years, people join and leave the project team and the code base continuously grows in size and complexity. Documentation can actually prevent the chaos which these factors often wreck into development. Not only it makes it easier for new developers to start working on the system, but also helps maintenance and usage. Last, but not least, it cuts the dependency on the actual developers who wrote the system, so they can move on to working on other projects or for another company without damaging further system development and maintenance.

    You have actually touched the main issue – documentation goes out of sync really quickly. While this is true, the solution is not to get rid of it. So, this should be managed carefully and it is indeed easy to slip into a loop where developers are wasting too much time writing and correcting documentation. My impression based on the majority of the developers I have worked with is that most people cannot really distinguish what is useful to document and how to write documentation in a way that it will stay in sync for maximum amount of time and will be easy to manage. It is quite common to read unhelpful documentation which is indeed badly out of sync, isn’t it? So, how do we counter that?

    Here’s a list of things I think should always be documented in a project, and I’ve paired them with the questions they must answer, in order to avoid introducing ambiguity:

    1. Development environment – how do I setup a developer machine in order to contribute to the project?
    2. Deployment details – how do I run it?
    3. Configuration details – how do I configure it to run properly; what are the available options?
    3. Usage guide – how can I use this awesome system to achieve a desired result/goal?
    4. Architecture design – why the heck have those decisions been made?
    5. Active environments – what is running where and what is the purpose that it serves?
    6. Component description – what does this component do, where is the code located, how does it fit in the big picture and how to use it?

    If you can bear with me, I’ll now go into more details for each point.

    1. Development environment
    This is very important for developers, setting up your environment must be easy in order to start working on the essentials as quickly as possible. Yes, you do this once, but it is also true that you’re motivation quickly starts to sink down the toilet if it takes you 2 days to setup your machine. So, this part of the documentation should contain all the relevant information one would need in order to get going – code base location, brief build instructions, all the 3rd party tools which are used (i.e. webapp containers, build tools, databases AND how to configure these in case the defaults are not used). Not all people working on a project are guaranteed to be highly skilled or to have vast amounts of experience with the tool set, so think for them. Also think for you – you want them to start quickly and you also want to minimize your time answering the same questions over and over again. In such a case, you should consider the time to write the documentation down. These things don’t change so frequently, hence maintenance cost is kept low.

    2. Deployment details
    Now that you have setup your machine and know how to build the system is time to deploy it and run it for the first time. Distributed systems are not a single package that runs standalone and there usually are many dependencies, such as a database filled with some data according to a schema (including “NoSQL” and “triple” stores), external services, particular containers, just to name a few. This is both quite simple and pretty complex at the same time. Depends on the point of view and the person who’s trying to deploy it, but since we’re not all equal, documentation really saves time and serves as a good reference. For example, imagine that you have successfully released a project and a bug is reported a year after it has been deployed to production. Can you swear that you can remember all the details – which webapp container and what version was used, how do I setup the database, etc. In the perfect case, you will have all these things automated and the documentation will be very short. You will have scripts to initialize the database, you will have everything packaged as RPMs/deb packets, you will have a single button to click which triggers the whole deployment on a desired environment. And if you, or someone else (like me) is using Windows you’ll write for them where to find these scripts and where in the code/scripts to look how to deploy it manually on their local machine. That’s why many companies invest a lot in infrastructure, because this is not their focus and it makes it simple to do and document, but I digress…

    3. Configuration details
    I’d try to make the whole system run with reasonable defaults, but sometimes this is not possible. So, the least you should provide in the configuration details part of the documentation is a list of all properties files, their location and what is their impact on the system. In order to reduce out of sync and maintenance costs, you can have the comments inside the properties files, which is a practice well-adopted and commonly seen in open source projects. The Maven plug-ins’ documentation is a great example for simplicity and clarity and on top of that is automatically generated, so developers are able to construct it directly in the code and POMs and thus improve accuracy and save time.

    4. Usage guide
    Now that the system is up and running, what do I do with it? Apparently, usage guide is more oriented to clients of the system, but developers sometimes also need a reference. Think of some sort of RESTful API which a part of the system is exposing and other parts/external clients need to use it. What are the available services and how to communicate with them should be well documented. A good user guide is also a necessity when your system is exposing some kind of syntax for instance.

    5. Active environments
    What is running and where. Are there test servers which can be used, or is the project in production? Are there public demo servers? What’s the status of these servers? What version of the software are they running? In software business things tend to crash very often. When something crashes you’d want to be able to react as quickly as possible and you don’t want dependencies on particular people who might be well gone from the company, or on holiday, in sick leave, dead or whatever. So, you’ll want this information written down. This is very powerful especially when you’re in an agile team and are working in some kind of DevOps style, because every member of the team can react immediately (or after being awaken in the middle of the night).

    6. Component description
    This is very flexible and varies from project to project, but a short description is always a must. What are the main functions of this components and how it fits in the system. You should also provide a list of features and public API description, so that people could work with it. Let’s try to imagine that you’re developing a cool library which another team in the company or an external client wants to use. You won’t tell them – “OK, here’s the dependency, have a good time looking at the code trying to figure out how to use it”. So, the point is again to keep it as minimum as possible, but not at the expense of fundamental information. A unified flexible structure of the documentation helps a lot.

    To sum up, documentation goes out of sync fast and is sometimes hard to maintain. We should do our best to keep it brief, but not at the expense of clarity or skipping the basics. Parts of it could be automatically generated by the nightly builds of the code base, other parts could be left in the code base with the documentation helpfully pointing to them.

    P.S. Let me have your comments, I may refine this and try to write it neatly in a blog entry, since it became a much longer answer than I initially intended to give.

  11. The post touches a raw nerve.

    I find the idea that one page of documentation is sufficient simplistic for a number of reasons (I wrote a post – just click on the link which goes with my name).I also agree with points raised by other posters (sb, Stefan Enev).

    From my point of view the documentation is all about shortening the time needed by new developers to acquire the knowledge necessary to join and contribute productively to a project. Even without the documentation a skilled developer will be able to learn from the code and the test cases – but it will take time. If the documentation cuts this time than it is useful as it saves costs. If it doesn’t than it isn’t.

    At the same time, producing documentation costs time. So producing the documentation makes economic sense if the costs it saves are at least equal to the costs it takes. This is, essentially, the equation that has to be solved by the project manager when deciding the level of detail for the documentation.

    And, yes, unless there is a dedicated technical writer (which I never experienced in my work) than the documentation has to be produced by the developers themselves.

  12. A short answer to Stefan’s comment:

    1 & 3 are sort-of the same thing, and they can be a very succinct document, or even better – they can be a versioned HowTo text file with a few steps.

    6 should ideally be javadoc – which is easier to keep in sync.

    2 & 5 are for the Ops people. If the DevOps “approach” is used (or if there’s simply no Ops people), only then it’s the developers’ responsibility.

    4 will partly be covered by javadoc, and partly by the overall architecture document that I mentioned is essential.

    To sum it up – I agree that _some_ documentation is indeed required, but it should be way less than generally requested (I’ve participated in projects where changing 20 lines of code required a 30 page document..)

Leave a Reply