Writing comments for some small code with rather large background [duplicate]

https://softwareengineering.stackexchange.com/questions/296867

10-10-2020
|

Вопрос

So I had to write some code related to splitting Bezier curves into parts. I read through several references and particularly referred this rather detailed one.

The final code outcome is however around 20-30 LOC. But without having this kind of background, it would be really difficult for someone to figure out what the code is doing.

Explaining it in details would require me to write too much comments as the function's explanation.

Putting a link to this document into comments did not seem a very nice an idea (links might break in future)

Q. Should I rather generate this as a doc, keep it locally with project docs, and give a reference to it in the comments ?

Q. Any other nicer way in general to give comments about some rather complex/large area of work associated with a particular functionality.

P.S. I don't want some body reading this code later to curse me for what it is, so, you see :p

Решение

Having a big comment section explaining "whys" and "hows" details of a complicated algorithm is a good idea. And it is better to have close to the code, so that developer does not need to switch context to read about it (even worse - switching back and forth between algorithm and document).

Just remember to include a sort of TL;DR on top of the lengthy comment, for those who need to get just the idea/outline, without details of implementation.

P.S. I was porting a project with such comment blocks a few months ago - they were very helpful.

Другие советы

Code is read far more often than it’s written, so carefully written comments are worth their weight in gold.

Distill the relevant details of the article you referenced while writing the code. Include a URL, even if it may be broken in the future; there is always the Internet Archive. And above all, make specific references to the theoretical results you used, such as De Casteljau’s algorithm.

It’s okay for code to be opaque to someone unfamiliar with the domain—in this case, splines and numerical analysis—as long as the reader can find detailed references for learning how to understand the code.

I personally like it if there is a comment with enough information to understand the code. If you store it any place else, there is always a chance this will get lost until the time someone tries to understand the code. Still, put a link there (it might work, else its maybe in the web archive), if there are any papers, put the title/authors there so it can be found.

But still, for someone who knows the field, it should be enough to read the comment to understand what's going on.

I don't know what you did exactly and never heard of de Casteljau before, but maybe something like this would be good:

Splitting the curves into parts because [thats more awesome|i like it|faster›...] using de Casteljau's algorithm.
The following differs from the usual de Casteljau's algorithm:
 a. all control points are cats
 b. we calculate using roman numerals
Maybe a reason why this differs.

A detailed description is at http://pomax.github.io/bezierinfo/#decasteljau
Also the paper "Towards the use of cats as curve points" by "Meow et al." is of great use (link to paper maybe)

If you want, more detailed description

This should give enough information to either understand the code or (if one doesn't know the theory behind it) find sources for explanation.

The only disadvantages of longer comments are file size and the need for people to scroll over the comment. Both are not really a problem (filesize today only really matters for embedded systems etc. and comments are not in the binaries, and people complaining over the need to scroll over a 20 lines comment should get an IDE)

Think about this: If someone has to change parts of that code in a few years, what would cost the company employing him more:

He needs to ignore a large comment while editing the code he understands
He needs to search for information how this code works or even rewrite it, because the file describing it in the docs folder got lost and the links are dead

Note: an argument for why should documentation live in the code is made in the section at the bottom of this answer.

I would encourage you to include the details of the algorithm as a comment, and more.

Why this algorithm?

Before explaining the algorithm itself, you must first explain why this algorithm was chosen, rather than an alternative.

The explanation can be as simple as "We will use Bezier Curves, no other alternative was explored as they performed well enough". However, if you did test alternatives, please explain why they were rejected.

The idea here is that a maintainer coming later on may have to explore again (looking for better performance or accuracy for example), and if you already did the work with algorithm X and Y, and the reasons they were rejected still apply, then said maintainer can decide to start checking another algorithm instead rather than repeating your experiments.

Which algorithm?

Why this might seem silly, seeing as you said "Bezier Curves", bear in mind that sometimes there are slight variations of algorithms/data structures/... (B-Tree, B+-Tree, B*-Tree for example). Therefore, specifying with as much precision as possible which algorithm was selected, and which source it was pulled from (preferably one available online...) can help the readers' expectations with reality.

Also, if this is a variant compared to the general text-book version, make sure it is clearly labelled as such, lest readers wonder why it seems to deviate.

How does this algorithm work?

This really depends on the team you work with. If Bezier Curves are the bread and butter of the team, then a simple one-liner might be sufficient; however if the code may end-up being read/maintained by people for which this sounds like a surf figure, describing the algorithm sounds best.

Secondly, another advantage of describing the algorithm in comments is that it makes it easy to split the description, literate programming style. That is, you first start with a general outline of the algorithm which only identify sections, and then for each section you will have a comment block and immediately under the associated code. It makes it easier to check that the code is in adequation with the comments.

Finally, a last advantage of describing the algorithm in comments is that it makes it possible to annotate the algorithm itself. You may take short-cuts (a single round of approximation rather than two is sufficient for your accuracy needs for example), or on the contrary explain why a given step is necessary (and what are the consequences of removing it); you may even fix the algorithm (if you pointed to a printed version, for example, there might be an errata...).

Where?

Since this is an implementation detail, it should not get in the way of the caller. Implementation details should be documented in the function (not outside), and depending on your documentation generator, using "pure" comments as they do not need to appear in the documentation.

Why document in the code?

I will argue that code documentation should live as close to possible to the code. The reason is simple:

the bug tracking system might change, and even if the import tool works the IDs may change
the source control system might change
the repository might be re-organized, split into several, etc...
...

Any time such an event occur, there is a risk that links across (new) boundaries are severed.

The code is the only constant!

Indeed:

without the code, you do not need the documentation
with the code, you get the comments, and thus the documentation within

Thus documentation is therefore more likely to be available if included in the code.

There is a limit, obviously, for example text files cannot contain images, so graphics/diagrams are hard to include in comments (ASCII Art should not be underestimated, and yes, I am serious). Still text goes a long way, so at the very least you can provide a good explanation in place and then nothing prevents you from also providing a more in-depth explanation elsewhere and link to it.

It seems like all the other answers are "put it into the comments!"

I'm going to go against the grain and recommend keeping it in a separate document alongside the code instead, for this reason: it looks like your code is doing something that requires complicated math to understand. Math can be hard enough to understand in the best of cases; it can be even harder to understand without proper standard notation, and code comments are not a very easy place to put fancy mathematical notation. ASCII-art can only get you so far.

Instead, I would briefly state the algorithm in the code comments and refer to the document which you create in an appropriate tool that can display the fancy math for details. Make sure this document is placed into the same repository as the code, and store it in a location unlikely to be omitted from any future moves of the code from one repository to another, or from one version control system to another. These things happen!

Where I work, every official document associated with a project gets a part number which can be used to retrieve the document later. If you have similar processes where this application is being developed, consider getting a number assigned to the document, and reference that number in the code comments so there is always a way to track down that document even if it gets lost.

Instead of describing the complex algorithm in the the sourcecode i prefer to add a reference to the ticketnumber of the Bug_tracking_system. Example:

 /** Split a Bezier into several parts.
  * See #18864 for details */
 public BezierParts[] splittBezier(Bezier aBezier) {
    ...
 }

18864 is the ticketnumber. The ticket has all informations including

a http link to the algorithm
attachment with a pdf copy of the "http link to the algorithm" in case that the http link disapears.

Лицензировано под: CC-BY-SA с атрибуция

Не связан с softwareengineering.stackexchange