Methodology for exploring APIs in dynamic languages

https://softwareengineering.stackexchange.com/questions/296454

10-10-2020
|

Question

As a regular user of Standard ML and, to a lesser extent, Haskell, the following pattern is deeply ingrained into my "instinctive" approach to navigating and learning new APIs:

Understand the types.
Derive and use free theorems.
Hoogle and Djinn are my friends.
Only if and when I'm not getting any more information from 1, 2 and 3; use test cases (either handcrafted or generated by a QuickCheck-like tool).

Obviously, this pattern doesn't carry over to dynamic languages, so I need a different methodology. The Python REPL has the function help(), which provides documentation for a function provided I already know its name. The Lisp REPL is somewhat smarter, and, in addition to the describe function (which is analogous to Python's help()), also provides the apropos function, which lists all functions whose name contains a given string. But, what if I'm confronted with a totally new library, which I must learn starting from the basics? How do I determine what has to be learned first? Even if "start with the types" doesn't work, there must be something else that does - what is it?

Solution

Read the documentation. When you have no compiler to do the “understanding” for you, you'll have to put in that effort yourself.

Ideally, a project has various kinds of documentation:

example code snippets
introductions, tutorials, how-tos
reference documentation

I find a bit of example code extremely helpful to get a cursory overview of a module's capabilities. Here is an example from a Perl module I recently looked at:

SYNOPSIS

use Term::ProgressBar;

$progress = Term::ProgressBar->new ({count => $count});
$progress->update ($so_far);

This is not a complete program, but it illustrates the most important facets of its API. Given such a synopsis, I can often make a quick judgement whether that code will be useful for me. If so, I continue reading with overview-level or tutorial-style documentation.

The overview and tutorial will give me a broad overview of the parts of the system, and introduce the most commonly used parts. For small and simple APIs, a tutorial might not even be necessary as you can simply look at the example code and skim the whole reference documentation in a couple of minutes. Once I have an overview, I can often intuit where some functionality I would want to use might be implemented, e.g. which class or which namespace.

When I then try to use that module, the reference documentation becomes more important for me. Ideally, this documentation has an entry for each type/class/function/method/variable/symbol that is part of the API. It should describe its meaning in the context of its system, and clearly describe its type. E.g. a class should describe all its methods, its invariants, its initialization, …. A function or method should describe the meaning of each parameter and the accepted types, what the function will do with that data, what it will return, and whether it will throw an exception. Here is an example from a project I am currently working on:

next_token
my ($type, $value) = $self->next_token;
Retrieves the next token from the input, applying special commands as needed.

This method is the main parsing driver. It implements various special syntax and delegates to command_*() functions when a command is encountered. Note that whitespace and comments are directly written to the currently selected buffer rather than returning them as a token.

Returns a two-value list: $type is the token type ID of the found token. This should be only used for comparisons with the token type constants. To get the name of the token type, use $self->TOKEN_NAMES->{$type}. $value is the arbitrary value of the token, which in general would be a string.

Throws on illegal syntax.

Stability: The interface of this method is not expected to change, though new behaviour might be added in a backwards-compatible manner.

Example:
my ($type, $value) = $self->next_token;
if ($type != $type->IDENT) {
    die "expected IDENT";
}

The types are still there, though they are obscured by the plain-text description. I personally find that satisfactory, but the corresponding type signature for a pure-functional version would be something like

next_token :: ParserState -> Try (ParserState, TokenId, String)
-- where type Try 'a = Either 'a Exception

Finally, if the reference documentation is not sufficiently precise, I will look at the source code. Many programmers do not take writing documentation seriously, and will forget to mention basic information, such as what kind of input is accepted, and what will be done on errors.

Obviously, starting with the reference documentation, or starting to use a system without knowing anything about it, will always be confusing. Therefore, built-in help systems such as Python's help(symbol) are not generally sufficient to build an initial understanding of a system. Fortunately, many project host their documentation on their website so that you can comfortably read their overview docs.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange