Frage

I am in the process of writing a requirements spec, and I have a dilemma in phrasing a piece of the requirements.

Scenario: We download files from a website and the downloaded files need to be attached to an item in the CM tool we have. The downloaded files contain names which can be ASCII, ISO-8859-1, Japanese, etc.

In the phrasing below, does "non-ASCII" cover all situations?

The downloaded file name may contain non-ASCII characters and processing of this shall not crash the application

War es hilfreich?

Lösung

The requirement, as stated, is fuzzy to me.

The first question I would have is: how many character encodings need to be supported? Possible interpretations include:

  1. Every encoding ever devised, including single-byte (e.g. ISO-8859-15), multibyte (e.g. Big5, Shift-JIS, HZ), and rare/weird ones (e.g. UTF-7, Punycode, EBCDIC).
  2. That's obviously extreme. How about just the minimum support, namely ISO-8859-1?
  3. Just ISO-8859-1 seems weaselly. How about just supporting modern best practices, namely Unicode as UTF-8?

If you don't specify which encodings you mean, then when an encoding-specific bug occurs, you and the implementor could have a fight and you'd both be right. That is, by definition, the consequence of a fuzzy spec.

Going further, what does the software need to do with the filename, besides not crashing? Should it…

  1. Preserve the filename in its original encoding, byte-for-byte?
  2. Normalize everything to Unicode? If so, does it need to auto-detect the source encoding? By what mechanism?
  3. Store both the Unicode form and the original, just in case the normalization fails?

A better version of your requirement would be

The downloader must support filenames in various encodings, including at least ASCII, ISO-8859-1, ISO-8859-15, KOI8-R, UTF-8, Shift-JIS, EUC-JP, GB2312, and Big5. If the web server response specifies an encoding, it must be respected. (If the encoding is unspecified, ISO-8859-1 may be assumed, or a better guess may be made.) Filenames shall be normalized to a Unicode representation in the content management system.

The specific examples of required encodings are essential for devising acceptance criteria. The added sentences state what the software needs to do, beyond not crashing.

Andere Tipps

The requirement that you've written doesn't have the characteristics of a good requirement. Specifically, it's not cohesive, it's not atomic, and it's not unambiguous. Because of the lack of these characteristics, it's also not easily verifiable.

Your initial state requirement is:

The downloaded file name may contain non-ASCII characters and processing of this shall not crash the application

I would recommend removing the "...and processing of this shall not crash the application". If you have a requirement that a piece of software needs to do something, I think it's OK to make the assumption that it should do it without crashing the software.

This transforms the requirement into:

The downloaded file name may contain non-ASCII characters

Now, you have a cohesive and atomic requirement. However, I'm not sure that it's unambiguous. In your question, you mention a number of different formats. There are a few options.

Some would recommend a separate and unique requirement for each file name encoding that must be supported. This would best support cohesive, atomic, traceable, unambiguous, and verifiable requirements. It would also make it easier to specify importance of each requirement - perhaps support for some encodings are more important or needed sooner.

Others may recommend a table of supported formats and this requirement would link to a table. It would be less complete (you have a textual sentence and a table to be maintained), but they would be in the same document or database. However, if you were going to perform linking in a requirements management tool, they could be linked together so that changes to one would highlight the linked requirement. It would also allow the text to flow to other software packages as is, but with a different table for different encodings.

How you document the requirements does depend on your specific needs, though.

There are a couple of issues with your wording that weaken the requirement:

1) You should express the requirement in positive terms, rather than in terms of what it should not do. How does one test for "not crashing".

2) The phrase "The downloaded file name may contain..." is vague.

A suggested alternative wording (purely subjective, of course) might be:

The application shall support downloaded file names containing non-ASCII characters.

(The word "support" is still a little vague and could be changed to be more concrete when taken in concert with other requirements for your application.)

The problem with the spec as written is that it doesn't say what the application should do with "interesting" filenames. I've encountered one program which would replace any filename characters it didn't understand with _, with the effect that when asked to copy a directory which contained two characters whose names were identical except in characters the utility didn't understand, the second file written to the directory would overwrite the first. Such behavior would qualify as "not crashing", but that shouldn't imply that it's acceptable absent an explicit spec saying so.

I would suggest that a good spec should affirmatively specify what should happen, or else note what courses of action are acceptable, e.g. "If a file name contains unrecognized characters, the system should generate a new GUID for the overall operation, and generate a filename which combines that GUID, an index number, and any portion of the original filename that can readily be accommodated; it should produce a table mapping the old and new filenames" or "If a file name contains unrecognized characters, the system may form a new name by concatenating the characters it recognizes; if two file names end up becoming identical through such transformation, either one may arbitrarily be declared the 'winner'".

Lizenziert unter: CC-BY-SA mit Zuschreibung
scroll top