Question

I am in a situation right now where a solution has been proposed that uses a csv file. The proposed file structure essentially contains three atomic values;

  • ID
  • Thing 1
  • Thing 2

Fine. As we were discussing this solution, someone mentioned that some customers might use our ID while other customers might use their own, different ID. Development's suggestion was to simply add a fourth field and have the customer choose between the two (use one or the other), i.e.;

  • Internal ID
  • External ID
  • Thing 1
  • Thing 2

In this way, the system can account for either situation. The response to this suggestion is what is troubling me - our implementation team came back and said to stick with the original file, and then development should add a different, internal designator somewhere else within the system that consumes the csv file, which treats the ID field as either internal or external, depending upon what customer submitted the file. The argument here was that having two fields for ID is confusing to the customer and could lead to problems.

Alright - finally my question. I feel rather strongly that we should have four fields, but I cannot find any basic software premise to back up my insistence. The CSV file is essentially a table, so I keep looking to DB normalization rules for an answer (1NF keeps coming to mind), but I don't think that's quite right either.

What rule is being broken by using a field/variable for multiple purposes? This has got to be in a few basic coding books, white papers, lists of do's and don'ts, right? Anyone have anything I might be able to point to?

Thanks so much in advance!

Marshall

Was it helpful?

Solution

Well, a single field shouldn't do two different things, but in your case, it's not a single field. The id field from one customer's file is a separate field from the id field from another customer's file. They just happen to share a name at the moment. If you were mixing internal and external ids in a single column in a single file, that's when you would have a problem.

If you really want to be able to decode a file without any external flag about what kind of ids they use, I would change the header name like:

  • <your company name> id
  • Thing 1
  • Thing 2

and

  • <customer name> id
  • Thing 1
  • Thing 2

This makes it clear which id is in use for the entire column, and also avoids the confusion of your external being your customer's internal. I would only use both ids in one file if a customer wants to supply both ids.

Licensed under: CC-BY-SA with attribution
scroll top