What's the idea behind naming classes with “Info” suffix, for example: “SomeClass” and “SomeClassInfo”?

https://softwareengineering.stackexchange.com/questions/283316

08-10-2020
|

Question

I'm working in a project which deals with physical devices, and I've been confused as how to properly name some classes in this project.

Considering the actual devices (sensors and receivers) are one thing, and their representation in software is another, I am thinking about naming some classes with the "Info" suffix name pattern.

For example, while a Sensor would be a class to represent the actual sensor (when it is actually connected to some working device), SensorInfo would be used to represent only the characteristics of such sensor. For example, upon file save, I would serialize a SensorInfo to the file header, instead of serializing a Sensor, which sort of wouldn't even make sense.

But now I am confused, because there is a middleground on objects' lifecycle where I cannot decide if I should use one or another, or how to get one from another, or even whether both variants should actually be collapsed to only one class.

Also, the all too common example Employee class obviously is just a representation of the real person, but nobody would suggest to name the class EmployeeInfo instead, as far as I know.

The language I am working with is .NET, and this naming pattern seems to be common throughout the framework, for exemple with these classes:

Directory and DirectoryInfo classes;
File and FileInfo classes;
ConnectionInfoclass (with no correspondent Connection class);
DeviceInfo class (with no correspondent Device class);

So my question is: is there a common rationale about using this naming pattern? Are there cases where it makes sense to have pairs of names (Thing and ThingInfo) and other cases where there should only exist the ThingInfo class, or the Thing class, without its counterpart?

Solution

I think "info" is a misnomer. Objects have state and actions: "info" is just another name for "state" which is already baked into OOP.

What are you really trying to model here? You need an object that represents the hardware in software so other code can use it.

That is easy to say but as you found out, there is more to it than that. "Representing hardware" is surprisingly broad. An object that does that has several concerns:

Low-level device communication, whether it be talking to the USB interface, a serial port, TCP/IP, or proprietary connection.
Managing state. Is the device turned on? Ready to talk to software? Busy?
Handling events. The device produced data: now we need to generate events to pass to other classes that are interested.

Certain devices such as sensors will have fewer concerns than say a printer/scanner/fax multifunction device. A sensor likely just produces a bit stream, while a complex device may have complex protocols and interactions.

Anyway, back to your specific question, there are several ways to do this depending on your specific requirements as well as the complexity of the hardware interaction.

Here is an example of how I would design the class hierarchy for a temperature sensor:

ITemperatureSource: interface that represents anything that can produce temperature data: a sensor, could even be a file wrapper or hard-coded data (think: mock testing).
Acme4680Sensor: ACME model 4680 sensor (great for detecting when the Roadrunner is nearby). This may implement multiple interfaces: perhaps this sensor detects both temperature and humidity. This object contains program-level state such as "is the sensor connected?" and "what was the last reading?"
Acme4680SensorComm: used solely for communicating with the physical device. It does not maintain much state. It is used for sending and receiving messages. It has a C# method for each of the messages the hardware understands.
HardwareManager: used for getting devices. This is essentially a factory that caches instances: there should only be one instance of a device object for each hardware device. It has to be smart enough to know that if thread A requests the ACME temperature sensor and thread B requests the ACME humidity sensor, these are actually the same object and should be returned to both threads.

At the top level you will have interfaces for each hardware type. They describe actions your C# code would take on the devices, using C# data types (not e.g. byte arrays which the raw device driver might use).

At the same level you have an enumeration class with one instance for each hardware type. Temperature sensor might be one type, humidity sensor another.

One level below this are the actual classes that implement those interfaces: they represent one device similar the Acme4680Sensor I described above. Any particular class may implement multiple interfaces if the device can perform multiple functions.

Each device class has its own private Comm (communication) class that handles the low-level task of talking to the hardware.

Outside of the hardware module, the only layer that is visible is the interfaces/enum plus the HardwareManager. The HardwareManager class is the factory abstraction that handles the instantiation of device classes, caching instances (you really do not want two device classes talking to the same hardware device), etc. A class that needs a particular type of sensor asks the HardwareManager to get the device for the particular enum, which it then figures out if it is already instantiated, if not how to create it and initialize it, etc.

The goal here is to decouple business logic from low-level hardware logic. When you are writing code that prints sensor data to the screen, that code should not care what type of sensor you have if and only if this decoupling is in place which centers on those hardware interfaces.

UML class diagram example showing the design described in this answer

Note: there are associations between the HardwareManager and each device class that I did not draw because the diagram would have turned into arrow soup.

OTHER TIPS

It may be a little difficult to find a single unifying convention here because these classes are spread out over a number of namespaces, (ConnectionInfoseems to be in CrystalDecisions, and DeviceInfo in System.Reporting.WebForms).

Looking at these examples, though, there seem to be two distinct uses of the suffix:

Distinguishing a class providing static methods with a class providing instance methods. This is the case for the System.IO classes, as underlined by their descriptions:

Directory:

Exposes static methods for creating, moving, and enumerating through directories and subdirectories. This class cannot be inherited.

DirectoryInfo:

Exposes instance methods for creating, moving, and enumerating through directories and subdirectories. This class cannot be inherited.

Info seems like a slightly odd choice here, but it does make the difference relatively clear: a Directory class could reasonably either represent a particular directory or provide general directory-related helper methods without holding any state, whereas DirectoryInfo could only really be the former.
Emphasising that the class only holds information and does not provide behaviour that might reasonably be expected from the un-suffixed name.

I think that the last part of that sentence might be the piece of the puzzle that distinguishes, say, ConnectionInfo from EmployeeInfo. If I had a class called Connection, I'd reasonably expect it to actually provide me with the functionality that a connection has- I'd be looking for methods like void Open(), etc. However, nobody in their right mind would expect that an Employee class could actually do what a real Employee does, or look for methods like void DoPaperwork() or bool TryDiscreetlyBrowseFacebook().

In general, an Info object encapsulates information about an object's state at some moment in time. If I ask the system to look at a file and give me a FileInfo object associated with its size, I would expect that object to report the size of the file at the time the request was given (or, to be more precise, the size of the file at some moment between when the call was made and when it returned). If the size of the file changes between the time the request returns and a time the FileInfo object is examined, I would not expect such a change to be reflected in the FileInfo object.

Note that this behavior would be very different from that of a File object. If a a request to open a disk file in non-exclusive mode yields a File object which has a Size property, I would expect the value returned thereby to change when the size of the disk file changes, since the File object doesn't merely represent the state of a file--it represents the file itself.

In many cases, objects that attach to a resource must be cleaned up when their services are no longer needed. Because *Info objects do not attach to resources, they do not require cleanup. As a consequence, in cases where an Info object will satisfy a client's requirements, it may be better to have code use one than to to use an object which would represent the underlying resource, but whose connection to that resource would have to be cleaned up.

Considering the actual devices (sensors and receivers) are one thing, and their representation in software is another, I am thinking about naming some classes with the "Info" suffix name pattern.

For example, while a Sensor would be a class to represent the actual sensor (when it is actually connected to some working device), SensorInfo would be used to represent only the characteristics of such sensor. For example, upon file save, I would serialize a SensorInfo to the file header, instead of serializing a Sensor, which sort of wouldn't even make sense.

I don't like this distinction. All objects are "representation[s] in software." That's what the word "object" means.

Now, it might make sense to separate information about a peripheral from the actual code that interfaces with the peripheral. So, for instance, a Sensor has-a SensorInfo, which contains most of the instance variables, along with some methods that don't require hardware, while the Sensor class is responsible for actually interacting with the physical sensor. You don't have-a Sensor unless your computer has a sensor, but you could plausibly have-a SensorInfo.

The trouble is that this kind of design can be generalized to (almost) any class. So you have to be careful. You obviously wouldn't have a SensorInfoInfo class, for example. And if you have a Sensor variable, you may find yourself violating the law of Demeter by interacting with its SensorInfo member. None of this is fatal, of course, but API design isn't just for library authors. If you keep your own API clean and simple, your code will be more maintainable.

Filesystem resources like directories, in my opinion, are very close to this edge. There are some situations in which you want to describe a directory which is not locally accessible, true, but the average developer is probably not in one of those situations. Complicating the class structure in this fashion is, in my opinion, unhelpful. Contrast Python's approach in pathlib: There is a single class which "is most likely what you need" and various auxiliary classes which most developers can safely ignore. If you really need them, however, they provide largely the same interface, just with concrete methods stripped out.

I would say that context/domain matters, since we have high level business logic code and low level models, architecture components and so on...

'Info', 'Data', 'Manager', 'Object', 'Class', 'Model', 'Controller' etc. can be smelly suffixes, especially on a lower level, since every object has some information or data, so that information is not necessary.

Class names of the business domain should be like all stakeholder talk about it, no matter if it sounds weird or is not 100% correct language.

Good suffixes for data structures are e.g 'List', 'Map' and for hinting patterns 'Decorator', 'Adapter' if you think it's necessary.

To your sensor scenario, I would not expect SensorInfo to save what your sensor is, but SensorSpec. Info imho is more a derived information, like FileInfo is something like the size, you don't save or the filepath which is built from the path and the filename, etc.

Another point:

There are only two hard things in Computer Science: cache invalidation and naming things.

-- Phil Karlton

That always reminds me of thinking about a name just for a few seconds and if I didn't find any, I use weird names and 'TODO'-mark them. I can always change it later since my IDE provides refactoring support. It's not a company slogan which must be good, but just some code we can change every time we want. Keep that in mind.

ThingInfo can serve as a great read only Proxy for the Thing.

see http://www.dofactory.com/net/proxy-design-pattern

Proxy: "Provide a surrogate or placeholder for another object to control access to it."

Typically the ThingInfo will have public properties with no setters. These classes and the methods on the class are safe to use and will not commit any changes to the backing data, the object, or any other objects. No state changes or other side effects will occur. These can be used for reporting and web services or anywhere that you need information about the object but want to limit access to the actual object itself.

Use the ThingInfo whenever possible and limit the use of the actual Thing to the times you actually need to change the Thing object. It makes reading and debugging considerably faster when you get used to using this pattern.

So far nobody in this question seems to have picked up on the real reason for this naming convention.

A DirectoryInfo is not the directory. It is a DTO with data about the directory. There can be many such instances describing the same directory. It is not an entity. It is a throw-away value object. A DirectoryInfo does not represent the actual directory. You can also think of it as a handle or controller for a directory.

Contrast that with a class named Employee. This might be an ORM entity object and it is the single object describing that employee. If it was a value-object without identity it should be called EmployeeInfo. An Employee does indeed represent the actual employee. A value-like DTO class called EmployeeInfo would clearly not represent the employee but rather describe it or store data about it.

There's actually an example in the BCL where both classes exist: A ServiceController is a class that describes a Windows Service. There can be any number of such controllers for each service. A ServiceBase (or a derived class) is the actual service and it does not conceptually make sense to have multiple instances of it per distinct service.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange