Using .NET 4.5.1, how do I use some of the non-intuitive properties provided by ETW?

Question

Jason,

I decided this was to verbose to answer here, so I did it in a blog post you can find here: http://msmvps.com/blogs/kathleen/archive/2014/01/24/how-are-event-parameters-best-used-to-create-an-intuitive-custom-evnetsourcetrace.aspx

Kathleen Dollard

Content From Blog Post

Channel

ETW supports four basic channels and the potential for custom channels. EventSource does not support custom channels (if you have a user story, contact me or the team). The default channel and the only one currently supporting in-line manifests is the Debug channel.

The Channel parameter exists only in the NuGet version and only for the purpose of accessing the additional channels, primarily the admin channel to access EventViewer for messages to admins. I was one of the people that fought for this capability, but it is for a very limited set of cases. Almost all events logically write to your channel – the default channel – the Debug channel.

To write to EventViewer, you need to write to the Admin channel and install a manifest on the target computer. This is documented in the specification, in my video, and I’m sure a couple of blog posts. Anything written to the admin channel is supposed to be actionable by ETW (Windows) guidelines.

Use Operational and Analytic channels only if it is part of your app requirements or you are supporting a specific tool.

In almost all cases, ignore the Channel parameter on the Event attribute and allow trace events to go to the Debug channel.

Level

For the Admin Channel

If you are writing to the admin channel, it should be actionable. Information is rarely actionable. Use warning when you wish to tell them (not you, not a later dev, but ops) that you want them to be concerned. Perhaps that response times are nearing the tolerances of the SLA. Use error to tell them to do something. Perhaps that someone in the organization is trying to do something they aren’t allowed to do. Tell them only what they need to know. Few messages, but relatively verbose and very clear on what’s happening, probably including response suggestions. This is “Danger, danger Will Robinson” time.

For the Debug Channel

This is your time-traveling mind meld with a future developer or a future version of yourself.

I’m lucky enough to have sat down several times with Vance, Dan and Cosmin and this is one of the issues they had to almost literally beat into my head. The vast majority of the time, your application can, and probably should run with the default information turned on.

If you’re looking at an event that clearly represents a concern you have as a developer – something you want to scare a later developer because it scares you – like a serious failed assert – use warning. If someone is holding a trace file with ten thousand entries, what are the three things or the ten things you think tell them where the problem is? If they are running at the warning (not informational level) what do they really, truly need to know?

If it’s an error, use the error level.

If it’s a massively frequent, rarely interesting event, use verbose. Massively frequent is thousands of times a second.

In most cases, use the default informational level for the Level parameter of the Event attribute. Depending on team philosophy, ignore it or record it.

Keywords

If you have verbose events, they need to be turned on and off in an intelligent fashion. Groups of verbose events need keywords to allow you to do this.

Warnings and Error levels do not need keywords. They should be on, and the reader wants all of them.

The danger of missing an event so vastly outweighs the cost of collecting events that informational events should be turned on without concern for keywords. If keywords aren’t going to be used to filter collection, their only value is filtering the trace output. There are so many other ways to filter the trace, keywords are not that helpful.

In most cases, use the Keywords parameter of the Event attribute only for verbose events and use them to group verbose events that are likely to be needed together. Use Keywords to describe the anticipated debugging task where possible. Events can include several Keywords.

Task

On the roller coaster of life, we just entered one of the scary tunnels – the murky world of ETW trace event naming. As far as ETW is concerned, your event is identified with a numeric ID. Period.

Consumers of your trace events have a manifest – either because it’s in-line (default for Debug channel, supported by PerfView and gradually being supported by WPR/WPA) or installed on the computer where the trace is consumed. The manifest does not contain an event name that is used by consumers.

Consumers, by convention, make a name from your Task and Opcode.

EventSource exists to hide the weirdness (and elegance) of ETW. So it takes the name of your method and turns it into a task. Unless you specify a task. Then it uses your task as the task and ignores the name of your method. Got it?

In almost all cases, do not specify a Task parameter for the Event attribute, but consider the name of your method to be the Task name (see Opcode for exception).

Opcode

I wish I could stop there, but Jason points out a key problem. The Start and Stop opcodes can be very important to evaluating traces because they allow calculation of elapsed time. When you supply these opcodes, you want to supply the Task to ensure proper naming.

And please consider the humans. They see the name of the method, they think it’s the name displayed in the consumer. For goodness sakes make it so. If you specify a task and opcode, ensure that the method name is the concatenation. Please

This is messy. I’m working on some IDE generation shortcuts to simplify EventSource creation and this is a key reason. I think it will help, but it will require the next public release of Roslyn.

Almost never use an Opcode parameter other than Start/Stop.

When using Start/Stop Opcodes, also supply a Task and ensure the name of the method is the Task concatenated with the Opcode for the sake of the humans.

Version

The version parameter of the Event attribute is available for you and consumers to communicate about whether the right version of the manifest is available. Versioning is not ETW’s strength – events rarely changed before we devs got involved and now we have in-line manifests (to the Debug channel). You can use it, and the particular consumer you’re using might do smart things with it. And even so, the manifest is or is not correctly installed on any machines where installed manifests are used.

Overall, I see some pain down this route.

The broad rule for versioning ETW events is don’t. That is do not change them except to add additional data at the end (parameters to your method and WriteEvent call). In particular, never rearrange in a way that could give different meaning to values. If you must remove a value, force a default or marker value indicating missing. If you must otherwise alter the trace output, create a new event. And yes, that advice sucks. New events with “2” at the end suck. As much as possible, do up front planning (including confidentiality concerns) to avoid later changes to payload structure.

Initially ignore the Version parameter of the Event attribute (use default), but increment as you alter the event payload. But only add payload items at the end unless you can be positive that no installed manifests exist (and I don’t think you can).

Message

Did you notice that so far I said, rarely use any of the parameters on the Event attribute? Almost never use them.

The Message parameter, on the other hand, is your friend.

The most important aspect of EventSource is documenting what the event needs from the caller of the code. It’s the declaration of the Event method. Each item passed should be as small as possible, non-confidential, and have a blazingly clear parameter name.

The guy writing against your event sees an available log method declaration like “IncomingDataRequest(string Entity, string PrimaryKey).” Exactly how long does it take him to get that line of code in place? “IncomingRequest(string msg)” leaves the dev wondering what the message is or whether it’s even the correct method. I’ve got some stuff in my upcoming video on using generics to make it even more specific.

Not only does special attention to Event method parameters pay off by speeding the writing of code that will call the Event method (removing all decision making from the point of the call), but (most) consumers see this data as individual columns. They will lay this out in a very pretty fashion. Most consumers allow sorting and filtering by any column. Sweet!

This is what Strongly Typed Events are all about.

Parameters to your method like “msg” do not cut it. Period.

In addition to the clarity issues, strings are comparatively enormous to be sticking into event payloads. You want to be able to output boatloads of events – you don’t want big event payloads filling your disks. Performance starts sucking pretty quickly if you also use String.Format to prepare a message that might never be output.

Sometimes the meaning of the parameter is obvious from the name of the event. Often it is not. The contents of the Message parameter is included in the manifest and allows consumers to display a friendly text string that contains your literals and whatever parts of the event payload seem interesting. Sort of like String.Format() – the “Message” parameter is actually better described as a “format” parameter. Since it’s in the manifest, it should contain all the repeatable parts. Let the strongly typed data contain only what’s unique about that particular call to the trace event.

The Message parameter uses curly braces so you feel warm and fuzzy. That’s nice. But the actual string you type in the parameter is passed to the consumer, with the curly braces replaced with ETW friendly percent signs. Do not expect the richness of String.Format() to be recognized by consumers. At least not today’s consumers.

By splitting the data into strongly typed chunks and providing a separate Message parameter, the person evaluating your trace can both sort by columns and read your message. The event payload contains only data, the manifest allows your nice wordy message. Having your beer and drinking it too.

Not sold yet? If you’re writing to a channel that uses installed manifests, you can also localize the message. This can be important if you are writing to the admin channel for use in EventViewer.

Almost always use Message so consumers can provide a human friendly view of your strongly typed event payload.

Summary

There are four basic rules for EventSource usage:

Give good Event method names
Provide strongly typed payload data – consider confidentiality – and work to get payload contents right the first time (small where possible)
Use the Message parameter of the event attribute for a nice human friendly message
For every other Event attribute parameter – simplify, simplify, simplify. Use the defaults unless you are trying to do something the defaults don’t allow