NDIS filter driver' FilterReceiveNetBufferLists handler isn't called

Question

Below is the mail I received from Jeffrey, I think it is the best answer for this question:)

The packet filter works differently for LWFs versus Protocols. Let me give you some background. You’ll already know some of this, I’m sure, but it’s always helpful to review the basics, so we can be sure that we’re both on the same page. The NDIS datapath is organized like a tree:

enter image description here

Packet filtering happens at two places in this stack:

(a) once in the miniport hardware, and

(b) at the top of the stack, just below the protocols.

NDIS will track each protocols’ packet filter separately, for efficiency. If one protocol asks to see ALL packets (promiscuous mode), then not all protocols have to sort through all that traffic. So really, there are (P+1) different packet filters in the system, where P is the number of protocols:

enter image description here

Now if there are all these different packet filters, how does an OID_GEN_CURRENT_PACKET_FILTER actually work? What NDIS does is NDIS tracks each protocols’ packet filter, but also merges the filter at the top of the miniport stack. So suppose protocol0 requests a packet filter of A+B, and protocol1 requests a packet filter of C, and protocol2 requests a packet filter of B+D:

enter image description here

Then at the top of the stack, NDIS merges the packet filters to A+B+C+D. This is what gets sent down the filter stack, and eventually to the miniport.

Because of this merging process, no matter what protocol2 sets as its packet filter, protocol2 cannot affect the other protocols. So protocols don’t have to worry about “sharing” the packet filter. However, the same is not true for a LWF. If LWF1 decides to set a new packet filter, it does not get merged:

enter image description here

In the above picture, LWF1 decided to change the packet filter to C+E. This overwrote the protocols’ packet filter of A+B+C+D, meaning that flags A, B, and D will never make it to the hardware. If the protocols were relying on flags A, B, or D, then the protocols’ functionality will be broken.

This is by design – LWFs have great power, and they can do anything to the stack. They are designed to have the power to veto the packet filters of all other protocols. But in your case, you don’t want to mess with other protocols; you want your filter to have minimal effects on the rest of the system.

So what you want to do is to always keep track of what the packet filter is, and never remove flags from the current packet filter. That means that you should query the packet filter when your filter attaches, and update your cached value whenever you see an OID_GEN_CURRENT_PACKET_FILTER come down from above.

If your usermode app needs more flags than what the current packet filter has, you can issue the OID and add additional flags. This means that the hardware’s packet filter will have more flags. But no protocol’s packet filter will change, so the protocols will still see the same stuff.

enter image description here

In the above example, the filter LWF1 is playing nice. Even though LWF1 only cares about flag E, LWF1 has still passed down all flags A, B, C, and D too, since LWF1 knows that the protocols above it want those flags to be set.

The code to manage this isn’t too bad, once you get the idea of what needs to be done to manage the packet filter:

Always track the latest packet filter from protocols above.
Never let the NIC see a packet filter that has fewer flags than the protocols’ packet filter.
Add in your own flags as needed.

Ok, hopefully that gives you a good idea of what the packet filter is and how to manage it. The next question is how to map “promiscuous mode” and “non-promiscuous mode” into actual flags? Let’s define these two modes carefully:

Non-promiscuous mode: The capture tool only sees the receive traffic that the operating system would normally have received. If the hardware filters out traffic, then we don’t want to see that traffic. The user wants to diagnose the local operating system in its normal state.

Promiscuous mode: Give the capture tool as many receive packets as possible – ideally every bit that is transferred on the wire. It doesn’t matter whether the packet was destined for the local host or not. The user wants to diagnose the network, and so wants to see everything happening on the network.

I think when you look at it that way, the consequences for the packet filter flags are fairly straightforward. For non-promiscuous mode, do not change the packet filter. Just let the hardware packet filter be whatever the operating system wants it to be. Then for promiscuous mode, add in the NDIS_PACKET_TYPE_PROMISCUOUS flag, and the NIC hardware will give you everything it possibly can.

So if it’s that simple for a LWF, why did the old protocol-based NPF driver need so many more flags? The old protocol-based driver had a couple problems:

It can’t get “non-promiscuous mode” perfectly correct
It can’t easily capture the send-packets of other protocols

The first problem with NPF-protocol is that it can’t easily implement our definition of “non-promiscuous mode” correctly. If NPF-the-protocol wants to see receive traffic just as the OS sees it, then what packet filter should it use? If it sets a packet filter of zero, then NPF won’t see any traffic. So NPF can set a packet filter of Directed|Broadcast|Multicast. But that’s only an assumption of what TCPIP and other protocols are setting. If TCPIP decided to set a Promiscuous flag (certain socket flags cause this to happen), then NPF would actually be seeing fewer packets than what TCPIP would see, which is wrong. But if NPF sets the Promiscuous flag, then it will see more traffic than TCPIP would see, which is also wrong. So it’s tough for a capturing protocol to decide which flags to set so that it sees exactly the same packets that the rest of the OS sees. LWFs don’t have that problem, since LWFs get to see the combined OID after all protocols’ filters are merged.

The second problem with NPF-protocol is that it needed loopback mode to capture sent-packets. LWFs don’t need loopback -- in fact, it would be actively harmful. Let’s use the same diagram to see why. Here’s NPF capturing the receive path in promiscuous mode:

enter image description here

Now let’s see what happens when a unicast packet is received:

enter image description here

Since the packet matches the hardware’s filter, the packet comes up the stack. Then when the packet gets to the protocol layer, NDIS gives the packet to both protocols, tcpip and npf, since both protocols’ packet filters match the packet. So that works well enough.

But now the send path is tricky:

enter image description here

tcpip sent a packet, but npf never got a chance to see it! To solve this problem, NDIS added the notion of a “loopback” packet filter flag. This flag is a little bit special, since it doesn’t go to the hardware. Instead, the loopback packet filter tells NDIS to bounce all send-traffic back up the receive path, so that diagnostics tools like npf can see the packets. It looks like this:

enter image description here

Now the loopback path is really only used for diagnostics tools, so we haven’t spent much time optimizing it. And, since it means that all send packets travel across the stack twice (once for the normal send path, and again in the receive path), it has at least double the CPU cost. This is why I said that an NDIS LWF would be able to be capture at a higher throughput than a protocol, since LWFs don’t need the loopback path.

Why not? Why don’t LWFs need loopback? Well if you go back and look at the last few diagrams, you’ll see that all of our LWFs saw all the traffic – both send and receive – without any loopback. So the LWF meets the requirements of seeing all traffic, without needing to bother with loopback. That’s why a LWF should normally never set any loopback flags.

Ok, that email got longer than I wanted, but I hope that clears up some of the questions around the packet filter, the loopback path, and how LWFs are different from protocols. Please let me know if anything wasn’t clear, or if the diagrams didn’t come through.