You definitely need DirectShow. Not positive about requiring SDL...
DirectShow provides the streaming interface to video capture sources in Windows. Although DirectShow itself is primarily a set of user mode components, the supplied video capture filter (typically KsProxy.ax) communicates via a defined set of interfaces and properties to kernel level capture drivers. Some camera manufacturers choose to provide their own user mode DirectShow capture filter with private (generally kernel) interfaces to their hardware, and this makes DirectShow the common access point for all Windows video capture devices.
SDL provides a cross platform library which gives quick access to the display. This is used by the codecs in PJSIP to get decoded video to the display.
It looks like PJSIP possibly supports using DirectShow as the rendering filter. See: http://svn.pjsip.org/repos/pjproject/trunk/pjmedia/src/pjmedia-videodev/dshow_dev.c
It's unclear from the source whether the code to enable the DirectShow renderer at line 52 would function if enabled:
/* Temporarily disable DirectShow renderer (VMR) */
#define HAS_VMR 0
Since this is commented out, I would assume the code wasn't fully completed which is why all the examples also require SDL.