It is likely functionality you need will be always tightly coupled with packet dissection. Good protocol dissectors are really needed to extract required information. So my suggestion is to use best open source tool available - wireshark.org
It provides "Follow TCP stream" functionality:
I doesn't look like you can easily extract part of Wireshark dissection logic, but at least there is a good example packet-tcp:
typedef struct _tcp_flow_t {
guint32 base_seq; /* base seq number (used by relative sequence numbers)
* or 0 if not yet known.
*/
tcp_unacked_t *segments;
guint32 fin; /* frame number of the final FIN */
guint32 lastack; /* last seen ack */
nstime_t lastacktime; /* Time of the last ack packet */
guint32 lastnondupack; /* frame number of last seen non dupack */
guint32 dupacknum; /* dupack number */
guint32 nextseq; /* highest seen nextseq */
guint32 maxseqtobeacked;/* highest seen continuous seq number (without hole in the stream) from the fwd party,
* this is the maximum seq number that can be acked by the rev party in normal case.
* If the rev party sends an ACK beyond this seq number it indicates TCP_A_ACK_LOST_PACKET contition */
guint32 nextseqframe; /* frame number for segment with highest
* sequence number
*/
Basically, there is separate conversation extraction logic, please notice find_conversation
usage:
/* Attach process info to a flow */
/* XXX - We depend on the TCP dissector finding the conversation first */
void
add_tcp_process_info(guint32 frame_num, address *local_addr, address *remote_addr, guint16 local_port, guint16 remote_port, guint32 uid, guint32 pid, gchar *username, gchar *command) {
conversation_t *conv;
struct tcp_analysis *tcpd;
tcp_flow_t *flow = NULL;
conv = find_conversation(frame_num, local_addr, remote_addr, PT_TCP, local_port, remote_port, 0);
if (!conv) {
return;
}
The actual logic is well documented and available here:
/*
* Given two address/port pairs for a packet, search for a conversation
* containing packets between those address/port pairs. Returns NULL if
* not found.
*
* We try to find the most exact match that we can, and then proceed to
* try wildcard matches on the "addr_b" and/or "port_b" argument if a more
* exact match failed.
* ...
*/
conversation_t *
find_conversation(const guint32 frame_num, const address *addr_a, const address *addr_b, const port_type ptype,
const guint32 port_a, const guint32 port_b, const guint options)
{
conversation_t *conversation;
/*
* First try an exact match, if we have two addresses and ports.
*/
if (!(options & (NO_ADDR_B|NO_PORT_B))) {
So what I'm actually suggesting is to use EPAN library. It is possible to extract this library and use it independently. Please be careful with the license.