[Ns-developers] Node "protocol handlers": limitations

Gustavo Carneiro gjcarneiro at gmail.com
Wed May 28 10:41:30 PDT 2008


On 28/05/2008, Mathieu Lacage <mathieu.lacage at sophia.inria.fr> wrote:
>
>
> On Wed, 2008-05-28 at 15:29 +0100, Gustavo Carneiro wrote:
> > In Linux "netfilter hooks", each hook is allowed more flexibility to
> > drop or
> > allow a packet to continue:
> >
> >
> http://www.netfilter.org/documentation/HOWTO/netfilter-hacking-HOWTO-3.html
> >
> > >From the article:
> >
> > > """Kernel modules can register to listen at any of these hooks. A
> > module
> > > that registers a function must specify the priority of the function
> > within
> > > the hook; then when that netfilter hook is called from the core
> > networking
> > > code, each module registered at that point is called in the order of
> > > priorites, and is free to manipulate the packet. The module can then
> > tell
> > > netfilter to do one of five things:
> > >
> > >
>
> > >    1. NF_ACCEPT: continue traversal as normal.
> > >    2. NF_DROP: drop the packet; don't continue traversal.
> > >    3. NF_STOLEN: I've taken over the packet; don't continue
> > traversal.
> > >    4. NF_QUEUE: queue the packet (usually for userspace handling).
> > >    5. NF_REPEAT: call this hook again.
>
> > >
> > > """
> > >
> >
> > NS-3's hooks are relatively more simplistic:
> >
> >   /**
> >    * A protocol handler
> >    */
> >   typedef Callback<void,Ptr<NetDevice>, Ptr<Packet>,uint16_t,const
> > Address
> > &> ProtocolHandler;
> >   /**
> >    * \param handler the handler to register
> >    * \param protocolType the type of protocol this handler is
> >    *        interested in. This protocol type is a so-called
> >    *        EtherType, as registered here:
> >    *        http://standards.ieee.org/regauth/ethertype/eth.txt
> >    *        the value zero is interpreted as matching all
> >    *        protocols.
> >    * \param device the device attached to this handler. If the
> >    *        value is zero, the handler is attached to all
> >    *        devices on this node.
> >    */
> >   void RegisterProtocolHandler (ProtocolHandler handler,
> >                                 uint16_t protocolType,
> >                                 Ptr<NetDevice> device);
> >
> > Unfortunately each "protocol handler" simply receives a copy of the
> > packet,
> > but has no power to veto the original packet traversal up the stack
> > (read,
> > to the remaining hooks).
> >
> > I am a bit unhappy about this situation.  Things have been mostly working
> so
> > far, but I can easily imagine a situation where things might break
> because a
> > packet is received by my hook but it continues up the stack and IP also
> has
> > a route for it, and so two copies of the packet may end up being
> forwarded.
> >
> > I am not asking to support a full netfilter framework in NS-3 (not yet,
> > anyway ;), but I would be willing to cook up a patch to make "protocol
> > handlers" return an enumeration telling the node what to do with the
> packet:
> > continue or drop.
> >
> > Would this sound OK?
>
>
> Not unless you are willing to spend the time necessary to think about
> the larger picture of netfilter hooks or you have a clear use-case where
> the current API breaks for you. i.e., it would be nice to either have a
> good reason to do this or to avoid breaking the API again later when we
> figure out what we want from these netfilter hooks.


Taking a deeper look at this, in Linux the equivalent to NS-3's "protocol
handlers" is de_add_pack() (packet type handler):

http://lxr.linux.no/linux/+code=dev_add_pack

It uses a structure, packet_type:
http://lxr.linux.no/linux/+code=packet_type

The structure has a function pointer of type:

 <http://lxr.linux.no/linux/include/linux/netdevice.h#L599>        int
                    (*func <http://lxr.linux.no/linux/+code=func>)
(struct sk_buff <http://lxr.linux.no/linux/+code=sk_buff> *,
 <http://lxr.linux.no/linux/include/linux/netdevice.h#L600>
                             struct net_device
<http://lxr.linux.no/linux/+code=net_device> *,
 <http://lxr.linux.no/linux/include/linux/netdevice.h#L601>
                             struct packet_type
<http://lxr.linux.no/linux/+code=packet_type> *,
 <http://lxr.linux.no/linux/include/linux/netdevice.h#L602>
                             struct net_device
<http://lxr.linux.no/linux/+code=net_device> *);


The code that uses these packet handlers is:
http://lxr.linux.no/linux/+code=netif_receive_skb

The important section:

1995 <http://lxr.linux.no/linux/net/core/dev.c#L1995>        type
<http://lxr.linux.no/linux/+code=type> = skb
<http://lxr.linux.no/linux/+code=skb>->protocol
<http://lxr.linux.no/linux/+code=protocol>;
1996 <http://lxr.linux.no/linux/net/core/dev.c#L1996>
list_for_each_entry_rcu
<http://lxr.linux.no/linux/+code=list_for_each_entry_rcu>(ptype
<http://lxr.linux.no/linux/+code=ptype>, &ptype_base
<http://lxr.linux.no/linux/+code=ptype_base>[ntohs
<http://lxr.linux.no/linux/+code=ntohs>(type
<http://lxr.linux.no/linux/+code=type>)&15], list
<http://lxr.linux.no/linux/+code=list>) {
1997 <http://lxr.linux.no/linux/net/core/dev.c#L1997>
if (ptype <http://lxr.linux.no/linux/+code=ptype>->type
<http://lxr.linux.no/linux/+code=type> == type
<http://lxr.linux.no/linux/+code=type> &&
1998 <http://lxr.linux.no/linux/net/core/dev.c#L1998>
  (!ptype <http://lxr.linux.no/linux/+code=ptype>->dev
<http://lxr.linux.no/linux/+code=dev> || ptype
<http://lxr.linux.no/linux/+code=ptype>->dev
<http://lxr.linux.no/linux/+code=dev> == skb
<http://lxr.linux.no/linux/+code=skb>->dev
<http://lxr.linux.no/linux/+code=dev>)) {
1999 <http://lxr.linux.no/linux/net/core/dev.c#L1999>
      if (pt_prev <http://lxr.linux.no/linux/+code=pt_prev>)
2000 <http://lxr.linux.no/linux/net/core/dev.c#L2000>
              ret <http://lxr.linux.no/linux/+code=ret> = deliver_skb
<http://lxr.linux.no/linux/+code=deliver_skb>(skb
<http://lxr.linux.no/linux/+code=skb>, pt_prev
<http://lxr.linux.no/linux/+code=pt_prev>, orig_dev
<http://lxr.linux.no/linux/+code=orig_dev>);

2001 <http://lxr.linux.no/linux/net/core/dev.c#L2001>
      pt_prev <http://lxr.linux.no/linux/+code=pt_prev> = ptype
<http://lxr.linux.no/linux/+code=ptype>;
2002 <http://lxr.linux.no/linux/net/core/dev.c#L2002>                }
2003 <http://lxr.linux.no/linux/net/core/dev.c#L2003>        }
2004 <http://lxr.linux.no/linux/net/core/dev.c#L2004>
2005 <http://lxr.linux.no/linux/net/core/dev.c#L2005>        if
(pt_prev <http://lxr.linux.no/linux/+code=pt_prev>) {
2006 <http://lxr.linux.no/linux/net/core/dev.c#L2006>
ret <http://lxr.linux.no/linux/+code=ret> = pt_prev
<http://lxr.linux.no/linux/+code=pt_prev>->func
<http://lxr.linux.no/linux/+code=func>(skb
<http://lxr.linux.no/linux/+code=skb>, skb
<http://lxr.linux.no/linux/+code=skb>->dev
<http://lxr.linux.no/linux/+code=dev>, pt_prev
<http://lxr.linux.no/linux/+code=pt_prev>, orig_dev
<http://lxr.linux.no/linux/+code=orig_dev>);
2007 <http://lxr.linux.no/linux/net/core/dev.c#L2007>        } else {
2008 <http://lxr.linux.no/linux/net/core/dev.c#L2008>
kfree_skb <http://lxr.linux.no/linux/+code=kfree_skb>(skb
<http://lxr.linux.no/linux/+code=skb>);
2009 <http://lxr.linux.no/linux/net/core/dev.c#L2009>
/* Jamal, now you will not able to escape explaining
2010 <http://lxr.linux.no/linux/net/core/dev.c#L2010>
* me how you were going to use this. :-)
2011 <http://lxr.linux.no/linux/net/core/dev.c#L2011>                 */
2012 <http://lxr.linux.no/linux/net/core/dev.c#L2012>
ret <http://lxr.linux.no/linux/+code=ret> = NET_RX_DROP
<http://lxr.linux.no/linux/+code=NET_RX_DROP>;
2013 <http://lxr.linux.no/linux/net/core/dev.c#L2013>        }
2014 <http://lxr.linux.no/linux/net/core/dev.c#L2014>
2015 <http://lxr.linux.no/linux/net/core/dev.c#L2015>out
<http://lxr.linux.no/linux/+code=out>:
2016 <http://lxr.linux.no/linux/net/core/dev.c#L2016>
rcu_read_unlock <http://lxr.linux.no/linux/+code=rcu_read_unlock>();
2017 <http://lxr.linux.no/linux/net/core/dev.c#L2017>        return
ret <http://lxr.linux.no/linux/+code=ret>;
2018 <http://lxr.linux.no/linux/net/core/dev.c#L2018>}

So it seems like all protocol handlers are always invoked if they match,
regardless.  Although there is an integer return value, it is mostly
ignored.  Just like NS-3.

It still makes me kind of uncomfortable, but if Linux gets away with it all
this time, maybe it's not so bad.

My trigger use case was that I was debugging code, so I wrote a class that
traces all packet drops and measures packet drop rate at each point in the
network.  The same code uses virtual tunnels and some interfaces that are
not registered with IP.  However, NS-3's IPv4 implementation adds a hook
that receives from all NetDevices, even the ones that should be hidden from
IPv4.  Therefore, packets are received by a packet socket, but at the same
time forwarded to IPv4, where they are dropped.  These packet drops should
not happen because IPv4 should not receive them at all.

Yes, I know it's hard to understand the use case; also hard to explain :-)

I would be happy to change this code in IPv4 (internet-stack.cc):

  node->RegisterProtocolHandler (MakeCallback (&Ipv4L3Protocol::Receive,
PeekPointer (ipv4)),
                                 Ipv4L3Protocol::PROT_NUMBER, 0);
  node->RegisterProtocolHandler (MakeCallback (&ArpL3Protocol::Receive,
PeekPointer (arp)),
                                 ArpL3Protocol::PROT_NUMBER, 0);

That 0 means "all netdevices".  IMHO it should listen only on NetDevices
specifically IP enabled, not all.

Regards,

-- 
Gustavo J. A. M. Carneiro
INESC Porto, Telecommunications and Multimedia Unit
"The universe is always one step beyond logic." -- Frank Herbert


More information about the Ns-developers mailing list