[Ns-developers] A question about downcast netdevice pointer.

Gustavo Carneiro gjcarneiro at gmail.com
Sun Sep 28 07:37:20 PDT 2008


*sigh* why do you always have to write such long replies... :P

2008/9/27 <craigdo at ee.washington.edu>

> > Craig, consider this:
> >
> > class Csma : NetDevice;
> > class Wifi : NetDevice;
> >
> > Now you create a mutant netdevice object instance with Csma
> > aggregated:
> >
> > Ptr<Wifi> wifi = CreateObject<Wifi> ();  // 0x123 address
> > Ptr<Csma> csma = CreateObject<Csma> ();  // 0x456 address
> > wifi->AggregateObject (csma);
> >
> > Now, given the 'wifi' pointer, what should
> > wifi->GetObject<NetDevice> ()
> > return?  0x456 or 0x123?  They are both NetDevices...
>
> This is a great example to sharpen what I'm talking about.  I think this
> should be a programming error to do this for exactly the reason you give
> above.  But allow me to use this same idea to construct a slightly more
> real
> example to examine in some detail.  Hopefully this will clarify what I am
> talking about.
>
> Let's say, for the sake of argument, that I don't like the ARP
> implementation in ns-3 for some reason.  I decide that it needs to be
> extended or changed in some way.  A simple way to do this would be to make
> ArpL3Protocol::Lookup virtual, inherit from ArpL3Protocol and create a
> CraigArpL3Protocol that did the lookup differently.  Let's say I want to
> add
> a knob to control the behavior so I define a TypeId ns3::CraigArpL3Protocol
> and add some attributes or methods to the class.  I go into the
> internet-stack.cc and add a function AddCraigArpStack to be symmetrical and
> then go to the InternetStackHelper and add an if .. else to call
> AddCraigArpStack in some cases.
>
> What I expect to have happened, is that I have substituted my ARP for the
> ns-3 ARP.  What I have really done is to have introduced a *very* hard to
> understand bug, and ns-3 will have happily accepted what I just did.
> Really, I don't think there are more than two or three ns-3 developers that
> can predict how this will behave and why; and I know that I have to think
> very deeply about the situation to try and predict what will happen.  This
> is bad (TM).  I don't think it is very arguable that it would be fairly
> simple to catch this problem if you treat the objects involved as having
> presented different discoverable Object interfaces to the system when, in
> fact, they really did.  Let me explain ...
>
> Here's what happened.  ArpL3Protocol inherits from Object and therefore can
> be treated as a discoverable interface.  By that I mean that
> node->GetObject<ArpL3Protocol>() will work.  CraigArpL3Protocol inherits
> from ArpL3Protocol and has a TypeId so
> node->GetObject<CraigArpL3Protocol>()
> will work.  What I did by adding the if .. else in the InternetStackHelper
> was to create the same situation you illustrated above.  The system
> accepted
> the situation because the "last" TypeIds of the two objects (ArpL3Protocol
> and CraigArpL3Prorocol) were different.  The system makes this a problem
> because it walks the inheritance tree in GetObject but doesn't test for
> duplicates in the inheritance tree in AggregateObject -- because the two
> cases are looked at as (very) subtly different.
>
> I'll end up with an aggregation that looks like the illustration below
> (ignoring the Object parts) and is actually implemented as a circular list.
> This is perfectly legal in our existing system since we really do have that
> dynamnic_cast versus Aggregate fuzziness we are focusing on.
>
>  +-> Ptr<Node> -> Ptr<ArpL3Protocol> -> Ptr<ArpL3Protocol +
> CraigArpL3Protocol> -+
>  |
> |
>
>
> +---------------------------------------------------------------------------
> ----+
>
> My question is:  "is there an easily understandable behavior for this
> construction that you can explain to a normal person in a few sentences."
>  I
> submit the answer is no, and you really have to understand how the data
> structures that implement the aggregation work, how they interact with the
> TypeId data structures, how the actual Objects were implemented, from which
> object a GetObject search was started, and the search algorithm to
> completely understand how this aggregation will behave.  This is even
> beyond
> the issue of "once you have a pointer to an object, should you use a cast
> to
> move up and down its inheritance hierarchy."  This tells me we have a real
> conceptual problem here.  You need to be able to document the behavior of
> the API without saying read the implementation ;-)
>
> Let me point out just a few of the (IMO) bizarre aspects of this situation
> (but feel free to read object.cc on your own and write some code):
>
> If you arranged the Aggregation order such that the original ArpL3Protocol
> was aggregated to the node first, a GetObject<ArpL3Protocol> will return a
> pointer to the original ARP Object.  A GetObject<CraigArpL3Protocol> will
> return a pointer to an entirely different Object, the second
> CraigArpL3Protocol.  A GetObject<ArpL3Protocol> on the returned
> CraigArpL3Protocol will return a pointer to the ArpL3Protocol associated
> with the Craig object, not the original ArpL3Protocol object because of the
> search order change.  The end result is that it is quite easy to have
> different pieces of the system believing they are working on the same ARP
> object when, in reality, there are two completely different objects
> underneath.  Now, if you arranged the aggregation order such that the
> CraigArpL3Protocol object was aggregated first, you may get an entirely
> different behavior.  Do you begin to see the litany of entirely different
> behaviors that could be exhibited based on different implementations?
>
> This could all be prevented by looking at the inheritance hierarchy of the
> aggregated objects as the collection of Objects that they are really
> presenting themselves to the system as being; and not trying to look at the
> situation as inheritance from one angle and object aggregation from
> another.
> I think this duality is the cause of the problems.
>
> If the situation where there are multiple ArpL3Protocol objects accessible
> in the aggregation was detected and flagged, the situation above wouldn't
> have happened.  Seeing and preventing this is all much more obvious if you
> start thinking about the ArpL3Protocol + CraigsArpL3Protocol situation as
> two Objects; and as a corollary, it makes sense to look at the NetDevice +
> CsmaNetDevice as two Objects accessible via GetObject and *not* as a
> separate orthogonal situation where dynamic_cast makes more sense.
>
> It is clear to me, at least, that the whole situation is much more easily
> understood and explained if you explicitly *don't* get dynamic_cast
> involved
> in the presence of aggregatable Objects.  So I think we should make the
> situation Gustavo outlines above explicitly a programming error and
> encourage people to look at collections of Objects as aggregations even if
> the implementation is via inheritance.  You don't actually lose anything;
> it's mostly a change of conceptual model.
>
> > I know  this example doesn't make much sense,
> > but I am still convinced GetObject () is wrong
> > for type casting.
>
> I think that this is a slippery slope that can get you into trouble if what
> you are type casting are ns-3 Objects.  I think if Objects are involved you
> should use GetObject.  If Objects are not involved you should use
> dynamic_cast.  I think your suggestion for a DynamicCast template would be
> useful and correct for Ptr-to-things that are not Objects.
>
> > If I understand what you're saying, you think it would be simpler if
> > developers pretended NetDevice and CsmaNetDevice were two
> > separate objects
> > which are aggregated.  I emphatically disagree.  Developers
> > *must know* that
> > NetDevice is a simple base class for CsmaNetDevice, not just another
> > object.  If the developer was not supposed to know it then he would be
> > writing lots of GetObject () calls before calling any
> > NetDevice method;
> > clearly silly and a runtime penalty.
>
> No.  The NetDevice Object "interface" is a generic object and has methods
> defined by the class declaration:
>
>  N1
>  N2
>  etc
>
> The CsmaNetDevice Object has *additional* methods for its "interface":
>
>  N1
>  N2
>  C1
>  C2
>
> If you have a NetDevice wand need to call a "C" method, you do
> GetObject<CsmaNetDevice> you get all of the methods you need.  Why would
> you
> GetObject back and forth any more than you would dynamic_cast back and
> forth
> to call NetDevice->N1 when you have a CsmaNetDevice->N1?
>
> >And it would be more
> > complicated than
> > it needs to: inheritance is a concept all C++ programmers
> > already know and
> > are used to, there's no point in taking it away from them.
>
> I'm not talking about taking inheritance away from anybody.  I'm saying
> that
> adding Object to the mix actually does add something to the mix that needs
> to be considered.  Using these Objects as if there was only inheritance in
> play is, IMO, dangerous and you can fairly easily shoot yourself in the
> foot
> in a very hard to understand way.  I think the only good way to create an
> API model that is understandable by mere mortals (without having to take in
> a lot of underlying implementation detail) is to *not* treat inheritance of
> Objects differently than a generic aggregation of objects.
>
> Outside of adding the one test I spoke about in AggregateObject, and
> preferring GetObject to dynamic cast, it is really only a conceptual
> (mindset) change I'm talking about.  _There_are_no_API_changes_.  I don't
> think anybody loses *anything* and it makes understanding the *whole*
> system
> easier once you answer for yourself the question that started this whole
> exchange: how can there appear to be two Objects aggregated when only one
> C++ object is involved ...


Here's what I think (and I wish you could be as objective as I try to be):

  1- You point out problems that AggregateObject should be detecting and
warning the user, not GetObject.  Swapping dynamic_cast for GetObject would
not help in any additional way that proper assertions in AggregateObject
can't detect;

  2- I am all in favor of AggregateObject detecting errors; in fact, I was
the first to suggest it: http://www.nsnam.org/bugzilla/show_bug.cgi?id=162

  3- I am not aware of any situation where a downcast via dynamic_cast would
not work 100% correctly unless there is already an underlying aggregation
error which would/should be detected by AggregateObject instead;

  4- On one hand you say developers should pretend a subclass is completely
independent from its base class, like they're two different objects
aggregated together.  In that case, how do you explain the code (e.g.) in
packet-socket.cc:

int
PacketSocket::Connect(const Address &ad)
{
[...]
  NotifyConnectionSucceeded ();
  return 0;
 error:
  NotifyConnectionFailed ();
  return -1;
}

  Should we be forced to write GetObject<Socket>
()->NotifyConnectionSucceeded () ?

  5- Finally, you claim there is no downside to using GetObject instead of
dynamic cast.  That is debatable for two reasons.  One, because developers
already know C++ and know how to use dynamic_cast much better than
GetObject, and two because I am not aware of performance comparison between
dynamic_cast and GetObject.  GetObject might be much slower.  When in doubt
I prefer to use dynamic_cast, as it is generated by a compiler with decades
of of experience, hence likely to be better optimized.

  Regards,

-- 
Gustavo J. A. M. Carneiro
INESC Porto, Telecommunications and Multimedia Unit
"The universe is always one step beyond logic." -- Frank Herbert


More information about the Ns-developers mailing list