[Ns-developers] A Reason for Slowness of NS3

George Riley riley at ece.gatech.edu
Fri Feb 13 13:01:04 PST 2009


Hi Adrian,
THanks for posting this.  I recall early on in the design discussions  
there was
a lengthy debate about the merits of "pass by value" versus "pass by  
reference".
The decision was, I believe, to use "const reference" where possible,  
precisely
for the reason you describe below.  The downside is, of course, an  
additional
indirection when reading the values passed by reference.  However, we  
did
decide pass by reference was generally the best approach.  Clearly,  
this was
not always done, but would support your suggested change.  We may want  
to
make a more thorough pass looking for any other high-overhead object  
copies
for pass by value.

Tom, feel free to comment.  Is my memory correct on this?

George

--------------------------------------------------
George Riley
Associate Professor
Georgia Tech
Electrical and Computer Engineering
riley at ece.gatech.edu
404-894-4767
Klaus Advanced Computing Building
Room 3360
266 Ferst Drive
Atlanta, Georgia 30332-0765

ECE4110 Web page:
http://users.ece.gatech.edu/~riley/ece4110/

ECE4112 Web page:
http://users.ece.gatech.edu/~riley/ece4112/









On Feb 13, 2009, at 2:58 PM, Adrian Sai-Wah TAM wrote:

> Hi,
>
> Recently I did some profiling with the NS3. I compile everything with
> the optimized build and measure the time in my laptop computer running
> Debian. Following is the result.
>
> $ time LD_LIBRARY_PATH=~/ns-3-dev/build/optimized ./csma-star
> --SchedulerType=ns3::MapScheduler
> real    0m2.805s
> user    0m2.692s
> sys     0m0.028s
> $ time LD_LIBRARY_PATH=~/ns-3-dev/build/optimized ./csma-star
> --SchedulerType=ns3::Ns2CalendarScheduler
> real    0m2.863s
> user    0m2.740s
> sys     0m0.040s
>
> It looks not bad. However, when I compile everything into binary (i.e.
> not using libns3.so) with "-pg -ggdb" options and pass through the
> gprof, I found the following:
>
> %   cumulative   self              self     total
> time   seconds   seconds    calls   s/call   s/call  name
> 1.81      1.60     0.12 14590560     0.00     0.00
> ns3::TypeId::TypeId(ns3::TypeId const&)
> 1.66      1.83     0.11 15729313     0.00     0.00   
> ns3::TypeId::~TypeId()
>
> What caught my eye is the number of calls. I studied the code and
> found that there are a lot of cases using TypeId and whenever a
> function call is returning TypeId or taking TypeId as a *read only*
> parameter, it is done as follows:
>
> TypeId GetTypeId() const;
> bool function(TypeId in);
> TypeId tid = GetTypeId();
>
> I spend a day to change all such function into:
>
> const TypeId& GetTypeId() const;
> function(const TypeId& in);
> const TypeId& tid = GetTypeId();
>
> That means, I try to avoid invoking the copy constructor as much as
> possible by using references. After that, the regression test still
> pass and the new running time is as follows:
>
> $ time LD_LIBRARY_PATH=~/ns-3-dev/build/optimized ./csma-star
> --SchedulerType=ns3::MapScheduler
> real    0m2.788s
> user    0m2.660s
> sys     0m0.052s
> $ time LD_LIBRARY_PATH=~/ns-3-dev/build/optimized ./csma-star
> --SchedulerType=ns3::Ns2CalendarScheduler
> real    0m2.858s
> user    0m2.748s
> sys     0m0.024s
>
> Not a huge improvement, but observable. And what I see from the  
> gprof output is:
>
> %   cumulative   self              self     total
> time   seconds   seconds    calls   s/call   s/call  name
> 0.49      3.32     0.03  2503869     0.00     0.00   
> ns3::TypeId::~TypeId()
> 0.33      3.85     0.02  1365059     0.00     0.00
> ns3::TypeId::TypeId(ns3::TypeId const&)
>
> which is an order of magnitude fewer number of calls.
>
> This story says that, performance of NS3 can improve by avoiding copy
> constructors when it is possible to use reference. TypeId is not the
> only case. I can provide the diff file of my changes, just to see if
> other people agree this is a necessary to make the code nicer.
>
> - Adrian.



More information about the Ns-developers mailing list