[Ns-developers] A Reason for Slowness of NS3

Adrian Sai-Wah TAM adrian at ieaa.org
Fri Feb 13 11:58:39 PST 2009


Hi,

Recently I did some profiling with the NS3. I compile everything with
the optimized build and measure the time in my laptop computer running
Debian. Following is the result.

$ time LD_LIBRARY_PATH=~/ns-3-dev/build/optimized ./csma-star
--SchedulerType=ns3::MapScheduler
real    0m2.805s
user    0m2.692s
sys     0m0.028s
$ time LD_LIBRARY_PATH=~/ns-3-dev/build/optimized ./csma-star
--SchedulerType=ns3::Ns2CalendarScheduler
real    0m2.863s
user    0m2.740s
sys     0m0.040s

It looks not bad. However, when I compile everything into binary (i.e.
not using libns3.so) with "-pg -ggdb" options and pass through the
gprof, I found the following:

 %   cumulative   self              self     total
 time   seconds   seconds    calls   s/call   s/call  name
 1.81      1.60     0.12 14590560     0.00     0.00
ns3::TypeId::TypeId(ns3::TypeId const&)
 1.66      1.83     0.11 15729313     0.00     0.00  ns3::TypeId::~TypeId()

What caught my eye is the number of calls. I studied the code and
found that there are a lot of cases using TypeId and whenever a
function call is returning TypeId or taking TypeId as a *read only*
parameter, it is done as follows:

TypeId GetTypeId() const;
bool function(TypeId in);
TypeId tid = GetTypeId();

I spend a day to change all such function into:

const TypeId& GetTypeId() const;
function(const TypeId& in);
const TypeId& tid = GetTypeId();

That means, I try to avoid invoking the copy constructor as much as
possible by using references. After that, the regression test still
pass and the new running time is as follows:

$ time LD_LIBRARY_PATH=~/ns-3-dev/build/optimized ./csma-star
--SchedulerType=ns3::MapScheduler
real    0m2.788s
user    0m2.660s
sys     0m0.052s
$ time LD_LIBRARY_PATH=~/ns-3-dev/build/optimized ./csma-star
--SchedulerType=ns3::Ns2CalendarScheduler
real    0m2.858s
user    0m2.748s
sys     0m0.024s

Not a huge improvement, but observable. And what I see from the gprof output is:

 %   cumulative   self              self     total
 time   seconds   seconds    calls   s/call   s/call  name
 0.49      3.32     0.03  2503869     0.00     0.00  ns3::TypeId::~TypeId()
 0.33      3.85     0.02  1365059     0.00     0.00
ns3::TypeId::TypeId(ns3::TypeId const&)

which is an order of magnitude fewer number of calls.

This story says that, performance of NS3 can improve by avoiding copy
constructors when it is possible to use reference. TypeId is not the
only case. I can provide the diff file of my changes, just to see if
other people agree this is a necessary to make the code nicer.

- Adrian.


More information about the Ns-developers mailing list