[Ns-developers] [ns3] Statistical framework (draft)

Tom Henderson tomh at tomh.org
Sun Apr 27 22:15:06 PDT 2008


Vincent Gauthier wrote:
> Hi all,
> 
> I am proposing to start a statistical framework implementation. The aim 
> of this framework is to provide an easy access to all the pre-defined 
> variables in each layer of the simulator, offering to the users the most 
> used a set of methods to perform statistical analysis (means, medians, 
> confidence intervals) and giving to users a friendly way to analyze the 
> outputs.

Vincent,
Glad to hear of your interest in this topic, and thanks for proposing 
some plans.

> 
> The proposal is driven by three ideas:
> - All available statistical variables should be defined in each layer of 
> the simulator,

Can you clarify what you mean by a statistical variable here, and how 
you envision it working?  For instance, suppose that I have a variable 
such as a TCP congestion window value.  It takes on different discrete 
values over time.  Are you talking about this low-level variable, or 
about some other class that wraps this such as, e.g.,

class TcpSndCwndStatistics : public Statistics
{
public:
   AddSample ();
   GetMean();
   GetStandardDeviation();
   ...
}

> - The statistical framework provides a useful help to define the 
> variables properly and display them (give output),
> - The variables are processed at the end of the simulation, the 
> framework is in charge of gathering all the statistical variables and 
> process them and push the output to another another interface more 
> meaningful for the user (gui, file, ..etc).

It would also be nice if the framework could terminate the simulation 
when certain conditions are met, rather than wait for the user-specified 
end of simulation.

> 
> Consequently, it will be easier for any add-on of the simulator to 
> include their own set of variables to the one's previously defined in 
> others layer/modules. At the end of the simulation, the end users will 
> have access to all set of information in a raw format.

Do you have suggestions on what this format might look like?  Are there 
existing formats that would be beneficial for existing post-processing 
tools?

> 
> Two main sorts of statistical analysis can be perform by the framework.
> One who doesn't need any extra memory than the space needed to store the 
> variable itself:
> - Continue mean (counter + begin and end time of the measurement),
> - Discrete mean (counter + number of sample),
> - Counter (simple variable),
> - Min/Max (store the maximum value of a variable over the simulation time),
> - Confidences Intervals,
> - Among others  (to be defined).
> 
> And in the other one, the ones needing extra memory space (and some 
> overhead) for example:
> - Continuous evolution of the variable X (i.e.: the realtime throughput 
> of flow #Y over the period T),
> - Discrete evolution of the variable X (i.e.: the realtime throughput of 
> TCP over a sample time dt),
> - More to be defined.
> 
> Due to their different impact on the simulator performance each group 
> must follow different process. The first type doesn't lead to any issue 
> about simulation overhead (no more than updating a variable), and in the 
> contrary the second type of variables could lead to a certain amount of 
> memory overhead, and slowing the simulation. We propose to perform all 
> the statistical analysis for the all variables of the first group for 
> all simulations without the need to enable or disable the calculus 
> (except for the output). For the second type we should perform the task 
> on demand to avoid extra overhead, if the end user doesn't use it or 
> need it.
> 
> I will appreciate your comments and feedback, and feel free to include 
> some ideas I didn't mention (how to properly include it in the existing 
> NS-3 structure and  so on).
> 

I think it will be helpful if people interested in this could try to 
develop a strawman API, use cases, and implementation framework for this 
feature.

As background, I think you will want to become familiar with a few 
topics:  ns-3 tracing, ns-3 attributes, and ns-3 object aggregation.

I have been assuming that the ns-3 tracing framework could be exploited 
to compute statistics.  The tracing framework is callback-based, and 
would allow users to hook a statistics-collecting object to the callback 
rather than (or in addition to) hooking a tracing sink that prints out 
traces.

Secondly, the issue of documentation is important, and for this we are 
providing the attribute system that allows variables and trace sources 
to be exported and documented in an organized fashion.  For instance, 
the list of current trace sources (to which one might hook 
statistics-collecting objects) can be found at:
http://www.nsnam.org/doxygen/group___trace_source_list.html

It has been our hope that users will be able to edit the core and 
convert other objects that they find interesting into their own custom 
trace sources and export them, but we should try to provide the commonly 
used sources without requiring any core recompliation.

Third, potentially one will end up with a lot of statistics objects and 
even low-level counters that may be optionally included in a simulation. 
  We have a facility in ns-3 (object aggregation) that might be useful 
to "hang" statistics objects from the regular objects.  For instance, 
consider the statistics found in a TCP MIB or EStats MIB.  Maybe not 
every simulation carries this object around but when statistics are 
enabled, a TCP MIB object is aggregated to the Node and the TCP code 
calls out to this MIB object if it happens to be there.

I don't know how many of these features are ultimately going to be used 
in the statistics framework but I wanted to call them out for some 
consideration.

Regards,
Tom


More information about the Ns-developers mailing list