[Ns-developers] [ns3] Statistical framework (draft)
Vincent Gauthier
vglist at mac.com
Mon Apr 28 09:00:36 PDT 2008
Hi Tom,
Le 28 avr. 08 à 07:15, Tom Henderson a écrit :
> Vincent Gauthier wrote:
>> Hi all,
>> I am proposing to start a statistical framework implementation.
>> The aim of this framework is to provide an easy access to all the
>> pre- defined variables in each layer of the simulator, offering to
>> the users the most used a set of methods to perform statistical
>> analysis (means, medians, confidence intervals) and giving to
>> users a friendly way to analyze the outputs.
>
> Vincent,
> Glad to hear of your interest in this topic, and thanks for
> proposing some plans.
>
>> The proposal is driven by three ideas:
>> - All available statistical variables should be defined in each
>> layer of the simulator,
>
> Can you clarify what you mean by a statistical variable here, and
> how you envision it working? For instance, suppose that I have a
> variable such as a TCP congestion window value. It takes on
> different discrete values over time. Are you talking about this
> low- level variable, or about some other class that wraps this such
> as, e.g.,
>
> class TcpSndCwndStatistics : public Statistics
> {
> public:
> AddSample ();
> GetMean();
> GetStandardDeviation();
> ...
> }
What I meant when talking about statistics variable, was more like a
class that wraps others classes, like in the following examples. The
Statistical framework will get the low-level and compute them as
function of their type. The type of the variable, e.g.
RxWindowsSizeStats, will determine the process done by the "update"
method. If the end-user wants to get the windows' size of the TCP flow
as a function of time, we need to store each sample of the windows'
size with the time in a vector. But in many others cases, like in the
second example, we don't need to do that. The end-user may just want
to get a mean or a total, and we don't have to save all the samples.
--------
class TcpStatistics : public Statistics
{
public :
Ptr < StatisticsVar> RxWindowsSizeStats
.....
}
class Tcp : public Statistics
{
public:
Tcp()
{
// Gets the type of all the variable define in stats (counter, mean,
dicrete sample)
stats->initialize();
}
~Tcp()
{
//sends all the data to a general statistical repository for further
process
stats->sendToRepository();
}
UpdateWindowsSize()
{
...
// Gets the type of all the variable define in the configuration files
or command file (counter, mean, dicrete sample, ...etc)
(stats->RxWindowsSizeStats)->update(m_rxWindowSize, Time);
}
private:
...
uint32_t m_rxWindowSize;
Ptr < TcpStatistics> stats;
}
-------
// in the code now :
class Queue : public Object
{
.....
private:
uint32_t m_nBytes;
uint32_t m_nTotalReceivedBytes;
uint32_t m_nPackets;
uint32_t m_nTotalReceivedPackets;
uint32_t m_nTotalDroppedBytes;
uint32_t m_nTotalDroppedPackets;
}
// Could look like that
// Here we just update the variable m_nBytes and m_nPackets and we
can compute the average or just the total number of bytes received
during the simulation without needing to store all the simulation sample
class Queue : public Object
{
.....
public:
bool Queue::Enqueue (Ptr<Packet> p)
{
stats->m_nBytes += p->GetSize ();
stats->m_nPackets++;
}
private:
Ptr <QueueStatistics> stats;
}
>
>
>> - The statistical framework provides a useful help to define the
>> variables properly and display them (give output),
>> - The variables are processed at the end of the simulation, the
>> framework is in charge of gathering all the statistical variables
>> and process them and push the output to another another interface
>> more meaningful for the user (gui, file, ..etc).
>
> It would also be nice if the framework could terminate the
> simulation when certain conditions are met, rather than wait for
> the user-specified end of simulation.
That should not be very hard to implement, we just need to create a
new type of variable (similar to the confidence interval) specified by
the user who send a callback to the simulator when the confidence
interval precision is reach.
>
>
>> Consequently, it will be easier for any add-on of the simulator to
>> include their own set of variables to the one's previously defined
>> in others layer/modules. At the end of the simulation, the end
>> users will have access to all set of information in a raw format.
>
> Do you have suggestions on what this format might look like? Are
> there existing formats that would be beneficial for existing post-
> processing tools?
The format of the main statistical output can look like something
hierarchically oriented like that (for the networking fields point of
view):
Network Layer, protocole name, id, variable name, type, value, units :
Mac Layer, IEEE 802.11, MAC_ADDR , RxBytes, mean, XXXX, Bytes/s
Network layer, IPv4, IP_ADDR, NbPacketsReceived, Total, XXX, none
....etc
or somethings more meaningful from the software's point of view :
/Statstistics/NodeList/0/DeviceList/0/TxQueue/NbTxPackets XXX
...etc
or a xml file but I am not sure it would be the easiest way to parse
the file for most of users ?
For the variable like the tcp windows size as function of the time we
create a file in which we put all the sample in columns
>
>
>> Two main sorts of statistical analysis can be perform by the
>> framework.
>> One who doesn't need any extra memory than the space needed to
>> store the variable itself:
>> - Continue mean (counter + begin and end time of the measurement),
>> - Discrete mean (counter + number of sample),
>> - Counter (simple variable),
>> - Min/Max (store the maximum value of a variable over the
>> simulation time),
>> - Confidences Intervals,
>> - Among others (to be defined).
>> And in the other one, the ones needing extra memory space (and some
>> overhead) for example:
>> - Continuous evolution of the variable X (i.e.: the realtime
>> throughput of flow #Y over the period T),
>> - Discrete evolution of the variable X (i.e.: the realtime
>> throughput of TCP over a sample time dt),
>> - More to be defined.
>> Due to their different impact on the simulator performance each
>> group must follow different process. The first type doesn't lead
>> to any issue about simulation overhead (no more than updating a
>> variable), and in the contrary the second type of variables could
>> lead to a certain amount of memory overhead, and slowing the
>> simulation. We propose to perform all the statistical analysis for
>> the all variables of the first group for all simulations without
>> the need to enable or disable the calculus (except for the
>> output). For the second type we should perform the task on demand
>> to avoid extra overhead, if the end user doesn't use it or need it.
>> I will appreciate your comments and feedback, and feel free to
>> include some ideas I didn't mention (how to properly include it in
>> the existing NS-3 structure and so on).
>
> I think it will be helpful if people interested in this could try to
> develop a strawman API, use cases, and implementation framework for
> this feature.
Right, seems to be good way to follow
>
>
> As background, I think you will want to become familiar with a few
> topics: ns-3 tracing, ns-3 attributes, and ns-3 object aggregation.
>
> I have been assuming that the ns-3 tracing framework could be
> exploited to compute statistics. The tracing framework is callback-
> based, and would allow users to hook a statistics-collecting object
> to the callback rather than (or in addition to) hooking a tracing
> sink that prints out traces.
>
> Secondly, the issue of documentation is important, and for this we
> are providing the attribute system that allows variables and trace
> sources to be exported and documented in an organized fashion. For
> instance, the list of current trace sources (to which one might hook
> statistics-collecting objects) can be found at:
> http://www.nsnam.org/doxygen/group___trace_source_list.html
>
> It has been our hope that users will be able to edit the core and
> convert other objects that they find interesting into their own
> custom trace sources and export them, but we should try to provide
> the commonly used sources without requiring any core recompliation.
>
> Third, potentially one will end up with a lot of statistics objects
> and even low-level counters that may be optionally included in a
> simulation. We have a facility in ns-3 (object aggregation) that
> might be useful to "hang" statistics objects from the regular
> objects. For instance, consider the statistics found in a TCP MIB
> or EStats MIB. Maybe not every simulation carries this object
> around but when statistics are enabled, a TCP MIB object is
> aggregated to the Node and the TCP code calls out to this MIB
> object if it happens to be there.
>
> I don't know how many of these features are ultimately going to be
> used in the statistics framework but I wanted to call them out for
> some consideration.
>
> Regards,
> Tom
Thx for your advices, I am going to take a look about your comments
and the things you point out and get back to you.
Regards,
Vincent
More information about the Ns-developers
mailing list