[Ns-developers] [ns3] Statistical framework (draft)
Joseph Kopena
tjkopena at cs.drexel.edu
Mon Apr 28 07:20:47 PDT 2008
On Fri, Apr 25, 2008 at 7:39 AM, Vincent Gauthier <vglist at mac.com> wrote:
> I am proposing to start a statistical framework implementation. The aim of
> this framework is to provide an easy access to all the pre-defined variables
> in each layer of the simulator, offering to the users the most used a set of
> methods to perform statistical analysis (means, medians, confidence
> intervals) and giving to users a friendly way to analyze the outputs.
Hi Vincent,
This all sounds good to me, and it's definitely a great thing to work
on. I could certainly use it! I'm also interested in helping with
this task.
Comments so far:
- How are you envisioning the framework being bound to the variables?
My conception was that the bulk of the statistics framework would be a
generic box collecting input and producing numbers, without
particularly caring what those inputs are. Sim writers can then
easily use it for their own metrics, an important goal, as well as
connecting to "internal" simulation data, such as buffer drops, frames
generated, etc.
For example, in simulations I wrote a little bit ago, I wanted to stop
logging everything as that was my biggest slowdown and I did not need
all the data. Instead, I made a simple (global) object that keeps a
set of counts, keyed by a tag. For events I care about. I write some
code updating a count for that tag. For example, "frame" gets updated
by 1 everytime a frame is sent by a netdevice, "bytes" gets updated w/
the bytes for each frame, "registration" whenever a service is heard
by a broker, etc. At the end of the simulation, the stat counter
dumps all the counts for all the fields, which is recorded and then
collated by a set of scripts to aggregate multiple trials at multiple
node densities, etc.
That's obviously a very simple "stats framework," but that's the basic
approach I've been thinking of. The key point is not binding to
predefined variables, and only collecting for things you're interested
in. The big next steps from that would be to utilize the tracing
framework instead of some ad hoc approach, and have more options than
simply counting (timing, means, etc).
In all, it sounds similar to what you've described, I'm just not real
clear on how you're seeing variables get defined, etc.
- I was going to say that actual statistic calculation in many cases
could be done offline by scripts included in the stats package, but I
like Tom's idea of being able to stop simulations based on hitting
some condition. That could also be used to provide output to a
loosely coupled realtime GUI visualization, but I personall am more
interested in output for papers than GUIs.
- A large part of the statistical package that would greatly benefit
users is management of data over multiple runs---both repeated trials,
as well as changing setups. In those sims I was talking about above,
all that stat data simply goes to text files. A bunch of perl scripts
(yes, yes, moving on...) manage running the simulation, looping
through repeats, changing command line variables, etc, and producing
stat files under a given naming convention. Other scripts then use
that naming convention to collect all the data and collate it into
gnuplot ready form, produce confidence intervals, etc. Something like
that would be a workable approach. XML output would be trivial to
add, and slightly more formal. Database integration to do that
storage, collection, and manipulation would also be a good idea, which
has been discussed for ns-3 before. In my experience, this data
management is the most cumbersome part of working with simulation
statistics, so it bears at least as much focus as the "stat framework"
inside the simulation.
Thx!
--
- joe kopena
right here and now
More information about the Ns-developers
mailing list