[Ns-developers] [RFC] Worker Infrastructure - Parallelization in Large

Mathieu Lacage mathieu.lacage at sophia.inria.fr
Tue Jan 6 02:41:02 PST 2009


hi hagen,

Sorry for the very late answer,

On Mon, 2008-12-22 at 14:00 +0100, Hagen Paul Pfeifer wrote:
> under http://code.nsnam.org/pfeifer/ns-3-worker I published a high level
> parallelization approach. Beside a component approach, where some/several
> ns-3 subsystems works in parallel, this worker approach parallelize
> complete ns-3 instances. The major goal is to speed up a *typical
> simulation*
> with more then one successive simulation runs.

I really like the idea and I always wanted to have something like this
so, it would be really nice if we could try to merge something like this
in ns-3 itself (in contrib or core).

> The next few lines show a minimal "hello world" worker application. That
> example is generic enough to utilize all available cores at a system at
> runtime

I went through the example code, API, and implementation. Comments
below.

1) I can't see why ChunkStart/End and GetNumCores are inline so, I would
move them to worker.cc

2) I think that the API is very focused on parallelizing a single for
loop with a fixed number of iterations and the same code executing every
time. It would be nice to try to generalize that. Here is a proposal.

class WorkManager
{
public:
  // register a callback which wants to do work.
  void AddWork (Callback<void> worker);
  // Run one worker on each core, until all workers have run.
  void Run (void);
  // run one worker on n cores, until all workers have run.
  void Run (uint32_t n);
  static uint32_t GetNCores (void);
};

class LoopWorkManager
{
public:
  // register the loop callback and the total number of iterations.
  void SetLoop (Callback<void,uint32_t,uint32_t> worker, uint32_t
iterations);
  // run the loop callback on each core until all iterations have run
  void Run (void);
  // run the loop callback on n cores, until all iterations have run
  void Run (uint32_t n);
  static uint32_t GetNCores (void);
private:
  // implement the loop manager with the work manager.
  WorkManager m_manager;
};

Note: I picked WorkManager over Worker because, for me, Worker means a
single guy doing his job while here, the goal is to manage a bunch of
guys doing their job. It could be also WorkerManager but WorkManager
seemed shorter and sufficiently descriptive.

3) You would be my personal hero if you would modify the waf
build/regression system to run the regression suite using a small C++
program based on the WorkManager when it is available on the target
platform.


thanks again for your code,
Mathieu



More information about the Ns-developers mailing list