The Task Manager 2

The Task Manager 2 is an almost complete re-write of the original Task Manager which has been successfully in use for skim production by BABAR since January 2004. Why such a radical step of a re-write instead of an upgrade.

First, the original Task Manager in continouusly being upgraded and has been adapted for various changes in the skim production procedure. But it also has some significant design flaws. When the Task Manager was designed, BABAR's Computing Model 2 was still under development. Many of the features of the current skim production only developed during the transition from the Objectivity based Computing Model 1 and the Task Manager had to be adjusted to take these into account. The biggest design flaw (for which I take full responsibility) is the treatment of skimming and merging as two separate production steps based on the same production framework. Whereas skimming and merging are in theory very similar, both take data collections as input, do some processing on them, and produce new data collections, the differences are significant enough to make the squeezing of both processing steps into on framework inconveniant and unelegant. Perhaps the biggest disadvantage however is the fact that the processing step of merging depends directly on the skimming.

The Task Manager 2 takes a different approach. Skimming and merging are treated as two steps of the same processing. This does leads to some replication in the database layout.

As can be seen in the database table layout, the table required for skimming and the merging and the joins between them are almost identical. The same is also true in the class design. However this falls beautifully into OO design, since both, the relevant classes for skims and merges, extend from the same base class. For example the class representing the individual jobs submitted to the batch system are called BbkTMSkimJob and BbkTMMergeJob respectively and extend the BbkTMJobs class. So the base class contains all things common to the jobs whereas the inherited classes contain the differences.

The Job Wrapper Package

The second big difference between the original Task Manager and the Task Manager 2 can be found in the Job Wrapper package. The Task Manager wraps the application for skimming and merging with a perl script to allow for preparation of the run environment before processing and job validation and clean up after the actual processing.

The original Task Manager only had very little support for the job wrappers. But with the requirement to run the applications in an increasingly autonomous and isolated (local) environment to allow for scaleability, the job wrappers had to take on more responsibility and the amount of code grew without a proper base. The Task Manager 2 consists of an entire package (BbkJobWrappers) consisting of various utilities required to run the jobs and support processing in the batch queue.

The figure above illustrates dependencies within the main classes (simple utility classes are not shown). The base of the package is a set of perl wrappers around the commonly used commands to interact with the event store and obtain information on data collections, as well as move data around and checksum them. The intermediate level classes resemble the metadata objects stored in the database. Finally the top level classes are built around collections of the intermediate level classes and deal with the metadata that was created by the individual jobs.

Summary

The Task Manager 2 is in its final development and should be launched by the end of 2005/early 2006.