The Simple Job Manager

The Simple Job Manager (SJM) is a small, but very powerful, job management program written in python. What makes SJM stand out is its strict utilization of tools provided by BABAR's Computing Model 2 (CM2). The input data for SJM has to be listed in the BABAR dataset bookkeeping and has to be available in the main event store. SJM then creates a bridge between the event store, the bookkeeping and the offline analysis framework. The user simply provides the name of the dataset (or datasets) to be used in the analysis and the configuration files required to run the offline analysis application together with some configuration options for SJM itself containing the name of the configuration files, the size of the individual jobs etc. SJM then creates all individual jobs and prepares them for submission to the batch system. If requested by the user, SJM starts a daemon (background process) that monitors the number of jobs in the batch system, checks completed jobs, and submits new jobs if required. If everything runs fine, i.e. the analysis application itself does not crash or other computing problems occur, the only thing the user ever needs to to do is to start the daemon and pick up the completed output from his/her analysis jobs after a couple of days (though it certainly doesn't hurt to check from time to time).

SJM does not require a database, instead the job accounting is managed over a set of directories. This makes it easier for users to get started since it avoids typical reservations regarding databases (What if I mess up my database?). The job accounting in SJM in totally transparent to users, everything is stored in one directory tree, and if really something should get messed up, one can simply delete the whole thing and start over again.

For more on SJM, see the SJM User Guide.