Today, the CERN batch system is a facility with more than 10,000 compute nodes. The CERN IT batch service manages this facility and provides computing resources to CERN users and for the Worldwide LHC Computing Grid. The computing power the batch system manages is around 100,000 CPU cores.
In the past, the CERN batch system was running the IBM platform LSF. Currently part of the batch farm has migrated to HTCondor, the future workload management software for the CERN batch system.
Latest versions of HTCondor come with new features, including the ability to run jobs inside Docker containers. We would like to test the Docker support in the batch service to address two issues:
● The inability of the batch submitters to define their own environment of execution, without the intervention of a system administrator
● The underlying OS on the worker nodes are dependent of the job environment. Making updates on those OS may affect the software compatibilities.
In this report, we outline and document the deployment of Docker on the HTCondor worker nodes running CERN CentOS 7, the setting up of the Docker Universe and the creation of job routes that transform incoming jobs in the grid to Docker jobs to be executed in containers. The project includes also the subtask of creating a Scientific Linux CERN 6 (SLC6) Docker image for the grid jobs.