Scheduling monitoring tasks

As mentioned earlier, our goal is to schedule the allocated monitoring tasks among M parallel monitoring processes with the aim of minimizing the total delay between the ideal time instants and the actual scheduled time instants when a monitoring task must be executed.

Let page i be allocated x_i number of monitoring tasks in an optimal resource allocation. Also, suppose the time instants at which these x_i monitoring tasks should be employed are t₁, t₂, t₃,..., t_xi, as identified in the resource allocation phase. Let $\ensuremath{\mathrm{fetch}}_i$ be the average fetching time for the i^th page. The scheduling problem can be easily mapped to parallel shop scheduling problem. In this problem, each job has to be processed on exactly one of M identical machines. Each monitoring task could be regarded as a job whereas the monitoring processes are equivalent to machines. Suppose there are a total of n such jobs. In scheduling problems, the time at which a job becomes available for processing is called the release time ( $\ensuremath{\mathrm{rel}}_j$ ) and the time for which it needs a machine is called the processing time. So in our case, ideal monitoring time instants t₁, t₂, t₃,..., t_xi would be the release times and fetching times of pages correspond to processing times (p_j) for jobs. Our goal is to minimize the delay d_i between ideal monitoring time instant $(\ensuremath{\mathrm{rel}}_i)$ and actual time instant s_i of scheduling. In our case all the jobs are equally important as there is no weight assigned with each monitoring. So our problem can be formulated in scheduling notation as $R\vert M\vert\ensuremath{\mathrm{rel}}_j \ge 0\vert\sum_j\ensuremath{\mathrm{Cm}}_j$ ( $\ensuremath{\mathrm{Cm}}_j$ denotes the completion time for job j), meaning that R jobs of non-trivial release times are available for scheduling at M machines with goal of minimizing the average completion time. Minimizing the average completion time leads to minimization of average delay time, because the total completion time is

$\displaystyle \sum_{i=1}^{R}$ Cm_i	$\textstyle =$	$\displaystyle \sum_{i=1}^R (s_i + p_i)$
$\displaystyle = \quad \sum_{i=1}^R (\ensuremath{\mathrm{rel}}_i + d_i + p_i)$	$\textstyle =$	$\displaystyle \sum_{i=1}^R d_i + \sum_{i=1}^R \ensuremath{\mathrm{rel}}_i + \sum_{i=1}^R p_i.$

As $\sum_{i=1}^R \ensuremath{\mathrm{rel}}_i$ and $\sum_{i=1}^{R}$ p_i are constants, minimizing average completion time is same as minimizing delay time. Note that $\ensuremath{\mathrm{Cm}}_i$ is the same as s_i + p_i because of non-preemptive scheduling. Unfortunately even the simpler problem R| 1| rel_j $\ge$ 0| $\sum_{j}^{}$ Cm_j has been proved to be NP-Complete [11]. So we have to look for approximation algorithms. For completion time problem there is an 1.58-approximation algorithm [11] which we used in our experiments as a heuristic for optimizing average delay time (an approximation for average completion time does not necessarily translate to the same approximation for average delay time; the latter is harder to approximate).