next up previous
Next: Page Divergence Metric Up: Theoretical Framework Previous: Theoretical Framework


Metrics

The following metrics are of interest:


Mathematically, we have a page divergence function $D(P, P')$ of two copies of a page: the true copy $P$ and cached copy $P'$. Immediately following a refresh event, $P = P'$ and $D(P, P') = 0$. Later, if $P$ undergoes updates such that $P \neq P'$, $D(P, P') > 0$. (We give the exact form of the $D(\cdot)$ function in Section 2.1.1.)

The overall divergence of the web cache at a given time is defined as a weighted average across pages:



\begin{displaymath}
D(\mathcal{P}) = \frac{1}{\vert\mathcal{P}\vert} \sum_{P \in \mathcal{P}} W_P \cdot D(P, P')
\end{displaymath} (1)

where $W_P$ denotes the importance or relevance weight of page $P$, e.g., $P$'s pagerank or embarrassment coefficient [13].

Averaging across a given time interval $(t_1, t_2)$, we have:



\begin{displaymath}
D(\mathcal{P}, t_1, t_2) = \int_{t=t_1}^{t_2} \frac{1}{\vert...
...\vert} \sum_{P \in \mathcal{P}} W_P \cdot D(P(t), P'(t)) \, dt
\end{displaymath} (2)

where $P(t)$ denotes the true content of page $P$ at time $t$, and $P'(t)$ denotes the (possibly out-of-date) content cached for $P$ as of time $t$.



Subsections
next up previous
Next: Page Divergence Metric Up: Theoretical Framework Previous: Theoretical Framework
Chris Olston 2008-02-15