The following metrics are of interest:
Mathematically, we have a page divergence function of two copies of a page: the true copy and cached copy . Immediately following a refresh event, and . Later, if undergoes updates such that , . (We give the exact form of the function in Section 2.1.1.)
The overall divergence of the web cache at a given time is defined as a weighted average across pages:
(1) |
where denotes the importance or relevance weight of page , e.g., 's pagerank or embarrassment coefficient [13].
Averaging across a given time interval , we have:
where denotes the true content of page at time , and denotes the (possibly out-of-date) content cached for as of time .