next up previous
Next: Implications Up: Generative Model Previous: Generative Model


Model Validation

Figure 5: Actual and modeled lifetime distributions.

To validate our generative model, we analyzed the real fragment lifetime distribution of pages from the high-quality data set. We focused on a set of pages that have the same average change frequency $\lambda_P = 0.25$. We assigned an estimated $K$ value to every non-static fragment, based on the number of page update events the fragment ``survives'' (i.e., remains on the page). For each page we found the most common $K$ value among its fragments.

We obtained three groups of pages: those whose dominant $K$ value is $1$ (churn behavior), those dominated by $K=2$ (short scroll behavior), and those dominated by scroll behavior with some $K>2$ (the third category was not large enough to subdivide while still having enough data for smooth lifetime distributions). Figure 5 plots the fragment lifetime distributions for the non-static fragments of the three groups of pages, along with corresponding lifetime distributions obtained from our generative model. Each instantiation of the model reflects a distribution of $K$ values that matches the distribution occurring in the data.

The actual lifetime curves closely match the ones predicted by the model. One exception is that the $K>2$ curve for the actual data diverges somewhat from the model at the end (top-right corner of Figure 5). This discrepancy is an unfortunate artifact of the somewhat short time duration of our data set: Fragments present in the initial or final snapshot of a page were not included in our analysis because we cannot determine their full lifetimes. Consequently the data is slightly biased against long-lifetime fragments.

Unlike the right-most curve of Figure 4, none of the curves in Figure 5 exhibit the flat component at the beginning. The reason is that in each of our page groups there is at least some $K=1$ content that churns rapidly.


next up previous
Next: Implications Up: Generative Model Previous: Generative Model
Chris Olston 2008-02-15