25
Intrinsic Temporal Locality Properties of Web Request Streams Rodrigo Fonseca (UC Berkeley) Virgílio Almeida (UFMG) Mark Crovella (Boston University) Bruno Abrahao (UFMG) IEEE INFOCOM 2003

On the Intrinsic Locality Properties of Web Reference Streams

Embed Size (px)

Citation preview

Intrinsic Temporal Locality Properties of Web Request Streams

Rodrigo Fonseca (UC Berkeley)

Virgílio Almeida (UFMG)

Mark Crovella (Boston University)

Bruno Abrahao (UFMG)

IEEE INFOCOM 2003

2

• Network measurements in highly dynamic environments–Understand the behavior and find

intrinsic properties

–Online algorithms for driving self-organization of system agents

–Algorithms for dynamic adapting environments with high degree of uncertainty

Research Questions

3

Question

• Study temporal locality at different points of the Web

–Results: Framework + metrics:

• New insights on the causes of temporal locality across different points of the Web topology

4

Web Measurements

• Why?– Understand and improve current systems

– Design new systems

• Challenges– Complexity

– Rate of change

• Approach– Look for regularities, underlying principles

– Abstractions

5

Previous Studies

• Many advancements–Caching hierarchies

–Replacement Policies for caches

– Load distribution and balancing

• However...–Focus on individual components

–No structured view of the entire system

6

Stream centric view of the Web

• Focus on request streams

– Look for intrinsic properties

–How they are altered

–How their properties change in different points

–How their intrinsic properties can influence the designs of components and of collections of components

7

Tranformations on Request Streams

• Three types of transformations

–Aggregation

• Multiple sources

–Disaggregation

• Multiple destinations

–Filtering

• Resulting stream = subset of input stream

DAF

8

Transformations Abstraction

• Components may be abstracted by combinations of transformations

D

F

Clients

A

Server

D

F

A

Proxy Cache

9

Approach

• For a given intrinsic property of streams–Study the effects of the

transformations on the property

–Combine the effects to understand• Effects of components

• Effects of collections of components

• Properties at different points of the topology

10

Temporal Locality

• Need of intrinsic metrics for the streams– We use virtual time

• Two sources:– Popularity

...XAXBXCXDXE...

• Preserved with reordering

– Correlation...AABBCCDDEE...

• Lost with reordering

11

Measuring Popularity: Entropy

– Traditionally Zipf’s Law: pi i-

– We measure the deviation from Uniform popularity distribution

– Entropy:

• pi determined empirically from frequencies

• N is the number of distinct objects

• If uniform, tends to log2(N)

• If one object only is accessed, tends to 0

– Not the Entropy rate of the source

N

i

ii ppH1

2log

12

Entropy and Zipf’s Law

13

Entropy and Zip’f Law

14

Measuring Correlation: CV

• Look at IAT distribution per object– Number of references between 2 references

to the same object

• Behavior– Correlation tendency to shorter IATs

– No Correlation tends to geometric distribution

• We measure the deviation from the geometric distribution for IAT– Coefficient of Variation:

μ

σ

15

IAT Distribution

Original

Scrambled

16

Metrics Summary

• Entropy

–High concentration: Low Entropy

–Uniform distribution: High Entropy

• CV

–High Correlation: High CV

–No correlation: CV ~ 1

17

Entropy vs. Hit Ratio

• LRU cache simulation, size 5%

18

CV versus HR Difference

• Difference between scrambled and original streams

19

Putting it all together

• Effects of the transformations on the components of temporal locality

20

Effects of Filtering: Popularity

21

Effects of Filtering: Correlation

22

Effects of Aggregation and Disaggregation

CV

Entropy

23

Temporal Locality atDifferent Points of the topology

Servers

Proxies

Clients

24

Effects of

25

Conclusions

• Understanding of behavior

• Transformations Framework allows for structured study and understanding of Web workloads characteristics