Upload
bruno-abrahao
View
298
Download
0
Embed Size (px)
Citation preview
Intrinsic Temporal Locality Properties of Web Request Streams
Rodrigo Fonseca (UC Berkeley)
Virgílio Almeida (UFMG)
Mark Crovella (Boston University)
Bruno Abrahao (UFMG)
IEEE INFOCOM 2003
2
• Network measurements in highly dynamic environments–Understand the behavior and find
intrinsic properties
–Online algorithms for driving self-organization of system agents
–Algorithms for dynamic adapting environments with high degree of uncertainty
Research Questions
3
Question
• Study temporal locality at different points of the Web
–Results: Framework + metrics:
• New insights on the causes of temporal locality across different points of the Web topology
4
Web Measurements
• Why?– Understand and improve current systems
– Design new systems
• Challenges– Complexity
– Rate of change
• Approach– Look for regularities, underlying principles
– Abstractions
5
Previous Studies
• Many advancements–Caching hierarchies
–Replacement Policies for caches
– Load distribution and balancing
• However...–Focus on individual components
–No structured view of the entire system
6
Stream centric view of the Web
• Focus on request streams
– Look for intrinsic properties
–How they are altered
–How their properties change in different points
–How their intrinsic properties can influence the designs of components and of collections of components
7
Tranformations on Request Streams
• Three types of transformations
–Aggregation
• Multiple sources
–Disaggregation
• Multiple destinations
–Filtering
• Resulting stream = subset of input stream
DAF
8
Transformations Abstraction
• Components may be abstracted by combinations of transformations
D
F
Clients
A
Server
D
F
A
Proxy Cache
9
Approach
• For a given intrinsic property of streams–Study the effects of the
transformations on the property
–Combine the effects to understand• Effects of components
• Effects of collections of components
• Properties at different points of the topology
10
Temporal Locality
• Need of intrinsic metrics for the streams– We use virtual time
• Two sources:– Popularity
...XAXBXCXDXE...
• Preserved with reordering
– Correlation...AABBCCDDEE...
• Lost with reordering
11
Measuring Popularity: Entropy
– Traditionally Zipf’s Law: pi i-
– We measure the deviation from Uniform popularity distribution
– Entropy:
• pi determined empirically from frequencies
• N is the number of distinct objects
• If uniform, tends to log2(N)
• If one object only is accessed, tends to 0
– Not the Entropy rate of the source
N
i
ii ppH1
2log
14
Measuring Correlation: CV
• Look at IAT distribution per object– Number of references between 2 references
to the same object
• Behavior– Correlation tendency to shorter IATs
– No Correlation tends to geometric distribution
• We measure the deviation from the geometric distribution for IAT– Coefficient of Variation:
μ
σ
16
Metrics Summary
• Entropy
–High concentration: Low Entropy
–Uniform distribution: High Entropy
• CV
–High Correlation: High CV
–No correlation: CV ~ 1