/[Apache-SVN]/httpd/flood/trunk/DESIGN
ViewVC logotype

Contents of /httpd/flood/trunk/DESIGN

Parent Directory Parent Directory | Revision Log Revision Log


Revision 653656 - (show annotations) (download)
Tue May 6 00:42:11 2008 UTC (16 years, 3 months ago) by fielding
File size: 7192 byte(s)
promote flood to released products
1 Flood Design Document
2 ----------------------------------------
3
4 Flood is an experimental multi-purpose HTTP load-testing tool.
5 The design of flood is intended to be modular in the extreme to
6 support testing of a wide variety of website applications.
7 Flood is also intended to be scalable and accurate.
8
9 ------------------------------
10 Functional Requirements:
11 ------------------------------
12
13 The following is a list of functions we intend to incorporate into flood:
14
15 1) Timing Metrics
16 a) TCP connect() time
17 b) Time to send request (time to fill local buffers)
18 c) Time until first response chunk was received
19 (tests network latency at the application layer (HTTP))
20 d) Time to receive a full response
21 e) (optional) Local Processing times, such as time to generate the Request
22 2) Simple response error detection
23 3) Load testing a set of URLs:
24 a) Random
25 b) Round-Robin
26 c) Sequenced (with cookie propogation)
27 d) Chaining of the above strategies
28 4) Distributed load generation (using rsh/ssh)
29 5) Complex response validation
30 a) grep/regexp matching
31 b) higher-level validation?
32 6) Complex request generation
33 a) Spider a site to generate a list of URLs
34 b) Read a CERN log to generate URL paths from real users
35 (could also add weights to the URLs depending on the occurance in the logs)
36 c) Read a tcpdump/sniff packet trace to generate URL paths
37
38 ------------------------------
39 Functional Specifications:
40 ------------------------------
41
42 With the above functional goals in mind, the following components will have
43 to be designed and implemented:
44
45 1) Request/Response framework
46 - Every hit to the site passes through this sequence of calls.
47 2) Timer hooks in the framework as well as a generic timer facility
48 - Hooks are placed to properly gauge request generation time, network
49 I/O time, round-trip time, network latency as well as bandwidth,
50 local processing time, etc...
51 - Hook facilities also allow module authors to time various aspects of
52 their processes, and aggregate them into the final report.
53 3) Simple distributed processing environment:
54 - Patterned after distributed models like cvs or rsync that use
55 rsh/ssh to invoke a remote process. Statistics from this remote
56 process must be regular and should ideally follow a standardized
57 reporting format.
58 4) Simple Reporting Format (? might be overengineering ?)
59 - A simple format used to report statistics between modules, may they
60 be remote processes or otherwise out-of-process.
61 - I envision this being some really simple XML schema that all processes
62 use to report data back to the terminal, where the originating process
63 collects this data, processes, and reports in various formats (XML,
64 human-readable, etc...)
65
66 (more to come)
67
68 ------------------------------
69 Design Specifications:
70 ------------------------------
71
72 ------------------------------
73 Modular Test Framework
74
75 Flood provides the control logic that will invoke a test procedure and
76 collect reported statistics. Tests are modular, and conform to this
77 simple invocation interface. In order to simplify the task of a module
78 developer, flood also provides an extensive support library.
79
80 ------------------------------
81 Test Invocation:
82
83 Tests are invoked with a standard interface.
84
85 [ proposal:
86
87 typedef unsigned int testid_t; /* unique test id across this invocation of flood
88 (including fork()s and remote invocations */
89
90 apr_status_t invoke_test(config_t * config, /* to be defined */
91 testid_t testid);
92
93
94 ]
95
96 ------------------------------
97 Transaction Reporting:
98
99 All tests will collect and process operating statistics and report these
100 back to the controlling process via standard calls. The following statistics
101 are required from every module and for each HTTP transaction:
102
103 - TCP Handshake time ( time spent in connect() )
104 - Application-layer latency (Time to receive first HTTP bytes)
105 - Idle time (total time spent waiting for data)
106 - Response time (total time to receive the HTTP Response)
107 ? - Total Transaction Time
108
109 [ proposal:
110
111 apr_status_t report_stats(const char * test_name, /* short name/description */
112 int test_number, /* unique to this test */
113 apr_status_t test_result, /* success, failure */
114 apr_interval_t tcp,
115 apr_interval_t first_resp,
116 apr_interval_t idle,
117 apr_interval_t total_resp,
118 apr_interval_t total);
119 ]
120
121 ------------------------------
122 Module Configuration:
123
124 [ mentioned above in "config_t", needs to be elaborated here.
125
126 - what is the config syntax?
127 - what is the definition of config_t?
128 - what support functions do modules use to retrieve config values?
129 - what if a needed key/value is missing?
130 ]
131
132 ------------------------------
133 Flood Support Library:
134
135 The support library provides many of the functions necessary to write
136 a conforming test module.
137
138 [ define this more, or move documentation to another file...
139 I hate to suggest it, but maybe preparsed XML (apr supports it, no?)
140 ]
141
142 ------------------------------
143 Local Parallel Tests:
144
145 Given the above scenarios, one may wish to perform many tests in parallel.
146 Flood provides two major way to accomplish this: Threaded and Forked.
147 These two methods can actually be performed at the same time, allowing
148 for fine-grain control of local resources to maximize test throughput.
149
150 Threaded:
151
152 The process is instructed to make a number of user-space threads,
153 each of which will perform a complex event chain as described above. For
154 the purpose of this document, each thread that performs an actual test
155 is called a "worker".
156
157 Forked:
158
159 The process is instructed to make multiple duplicate copies of itself
160 using the fork() system command, each of which can run one test
161 or can run a threaded/parallelized group of tests. For the purpose
162 of this document, each fork()ed process is called a "child".
163
164 ------------------------------
165 Distributed Tests:
166
167 Further parallelization of the tests is possible, given access to a number
168 of remote machines. Flood can invoke a remote instance of itself with
169 either the "rsh" or "ssh" remote shell commands. These instances of flood
170 are special, in that they do not report directly back to the user, but
171 instead communicate their statistical information back to the main
172 flood process, which aggregates the information and generates a human
173 readable report.
174
175 ------------------------------
176 Interprocess Communication:
177
178 In such a largely scalable and distributable system, one of the larger
179 hurdles will be interprocess communication. One can imagine scenarios
180 where a main flood process has spawned many remote instances, each of
181 which (including the original) will fork into many children processes,
182 each of which will run multiple parallelized testing event chains.
183 When this massive invocation has completed, the user will be presented
184 with a single unified report. To deal with this communication, a
185 protocol must be designed to facilitate IPC.
186
187 [ propose a protocol here, perhaps XML? ]
188
189 ----------------------------------------
190 $Id$
191

Properties

Name Value
svn:eol-style native
svn:keywords Id

infrastructure at apache.org
ViewVC Help
Powered by ViewVC 1.1.26