流量复制重放工具goreplay与diffy结合使用

goreplay简介

https://github.com/buger/goreplay

https://goreplay.org

GoReplay是一个开源工具,用于捕获实时HTTP流量并将其重放到测试环境中,以便使用真实数据持续测试系统。
GoReplay不是代理,而是监听网络接口上的流量,不需要更改生产基础架构,而是在与服务相同的计算机上运行GoReplay守护程序。

goreplay工作原理

goreplay工作原理

goreplay常见用法

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
1. 简单的 HTTP 流量复制:
gor –input-raw :80 –output-http “http://staging.com”

2.HTTP 流量复制频率控制:
gor –input-tcp :28020 –output-http “http://staging.com|10″

3.HTTP 流量复制缩小:
gor –input-raw :80 –output-tcp “replay.local:28020|10%”

4.HTTP 流量记录到本地文件:
gor –input-raw :80 –output-file requests.gor

5.HTTP 流量回放和压测:
gor –input-file “requests.gor|200%” –output-http “staging.com”

6.HTTP 流量过滤复制:
gor –input-raw :8080 –output-http staging.com –output-http-url-regexp ^www.

7.HTTP指定接口流量复制:
gor --input-raw :80 --http-allow-url '/api/v1' --output-stdout //--output-stdout表示直接在控制台输出

gor参数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
[root@~]# gor --help
Gor is a simple http traffic replication tool written in Go. Its main goal is to replay traffic from production servers to staging and dev environments.
Project page: https://github.com/buger/gor
Author: <Leonid Bugaev> leonsbox@gmail.com
Current Version: 1.0.0

-copy-buffer-size int
Set the buffer size for an individual request (default 5M) (default 5242880)
-cpuprofile string
write cpu profile to file
-debug verbose //打开debug模式,显示所有接口的流量
Turn on debug output, shows all intercepted traffic. Works only when with verbose flag
-exit-after duration
exit after specified duration
-http-allow-header value //用一个正则表达式来匹配http头部,如果请求的头部没有匹配上,则被拒绝
A regexp to match a specific header against. Requests with non-matching headers will be dropped:
gor --input-raw :8080 --output-http staging.com --http-allow-header api-version:^v1
-http-allow-method value // 类似于一个白名单机制来允许通过的http请求方法,除此之外的方法都被拒绝.
Whitelist of HTTP methods to replay. Anything else will be dropped:
gor --input-raw :8080 --output-http staging.com --http-allow-method GET --http-allow-method OPTIONS
-http-allow-url value //一个正则表达式用来匹配url, 用来过滤完全匹配的的url,在此之外的都被过滤掉
A regexp to match requests against. Filter get matched against full url with domain. Anything else will be dropped:
gor --input-raw :8080 --output-http staging.com --http-allow-url ^www.
-http-basic-auth-filter value //匹配认证头重放
A regexp to match the decoded basic auth string against. Requests with non-matching headers will be dropped:
gor --input-raw :8080 --output-http staging.com --http-basic-auth-filter "^customer[0-9].*"
-http-disallow-header value //用一个正则表达式来匹配http头部,匹配到的请求会被拒绝掉
A regexp to match a specific header against. Requests with matching headers will be dropped:
gor --input-raw :8080 --output-http staging.com --http-disallow-header "User-Agent: Replayed by Gor"
-http-disallow-url value //用一个正则表达式来匹配url,如果请求匹配上了,则会被拒绝
A regexp to match requests against. Filter get matched against full url with domain. Anything else will be forwarded:
gor --input-raw :8080 --output-http staging.com --http-disallow-url ^www.
-http-header-limiter value //读取请求,基于FNV32-1A散列来拒绝一定比例的特殊请求
Takes a fraction of requests, consistently taking or rejecting a request based on the FNV32-1A hash of a specific header:
gor --input-raw :8080 --output-http staging.com --http-header-limiter user-id:25%
-http-original-host //在--output-http的输出中,通常gor会使用取代请求的http头,所以应该禁用该选项,保留原始的主机头
Normally gor replaces the Host http header with the host supplied with --output-http. This option disables that behavior, preserving the original Host header.
-http-param-limiter value
Takes a fraction of requests, consistently taking or rejecting a request based on the FNV32-1A hash of a specific GET param:
gor --input-raw :8080 --output-http staging.com --http-param-limiter user_id:25%
-http-pprof :8181
Enable profiling. Starts http server on specified port, exposing special /debug/pprof endpoint. Example: :8181
-http-rewrite-header value
Rewrite the request header based on a mapping:
gor --input-raw :8080 --output-http staging.com --http-rewrite-header Host: (.*).example.com,$1.beta.example.com
-http-rewrite-url value
Rewrite the request url based on a mapping:
gor --input-raw :8080 --output-http staging.com --http-rewrite-url /v1/user/([^\/]+)/ping:/v2/user/$1/ping
-http-set-header value
Inject additional headers to http reqest:
gor --input-raw :8080 --output-http staging.com --http-set-header 'User-Agent: Gor'
-http-set-param value
Set request url param, if param already exists it will be overwritten:
gor --input-raw :8080 --output-http staging.com --http-set-param api_key=1
-input-dummy value
Used for testing outputs. Emits 'Get /' request every 1s
-input-file value //从一个文件中读取请求
Read requests from file:
gor --input-file ./requests.gor --output-http staging.com
-input-file-loop
Loop input files, useful for performance testing.
-input-kafka-host string
Send request and response stats to Kafka:
gor --output-stdout --input-kafka-host '192.168.0.1:9092,192.168.0.2:9092'
-input-kafka-json-format
If turned on, it will assume that messages coming in JSON format rather than GoReplay text format.
-input-kafka-topic string
Send request and response stats to Kafka:
gor --output-stdout --input-kafka-topic 'kafka-log'
-input-raw value
Capture traffic from given port (use RAW sockets and require *sudo* access):
# Capture traffic from 8080 port
gor --input-raw :8080 --output-http staging.com
-input-raw-bpf-filter string
BPF filter to write custom expressions. Can be useful in case of non standard network interfaces like tunneling or SPAN port. Example: --input-raw-bpf-filter 'dst port 80'
-input-raw-buffer-size int
Controls size of the OS buffer (in bytes) which holds packets until they dispatched. Default value depends by system: in Linux around 2MB. If you see big package drop, increase this value.
-input-raw-engine libpcap
Intercept traffic using libpcap (default), and `raw_socket` (default "libpcap")
-input-raw-expire duration
How much it should wait for the last TCP packet, till consider that TCP message complete. (default 2s)
-input-raw-immediate-mode
Set pcap interface to immediate mode.
-input-raw-override-snaplen
Override the capture snaplen to be 64k. Required for some Virtualized environments
-input-raw-realip-header string
If not blank, injects header with given name and real IP value to the request payload. Usually this header should be named: X-Real-IP
-input-raw-timestamp-type string
Possible values: PCAP_TSTAMP_HOST, PCAP_TSTAMP_HOST_LOWPREC, PCAP_TSTAMP_HOST_HIPREC, PCAP_TSTAMP_ADAPTER, PCAP_TSTAMP_ADAPTER_UNSYNCED. This values not supported on all systems, GoReplay will tell you available values of you put wrong one.
-input-raw-track-response
If turned on Gor will track responses in addition to requests, and they will be available to middleware and file output.
-input-tcp value // 用来在多个gor之间流转流量
Used for internal communication between Gor instances. Example:
# Receive requests from other Gor instances on 28020 port, and redirect output to staging
gor --input-tcp :28020 --output-http staging.com
-input-tcp-certificate string
Path to PEM encoded certificate file. Used when TLS turned on.
-input-tcp-certificate-key string
Path to PEM encoded certificate key file. Used when TLS turned on.
-input-tcp-secure
Turn on TLS security. Do not forget to specify certificate and key files.
-memprofile string
write memory profile to this file
-middleware string
Used for modifying traffic using external command
-output-dummy value //用来测试输入,打印出接收的数据.
DEPRECATED: use --output-stdout instead
-output-file value //把进入的请求写入一个文件中
Write incoming requests to file:
gor --input-raw :80 --output-file ./requests.gor
-output-file-append
The flushed chunk is appended to existence file or not.
-output-file-flush-interval duration
Interval for forcing buffer flush to the file, default: 1s. (default 1s)
-output-file-max-size-limit value
Max size of output file, Default: 1TB (default -1)
-output-file-queue-limit int
The length of the chunk queue. Default: 256 (default 256)
-output-file-size-limit value
Size of each chunk. Default: 32mb (default 33554432)
-output-http value //转发进入的请求到一个http地址上
Forwards incoming requests to given http address.
# Redirect all incoming requests to staging.com address
gor --input-raw :80 --output-http http://staging.com
-output-http-compatibility-mode
Use standard Go client, instead of built-in implementation. Can be slower, but more compatible.
-output-http-debug
Enables http debug output.
-output-http-elasticsearch string //把请求和响应状态发送到ElasticSearch
Send request and response stats to ElasticSearch:
gor --input-raw :8080 --output-http staging.com --output-http-elasticsearch 'es_host:api_port/index_name'
-output-http-header --output-http-header
WARNING: --output-http-header DEPRECATED, use `--http-set-header` instead
-output-http-header-filter --output-http-header-filter
WARNING: --output-http-header-filter DEPRECATED, use `--http-allow-header` instead
-output-http-header-hash-filter output-http-header-hash-filter
WARNING: output-http-header-hash-filter DEPRECATED, use `--http-header-hash-limiter` instead
-output-http-method --output-http-method
WARNING: --output-http-method DEPRECATED, use `--http-allow-method` instead
-output-http-queue-len int
Number of requests that can be queued for output, if all workers are busy. default = 1000 (default 1000)
-output-http-redirects int //设置多少次重定向被允许
Enable how often redirects should be followed.
-output-http-response-buffer int
HTTP response buffer size, all data after this size will be discarded.
-output-http-rewrite-url --output-http-rewrite-url
WARNING: --output-http-rewrite-url DEPRECATED, use `--http-rewrite-url` instead
-output-http-stats //每5秒钟输出一次输出队列的状态
Report http output queue stats to console every N milliseconds. See output-http-stats-ms
-output-http-stats-ms int
Report http output queue stats to console every N milliseconds. default: 5000 (default 5000)
-output-http-timeout duration //指定http的request/response超时时间,默认是5秒
Specify HTTP request/response timeout. By default 5s. Example: --output-http-timeout 30s (default 5s)
-output-http-track-response
If turned on, HTTP output responses will be set to all outputs like stdout, file and etc.
-output-http-url-regexp --output-http-url-regexp
WARNING: --output-http-url-regexp DEPRECATED, use `--http-allow-url` instead
-output-http-workers int // gor默认是动态的扩展工作者数量,你也可以指定固定数量的工作者
Gor uses dynamic worker scaling. Enter a number to set a maximum number of workers. default = 0 = unlimited.
-output-http-workers-min int
Gor uses dynamic worker scaling. Enter a number to set a minimum number of workers. default = 1.
-output-kafka-host string
Read request and response stats from Kafka:
gor --input-raw :8080 --output-kafka-host '192.168.0.1:9092,192.168.0.2:9092'
-output-kafka-json-format
If turned on, it will serialize messages from GoReplay text format to JSON.
-output-kafka-topic string
Read request and response stats from Kafka:
gor --input-raw :8080 --output-kafka-topic 'kafka-log'
-output-null
Used for testing inputs. Drops all requests.
-output-stdout
Used for testing inputs. Just prints to console data coming from inputs.
-output-tcp value //用来在多个gor之间流转流量
Used for internal communication between Gor instances. Example:
# Listen for requests on 80 port and forward them to other Gor instance on 28020 port
gor --input-raw :80 --output-tcp replay.local:28020
-output-tcp-secure
Use TLS secure connection. --input-file on another end should have TLS turned on as well.
-output-tcp-stats //每5秒钟报告一次tcp输出队列的状态
Report TCP output queue stats to console every 5 seconds.
-prettify-http
If enabled, will automatically decode requests and responses with: Content-Encodning: gzip and Transfer-Encoding: chunked. Useful for debugging, in conjuction with --output-stdout
-split-output true
By default each output gets same traffic. If set to true it splits traffic equally among all outputs.
-stats //打开输出队列的状态
Turn on queue stats output
-verbose
Turn on more verbose output

diffy简介

http://www.github.com/twitter/diffy

Diffy是一个开源的自动化测试工具,它能够自动检测基于Apache Thrift或者基于HTTP的服务。使用Diffy,只需要进行简单的配置,之后不需要再编写测试代码。
Diffy主要基于稳定版本和它的副本的输出,对候选版本的输出进行比较,以检查候选版本是否正确。因此,Diffy首先假设候选版本应该和稳定版本有“相似”的输出。即不论候选版本和稳定版本系统模块是否相同,他们的最终输出应该是“相似”的。

Diffy工作原理

在测试过程中,Diffy充当一个代理,它能够将来源请求分发到不同版本的系统中去,通过对各个版本系统的输出进行对比,做出最终的结论。
Diffy需要三个版本的系统,以实现它的噪声过滤和对比功能,它们分别是:
候选版本:该版本是待测版本,相对于生产环境版本有着跟新的代码
稳定版本:该版本通常是已经上线版本,或者是已知功能正常的版本
稳定版本副本:该版本是稳定版本的副本,和稳定版本运行相同的代码,主要用于排除噪声
整个运行流程为:
diffy运行流程

其中:
1.原始区别为候选版本和稳定版本之间输出的区别,其中可能会包含上述的噪声
2.噪声从稳定版本和其副本中获得,如果两个运行相同代码的系统输入相同输出却不同,则Diffy会认为这是开发人员不需要关心的噪声。

基于上述两个区别集合,Diffy可以识别出候选版本和稳定版本真实的区别,这些区别很有可能就是一个缺陷。
当然,对于一个概率性出现随机值,仅仅一次请求的结论可能是不准确的。例如对于一个50%概率出现true或者false的布尔值,则有50%的概率会出现候选版本和稳定版本的不同,同时又会有50%的概率出现稳定版本和其副本出现不同(即将这个值认定为噪声),最终会有25%的概率认为这是一个缺陷。因为此时稳定版本和其副本值相同,候选版本和稳定版本值不同。因此,Diffy还会聚合原始区别和噪声,当发现二者出现的概率类似的时候,会认定之前识别出来的缺陷属于误报。

示例

gor

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
https://github.com/buger/goreplay/wiki/Getting-Started

#捕获网络流量
1. 模拟启动一个server
gor file-server :8000
2. 抓取8000端口流量
gor --input-raw :8000 --output-stdout
3. 浏览器或curl模拟请求,两个终端窗口都会打印输出
http://localhost:8000


#流量重放
4. 启动另外一个server,模拟流量重放
gor file-server :8001
5. 流量复制到另外一个server
gor --input-raw :8000 --output-http="http://localhost:8001"


gor --input-raw :8000 --output-http="http://localhost:8001|50%"
6. 浏览器或curl模拟请求,8000/8001端口都会输出响应
http://localhost:8000


#将请求保存到文件并稍后重放
7. 将请求记录到文件
gor --input-raw :8000 --output-file=requests.gor
实际会分批保存为 request_0.gor,request_1.gor 这种文件名。
236K requests_0.gor
236K requests_10.gor
236K requests_11.gor
240K requests_12.gor
236K requests_13.gor
236K requests_14.gor
188K requests_15.gor
236K requests_1.gor
240K requests_2.gor
236K requests_3.gor
240K requests_4.gor
236K requests_5.gor
240K requests_6.gor
236K requests_7.gor
236K requests_8.gor
236K requests_9.gor
8. 浏览器或curl模拟请求
http://localhost:8000
9. 流量回放,请求全部会转发到8001端口
gor --input-file requests_0.gor --output-http="http://localhost:8001"

gor限流

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# Limiting replay using absolute number
# staging.server will not get more than ten requests per second
gor --input-tcp :28020 --output-http "http://staging.com|10"


# Limiting listener using percentage based limiter
# replay server will not get more than 10% of requests
# useful for high-load environments
gor --input-raw :80 --output-tcp "replay.local:28020|10%"


# Consistent limiting based on Header or URL param value
# Limit based on header value
gor --input-raw :80 --output-tcp "replay.local:28020|10%" --http-header-limiter "X-API-KEY: 10%"
# Limit based on header value
gor --input-raw :80 --output-tcp "replay.local:28020|10%" --http-param-limiter "api_key: 10%"

# Performance testing
# Replay from file on 2x speed
gor --input-file "requests.gor|200%" --output-http "staging.com"

goreplay+diffy

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
1. Deploy your old code to localhost:9990. This is your primary.
nohup java -Dserver.port=9990 -jar s1.jar &

2. Deploy your old code to localhost:9991. This is your secondary.
nohup java -Dserver.port=9991 -jar s1.jar &

3. Deploy your new code to localhost:9992. This is your candidate.
nohup java -Dserver.port=9992 -jar s1.jar &

4. Download the latest Diffy binary from maven central or build your own from the code using ./sbt assembly.

5. Run the Diffy jar with following command line arguments:
java -jar diffy-server.jar \
-candidate=localhost:9992 \
-master.primary=localhost:9990 \
-master.secondary=localhost:9991 \
-service.protocol=http \
-serviceName=My-Service \
-proxy.port=:8880 \
-admin.port=:8881 \
-http.port=:8888 \
-rootUrl='localhost:8888'

注:此处需要下载源码重新编译打包成diffy-server.jar

java -jar diffy-server.jar -candidate=localhost:9992 -master.primary=localhost:9990 -master.secondary=localhost:9991 -service.protocol=http -serviceName=My-Service -proxy.port=:8880 -admin.port=:8881 -http.port=:8888 -rootUrl='localhost:8888'

步骤1-5运行完毕后,进程:
root 3952 1 0 14:12 ? 00:00:17 java -Dserver.port=9990 -jar s1.jar
root 4004 1 0 14:13 ? 00:00:15 java -Dserver.port=9991 -jar s1.jar
root 4049 1 0 14:13 ? 00:00:15 java -Dserver.port=9992 -jar s1.jar
root 16863 2113 99 14:58 pts/0 00:00:05 java -jar diffy-server.jar -candidate=localhost:9992 -master.primary=localhost:9990 -master.secondary=localhost:9991 -service.protocol=http -serviceName=My-Service -proxy.port=:8880 -admin.port=:8881 -http.port=:8888 -rootUrl=localhost:8888


6. Send a few test requests to your Diffy instance on its proxy port:
curl localhost:8880/your/application/route?with=queryparams

注:此处是代理端口,如果使用goreplay复制流量时,将指向该代理端口
gor --input-raw :8000 --output-http="http://diffserver:{diff proxy port}"

7. Watch the differences show up in your browser at http://localhost:8888.

参考

1
2
3
4
5
6
7
8
https://goreplay.org/
https://github.com/buger/gor/wiki
https://github.com/twitter/diffy/

https://www.cnblogs.com/playboysnow/articles/9759366.html
https://studygolang.com/articles/10205
http://tyrion.iteye.com/blog/2311987
https://blog.51cto.com/xqtesting/2068569