Yahoo Streaming Benchmark For Heron

项目简介

Github地址:https://github.com/yahoo/streaming-benchmarks

At Yahoo we have adopted Apache Storm as our stream processing platform of choice. But that was in 2012 and the landscape has changed significantly since then. Because of this we really want to know what Storm is good at, where it needs to be improved compared to other systems, and what its limitations are compared to other tools so we can recommend the best tool for the job to our customers. To do this we started to look for stream processing benchmarks that we could use to do this evaluation, but all of them ended up lacking in several fundamental areas. Primarily they did not test anything close to a read world use case, so we decided to write a simple one. This is the first round of these tests. The tool here is not polished and only covers three tools and one specific use case. We hope to expand this in the future in terms of the tools tested, the variety of processing tested, and the metrics gathered.

使用说明

Yahoo Streaming Benchamark for Heron精简的原有的Yahoo benchmark项目,暂时移除了对Storm,Flink和Spark Streaming的测试,只保留了精简的redis,zookeeper,kafka以及data生成的相关组件,用于对Heron的数据提供

使用命令:

该项目暂时精简为了Heron系统是数据源提供以及topology项目,在运行数据源启动命令之前,需要先手动启动如下服务:

注:以下服务是在之前已经安装配置完成的。

zookeeper

yitian@heron01:~$ ./zookeeper/zookeeper-3.4.10/bin/zkServer.sh start

kafka

zookeeper启动完成后,启动kafka(这里kafka和zookeeper的集成配置见:Kafka+Zookeepr的单节点集成配置

yitian@heron01:~/kafka/kafka_2.11-2.0.0$ bin/kafka-server-start.sh config/server.properties

使用该项目,需手动创建名为ad-events的kafka topic:

yitian@heron01:~/kafka/kafka_2.11-2.0.0$ bin/kafka-topics.sh --create --zookeeper 192.168.209.137:2181 --replication-factor 1 --partitions 1 --topic ad-events

Topic创建完成后,尝试启动console-producer和console-consumer用于测试kafka是否正常使用,以及之后使用console-consumer接受yahoo-streaming-benchmark-for-heron中创建的数据源数据:

yitian@heron01:~/kafka/kafka_2.11-2.0.0$ bin/kafka-console-producer.sh --broker-list 192.168.209.137:9092 --topic ad-events
yitian@heron01:~/kafka/kafka_2.11-2.0.0$ bin/kafka-console-consumer.sh --bootstrap-server 192.168.209.137:9092 --topic ad-events --from-beginning

redis

启动redis的命令如下:

yitian@heron01:/usr/local/bin$ redis-server /etc/redis/redis.conf 

这里的redis配置了日志目录:

image

关闭redis的命令如下:

yitian@heron01:/usr/local/bin$ redis-cli shutdown

如果关闭失败,则使用如下命令查看redis进程并使用kill命令停止:

yitian@heron01:/usr/local/bin$ ps axu|grep redis
yitian    10106  0.1  0.0  44764  3188 ?        Ssl  20:29   0:01 redis-server 127.0.0.1:6379
yitian    10112  0.1  0.3 656136 15616 ?        Sl   20:29   0:01 gedit /home/yitian/redis/redis.log
yitian    10818  0.0  0.0  14224   932 pts/18   S+   20:51   0:00 grep --color=auto redis

上述服务启动完成后,启动数据源的命令如下:

先运行:

yitian@heron01:~/heronIdea/yahoo-benchmark-for-heron$ ./start-stream.sh START_REDIS

再运行:

yitian@heron01:~/heronIdea/yahoo-benchmark-for-heron$ ./start-stream.sh START_LOAD

运行完毕后,如果一切正常,则看到如下输出:

image

右侧的部分可以看到,是kafka console-consumer中接受到的数据输出,说明生成的ad数据已经写入到了kafka中,下面要做的就是使用heron 中的kafka spout将这些数据流入拓扑中进行计算。

数据加载停止命令:

yitian@heron01:~/heronIdea/yahoo-benchmark-for-heron$ ./start-stream.sh START_LOAD

项目配置

注意:在只有使用该项目时,不在运行SETUP命令,该命令会重写{项目目录}/conf/localConf.yaml配置文件中的内容,而该文件中的内容已经时配置完成后的,暂时不需要更改,需要更改的时候,直接在该配置文件中更新新的配置项即可,也不运行SETUP命令。

。。。

遇到的问题

1. 命令精简过多的问题

yitian@heron01:~/heronIdea/yahoo-benchmark-for-heron$ WARNING: send already refers to: #'clojure.core/send in namespace: setup.core, being replaced by: #'clj-kafka.new.producer/send
{:redis-host localhost, :kakfa-brokers 192.168.209.137:9092}
Running, emitting 1000 tuples per second.
Exception in thread "main" java.lang.RuntimeException: No Campaigns found. Please run with -n first., compiling:(/tmp/form-init5902727056091938465.clj:1:73)
	at clojure.lang.Compiler.load(Compiler.java:7239)
	at clojure.lang.Compiler.loadFile(Compiler.java:7165)
	at clojure.main$load_script.invoke(main.clj:275)
	at clojure.main$init_opt.invoke(main.clj:280)
	at clojure.main$initialize.invoke(main.clj:308)
	at clojure.main$null_opt.invoke(main.clj:343)
	at clojure.main$main.doInvoke(main.clj:421)
	at clojure.lang.RestFn.invoke(RestFn.java:421)
	at clojure.lang.Var.invoke(Var.java:383)
	at clojure.lang.AFn.applyToHelper(AFn.java:156)
	at clojure.lang.Var.applyTo(Var.java:700)
	at clojure.main.main(main.java:37)
Caused by: java.lang.RuntimeException: No Campaigns found. Please run with -n first.
	at setup.core$gen_ads$fn__1508.invoke(core.clj:156)
	at setup.core$gen_ads.invoke(core.clj:152)
	at setup.core$run.invoke(core.clj:185)
	at setup.core$_main.doInvoke(core.clj:283)
	at clojure.lang.RestFn.invoke(RestFn.java:482)
	at clojure.lang.Var.invoke(Var.java:401)
	at user$eval5.invoke(form-init5902727056091938465.clj:1)
	at clojure.lang.Compiler.eval(Compiler.java:6782)
	at clojure.lang.Compiler.eval(Compiler.java:6772)
	at clojure.lang.Compiler.load(Compiler.java:7227)
	... 11 more

解决方法:先运行START_REDIS,其中包括运行加-n参数的数据相关命令。

2. redis配置问题

yitian@heron01:~/heronIdea/yahoo-benchmark-for-heron$ WARNING: send already refers to: #'clojure.core/send in namespace: setup.core, being replaced by: #'clj-kafka.new.producer/send
{:redis-host localhost, :kakfa-brokers 192.168.209.137:9092}
Running, emitting 1000 tuples per second.
Exception in thread "main" java.util.NoSuchElementException: Could not create a validated object, cause: MISCONF Redis is configured to save RDB snapshots, but it is currently not able to persist on disk. Commands that may modify the data set are disabled, because this instance is configured to report errors during writes if RDB snapshotting fails (stop-writes-on-bgsave-error option). Please check the Redis logs for details about the RDB error., compiling:(/tmp/form-init3919348844310058731.clj:1:73)
	at clojure.lang.Compiler.load(Compiler.java:7239)
	at clojure.lang.Compiler.loadFile(Compiler.java:7165)
	at clojure.main$load_script.invoke(main.clj:275)
	at clojure.main$init_opt.invoke(main.clj:280)
	at clojure.main$initialize.invoke(main.clj:308)
	at clojure.main$null_opt.invoke(main.clj:343)
	at clojure.main$main.doInvoke(main.clj:421)
	at clojure.lang.RestFn.invoke(RestFn.java:421)
	at clojure.lang.Var.invoke(Var.java:383)
	at clojure.lang.AFn.applyToHelper(AFn.java:156)
	at clojure.lang.Var.applyTo(Var.java:700)
	at clojure.main.main(main.java:37)
Caused by: java.util.NoSuchElementException: Could not create a validated object, cause: MISCONF Redis is configured to save RDB snapshots, but it is currently not able to persist on disk. Commands that may modify the data set are disabled, because this instance is configured to report errors during writes if RDB snapshotting fails (stop-writes-on-bgsave-error option). Please check the Redis logs for details about the RDB error.
	at org.apache.commons.pool.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:1234)
	at redis.connection_pool.ConnectionPool.get_connection(connection_pool.clj:11)
	at setup.core$gen_ads.invoke(core.clj:152)
	at setup.core$run.invoke(core.clj:185)
	at setup.core$_main.doInvoke(core.clj:283)
	at clojure.lang.RestFn.invoke(RestFn.java:482)
	at clojure.lang.Var.invoke(Var.java:401)
	at user$eval5.invoke(form-init3919348844310058731.clj:1)
	at clojure.lang.Compiler.eval(Compiler.java:6782)
	at clojure.lang.Compiler.eval(Compiler.java:6772)
	at clojure.lang.Compiler.load(Compiler.java:7227)
	... 11 more

问题解决:

image

参考:https://blog.csdn.net/qq_31766907/article/details/78715935