Hyperledger community in the article Hyperledger Fabric: A Distributed Operating System for Permissioned Blockchains shows that Fabric achieves end-to-end throughput of more than 3500 transactions per second in certain popular deployment configurations. I'm trying to achieve this result in my project, but I'm far from it. Here I report my first results of load testing and invite you to join the investigation how to achieve a high throughput with Hyperledger Fabric and Composer
We build high-load service that uses Hyperledger Fabric. Our backend system consists of HF blockchain network, several microservices (node js) which communicate with blockchain via Hyperledger Composer, message broker for communication between microservices.
Hyperledger Fabric v1.1. Hypeledger Composer v0.19.0
Fabric network (deployed with Cello):
{
fabric001: {
cas: [],
peers: ["[email protected]"],
orderers: ["orderer1st.orderer"],
zookeepers: ["zookeeper1st"],
kafkas: ["kafka1st"]
},
fabric002: {
cas: [],
peers: ["[email protected]"],
orderers: ["orderer2nd.orderer"],
zookeepers: ["zookeeper2nd"],
kafkas: ["kafka2nd"]
},
fabric003: {
cas: [],
peers: ["[email protected]"],
orderers: ["orderer3rd.orderer"],
zookeepers: ["zookeeper3rd"],
kafkas: ["kafka3rd"]
},
fabric004: {
cas: ["ca1st.main"],
peers: [],
orderers: ["orderer4th.orderer"],
zookeepers: ["zookeeper4th"],
kafkas: ["kafka4th"]
}
}
fabric001-004 - AWS ec2 instances of t2.xlarge type. Initially, I used m5.4xlarge, but it costs a lot and CPU usage was always low even when Fabric starts to fail.
Fabric config:
BatchTimeout: 0.2s
BatchSize:
MaxMessageCount: 10
AbsoluteMaxBytes: 98 MB
PreferredMaxBytes: 512 KB
TLS disabled.
If required I can perform new tests with any configuration.
First of all I decided to test request to the state of the ledger (CouchDB). Blockchain is empty, only system data and few participants. Direct query requests to the CouchDB open port are very fast (~150 ms). My microservice connects to the Fabric by establishing a permanent connection for the existing identity. Requests take up ~500 ms in our system without high load. Half of this time accounts for message broker (AWS SQS is really slow). For load testing I'm using tool YandexTank. Load is going smoothly without latency increasing up to ~70 requests per second. Then latency stats degrade and at some point, chaincode starts return error messages. You can see test results here:
TEST RESULTS
There are two types of error messages that I received during iterations of load tests:
1.
[Hyperledger-Composer] undefined:HLFQueryHandler :queryChaincode() query payload returned an error: Error: 2 UNKNOWN: error executing chaincode: failed to execute transaction: timeout expired while executing transaction
2.
LFQueryHandler :queryChaincode() query payload returned an error: Error: 2 UNKNOWN: error executing chaincode: transaction returned with failure: Error: The current identity, with the name 'txBuilder' and the identifier '5606acbada327a8ef33134e601f990076872b31a3dda5ec0a983e04915d16007', has not been registered`
Chaincode container does not restart by itself, but from this time it doesn't work well. Sometimes I can't ping it, sometimes I can, but anyway latency is terrible. Only restart of the peer container can help. (I remind you that request to the ledger goes through one peer due to Composer, that's not good, but it's not the point of my investigation). The second error is really strange because this is the only identity I use and it works before chaincode starts to fail. And it works after I restart peer.
During applying the load, CPU usage of the peer, chaicode and CouchDB are the most (as expected). I'm in the middle of a configuring monitoring system for my blockchain network and soon I will be able to share more information.
Any thoughts?
I've been advised to use c*-type AWS instances for deploying Fabric. I chose c5.4xlarge (16 vCPU) for my tests. Also, I changed Fabric config a little bit:
BatchTimeout: 1s
BatchSize:
MaxMessageCount: 20
AbsoluteMaxBytes: 98 MB
PreferredMaxBytes: 512 KB
I performed the same test and, to my regret, I got the same result:
TEST RESULTS
In the figure below you can see the plot of containers CPU usage during the test which lasts 1 minute
Total CPU usage in maximum was ~ 30%. So we can see that problem of latency degradation lies elsewhere.
As performance results were very poor, I decided to continue my tests with pure Fabric without any unnecessary intermediate components. Just Fabric network and nodejs SDK. See new report here
Transactions may be of two types: Deploy transactions create new chaincode and take a program as parameter. When a deploy transaction executes successfully, the chaincode has been installed “on” the blockchain. Invoke transactions perform an operation in the context of previously deployed chaincode.
Chaincode can turn business logic into an executable program that is agreed to and verified by all members of the blockchain network. Business logic includes the definition of assets that are traded between parties. It also consists of the terms and conditions that are required for a transaction to be executed.
Hyperledger Fabric is built on a modular architecture that separates transaction processing into three phases: distributed logic processing and agreement ("chaincode"), transaction ordering, and transaction validation and commitment.
In a Hyperledger Fabric network, a node or collection of nodes together form what's called an “ordering service”, which literally orders transactions into blocks, which peers will then validate and commit to their ledgers.
I did a similar test with similar kind of setup and could achieve about 220 RPS, using 8 peer nodes, single Org. With a 2nd org, this performance would drop for sure. I used the high performance chaincode provided with fabric samples. Not sure how did they manage to get 3500 RPS.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With