Steem-Python API and RPC Speed

in #steemdev7 years ago (edited)

Some struggle with Steem-Python


dorabot@dorabot Protector of Peace

Using steem-python to access the Steem API can sometimes turn out to be painfully slow. Especially using the official Steemit RPC nodes.

The RPC nodes seem to be frequently under heavy load and can cause extremely unreliable response times. In many cases this will only lead to slow result, but it can even end up with incomplete and missed data. One example of this is trying to stream data from the blockchain using the function stream_comments().

Part of my goal with @dorabot was to enable interactions via Steemit. To make that happen, I needed a reliable way to process all comments posted in the blockchain. My initial choice was to use a function called stream_comments in the Steemd class.

def stream_comments(self, *args, **kwargs):
        """ Generator that yields posts when they come in
            To be used in a for loop that returns an instance of `Post()`.
        """
(Steem-Python on Github).

The stream_comments() function is constructed to listen to all the blocks in the Blockchain and return a Post object for each post/comment. Very convenient, but returning a Post object means there is an additional API call and that makes it horrendously slow.

When I was looking for a solution I stumble upon this post made by @pibara.

He was having similar issues and made a pretty neat script to measure the performance fetching raw blocks from the blockchain. The post above is old by now and doesn't include the latest bug fix. The following line needs to be moved outside of the while-loop for the script to function properly.


index = start

The script is using no fancy API calls and is an excellent way to measure the pure performance of fetching blocks from the blockchain.

Running the script from @pibara, it was sometimes slow to fetch data from the Steemit RPC nodes, but at least it could easily keep up. So it was clear that the stream_comments function should be avoided.

My next step was to use this script to compare the performance using different RPC nodes. Ever since I started using steem-python I noticed that the response time can vary greatly. And as @followbtcnews (in cooperation with @crimsonclad) just had released a full node as part of the Minnow Support Project, I thought it would be a cool thing to compare. Please follow this link for the announcement from @followbtcnews.
https://steemit.com/witness-category/@followbtcnews/heard-you-needed-a-node-witness-followbtcnews-launches-a-new-public-full-rpc-node

I started the test with the default steemit nodes. As this is the default, there is no specific modification. As soon as you create an object of the Steem class, like below, you will connect to one of the default nodes.


steem = Steem()


Steemit RPC Node

And below the code to connect to a different node.


my_nodes = ['https://steemd.minnowsupportproject.org']
steem = Steem(my_nodes)


MSP's RPC Node

As you can see above, the results from the default nodes vary quite a bit while the server from @followbtcnews is very consistent. Don't look blindly at the "seconds behind" value. There seems to be some kind of delay for how the blocks are presented through the API, hence the ~60 seconds behind being normal.


Seeing these results I will definitely be moving @dorabot away from the default RPC nodes.

Thank you for reading!
Stayed tune for future updates.

Please let me know if you have any questions.
And please ping me (@danielsaori) if you connect to Discord.

Proud member of #minnowsupportproject & #teamaustralia
Thank you @aggroed, @ausbitbank, @teamsteem,
@theprophet0, @someguy123, @canadian-coconut and @sirknight

Click HERE to learn more about Minnow Support Project.
Click HERE to connect to our Discord chat server.


Sort:  

but it can even end up with incomplete and missed data

I've been asking a few others about this, and I've been wondering if it's not a potential issue with the websocket implementation being used, as I had described in this github issue:

https://github.com/zaphoyd/websocketpp/issues/641#issuecomment-329650068

I believe this is also related to an overloaded server (at times of especially "high response latency"), but here's the gist of it:

All frames returned are in json format and no compression is being used (I believe there were issues when they tried enabling compression). Every so often, I get what appears to be a "complete" websocket frame, however, the json packet is incomplete (ie. the frame size appears to be smaller than the actual packet size). The rest of the packet still comes through (but not as another frame). After appending the remainder to the original truncated packet, all seems well again.

I have a feeling if STEEMIT and other graphene-related chains transitioned to use https://github.com/uNetworking/uWebSockets instead, it might help alleviate at least some of these issues.

Great finding! 👍
I have never had that error thrown that I can remember. But I also only tested a short while before I moved away from the default rpc nodes.
There could be many things playing a role here, but for sure, using the stream_comments function there is an enormous overhead initiating a post object for each comment.

Hmm I never understood(and still don't understand) what an RPC node is, but your bit about the default nodes being slow due to heavy load makes sense.

I actually made a script a few weeks ago using the steem-python api, and it was REALLY slow, until now I thought that the python api itself is slower than the other APIs, but after reading this post I realised that its the RPC nodes.

You should give it a try with another rpc node and see how it goes.
Depending on the complexity and number of queries it can make a big difference

Thank you for sharing it !

your post is really amazing friend, I like

Hi @danielsaori,

Thanks for sharing this!

I'm looking for an API that allows me to create content (POSTs). I have been reading the documentation and it seems that we can only retrieve information or just create a comments or up-vote.

I wonder if you could point me to some API that allows creating POST in steemit.com.

Thanks,

@realskilled

If you are using steem-python you have a function called post() available to make both post new posts or make comments.

Well, actually I'm need to use this from PHP :)

Is that Doreamon??😸

Avenger Dora! 😉

This post received a 5% vote by @minnowsupport courtesy of @followbtcnews from the Minnow Support Project ( @minnowsupport ). Join us in Discord.

Upvoting this comment will help support @minnowsupport.

Thanks!
So you think this was a boobs-worthy post? ;)

No, i just asumed you wanted to see some boobs!

@dorabot another ?winner test

No Winner!! No upvotes on this post... :(

Can you tell me @dorabot who the ?winner is?

The winner is: qwasert!!!

The winner is: danielsaori!!!

hmmm.. This above was from the esteem app...

Ah, so the esteem app is not formatting json metadata inline with steemit.com...
@dorabot ?winner

The winner is: dorabot!!!

This post has been resteemed by @minnowsupport courtesy of @followbtcnews from the Minnow Support Project ( @minnowsupport ). Join us in Discord.

Upvoting this comment will help support @minnowsupport.