Steem Pressure #3 - Steem Node 101

in #steem-pressure6 years ago (edited)

Everything you’ve always wanted to know about your first Steem Node but were afraid to ask.

Our goal for today

We will set up the system, build steemd manually, configure a simple Steem seed node, run it and sync it with the current head block.

What is that "simple Steem seed node"?


Video created for Steem Pressure series.

A consensus node.

Every Steem node is at least a consensus node.
In its simplest form, a consensus node is simply a node that is connected to the Steem network for the sole purpose of getting and sending blocks and transactions.

The more Steem nodes are running, the more decentralized and resilient the Steem network is.

A consensus node is also called a “Low Memory Node” - the name comes from the compile-time option LOW_MEMORY_NODE=ON, which is used to build steemd in such a way that data and fields not needed for consensus are not stored in the object database.

A low memory node is all that is needed (and therefore recommended) for seed nodes, witness nodes, and nodes run by exchanges.

A full node.

Running a low memory node is enough in many cases, but we still need full nodes to be able to use certain plugins and their APIs.

Importantly, a full node means here something different than in the Bitcoin realm, where Bitcoin full node is pretty much something that a Steem consensus node can do.

Here, on Steem, the word “full” doesn’t refer to anything related to the blockchain - it refers to the fully featured set of APIs enabled within steemd.

For example, the Condenser that powers up steemit.com site uses those APIs to display posts, comments, votes, feeds, and tags.

Many of those calls don’t need to be served by steemd, and in the future they will be served by various microservices.

Full nodes have significantly higher resource requirements, but this issue will not be covered in this episode.

Setting up the hardware

In the previous episode Steem Pressure #2 - toys for boys and girls, I gave you some tips about the hardware you might need.

In this episode, I will use an entry-level dedicated machine:
Intel(R) Xeon(R) CPU E3-1245 V2 @ 3.40GHz on an Ivy Bridge with 32GB DDR3 1333MHz RAM and 3x 120GB SSD

OS setup

I’m currently using Ubuntu 16.04 LTS.
You can assume that it’s a default clean install.

It’s up to you how you will set up your system to suit your needs and best utilize its hardware components. Every case is different, so there’s no ultimate solution here. In the end, if you run Steem node as a public service, you are expected to be qualified enough for sysadmin tasks.

In this example, the performance of each of the three disks is good on “Gandalf’s scale” ;-) i.e. it can perform Trivial benchmark presented in previous episode in less than 8 seconds.

I’ve used a 12GB swap partition on each of those drives with the same priority
/ and /boot partitions are configured as software RAID1 on all three drives.
/home is configured as software RAID0 on all three drives.

It’s OK to use RAID0 in my case, because I’m not going to use it to store anything important (anything I can’t afford to lose in the event of a power or drive failure), and that makes it able to pass the test in around 4 seconds.

Software prerequisites

Let's prepare our system so that we can build and run steemd.

Add a steem user:
useradd -s /bin/bash -m steem
Update the list of packages:
apt update
Make sure that your packages are up to date:
apt upgrade
Make sure that you have a reliable time source:
apt install ntp

If you wish to use tmpfs for the shared memory file (which is an optional solution), prepare enough free space on the target tmpfs device, create a directory and set ownership:
mount -o remount,size=48G /run
mkdir /run/steem && chown steem:steem $_

Building steemd

To use a docker or not?

Use it. It’s way faster and easier and it protects you from many possible errors that you will likely make by not following the manual very carefully.

Manual way

I'm not going to use a docker now, because I want you to get more familiar with the steemd building process.

Install packages needed to build steemd:

apt install \
    autoconf \
    automake \
    cmake \
    doxygen \
    g++ \
    git \
    libboost-chrono-dev \
    libboost-context-dev \
    libboost-coroutine-dev \
    libboost-date-time-dev \
    libboost-filesystem-dev \
    libboost-iostreams-dev \
    libboost-locale-dev \
    libboost-program-options-dev \
    libboost-serialization-dev \
    libboost-signals-dev \
    libboost-system-dev \
    libboost-test-dev \
    libboost-thread-dev \
    libncurses5-dev \
    libreadline-dev \
    libssl-dev \
    libtool \
    make \
    perl \
    pkg-config \
    python3 \
    python3-jinja2 \
    wget

apt install

This list is likely to change in future releases, but you can always take a look at Dockerfile to get an idea what is needed…. Or just go and use the docker instead (we will cover this question in one of the upcoming episodes).

From now on, you can perform all steps as the user steem:
su - steem

Clone Steem from GitHub:
git clone https://github.com/steemit/steem

Checkout stable branch and update submodules:

cd steem
git checkout stable
git submodule update --init --recursive

git checkout

Create two directories: one for the build process and one for the resulting binaries:
mkdir ~/build ~/bin

Use cmake to configure steem for the build process:

cd ~/build
cmake -DCMAKE_BUILD_TYPE=Release \
      -DLOW_MEMORY_NODE=ON -DCLEAR_VOTES=ON \
      ../steem

Setup build

Finally, build steemd:
make -j$(nproc) steemd
And copy it to ~/bin for convenience:

cp ~/build/programs/steemd/steemd ~/bin
cd;~/bin/steemd --version

building.gif

Congratulations! You have your own steemd.

Configuring a simple node.

How simple could that be?

Create a directory for data:
mkdir testdata

Create a simple configuration file:
cat > testdata/config.ini

p2p-endpoint = 0.0.0.0:2001
seed-node = gtg.steem.house:2001
public-api =
enable-plugin = witness

[log.console_appender.stderr]
stream=std_error

[logger.default]
level=info
appenders=stderr

I have configured it to listen on all interfaces (0.0.0.0) on port 2001.
You need to provide at least one seed node (address and port) - in this example I’m using my own public seed node: gtg.steem.house:2001
doc/seednodes.txt is recommended as an authoritative source of reliable seed nodes. You can use this one-liner to add them to your configuration file:

while read s; do echo seed-node = ${s%% *}; done < ~/steem/doc/seednodes.txt >> config.ini

I have not enabled any public APIs. You need to explicitly put an empty list. Otherwise the default setting will be used, which contains the login, database and account_by_key APIs.
I have enabled the witness plugin. You might find this surprising, but it has nothing to do with being a witness. It will allow your node to have an idea about bandwidth restrictions, which is part of the witness plugin. It is not required, but it is recommended, so it’s a good opportunity to highlight this option.
The remaining lines are related to logging; the levels info and higher will go to stderr. In this way, we won’t be flooded by p2p debug messages or get bored by the lack of messages during resync.

You might point shared-file-dir to a /run/steem, if you configured that earlier or explicitly set shared-file-size to a value other than default.

Please note that this configuration works well for the current stable release, which is v0.19.2 (and its minor changes, i.e. will be obsolete for appbase)

Run steemd, run!

So once we are configured, it’s time to start synchronization, which will take quite a lot of time depending on the configuration.

~/bin/steemd -d testdata

Run

That’s it

My example node synced 20M blocks in slightly less than 9 hours, reaching the current head block 100 minutes later.

If you see something like this:

Getting blocks

It means that your node is fully synced and you are getting blocks produced by witnesses.

There are ways to speed things up, like putting the shared memory file on ramfs (pros: ram is fast, cons: ram is expensive and it can quickly run out) or tmpfs (pros: it will use swap when out of ram, cons: it will run swap when out of ram), or using a local copy of block_log to replay what we already have inside it and sync what we are missing up to the current head block.

What’s next?

In the next episode, I will show you the performance differences between various setups and how quickly they can replay up to 20M blocks to give you some reference data.


If you believe I can be of value to Steem, please vote for me (gtg) as a witness on Steemit's Witnesses List or set (gtg) as a proxy that will vote for witnesses for you.
Your vote does matter!
You can contact me directly on steem.chat, as Gandalf



Steem On

Sort:  

When Master Gandalf teaches a class lesson...

Lolx.... i agree

😂😂😂😂😂😂😂👍👍👍👍

@gtg ... sorry if my question is a bit off topic ... But, how does one broadcast a raw+signed tx to the STEEM blockchain? Is there any way one can do this via HTTP RPC using public api servers? Or, will I need to:

  • setup my own Steem Node in the cloud
  • open ports to receive data
  • use Steemd to broadcast?

Well, just a bit. It's about nodes after all, so don't worry :-)

Yes, all you need is an RPC endpoint (at one of the public API nodes).

What's very on topic is that, in fact, it's enough to use consensus node for such broadcasting (as long as it provides RPC endpoint). In our universe, those nodes that provide such endpoints are most often "full nodes".

Here's an old, but very good read from @xeroc: Steem transaction signing in a nutshell

Yes, all you need is an RPC endpoint (at one of the public API nodes).

Excellent, thank you! So, I assume I can do this with Steemit's API then? I was just worried that all the call methods were read-only.

Would you happen to have any curl examples for broadcasting a tx? If not, that's ok, I'll just go hunting for it. Or, I'll just look through the source code to figure out what parameters the api wants.

Thank you very much for the assistance fellow wizard! :)

Ah! I think I found what I was looking for.


It seems like the hard part is just going to be constructing the message and getting the signatures as Xeroc described ...

Exactly, that would be a hard part but we have few libraries that can help you deal with signing.
The easy part looks like:

{"id":1,"jsonrpc":"2.0","method":"call","params":["network_broadcast_api","broadcast_transaction_synchronous",[{"operations":[["vote",{"voter":"fulltimegeek","author":"gtg","permlink":"steem-pressure-3-steem-node-101","weight":10000}]],"extensions":[]}]]}

but then you need to sign it and even before that to add reference block number and block prefix along with expiration time and to all that you add a signature which the call something like this:

{"id":1,"jsonrpc":"2.0","method":"call","params":["network_broadcast_api","broadcast_transaction_synchronous",[{"ref_block_num":123,"ref_block_prefix":1234567890,"expiration":"2018-03-08T12:34:56","operations":[["vote",{"voter":"fulltimegeek","author":"gtg","permlink":"steem-pressure-3-steem-node-101","weight":10000}]],"extensions":[],"signatures":["signaturesignaturesignaturesignaturesignaturesignaturesignature"]}]]}

That would be hard if you would like to script that signing in bash but existing libraries will do that for you without need of re-inventing the wheel.

@GTG, you are one sexy bastard! lol ... Those curl snippets are a huge aphrodisiac to me ...

Thank you again for all you do and your help. I have looked into libraries, as you mentioned, but I'm trying to do this natively in Android so I don't need any dependencies. I think I have it all figured out now.

Expect some dank apps from me sometime this year :D

Hold on one moment.

"Dennis! Dennis! Get in here!"

Dennis is the Empire's IT guy.

Gtg, can you repeat everything you just said? Please speak very slowly. Dennis is a bit of a moron.

Short version, especially for you:
"Run steemd, run!"

Unfortunately it can't be slower. Running is something that is expected to be fast.

You can, however, run '--replay' as many times as you want :-)

I really appreciate this detailed explanation. I have been wondering a bit about setting up a node for a while now, and here I have very specific instructions on how to do it--so thanks.

There is a broken URL in the text (https://steem-pressure/@gtg/steem-pressure-2-toys-for-boys-and-girls):

Trivial benchmark presented in previous episode in less than 8 seconds.

missing steemit.com

Thank you, fixed. Actually it was missing / at the beginning (my intent was to make a link relative to the site, so it will get people to the previous episode within same site, i.e. if you are on Steemit you will see target episode on Steemit, but if you are on busy.org you will continue reading there, instead of going to Steemit.

Great work Gandalf! Although I'm not much a technical guy at my best, I can recognize the potential you are showing to us! Well done my friend, you have my vote for your effort!

Thank you :-)

Sorry if I interrupt your convenience,

You are my inspiration.

I believe you are a good person who will help schoolchildren and starving children @gtg

Last year my friend and I created a community of Charity For Children of the World Generation, which is engaged in social education for children and hunger in order to keep learning for the future,

I have posted on my blog about the Charity For Childrens Community of the World's Generation community. In that post I mentioned your name as my motivator,

If you do not mind, you can appreciate my block for suggestions and feedback so that the Charitable Generation community for Children of the World Generation can continue to help the children.

You are my inspiration.

hopefully you can judge my writing and advise me to keep walking and more children saved from ignorance. I want you to be my teacher.

Thank you...

Inspiration? How on earth you were inspired by me to send such copy&pasted comment? And how's that related to this post?

Mega great work

Hi, is there a way to get virtual ops with less RAM/Disk usage?

api.steemit.com is very slow. The get_ops_in_block takes 10 seconds. I guess there is a DDOS prevention machanism. It leads that my piece of software cannot process blocks every 3 seconds. Therefore I'm looking for own steem node, but requirements are relativelly high

Not yet, but soon(-ish) I guess:
https://github.com/steemit/steem/issues/2099

Currently you can try to run somewhat limited node suited for your specific needs (for example without tags and follows plugin and filtered account history)

Hello. Thanks for the Steem Pressure series so far.

I'm investigating setting up a local node to generate a local copy of the steem blockchain to in turn create a database. I had initialy tried using steem python to generate a .json file of the blockchain using the public api, but that was evidentaly not ideal due to the time it would take to complete. Would you be able to give me an idea of the hardware, and the node plugins, that would be required to achieve this? I have so far not came accross a tutorial which provides instruction as to this, and would be greatful for any help you could provide.

Yes, definitely having a local steemd node will improve performance of such data gathering. As for configuration/plugins needed it all depends on which API calls you will be using to get that you need.
In worst case it would be a "full node" i.e. one with full bells and whistles (there should be some post about it where I provided my config for a full node, it's old but no much things have changed).
However, it's possible for you to optimize the config by choisng only those plugins you need (or maybe even ops filtering for account history).
Full node can currently run with 128GB RAM (or even 64GB) if you have very fast storage backend (for swap), but the more RAM, the faster your node will be able to answer your requests.

Thanks for taking the time to answer :) What I had in mind would require a record of all transfers from and to an account as well as rewards received as a minimun.

I'm not sure if this is something you may have tried, but what would be the likely result of trying to run a full node with less RAM (perhaps 32GB or 16GB)? Would the node fail to run, or would it just be much slower?

It would so slow that at some point it wouldn't be able to process blocks faster than they are coming, spending all its time on I/O operations.

hi @gtg, as mentioned in chat i was going to visit each of these posts so i can learn a little more. Thank you for putting them together. I have learned a lot ( but a lot more to go)

You did refer to a follow up post using docker, but I have not found that? Running docker is easy enough, maybe if you are considering doing a post, it would be more about 'understanding docker' than running docker, its just to easy to copy a piece of code and have no clue what it means or does.

Im also wondering now that some time has passed since this series, how much if any have the above specs changed.

Finally, with HF20 on the way, will you be doing a post around the update?

Kind Regards

Paula
@steemcommunity witness

Oh, I definitely want to continue Steem Pressure series I've even collected some materials for updates but writing posts takes me a lot of time.
I will try to improve frequency.

Omg this is a whole lot to digest. Yesterday @teamcliff gave a run down of his own node and now you. It's amazing how you guys come up with this and I must say I am really impressed. Keep up the good work you are doing for this community ....you gat my support.

And how exactly you do that? :-)
By keeping your fingers crossed?
Because clearly not by upvoting my post. And not by upvoting his post.
But you you have upvoted your own comments under each of our posts.
You are also not voting for either of us as witnesses.
It's up to you of course but given all that I just have an impression that it's all about your self promotion here.

Your wish is my command...gtg and timcliff for witness it is. On it right away.

DONE

It's timcliff, not teamcliff ;)

Thanks bruv