#4 - Indexer: posts.py

in #steem5 years ago

Hivemind Deep Dive (new)-2.png

What I am learning about Hivemind's design

The module posts.py handles "critical/core post operations and data" in Hivemind. When blocks are scanned, the operations that are related to posts/comments are parsed by this module and the appropriate actions are taken within the database to maintain state.

This excludes data such as body content, title, raw JSON data and votes. These are handled by cached_posts.py, which I will write about next, after this one.


Links to Python scripts referred to in this post

These scripts are from the master branch on Hivemind's GitHub repository.


In blocks.py (the module responsible for processing raw blocks), methods in posts.py are triggered by a couple of specific conditions. Here's a snapshot of the code block in blocks.py that handles this:


# post ops
elif op_type == 'comment_operation':
    Posts.comment_op(op, date)
elif op_type == 'delete_comment_operation':
    Posts.delete_op(op)
elif op_type == 'vote_operation':
    if not is_initial_sync:
        CachedPost.vote(op['author'], op['permlink'],
                        None, op['voter'])

The first condition identifies a comment operation and this condition makes calls to the comment_op() method in our posts.py module, which then decides if it's a new post or an edit, or an undelete.

Posts.comment_op()

Here's a snapshot of what the code looks like:


def comment_op(cls, op, block_date):
        """Register new/edited/undeleted posts; insert into feed cache."""
        pid = cls.get_id(op['author'], op['permlink'])
        if not pid:
            # post does not exist, go ahead and process it.
            cls.insert(op, block_date)
        elif not cls.is_pid_deleted(pid):
            # post exists, not deleted, thus an edit. ignore.
            cls.update(op, block_date, pid)
        else:
            # post exists but was deleted. time to reinstate.
            cls.undelete(op, block_date, pid)

New posts, edits or updates, and undelete operations are handled by the comment_op() method. The following are the scenarios included.

New posts

If the above method ascertains that it's a new post operation, the insert() method is called. It makes an entry into the database for the post.

It also checks if the post has a parent_author and updates the parent's child count, in cached posts. Data is also inserted into hive_feed_cache, if the post is not a comment.

Now, hive_feed_cache is another cache that maintains the state of feeds (blogs and reblogs), offering efficient queries, in the same way that cached_posts makes efficient post querying possible. I will write about that in a later post.

Update posts

When a post is to be updated, the update() method is called and data is passed to cached_posts.py, which changes the post's data from the old to the new, to reflect the new state.

Undelete posts

When a post undelete op is detected, it triggers an undelete() method and the following happens within it:

  • it sets the is_delete flag to 0
  • it rebuilds the post
  • undeletes from cached_posts
  • inserts the post into the feed_cache

Posts.delete_op()

Going back to that code block I shared above from blocks.py, the second condition triggers Posts.delete_op(), where a comment (be it top level post or actual comment) is marked as deleted. It is also removed from feed_cache and cached_posts.

CachedPost.vote()

The last condition is not really connected to this module in particular, but I thought I would address it here, in brief. This records vote operations for a comment and it triggers the vote() method in cached_posts.py.

What have I learned?

The posts.py module will be a good place to plug in code that handles ad_posts for the Native Ads system. Options include a new DB table that holds ad core data (post IDs, moderation status, scheduling, etc) and then leveraging the cached_posts to retrieve full details about an ad's data/content (from JSON content, for example).

The feed_cache will be irrelevant for Native Ads, because the posts will not be displayed on UIs.

Posts in this series

#1 - Overview and opportunities
#2 - Indexer: blocks.py
#3 - Indexer: accounts.py
#4 - Indexer: posts.py



I am currently working on a new feature called Native Ads, that may be added to Hivemind Communities in a future update.

For an overview of the Native Ads feature and how it will work, read this doc.

If you would like to take a look at the code, check out my fork of Hivemind on GitHub.

Sort:  

Congratulations @imwatsi! You received a personal award!

Happy Birthday! - You are on the Steem blockchain for 3 years!

You can view your badges on your Steem Board and compare to others on the Steem Ranking

Do not miss the last post from @steemitboard:

The new SteemFest⁴ badge is ready
Vote for @Steemitboard as a witness to get one more award and increased upvotes!