A game-theoretic model of Steem's curation algorithm

in #computer-science6 years ago
This post reviews the paper, A Puff of Steem: Security Analysis of Decentralized Content Curation, an arXiv paper that models content curation using parameters from the Steem block chain.

Background

image.png

As you may know, I have been doing a series of #rsslog posts that are (mostly) based upon interesting links that I find in my daily reading from my RSS feed. In this morning's post, I included a link to the article: A Puff of Steem: Security Analysis of Decentralized Content Curation (PDF), which I thought should get a bit more attention.

Here is the summary from this morning's post:

  • A Puff of Steem: Security Analysis of Decentralized Content Curation. - Researchers from the University of Edinburgh, IOHK, the Imperial College of London, Clearmatics, and Brave Software model the Steem curation system, show the conditions under which the model converges "correctly", and demonstrate that it is vulnerable to selfish behavior. The model is a simplification of Steem's actual voting rules, and does not seem to incorporate a "downvote" or "flag" mechanism.

  • And here is the paper's Abstract:

    Abstract. Decentralized content curation is the process through which uploaded posts are ranked and filtered based exclusively on users’ feedback. Platforms such as the blockchain-based Steemit6 employ this type of curation while providing monetary incentives to promote the visibility of high quality posts according to the perception of the participants. Despite the wide adoption of the platform very little is known regarding its performance and resilience characteristics. In this work, we provide a formal model for decentralized content curation that identifies salient complexity and game-theoretic measures of performance and resilience to selfish participants. Armed with our model, we provide a first analysis of Steemit identifying the conditions under which the system can be expected to correctly converge to curation while we demonstrate its susceptibility to selfish participant behaviour. We validate our theoretical results with system simulations in various scenarios.

    This paper was posted on arXiv in October of 2018, so it's possible that it's been covered here before, but I missed it if it was. I came across it yesterday, and read through it quickly. I didn't take the time to fully understand the dense sections of analysis, but I think I read it closely enough to get a general sense of the work. Hopefully, the following summation will touch on the highlights.

    Introduction

    In the paper's introduction, the authors describe the general importance of content curation. They note that because many Internet web sites are too large for any single user to access every post, content curation has emerged as a decentralized mechanism for categorizing, ranking, and filtering the content, and that some method of curation must be employed by any site that hosts a large volume of user generated content (UGC). They go on to observe that decentralized content curation on a permissionless blockchain is an especially hard problem that remains unsolved. Noting that the Steemit platform has a relatively long history and wide usage, the authors undertake to model that environment.

    Original Contribution

    The authors studied the computational foundations of decentralized content curation and created a model of a post-voting system that aims to sort a collection of posts according to decentralized user input.

    The model makes use of concepts called ideal score, and convergence or t-convergence to describe the state of the ranking when the first t posts are correctly ranked according to their objective scores if the participants had all voted honestly. The model accomplishes the following:

    • Describes the conditions under which a list of posts converges, and finds that under rules modeled after Steemit, lists larger than 70 elements may not even achieve 1-convergence (i.e. the #1 ranked post is correctly ranked).
    • Uses simulations to statistically validate the results.
    • Demonstrates that "selfish" deviation from honest behavior might lead to substantial advantages in list ranking.

    Conclusion

    After these opening sections, the paper contains additional sections with a literature review of the curation literature, a detailed description of the model that the authors used, and a description of the simulation design and methodology.

    I'll skip over the details, which you can read in the PDF link, except to note that it simplifies the voting scheme to a single post per author at the beginning of each round of voting, and does not seem to incorporate bidbots, downvotes, or beneficiary awards in the analysis (except to the extent that bidbots might be considered to be selfish deviations from honest voting).

    It is interesting to see a formal study of Steem's curation mechanism, and the paper raises the unsurprising possibility that the Steem curation algorithms are vulnerable to selfish voting, and might not do a great job of ranking the posts in order by quality.

    However, the model is more simplistic than Steem's actual curation mechanism, so although the findings are interesting to see, it's not clear how convincingly they can be transferred from the game-theoretic model to the actual Steem blockchain. I look forward to seeing future work on the topic.

    Thank you for your time and attention.
    As a general rule, I up-vote comments that demonstrate "proof of reading".




    Steve Palmer is an IT professional with three decades of professional experience in data communications and information systems. He holds a bachelor's degree in mathematics, a master's degree in computer science, and a master's degree in information systems and technology management. He has been awarded 3 US patents.

    Steve is also a co-founder of the Steem's Best Classical Music Facebook page, and the @classical-music steemit curation account.