Steemit/Steem: Reputation Voting Analysis

in #utopian-io7 years ago (edited)

This is an analysis of the voting patterns of the 'Reputation' number given to each account on the Steem Blockchain.

The aim of this analysis is to find out if there any patterns with regards to the voting 'style' of the Reputation number, with a focus around the following:

  • Do reputations like to vote for the same reputations?

  • If so, to what extent, and is this the same across all reputations assessed?

  • With 'self-voting' excluded, do the same patterns (if any) take shape?


title.png


Background

The data collected for this analysis covers one week from the time of script execution - 5,087,078 votes in 7 days, prior to further data removal as stated in the following points.

The reputation range has been limited to between 40 and 78 - 78 being the highest reputation of an account at this time.

The reputation numbers used is from the 'accounts' table.

No account names are listed in this analysis.

Every Bid-bot listed at https://steembottracker.com/ was removed from the original data set.


Sourcing the data

The following query was used to obtain the base data to be analysed:

image.png

This data was stored locally to perform additional queries as below:

image.png


Presentation

The first table and chart show the total vote weight as a %, given to equal reputations.

e.g. Of all the accounts at 64 reputation, 15% of the total vote weight was given to accounts with a reputation of 64.

1.png

With an average around 16% across all the reputations, 75, 76, 77 stand out from this figure.

Reputation 75, and 77 have the lowest weight % going to the same reputation, while Reputation 76 has the highest of all the reputations analysed at almost 48%.


Next, the same table format, but this time including the reputation that received the most vote weight by each reputation.

Table 1 - Votes to Self included

image.png

As we can see, the table is identical bar one row.

All reputations except for 77, vote with the highest total weight percentage to the same reputation.


This time, let's take a look at the same dataset, and exclude the votes that are to the same account: Voter <> Author.

Table 2 - Votes to Self excluded

image.png

image.png

Now we have a much lower 'average % of vote to the same reputation'.

With self-votes included this was 16.2%, and now excluding votes to self, 6.1%

The highest voting % given to a reputation, is Rep 76, distributing 31% of their total vote weight to reputation 74.

And, without votes to self, only 4 reputations (43, 51, 54, 60) give the highest vote weight % to the same reputation.


Analysis Summary

Looking at Table 1, and excluding further analysis regarding sock puppet accounts (alternate accounts with same owner), it looks fairly clear that many users vote for themselves. Only reputation 77 distributes a higher vote weight % to a reputation not equal to its own.

Only 3 accounts have reputation 77 at the time of this analysis, and a certain 'Daddy Chilli' has 0.0% votes to self in the past 7 days. A round of applause from the analyst today.

The highest %'s by a distance in table 1 fall under reputations 76 and 78. There are 6 accounts with a reputation of 76, and 1 account with a reputation of 78. It is safe to say there is an above average amount of self-voting within these reputation levels.

Table 2 is worth further discussion. It is possible to see where the reputations' (or accounts) choose to send their votes when not voting for their own content.

Reputations 40-44 all vote to reputations in the same range. This could due to small accounts helping each other, but another reason could be that these reputations are sock puppet accounts, and are voting (with low SP) each other. As there is no option to filter content by reputation, and usually the higher reputations sit at the top of any given tag/link, the second reason seems the more likely option.

Reputations 45 and 46 both vote for reputation 58 the most. This stands out in the dataset but the analyst has no obvious conclusions as to why. Anyone?

Reputations 50-59, as with 40-44, all vote with the most weight to reputations in the same range. It is less likely that these accounts are sock puppets, and possible that in this Reputation range, communities and friendships are starting to form.

Reputations 60-70 as above, all vote within the same range apart from Rep 68 that votes with the most weight to Reputation 74. This is another anomaly in the dataset that is difficult to explain without further analysis into account details.

Reputations 71,72,73,75,77, and 78 all vote with most weight to Reputations in the 60's. This makes sense as there are much more 'established' user accounts in this range. Apart from Rep 78, the top weight % is fairly low at around 6-7%, showing a reasonable spread is likely.

78, the highest Reputation account and the only one at this level, votes with an particularly high voting weight % to reputation 63. This stands out, and with an own reputation vote weight, taken from table 1, of 27% and knowing this is the only account at 78, it looks like this account does not spread their vote weight around too much.

Reputations 74 and 76 both vote for reputations in the 70's. The stand out figure is the vote weight % of reputation 76, using 31% of their vote total weight to reputation 74.

As there are fewer accounts up at these levels, it is 'strange' to see reputations in the 70's voting with a largest % of vote weight to 70+ reputations, particularly when self-votes are removed.


Summary

Across the total Reputation dataset, voting of same reputations (and to self) is common.

When self-voting data is removed, same rep highest vote weight %'s fall from 37 (out of 38) to 4, and this % falls on average from 16.2% to 6.1%.

Suspicious voting patterns relating to account Reputation appear at the very bottom of the sample, and at the top.


Tools used to gather this data and compile report

The data is sourced from SteemSQL - A publicly available SQL database with all the blockchain data held within.

The SQL queries to extra to the data have been produced in both SQL Server Personal Edition and LINQPAD 5.

The charts used to present the data were produced using MS Excel.

This data was compiled on the 4th March 2018 at 8pm (UCT)

I am part of a Steemit Business Intelligence community. We all post under the tag #blockchainbi. If you have analysis you would like to be carried out on utopian-io/Steem data, please do contact me or any of the #blockchainbi team and we will do our best to help you.


Thanks

Asher @abh12345



Posted on Utopian.io - Rewarding Open Source Contributors

Sort:  

I was looking forward to this analysis. As the voter rep increases so does the author rep. I was expecting to see this and my take on it is as follows

When I joined steemit I had a rep of 25, I found other new authors also with a rep of 25. As my rep grew, so did the rep of these accounts. So many of the accounts I was following and voting for when I joined I am still following and voting for. Therefore a lot of my votes are going for accounts with a rep the same as me. We have had this discussion on discord Asher so you know my thoughts on this.

While it isn't surprising and of course birds of a feather flock together and all that, it is still disappointing to me as someone who has tried very hard to spread my votes out to good content independent of REP, with a bias toward lower REP to help newbies out. Flying the no self-voting flag here is a bit of a lonely endeavor. Likewise publishing content as a newb is a bit like beating your head into a brick wall, except the brick wall keeps spouting that this is the place your voice can be "heard". @lextenebris point below RE lower REP accounts relying on higher REP accounts to "drag them" up is spot on, and the inclusion of money in this equation where higher REP = higher income means where attention is not focused on cohorts in the same REP range, it is focused on higher REP accounts in an attempt to gain attention and votes. I can't help but wish someone with even a tiny bit of knowledge of game theory had been involved in the initial decision making processes when setting up the reward distribution system here.

I like your analysis! Following you!

I haven't found the exact spot where you donated to Asher's league, but I wanted to say thank you for what you are doing. You generosity towards the project is truly great and many of us will do our best to make you feel like it was worthwhile! He has really helped us grow and connect with the metrics that he delivers, and I know as a very small fish it is encouraging people like you are recognizing what Asher is attempting to build here!

Thank you @paulag! :)

Yeah you were right, in the main that is the case for sure!

That took forever, i need a beverage!

Makes total sense!

This makes sense to me. The accounts with the highest rep scores have the largest accounts. They can make reasonable money curating and not have to author to get anything in return. If I could make a few grand curating per week, I know I'd be less interested in writing content. But that's just me. :-) Thanks for the post! Very interesting work!

Thanks Brett!

That one was a slog!

I think the main thing is that it shows that self-voting is spread across the board, it's not just the Whales, and so their should be no complaints.

Thanks for the hard slog on this @abh12345. Not sure I agree with your comment on this though, as surely newbies are following the examples set by the whales! I could be wrong though and maybe they are just doing what comes natural 🙊. Thank you again for helping us to understand the inner workings, and I guess the conclusions we draw from the stats can be subjective.

I trust you enjoyed your beverage 🍻.

Hi, and yes thanks!

Indeed the written analysis and conclusions are down to personal opinion. And it's easy to tar Reps in this case, with the same brush.

I suppose it depends which whales you follow, they aren't all rotten, I think!

Cheers!

Ah yes for sure! I should have said "some whales". I used to big a brush! Thanks for your reply.

Yeah same here!! =)

Suspicious voting patterns relating to account Reputation appear at the very bottom of the sample, and at the top.

You left out one of the most important things to note when talking about Reputation on the steem blockchain.

It's a logarithmic curve. It's a logarithmic curve that can only increase for an individual account by being voted up by a higher Reputation account.

This process is important when talking about anything dealing with Reputation, because it puts an interesting spin on trying to figure out how accounts reached that Reputation in the first place.

Sure, Reputation typically goes along with simply being on the platform longer – but it also simultaneously is only found in an account which regularly gets visibility and exposure to accounts with higher Reputation. Because the curve is logarithmic, accounts with higher Reputation become increasingly rare, at a ridiculous rate, as you go up.

Other analysis has suggested that Reputation 60 is the "magic number" at which the amount earned on the platform rapidly diverges upwards from income consistently garnered before that point. I don't think that is a coincidence; I believe it's correlated with the cohort age and survivability of the wave of users that came in about the same time.

A month over month retention analysis would be difficult purely because of the sheer amount of data to be filtered through, but we know that new user retention hovers between 6% and 8%. The retention rate for accounts which survive up to Reputation 60, much less 70+, have got to be horrendous.

So let's consider what a sock puppet has to do to climb out of the Reputation 25 starter space and move up into the suspected 40 – 44 range from your observations. They have to receive votes from higher Reputation accounts. If we see a Reputation cohort all plateauing around the same level, it's reasonable to assume that if most of the votes involved are going between each other, their visibility is largely only to one another, at least in enough volume to move the needle at that level. It seems to imply that if that plateau is stable, then the sock puppets all have roughly the same Reputation and probably mutually sustain at that level because there aren't enough at a higher Reputation to pull them up further along.

In the cases of specific Reputation ranges that tend to vote for another Reputation level, particularly a higher one, it might be worthwhile to simply extract the whole slice of accounts which have that level of Reputation and examine them individually. At anything above 60, this ought to be relatively straightforward and manageable.

Don't forget, some of this behavior can be explained by there being a significant presence of non-public vote automation going on behind the scenes. It would certainly make sense for someone with a modest amount of server availability and a little bit of liquid steam to make an extended synthetic community of accounts which do very little but vote for one another across the network in order to insert claims into the reward pool on a regular basis. Done consistently, this could be fairly self-sustaining.

I've also come to believe by looking at relationship maps that a number of the accounts with Reputation 70+ are not independent entities, but instead organized and operated by a singular intelligence who has spread to their SP across multiple accounts in order to obscure his or her activity. For that level of Reputation and the amount of SP that we're talking about, that person would have needed to be involved with the blockchain from a fairly early time or have an immense amount of cooperation from one or multiple people who were.

The logarithmic nature of Reputation and the requirement of having someone with a higher Rep vote you up into that space makes certain demands of subsequent analysis.

Sure, Reputation typically goes along with simply being on the platform longer – but it also simultaneously is only found in an account which regularly gets visibility and exposure to accounts with higher Reputation.

Thank you for that so clear explanation @lextenebris. All the content in your comment is pure solid gold. In fact, that info which I have selected in the quote only confirms in all its splendor our suspicions of why we have both been anchored in a reputation of 57 forever. ;)

Superb comment this one of yours mate. Complementing in great measure and depth what in itself was an interesting analysis by @abh12345 who reasonably refrained from revealing too many details in his own analysis for fear that these extra hints could go detrimental to his account. Tsk Tsk

:O

What do you mean Tsk Tsk :P

Come on level 58!

Hahahaha Tsk Tsk like in Glossina Morsitans with too much code & technical computations my friend. :p

Come on level 58!

Ok @abh12345, here you have it. More computation analysis challenges for you.

Both, @lextenebris and me have practically the very same Rep Score: 57.8 ¿Could you dare to place an 'entomologist bet' about which one of us would reach 58 first? }:)

tsssk :)

If I was a betting man, I would say that @lextenebris might just pip you to the post, sorry!

Oh, I think you're betting the wrong way on that one.

I never bet in favor of myself when it comes to any kind of competition. RNGesus hates me with a burning passion, and anything which might involve any kind of random factor at all is going to break against me.

It's not that the universe is sentient, it's that it's malevolent.

Don't underestimate yourself like a gambler. For me, you look like a formidably accurate betting man. Because I would say the same. };)

I'm actually pretty comfortable about being in Rep 57. Once you break that Rep 60 ceiling, people seem to expect more of you. They expect that you will care about curation, that you're going to invest heavily into bots and other automated systems for passive income, that you're going to suddenly transform into someone who cares about the platform as a platform, sell out your soul to the crypto cultists, and generally start acting like – I don't know – Jerry Bankfield.

Man, I don't want to do any of that.

My plan is to largely ignore my Reputation and focus on just doing what pleases and amuses me. If it just so happens that people with higher Reputations like what I'm doing, my Rep will go up. If it just so happens that a lot of people with lower Reputations like what I'm doing, my rep will stay where it's at.

Given the choice, I'm far more interested in playing to a wider audience. If nothing else, it ends up with more interesting comments.

Oh yeah! I can understand your position and viewpoint very well. I suspect you prolly would like to check what I've already replied to Asher a couple of comments above.

Doing what you please within the platform also amuses me big time. I find myself reluctant too to give up that freedom of movement so easily.

Given the choice, I'm far more interested in playing to a wider audience. If nothing else, it ends up with more interesting comments.

Though, after almost two years around here experimenting and trying to also play to a wider audience, clearly seems that I have fallen short in attract those interesting comments you are talking about also. }:)

haha

I'm enjoying this thread :)

As far as playing to a wider audience, analysis cant be too technical - If I see too much code or computations I'm out!

Hello!

I'm in agreement with much of what you say, a great addition to the post, thank you.

Much of what you say I am aware of, but I think for the general populous here, logarithmic scales are not really what they came to see.

Mind you, perhaps and analysis that takes you to the box, but wont allow you to open it, isn't either :)

Your second paragraph from the finish is spot on, and actually has been posted about in the past by larger accounts.

The logarithmic nature of Reputation and the requirement of having someone with a higher Rep vote you up into that space makes certain demands of subsequent analysis.

All yours :)

I've said all I wish to say on the topic without including account names. The main reason being that this is my only source of income and I don't wish to lose it.

However, I would be keen on a Vested Steem Power analysis of voting - does money vote for money?

Thank you for the most excellent comment.

I try not to underestimate the intelligence of my audience. If the fact that the path upwards becomes exponentially steep the higher they go is important, I let them know. Shifting from displaying values linearly to logarithmically was a pretty revelatory move when I was doing an analysis and breakdown of the curve of diminishing SP at the top of the ownership lists on the blockchain.

It's one thing to see that there is a self similar curve. It's quite another to see that, aside from a few bumps in the road, that curve remains linear when viewed logarithmically all the way down. At that point, it makes it much easier to see the aberrations – and the aberrations are where the interest are.

However, I would be keen on a Vested Steem Power analysis of voting - does money vote for money?

I've obviously given up all concerned about naming names and drawing attention, mainly because I figure there is nothing that would be more supportive of my analysis than some big names getting bent out of shape enough to start voting me down in earnest.

I like the income, but I can't be saying anything to particularly radical because neither Bernie nor Haejin have decided that I'm worth putting the boots to. That's probably the right analysis on their part, given that I'm just pointing out abstruse corners of things going on in the blockchain and their game is in an entirely different space.

But that's how I roll.

"Does money vote for money?" That should be a fairly easy analysis to put together, and one that I find myself being vaguely curious about as well.

I'll just add that to my "potential future stories" pile.

Well thought out responce! Following you!

That's a really interesting analysis.. I will share my story too if you don't mind :D

Personally, I try to spread my votes as much as I can. But I am a greedy human, so 1 full upvote goes to myself every day. And one full upvote to another person I am really close. My only problem is that there are certain whales or dolphins (now at 69 lvl or more) that greatly supported me (and still do) when I first registered here, so when they post I always give them a 100 % vote, even if they don't need it. It would just seem ungrateful not to do so. But, I understand this is damaging as it means less votes for the newcomers here :/ And I have no idea what to do, as I only get to have 10 full upvotes a day :/

Hey @trumpman, I am certainly no expert. But you have an interesting question at the end, so I thought I would weigh in.

As it regards to you voting out of "respect" to the people that supported you early, I believe it depends on the type of person that you feel obligated to support. If that person would want you to vote for them even though you have a limited amount of votes to use, then I can understand the loyalty.

However, if that person would want you to support the platform instead by spreading your vote to encourage good community development, then I think you should consider better alternatives. The only way to know is to have the conversation with them (if you don't already know their answers).

I am very new and nowhere near where any of your all are at, but I feel the same debt of gratitude to the writer of this post @abh12345. He has helped me grow and kind of taken me under his wing. Of course I feel a debt of gratitude to him that will never be able to be fully repaid.

But without even asking, I know what Asher will want me to do when I get some real SP one day. He will want me to grow the community and would encourage me to spread it out. Of course not everyone is as humble and gracious as Asher, but its something that everyone should always think through and find out.

Is it in our own interest for the platform to grow? Or is it time to milk as much out of this cow before the milk runs dry?

I know what my "big supporter" would want, but obviously I have an easy choice. I would advise everyone to have that conversation though, because that's the only way we can discover solutions to any dilemmas we may face.

It would just seem ungrateful not to do so

Touch them with 10 or 20 % a little as a reminder of gratitude but for the most part, if they are supporters of content and don't need your support, they would be happy to have you pay it forward to others.

Correct, if they don't understand supporting new lower members than there is something wrong but you never forget those who helped you either. People change but the past is still real.

I'm not sure that means less votes for new comers, I assume you still on occasion cast a (what newcomers would refer to) a crumb there direction. I can not even fathom a full power vote from a whale, I had a couple of comments of mine voted by an extremely large dolphin, and was floored by the value of it. So less votes I doubt, controlled votes I can see. You have no idea if they are a real person as in acting for the good of others or just a person looking for a quick money fix. I do a lot of voting for new users, I have recently had to pull two votes, and mute two people, for plagiarism, which I consider theft. So a whale being cautious on votes does not concern me, we vote for who we think are real, and a lot of that comes to voting for and with people that joined during the same time frame, because like @paulag says "As my rep grew, so did the rep of these accounts" I do not see your voting style as "selfish".

Thank you for the contribution. It has been approved.

Wow, that looks like a lot of work! I also share the common opinion here with users voting for similar reps. I think I would expect maybe even a stronger orientation to slightly higher reps to "get noticed" and learn the tricks. I must admit I found myself looking for significantly lower rep users and trying to notice & vote more than the same few authors only after I got some traction/rep with my own account.

You can contact us on Discord.
[utopian-moderator]

It took far too long!

For not reall much more than we already know I guess.

I don't wish to name accounts, as that could prove detrimental to my account, so I guess the juicy bits are not there to see 😊

Thank you!

Reputations 45 and 46 both vote for reputation 58 the most. This stands out in the dataset but the analyst has no obvious conclusions as to why. Anyone?

Perhaps, Sockpuppet accounts voting their master. It would be around that level that someone would build the understanding and have the funds to create the network I guess. Plus, the 58 would have the power to bring the puppets up to 45/46 relatively easily but not much higher quickly.

Hi @tarazkp

Thanks for stopping by and suggestion a reason for the question - some proof you had a look through :)

It's a good shout and makes sense, you are tempting me to go and have another look now. I think i'll need to delve into the account names a bit more, something i tried to avoid here :)

If I find something, i'll let you know. Cheers!

Yeah, I try when I can to actually read :P

It is an interesting area. The 76 voting behaviour is relatively obvious of course

As there are fewer accounts up at these levels, it is 'strange' to see reputations in the 70's voting with a largest % of vote weight to 70+ reputations, particularly when self-votes are removed.

You will likely find that this is a group of accounts vote trading. I can likely name some of them off the top of my head who will essentially go one for one. Just another scam at Steemit.

Yes both of these cases I'm aware of, as many are I would assume.

I do think this is not the only case, but likely the highest profile 'selective voting arrangement'. That's about all I have to say on that one I think!

yes, best not to speak too loud.

Asher is there anyway to add in to the data of those questionable voters if they are vote proxy accounts or all voting for the same witnesses? No names needed just the data sets would be nice to see.

It's possible I think, and yes I suspect a fair overlap.

Honestly, I don't wish to delve too far into it, for my general happiness here and my contents' well-being, if you know what I mean :)

I completely understand. I have been looking at some stuff as well and it seems like an endless rabbit hole of irritation. I guess in reality we all wonder what is in pandora's box but none of us really want to open it to find out.

Thanks for what you do here, one by one I am introduced to or find one of the larger names around here. I was told about you last week or so by @ashleykalila. That lead me to find this. That's a smart girl.

Hey Asher, I don't feel qualified to weigh in on your post with any conclusions. But I wanted to say that I appreciate the time and effort this took to compile.

There are several obvious things that I learned from this, but since this is the big-boy page I will just leave my response to a "thank you for your effort, it is appreciated" reply! :)

Hi!

Yes don't worry, I like the calm of these posts if I'm honest - and if approved they supply a welcome vote and SP to delegate 😊

Good for curation too as mentioned in the League post - easy crypto as they might say!

yes... I figured you don't want the spiderwebs that get woven on the other post to happen every time! lol

btw, this guy translated his placing in your league into portugese and talks about the upcoming contest. I thought you might like to know, its pretty cool! ;)

https://steemit.com/pt/@pataty69/eu-sou-curador

ps... if you don't speak Portuguese, use google chrome and it will auto translate it for you.

You really went all in on this analysis. You really do put in painstaking efforts to dig data that will be useful and hard to obtain. The data just goes to show that most people are really a fan of themselves :)

Thank you @greenrun, it sure did take long than expected!

Yes I agree, and people are entitled to use their stake as they wish, let's hope it's not too detrimental to the platform.

Thanks for visiting 😊

Reputations 45 and 46 both vote for reputation 58 the most. This stands out in the dataset but the analyst has no obvious conclusions as to why. Anyone?

On that "Table 2 - Votes to Self excluded" I can't see any 57 in the 'Author Rep' column anywhere. Where I am certainly included right now. So, your great analysis reveals me why it seems to have been so far a so uphill huff and puff struggle that looks like it has lasted for years to reach a 58 Rep (in this instant at 57.8) for someone who joined steemit on 2016-08-07, 22:47 as the 51,069 steemian membership within the platform.

Hahaha, let's see if I can gain those 45/46 Rep upvotes, at least, any time soon. }:)

Oh yes, nowhere to be seen!

Well it would be nice for you if the votes started magically appearing when you reach 58, don't bank on it be do let me know If so!! :D

I will mate. Sure I will let you know. But don't allow your beer to warm up to celebrate together with a toast. With this recent drought of upvotes that we observe these days around the platform. Specially those executed by High Rep folks with the true capability to pump up us to higher Rep figures in our steemit profile. In my case, I have the hunch this still will take a couple of months or more to reach 58.

So, drink your beers now as cold as you can. LoL

Sorry to be a bother, but have you seen

https://steemit.com/steem/@lextenebris/steem-visualizing-vote-histories-on-the-blockchain

https://steemit.com/technology/@lextenebris/steem-study-yes-more-flows-out-than-in

or some of his other posts on votes, transfers and other data visualized?

I think the single best feature of your post has to be how you say the word strange in quotes.

Strange voting patterns indeed, my friend. I wonder how they got to relatively equal rep levels. ;D

Thank you for these, I had not seen, this man is worth a follow!

'strange' is my on the fence verdict - analyst trying to stay impartial to what he sees behind the tables and charts :)