Performance of: ai-scan.rb (Round 1)
Previously, I posted a script that claims to predict new post payouts based on trending posts. To re-cap, we're using ID3 to correlate certain fields in the post without considering the name of the author.
I did this report manually. If it performs well, it might be interesting to do it automatically each day. Anyway, let's see how it did this time ...
Here's the output from Sunday, January 20th, 2017 at about 7:27 PM (UTC):
$ ruby ai-scan.rb
Predicting the following payouts will rise by:
16 SBD: https://steemit.com/aceh/@keuudeip/they-sold-this-handmade-in-their-traditional-market-a-vote-for-cultural-diversity
16 SBD: https://steemit.com/steemitphotochallenge/@knight-angel/steemitphotochallenge-25-entry-no2-plasticine-rock
16 SBD: https://steemit.com/openmic/@ken-and-jane/open-mic-night-week-15-come-in-from-the-cold-an-original-song-by-ken-and-jane
16 SBD: https://steemit.com/steemitphotochallenge/@dreamstream/more-of-the-blues-steemitphotochallenge-entry
17 SBD: https://steemit.com/film/@jackmanmania/banking-on-bitcoin-2016-film-review
17 SBD: https://steemit.com/photography/@dejoelblog/monument-general-strike-in-yogyakarta-1-march-1949
17 SBD: https://steemit.com/til/@transhuman/til-of-tommy-emmanuel
17 SBD: https://steemit.com/animal/@barvon/golden-frog
17 SBD: https://steemit.com/photography/@dman57/notocactus-otto-blossomed
17 SBD: https://steemit.com/cosplay/@gnumix/cosplay-wars-marvel-vs-dc
17 SBD: https://steemit.com/fiction/@johnjgeddes/tempest-and-tea-spells-and-magic-and-simple-attraction-conclusion
17 SBD: https://steemit.com/anarchy/@ericwilson/anarchy-is-everywhere-2017115t13242651z
17 SBD: https://steemit.com/nature/@tangmo/beautiful-flowers-strange-trees-and-dog-gangs-in-the-garden
17 SBD: https://steemit.com/art/@alexeymohov/butterfly
17 SBD: https://steemit.com/photography/@foreman/green-leaf-sprinkled-with-snow
20 SBD: https://steemit.com/poker/@tuck-fheman/poker-what-am-i-holding-aka-guess-my-hand
All of the above posts had a pending payout of less than a dollar when I ran the script. They all had at least 23 hours left for the first payout.
So, what we're looking for is the resulting payout for these posts, once they get to that point. How close does my script get to those numbers?
To actually claim a prediction as accurate, the post has to make at least as much as predicted. None of them exceeded the prediction.
If we cherry-pick the results, the ones that got non-zero payouts had pretty decent curation because my votes were after 30 minutes but before many high stake votes. Not quite perfect, because the almost optimized curation is also within the first 10 votes cast.
The above test did not segment based on tag. For the next test, I narrowed to the photography
tag:
Predicting the following payouts will rise by:
0 SBD: https://steemit.com/life/@gardenofeden/full-moon-community-gathering-photo-album-from-the-dhyana-yoga-center-please-join-us-at-the-next-event-on-january-28th
0 SBD: https://steemit.com/photography/@ejhaasteem/devil-horns-on-small-creatures
0 SBD: https://steemit.com/art/@marius19/decoration-origami
0 SBD: https://steemit.com/aceh/@keuudeip/they-sold-this-handmade-in-their-traditional-market-a-vote-for-cultural-diversity
0 SBD: https://steemit.com/nature/@mynameisbrian/good-morning-agouti
0 SBD: https://steemit.com/photography/@fibra59/morning-fishing-greeting
0 SBD: https://steemit.com/photography/@abdullar/flowers-at-night-photos
0 SBD: https://steemit.com/photography/@foreman/green-leaf-sprinkled-with-snow
3 SBD: https://steemit.com/steemitphotochallenge/@germanaure/steemitphotochallenge-25-entry-2-me-and-my-problems
4 SBD: https://steemit.com/steemitphotochallenge/@knight-angel/steemitphotochallenge-25-entry-no2-plasticine-rock
16 SBD: https://steemit.com/photography/@dman57/notocactus-otto-blossomed
16 SBD: https://steemit.com/photography/@xntryk1/swapmeet-finds-644
17 SBD: https://steemit.com/photography/@julia-steemit/rainbow-bright
17 SBD: https://steemit.com/steemitphotochallenge/@roma-nt/steemitphotochallenge-25-balalaika
17 SBD: https://steemit.com/photography/@apolymask/i-finally-got-to-100-subscribers-on-youtube
17 SBD: https://steemit.com/funny/@singapasee/the-grasshopper-stories-2017116t2953477z
17 SBD: https://steemit.com/food/@abudar/delights-and-benefits-of-snail-dongdong
17 SBD: https://steemit.com/photography/@dman57/icicles
19 SBD: https://steemit.com/photography/@kalemandra/purrfect-colorsplash-cat-eyes
The above results were a little less uniform than the tag indifferent result. They initially ranged from less than a dollar to $15. Their timeframes were also less uniform, ranging 23 hours left to only 16 hours left.
This makes sense because tag indifferent results track more activity and more of a timeframe is included here.
In terms of accuracy, 8 predictions out of 19 were correct. But while the ones that had a prediction of zero increased payout and these 8 were accurate by this criteria, it was a somewhat useless kind of accuracy.
However, 4 of those 8 actually had a non-zero payout, so that's a success. It seems like using a tag for this script is indicated, so further testing should probably include a tag.
Conclusion
- Is it just the same as luck or can ID3 actually predict the final payout? - On its face, I think the results might be worse than luck.
- Do we consider zero payout as "accurate" when the prediction is zero like the four we see in the second sample? - Actually, this is where the script seems to be better than chance, so our methodology might benefit from a few tweaks to improve this.
- Should we cherry-pick the initial prediction to only consider articles with 9 votes at 30 minutes? - I think this should be explored.
If these results are an indication, it seems like this analysis is more useful when we narrow the search by tag. Based on these results, I think it's worth continued tests.