No. I've been considering whether to open source it, and there are two issues:
Due to the current lower prices, I haven't nearly covered my learning/development time yet from the community rewards, significant though they have been.
If this code was made public it would allow organised spammers to much more easily circumvent adverse classification, and undermine the effort.
I think this issue is similar to some we've discussed in the past, with no obvious perfect solutions.
Sure, and my position remains the same: I can't trust a metric I can't inspect, so while I know you have the best of intentions you become the point of trust for your algorithm instead of the algorithm standing objectively on it's own.
A compromise could be to allow someone you trust to inspect it, someone with some credibility publicly, and they can state their findings without revealing the algorithm. If you had a few of those from people I knew to be competent and honest it would significantly raise my trust in it.
Just something to think about in the interest of claiming these metrics have any meaning.
I suppose the community can look at the results from the algorithm to assess whether it has any meaning, and yes, trust me if they think I deserve it. That said, I wouldn't rule out what you have mentioned.
Perhaps you could send me a list of people you know to be competent and honest? ;)
The results won't be enough to test that unless you had good knowledge of the entire ecosystem, i.e. what did it leave out? Sounds expensive to verify blindly.
@timcliff seems to fit the bill, and I believe he's taken an interest in your project.
I'll certainly consider that once I have more training data and a cleaner implementation. That's not a very long list of people though, and you said you'd want 'a few' in order to significantly raise your trust.
It necessarily going to be a small list, I can only think of one other - @jesta. Think about it and why not pick some others too, it's not specifically for me or anything.
Thanks.
No. I've been considering whether to open source it, and there are two issues:
I think this issue is similar to some we've discussed in the past, with no obvious perfect solutions.
Sure, and my position remains the same: I can't trust a metric I can't inspect, so while I know you have the best of intentions you become the point of trust for your algorithm instead of the algorithm standing objectively on it's own.
A compromise could be to allow someone you trust to inspect it, someone with some credibility publicly, and they can state their findings without revealing the algorithm. If you had a few of those from people I knew to be competent and honest it would significantly raise my trust in it.
Just something to think about in the interest of claiming these metrics have any meaning.
I suppose the community can look at the results from the algorithm to assess whether it has any meaning, and yes, trust me if they think I deserve it. That said, I wouldn't rule out what you have mentioned.
Perhaps you could send me a list of people you know to be competent and honest? ;)
The results won't be enough to test that unless you had good knowledge of the entire ecosystem, i.e. what did it leave out? Sounds expensive to verify blindly.
@timcliff seems to fit the bill, and I believe he's taken an interest in your project.
I'll certainly consider that once I have more training data and a cleaner implementation. That's not a very long list of people though, and you said you'd want 'a few' in order to significantly raise your trust.
It necessarily going to be a small list, I can only think of one other - @jesta. Think about it and why not pick some others too, it's not specifically for me or anything.
I will think about it, and tidy up my code ;)