Arc Forumnew | comments | leaders | submitlogin
Discussion on ranking algorithms
1 point by marrone 4381 days ago | 2 comments

this relates to the news.yc source that was included with the arc2 source. Huge props to the YC people for putting this out there. I love just being able to look and learn from their own code.

I see that news.yc uses a ranking algorithm (news.arc source starting at line 186) that uses the age in hours of the votes to calculate the score. I find that very interesting because I once tried to determine a good algorithm for ranking things and ran into the same problem that news.yc has, for which they handle with the rerank-random function (line 237). Basically the problem is that stories' scores wont deprecate until another vote has been cast upon them.

Their comments explain: "; If something rose high then stopped getting votes, its score would ; decline but it would stay near the top. Newly inserted stories would ; thus get stuck in front of it. I avoid this by regularly adjusting ; the rank of a random top story."

An alternative approach I considered was giving each item a base score based off number of days (for example) since a fixed point in the past (eg Jan 1, 1970) . That way each item will degrade relative to newer items without ever needing another vote, since newer items gain the advantage of a larger base score in the calculation.

Anyways, I just wanted to give huge props for the source to news.yc and being able to learn from the masters themselves.

Sorry, is this a post for news.yc or these forums???

1 point by lojic 4380 days ago | link

Are you sure you read the source right?

It looks to me like it's the age of the story that's considered, not the age of the vote. Apparently, the stories only have their rank adjusted when a vote is made, so the solution was to have something in addition to voting drive the adjustment - in this case, randomly selecting a story.


1 point by byronsalty 4381 days ago | link

This approach seems like it would work but it would probably be odd because scores would be ever increasing. Eventually the scores for the site would get extremely high.