Subject: is there a statistician in the house? (long)
From: Seth Gordon <>
Date: Thu, 10 Mar 2005 14:25:20 -0500

I have a weird idea for how to address the perpetual question "how can I 
pay the rent by writing open-source software"?  My idea depends on some 
statistical techniques that I've read about, but never formally studied. 
    I'm hoping that someone on this list who knows more about statistics 
than myself can tell me if I'm onto something useful here, or if I'm 
just waving my hands.

The community of free-software writers is frequently referred to as a 
gift economy, where people donate code to the community in order to 
enhance their own reputation.  I would like to point out two significant 
things about such economies, in general:

(1) Gift economies are most effective in communities where the donors 
and recipients are doing the same kind of work.  (Presumably this is 
because, in the absence of price signals, the shared knowledge of the 
craft assures donors that recipients will properly appreciate the 
donors' work.)  In the classic gift economy, the potlatch, everyone 
involved belonged to a hunter-gatherer tribe.  Pal Erdos's reputation 
among his fellow-mathematicians was such that many mathematicians let 
him stay in their homes; professors in other fields would presumably not 
be so interested in hosting a mathematician, however famous. 
Open-source hackers have tended to be much more diligent in producing 
code that their fellow hackers can appreciate than they have been in 
producing artifacts that benefit other groups (e.g., documentation for 
non-technical end-users).

(1') Another way of stating this is that there is no global ordering for 
people's reputations.  If I say that A is a more praiseworthy hacker 
than B, and you say the opposite, there is no yardstick for judging 
between us.  Of course, if we both are trying to hire a programmer and 
we start putting out competitive bids, we will discover whether or not 
A's labor has a higher *market* value than B's, but then we are no 
longer in the realm of a reputation-based economy.

(2) Reputation is ordinal, not parametric.  That is, I might say that A 
is a more praiseworthy hacker than B, but I cannot quantify *how much 
better* A is.

Now, if there were some way to *transfer reputation* from one person to 
another, then the effects of (1) could be diluted.  If, for example, my 
mother told me that a certain piece of free software made her life much 
much easier and she regarded its author very highly, then even if I 
personally had no use for that software, I would be favorably disposed 
towards its author.  How can that transfer of reputation be formalized?

With that introduction, I present The Kindness Of Strangers Game.

(1) The game is played in turns.  (Assume for the time being that each 
turn lasts one month.)  Players commit to give one another gifts of 
goods and/or services, and promise that all other factors being equal, 
they will use the ranking system described below to determine who to 
give things to.  (A nonrival good given to the entire community, such as 
an open-source program or an original piece of music, can be accepted as 
a gift by everyone who appreciates it and ignored by everyone else.)

(2) During each turn, each player announces his or her "donors" and 
"endorsees".  By naming someone as a donor, you say, "So-and-so gave me 
some gift in the previous turn, for which I am grateful."  By naming 
someone as an endorsee, you say, "If you give so-and-so a gift in the 
next turn, I will be inclined to reciprocate it."  Endorsees can be 
friends, relatives, gurus, etc.

(3) Your donors are ranked by how much you value their gifts.  This is 
your "primary rank list".

(4) (This is the statistics part.) The people whom you named as 
endorsees in the previous turn share their primary rank lists with you 
in the current turn.  You can combine these lists into a single list as 

(4a) Treat each ranking of each donor as an observation; treat the set 
of rankings received by each donor as a sample.

(4b) Use the Kruskal-Wallis test to find out if there is a statistically 
significant difference between your samples.  If no donor has 
significantly better rankings than any other donor, then the only way to 
combine the lists is to declare a tie between all the donors.

(4c) If the Kruskal-Wallis test *does* find a difference, then use the 
post-hoc Newman-Keuls statistic to find out who tended to outrank whom.

(Conveniently enough, the Statistics::KruskalWallis Perl module purports 
to do both these calculations.)

(4d) The list generated by this procedure is your "secondary rank list". 
   Thus, a person who you never heard of, but who was generous to your 
friends, may show up highly in your secondary rank list.

(4e) This process may be continued recursively--players may publish 
their secondary rank lists and use them to compute tertiary lists, etc., 

(5) When you are deciding who to favor with gifts in the current turn, 
you let yourself be guided by the rankings in both your primary and 
secondary rank lists.


How badly am I abusing the statistical techniques I'm referring to here?

"Simple faith can lead to very complex aveirot."  --Shmarya
// seth gordon // // //