Subject: Estimating investments in Linux and other libre software
From: (Frank Hecker)
Date: Fri, 30 Oct 1998 13:57:25 -0500

(Resending this due to a qmail error at the first time.)

Here is an interesting article containing estimates of the programmer
time put into developing the Linux kernel and the surrounding GNU, etc.,

As it happens, Mike Shaver of Netscape and I engaged in some internal
correspondence regarding this subject a while back.  Mike estimated
(based on the amount of code added to the kernel in a year, divided by a
nominal 10 lines of completed code per day per developer) that the
amount of programmer time put into Linux kernel development was
equivalent to approximately 120 full-time developers; compare this to
the estimate in the article of 200 part-time kernel developers totalling
500 man-years over 5 years, or the equivalent of 100 full-time people
per year.  Pretty good agreement, and one which gives some confidence
that these are reasonably accurate numbers.  (Although I wish that the
article had included more background information on the methodology used
to create the estimates.)

This subject is inherently interesting to me for a number of reasons.
First, it enables one to compare the programmer resources being put into
Linux and other libre software projects vs. the resources being put into
proprietary products like NT, and this in turn gives some clues as to
the long-term viability of Linux, etc., vs. NT, clues which will be
useful to those considering investing in or participating in the general
Linux market (meaning, Linux plus stuff running on top of it plus stuff
needed to make it run).  In this sense it is a useful complement to the
estimates of the total Linux user base that have been published by Red
Hat and others. 

For libre software development in general it is also important to know
things like the size of the total pool of available programming talent
vs. the amount of that talent being spent on various projects.  As libre
software projects become more sophisticated and more tied into
commercial enterprises (e.g., FSBs), I think those who initiate and
manage such projects are going to have to get more sophisticated about
recruiting and retaining developers; for one thing, they're going to be
competing against other projects trying to recruit from the same
developer pool. 

One aspect of being more sophisticated is having better metrics about
actual and potential developer resources; then you can begin to think
implementing formal strategies for recruitment and retention and being
able to evaluate their relative success or failure.  This is analogous
to what more sophisticated non-profit organizations do in terms of
measuring the success of direct-mail campaigns or volunteer
initiatives.  (In fact I would contend that the analogy is actually
fairly exact, given that in both cases you're leveraging unpaid
volunteer resources and have to deal with all the issues that entails:
managing the role of volunteers vs. that of in-house paid staff, paying
attention to the ideals that motivate volunteers, etc.)

This would be a great area for someone in academia to do a study and
come up with some publishable data on past and present investments in
libre software both in toto and by project.  Most if not all the
information you'd need for such a study is publicly available; for
example, you could use old tarballs or CVS snapshots to do source line
counts and counts of the number of unique contributors for particular
projects over time.

Frank Hecker          Pre-sales support, Netscape government sales