How open collaboration works:
an introduction for scholars

Larry Sanger

Does the idea of sharing your work freely online, or even allowing others to change it, strike you as obviously bizarre or even offensive?  Never heard of "open source software," or have only the vaguest idea?  Don't know what Linux is?  Never heard of Richard Stallman, Larry Lessig, Linus Torvalds, or Eric Raymond?  Completely mystified why Wikipedia works at all?  Have no clue as to what the GNU GPL and Creative Commons licenses are?

Then let me explain.  I'll introduce you to some ideas and processes that are strange and counterintuitive--but which nevertheless work surprisingly well.  Then I'll explain how you, serious scholars, might apply these ideas and processes to create grand new projects yourselves.  I think that in time these ideas and processes will revolutionize the way that some academic work is done.

A good way to begin is to explain what open source software is.  For this I draw especially on Eric Raymond's essay, "The Cathedral and the Bazaar," which has been extremely influential.  Free to read, of course, here.  The source code of software is the code that is written by programmers and then compiled, in order to produce the executable file (the runnable program) that you use.  Usually (e.g., with Microsoft Word), you use the executable file without having the source code, and there's no easy way to extract the source code from the executable.  Open source software, then, is software the source code of which is open and published for other programmers to read and, by extension, to change for their own benign or nefarious purposes.

Before your eyes glaze over completely, let me explain why this matters.  In typical (proprietary) software development, a coder (or a coder's company) jealously guards access to source code.  It's his program, he's going to sell it, and so he doesn't want people touching the source.  Control over the source gives him the right to sell the program and the exclusive right to determine the program's design and features.  These are significant benefits to the individual or group that owns the program.

Open source turns this on its head.  Granted, it is pretty hard to make money (for most developers) with open source software.  You're giving it away, after all.  But when a coder opens up the source, other coders can contribute to the program's development.  Other coders can add new features, fix bugs, and generally make life a lot easier for the original developer.  The software becomes genuinely a product of, and owned by, the community of software developers who maintain it.  And in the Internet age, the world can benefit from this puzzling behavior--which helps explain why perhaps it shouldn't be so puzzling after all.

Raymond's essay (and another one, "Homesteading the Noosphere") describe a fascinating process and ethic that arose in the open source "hacker community" (i.e., among really good programmers who work on open source programs).  The program's originator becomes its maintainer.  Other coders send in changes to the program, and the maintainer decides what changes to use and what changes to reject, and then re-releases the program to the developer community.  The developer community again bangs away at the program, reports on and fixes bugs, proposes and codes new features.  And so on.  In time, the program becomes something much bigger and better than the maintainer could have done on his own.

The most famous example of this process is probably the development of Linux, a free operating system that competes with Windows and the Mac operating system.  A computer science student named Linus Torvalds wrote his own version of the kernel of the (proprietary) Unix operating system and then made it available to other programmers, inviting them to work on it with him.  Torvalds' community of developers grew quickly and in very short order the operating system was in great shape.  This is widely regarded, among techies, as one of the great success stories of the computer age.  Let me underscore this point.  You can't appreciate the appeal of open source to some techies unless you understand that they are enormously impressed with the efficiency with which open source software is created, the quality of the results, and the fact that it's free and outside the control of vested interests.  With such a powerful combination of virtues, it's no wonder that so many people are ga-ga over open source.

The above-outlined development process worked brilliantly because Torvalds gave away his source code, and then encouraged people to collaborate on it with him, and the people as a result became a community.  Freedom, collaboration, and community are, thus, core values that made the Linux operating system possible.  There are other operational values as well.  For instance, one principle is "publish early, publish often."  The reason for this is that publishing's purpose in an open source project is to let other developers get to work on the latest code.  The more you publish, the more of an opportunity for progress on the code.

But in fact there is one central idea that explains the others.  It is the vision of a collectively-owned work (product) that anyone who is able can help build.  Open source advocates just love that vision.  This core idea explains the values of freedom, collaboration, and community.  For the work to be available for contribution by anyone, then it must be "free" or "open," i.e., the source code is available to whomever asks for it, and is developable further by anyone.  Collaboration is essentially the means whereby the work is created.  Finally, community is a necessary by-product, since collaborators must communicate and share a goal, thus developing the relationships that we associate with community into existence.

Obviously, coders might well be concerned that the products of this open source software (often abbreviated OSS) development process remain free.  While most OSS coders don't care whether companies make money by encorporating their software in their products, they do care if anyone tries to take control of the process and make it proprietary.  To motivate their work, they want a guarantee not only that the software is free, but also that it will remain free.  Consequently, typically, the project coordinator (or a group of developers, or an organization) retains copyright over the software but licenses the software under what is called an "open source" or "free software" license.  The license requires that anyone who makes new versions of the software release it under either the same open source license or (depending on the license) a similar one.  And here is where Richard Stallman and the Free Software Foundation come in.  Stallman has been a very vocal advocate for a particular license, the GNU General Public License, or GPL, which the Free Software Foundation revises and re-releases from time to time.

These are suggestive ideas.

It does not require much imagination to realize that the core vision behind OSS--a collectively-owned work that anyone who is able can help build--can be applied to a lot more than software.  Lawrence Lessig started Creative Commons with the idea that a scheme of free licensing should allow people many different options in licensing many different types of work.  A person should be able to specify whether others are able to make derivative works, whether the author must be credited as such, and other options.  An artist, for example, might require that she be named as director of a video, to allow free copying of it, and to disallow derivative works.  A lexicographer might not require attribution for his dictionary, and allow derivative works.  An idea that began life in the software community--a license that guarantees free redistribution rights of software source code--in Lessig's hands was generalized and expanded.

At about the same time Lessig was starting Creative Commons, Jimmy Wales and I started Wikipedia.  Wikipedia invites anybody to come to the website and, even without logging in, start banging away not on software but on encyclopedia articles.  Like OSS, Wikipedia is a collectively-owned work that anyone who is able can help build.  Like OSS, Wikipedia features freedom, collaboration, and community: the project's articles are freely redistributable, written collaboratively, and a community has developed around the collaboration.

Wikipedia has been many people's first exposure to the magic of open collaboration.  But as you can see now, Wikipedia borrowed that magic from OSS.  And, as used to be the case with OSS, on first glance, it seems absurd that the Wikipedia development process should work.  After all, if it's that open to everyone, why isn't it full of nonsense?  The amazing thing, however, is that it's not.  It's not perfect, but it's remarkably good considering how it is produced.  Why?  For the same reason that long-lived OSS projects, that no one is getting paid for, aren't full of bugs.  Namely, there are many collaborators, after all, and they all feel a sense of ownership over a common product, so out of pride if nothing else they simply won't tolerate obvious mistakes.  Since "reverting" (undoing) bad edits is actually easier than making the bad edits in the first place, vandals are (however annoying) pretty easy to deal with.  Raymond's most famous catchphrase, "Given enough eyeballs, all bugs are shallow," was transformed to: "Given enough eyeballs, all errors are shallow."

It is often said that Wikipedia shares something else with OSS: a contributor's work is judged strictly on its own merits, not based on the contributor's credentials.  There is, therefore, no special place for experts as traditionally defined; as Wales himself has said, the Wikipedia community is "anti-credentialist."  This makes sense for software projects.  After all, either code compiles and does what it's intended to do, or it doesn't.  Code doesn't care about credentials.  But static text contains nothing really analogous to compiling code and running a program. Compiling and running is a relatively straightforward, objective way to determine the quality of a program.  There is no correspondingly straightforward, objective way to determine the quality of an encyclopedia article.  This important point of disanalogy seems lost on those who tout Wikipedia as the "open source" encyclopedia.

Now, prior to Wikipedia, the usual way that people have determined the quality of reference material has been through careful editing, fact-checking, and review by experts.  And in fact, despite the disanalogy noted above, there is an essentially editorial role in all OSS projects.  There is often a small set (sometimes a set of one) of "senior developers" who examine submitted code and decide what's in and what's out.  But Wikipedia has no "senior content developers" who play a similar role in deciding what is published, or even merely featured or certified.  When it comes to content decisions, all contributors negotiate on a more or less equal footing, regardless of how much they know, or don't know, about a subject.

So, in my opinion, and for this reason, Wikipedia really doesn't follow the OSS development model as well as it could.  And, unsurprisingly, the quality of Wikipedia articles, especially in the humanities and social sciences, is uneven and frequently amateurish.  Clearly, the job of applying the OSS model to encyclopedias is unfinished.

I think it's time that the editors of the world--meaning academics, scientists, and others whose work essentially involves editing--got involved, not necessarily in Wikipedia, but in similar, suitably altered projects.  I want to encourage you scholars, who make it your life's work to know and teach stuff, to become students of the wonders and beauties of OSS development, and think about how it can be applied to the development of content.

There are many exciting possibilities in which scholars would be "senior content developers" for open content (that's actually the phrase) projects.  I invite you to imagine what you could do if you were to pool your efforts, working as editors guiding a huge global group of content developers.  What sorts of new reference works could emerge with the concerted efforts of people editing each others' work?  Think of the story of the creation of the Oxford English Dictionary.  It was by pooling and editing the submissions of thousands of perfectly uncredentialed contributors that James Murray and his assistants created one of the greatest works of the English language.  What is possible, now that collaboration can be done digitally, instantly, globally, and with all sorts of software tools to automate drudgework?  It would make Murray's mind reel.

Well, I maintain that one thing that's possible is that you, scholars, become the gentle editor-guides of new wiki encyclopedias.  (Gentle, I say, because in your editing you must not kill the goose that lays the golden eggs, namely, bottom-up collaboration.)  Time will tell whether it fits properly into the narrative as I've told it above, but arguably, the next step in the application of OSS processes is the inclusion of editors as "senior content developers" in open content projects like Wikipedia.

That, at any rate, is the approach of the Citizendium, which (at this writing) is a proposal to begin with Wikipedia's content and invite editors to collaborate with the general public to improve it (it is a "fork" of Wikipedia, which Wikipedia's open content license permits).  I think there's an excellent chance that this project, or others like it, will work.  I hope also that such projects will also illustrate and teach the virtues of open collaboration to a large and vastly influential group of people who never even thought of working this way.

But some promoters of OSS and open content say these projects won't, or even can't, work.  They say that professional researchers won't work without pay--ignoring that all the time researchers are writing for journals and giving speeches at conferences without pay.  They say that scholars will do work only alone or in very small groups, require their names on their work, and can't learn how to collaborate in the radical OSS way--ignoring that many unsung scholars have collaborated with Wikipedians and clearly like the basic concept.  They say that college professors are too snobbish to work alongside and interact with the public--ignoring that an essential part of being a college teacher is precisely to make knowledge accessible to the public, a task to which many are passionately devoted.  They say that specialists generate mainly pretentious nonsense, and so are not suitable as editors for the public--ignoring the fine abilities of many specialists.

These well-meaning but wrongheaded promoters of OSS and open content seem to think that open collaboration is a method reserved exclusively to amateurs, students, the "general public," and so forth.

Let's prove them wrong.

Creative Commons License
This essay is free: it is licensed under a Creative Commons Attribution-NoDerivs 2.5 License. You are strongly encouraged to reproduce it on your website or blog, print it out and give copies of it to your colleagues.  As content developer for this essay, I will integrate suggestions via sanger@citizendium.org.  Note, I want to keep it at least as short as it now is.

Back to the Citizendium front page