2010년 6월 9일 Distributed Encyclopedia Model
(한국어로는 오늘 메아리 메인 사이트에 올린 이 글이 비슷한 내용을 담고 있다.)
Today we see lots of things go to the distributed model; so-called crowd sourcing and DVCS are examples of distributed models. As a polymath wannabe, I naturally have thought of the distributed encyclopedia model. While Wikipedia did a great job on publicly editable encyclopedia, it is nowhere close to “distributed” encyclopedia as it has a central rule (albeit it is quite flexible) and editors have to (mostly) conform to it.
Why is the distributed model important? The first observation is that Wikipedia is a general encyclopedia, not particularly specialized in one subject. It doesn’t necessarily mean Wikipedia articles are inferior to expert’s one, but it does miss one fact: notability is subjective. Having a concrete notability criterion (repeating myself, it is still quite flexible) is good for Wikipedia’s original purpose. But is a non-notable subject really not notable? Many specialized encyclopedias (e.g. Wookeepedia) contain lots of subjects which would be not notable in Wikipedia, but are in fact notable in themselves! Wikipedia misses lots of specialized knowledges which may be notable in some aspect, so it fails to capture all of human knowledge.
The second observation is that we already have lots of wiki-based encyclopedias and they are mostly isolated to each other. Have you seen any case that one encyclopedia extensively links to other encyclopedia, and vice versa? I’d bet not. There were an interwiki and sisterwiki that are used for exactly same purpose, but they are mostly ignored nowadays and turns out to be very ineffective. I think Wikia has a similar system for Wikia wikis (with MediaWiki’s interwiki facility), but as far as I know it is seldom used too. Moreover it turns out that many encyclopedias don’t link to Wikipedia articles either, despite Wikipedia is very well known and quite stable for purpose of link.
The third observation is that a collaborative encyclopedia project is also a community. An online community is a kind of pain to manage and becomes a disaster when it gets bigger and larger. Wikipedia editors, for instance, have one common interest, contributing to a particular subject, but except for that they have nothing in common. A conflict between two parties may dissolve the entire community, split the community in half, or forbid new users during the conflict. For an encyclopedia project, however, it is hard to split the community since one cannot efficiently fork the project; Wikipedia does have a dump but still hard to get all things right. Have you ever heard of Conservpedia which completely failed to fork the project, thus started from scratch and became a piece of shit of Internet?
These observations lead me to the important conclusion: Wikipedia, or other current model of wiki-based encyclopedia, is undesirable for collecting the entire human knowledge. They may be useful for collecting some of the human knowledge (as they have shown), but then how about the others? Therefore we need more strong relationship between individual wikis, more effective forking mechanism, and a systematic policy to encourage them. Ideally speaking the community of one wiki has to be small, as small as one person (“individual encyclopedia”). A community may choose to write every articles from scratch (much like the way I prefer) or write some relevant articles and refers other articles in other wiki. Here the reference is not same to a link, as we need a resilience of referred articles (when the other wiki cease to exist or corrupt the article) and a capability to backtrack the reference (much like backlink in wiki). Someone may choose to fork some wiki, but it should cause only a reasonable amount of traffic; possibly the software may copy articles spontaneously and intelligently, minimizing the peak traffic.
Yeah, that is untested and not guaranteed to work. I don’t know there is a solution that fulfills these requirements with many loosely interconnected wikis too; for example the traffic requirement rules out most DVCSes. But still I’m confident that my observations are valid. First of all, is it sane that the important piece of human knowledge relies on the centralized mechanism only, or not?!
