Friday, December 31, 2010

On Wikipedia

I have certainly found Wikipedia to be the most useful resource for just about any topic, but I was wondering if there was any useful way of quantifying this.  Here's my attempt.

The utility of a resource is related to what you gt from reading it, taking into account both correct and incorrect information.  Wikipedia skeptics would point out that Wikipedia has much more incorrect information than a more "scholarly" source, even measured per unit of information in the resource, but I'd argue that this isn't the correct measure.  Instead, I'd argue that the best--or at least a good--way to measure the usefulness of a resource is to think about what happens when you attempt to find a specific piece of information on it.  Let's say that you're looking for some fact A.  If you let the total information of a resource (weighted by usefulness*) be I, and the total information in the world (again, weighted by usefulness) be T, then the odds that A is in a resource R are I/T.  If it's not their, the usefulness of the resource is 0; if it is, then let's say that the usefulness is X if the information is correct, and Y if it's incorrect (Y would presumably be negative).  Let the probability that a random fact, weighted by usefulness, is correct in a given resource be P.  Then, the expected value of looking up A in R is (I/T)*(P*X-(1-P)*Y).  This simplifies to I*[P(X+Y)-Y]/T.  T, however, is constant throughout all resources, and without loss of generality I'll define the unit of usefulness to be the UN, and the units of the above equation to be T*UN, thus meaning that the usefulness of a resource is I*[P(X+Y)-Y] (measured in UN).

Now, let's try to compare two resources with the above equation.  I'll attempt to compare Wikipedia with the Encyclopedia Britannica.  Let's say, for the sake of argument, that Britannica has no errors (i.e. P=1), and that Wikipedia has 1 error in every 100 pieces of information (a figure that I think is way to high--articles contain thousands of pieces of information and most don't contain any errors), i.e. P=.99.  The length of Wikipedia is about 25 times as long as Britannica (yes, this is according to Wikipedia; I'm willing to take the chance that it's wrong); this number will likely double every few years for a little while, but let's even keep it constant at 25.  Then, the usefulness of Wikipedia is 25[.99*X-.01*Y], and the usefulness of Britannica is X.  I would normally assume that if we let X be normalized to 1, then Y would be about 3 (which is to say that if you were given 3 correct pieces of information and 1 incorrect one, you'd be breaking even); this would mean that Wikipedia comes out to 24 UN, with Britannica at 1 UN--not even close.  But let's see, for the sake of argument, what Y would have to be for them to be equal.  Again normalizing X to 1, we get that 25[.99-.01*Y]=1, or Y=95.  So the break even point would be if it were the case that a person who received 1 incorrect piece of information and 94 equally useful correct pieces of information were getting a bad deal.  Remember, this is all using assumptions that I would guess are not fair to Wikipedia; in addition to those stated above, I would expect that while Wikipedia is currently 25 times as big as Britannica, this number is not weighted by usefulness and that if one were to weight it by usefulness (as one should do, but is hard to do quantitatively without knowing things like aggregated browsing history) it'd be much larger--possibly into the 100's.

But this is all something that should be intuitively obvious for someone not biased by the prudishness of tradition--if you actually want to know something, nothing compares to Wikipedia.  I recently wanted to get a sense of Colorado senator Michael Bennet; campaign websites for both him and his opponent would obviously be biased (and short on facts), and Britannica didn't even have an article on him.  Sure it's possible that the Wikipedia article misspelled something, but if I had opted for other resources I wouldn't have learned what his stance on major issues were.  Wikipedia is the best, most useful resource there ever has been.


*: What I mean by this is just that we care more that it correctly states a US Senator's political party--an often desired fact--than that it correctly states the year that the Canon PowerShot A470 was first made.  If you want to define this mathematically, just weight each piece of information by the product of how often it is desired by how important it is that it is correctly known, and have the whole set of usefulness normalized to 1.  I'll follow this convention throughout the article.

Thursday, December 23, 2010

Why Marriage Shouldn't Be a Legal Term

I wrote a post a few days ago questioning why marriage is a legal, and not personal, term.  I'd like to expand on that with a list of reasons why we would be better without (official) marriage.

First, it would end thorny questions of what, exactly, could constitute a marriage.  Right now, a marriage consists of a man and a woman.  This is clearly biased against homosexuals; legalizing gay marriage would solve this problem.  It doesn't, however, solve the more general problem of the government deciding what a marriage is.  Even with gay marriage, it would be defined as being between two people.  Why not more?  Whatever you think of polyamorous relationships, why should the government be deciding this for you?

Furthermore, why should the government be splitting all monogamous couples up into two binary categories: married and not married?  Why does there need to be a stark, legal distinction whereby half of all relationships are given no legal standing, and the other half are given a ton of it?  This encourages a number of potentially unhealthy decisions.  First, in encourages early marriage: there are legal benefits to being married that can't be gained from a non-marriage relationship--even a long term one--thus encouraging young people to lock themselves into a potentially life-long decision before they otherwise would.  Second, it causes people to divide up relationships into two categories in terms of longevity: marriage, which is permanent, and relationships, which aren't.  But what if a couple wanted something in between?  In the current society it's considered odd, and thus implicitly discouraged, to have a long term relationship that isn't bound by marriage, meaning that many couples are forced to either permanently tie themselves to a relationship that they're not sure they want to be in for the rest of their lives, or to end a relationship.  And while it's true that a couple could get married and then later divorce, the stigma associated with divorces makes this, too, an unattractive option.

Finally, making marriage legal causes a couple to be treated, in many ways--and particularly economically--as a single entity.  This, both directly and indirectly, is responsible for a huge part of the wage gap between women and men. It's legally reinforced that couples' income is treated together, meaning there is legal backing to the notion that women need not work as long as their husbands can make a living; this, possibly above all other reasons, is why women earn so much less than men, and work so much less frequently.

In short, marriage is an arbitrary distinction that divides all romantic relationships in this country into two categories, creates incentives for couples to choose one of the two categories--whether or not it's right for them--and comes with a whole host of repercussions, including discrimination, early marriages and divorces, and gender inequality.

Tuesday, December 21, 2010

On Marriage

William Saletan wrote a post earlier today about how the military should treat straight domestic partners, polyamorous couples, or incestuous couples in light of the DADT repeal.  As usual, he raises a really good topic; but as usual, he stops short of saying anything of much use on it, opting instead of platitudes like "Is homosexuality about who you love or who you are? That debate, unresolved by the fight over DADT, will rage on."  He fails to bring up the central question of the military's treatment of romance, though: why does it care?  Why does it give partner benefits?  Questions like who should be able to visit a soldier in the hospital could be resolved by that soldier providing a list of who should see him in the case of injury--a list that wouldn't have to be specifically bound to marital status.  As for questions like tax breaks for couples and different treatment of income from individuals and couples--why do these distinctions exist?  Why does the state care who you're married to?  Why is marriage even a legal status at all, instead of a personal one?  Questions about regulating incest are different, but as for which types of unions we legally recognize--why do we recognize any?

Saturday, December 18, 2010

What Obama Should Have Said

"In a few days, an organization called Wikileaks will be releasing to the public a number of classified cables relating to America's involvement in the international community.  We welcome increased scrutiny on our government and attempts to increase transparency, and respect Wikileaks' right to free speech, even though they may be breaking numerous United States and international laws in doing so.  However, we believe once American citizens have reviewed the documents in question they will come to the conclusion--as we have in a thorough review of the released documents--that they display a government that is efficiently executing its duties as a member of the international community, and that Mr. Assange's claims of fraud will be thoroughly debunked.  Instead, these documents contain many pieces of information which, while blameless, could significantly hurt America's diplomatic operations by revealing sensitive information regarding frank statements made in private by US diplomats and confidential reports on military preparedness.  We wish that, in the future, Mr. Assange would act less heedlessly in revealing confidential documents that endanger US national security but do not reveal significant transgressions by our government."