Tuesday, December 21, 2010

Hacking Integrity

Hans Christian Anderson's tale of the Emperor's New Clothes, is a simple morality tale about speaking out against hypocrisy and pretension. For those of us who see ourselves as the little child, shouting out that the emperor does indeed have no clothes, there is another side to the story we should consider.

What if the child was a shill, planted by the merchants to embarass the emperor? Is the child still a hero? Or should he be despised as a member of the criminal conspiracy against the throne?

When exposing hypocrisy, the integrity of the person who points out the nakedness of the emperor is central to the excercise. Consider these three examples:

My first example is the anonymous crook who makes a living helping college students cheat on their term papers and PhD theses. His self-important expose of how easy it is for students to pass off the work of another person is worthless drivel. Because he shouts from the shadows that the emperor has no clothes, and because he engineered the fraud in the first place, his story has no merit.

My second example is the professor of medical education who submitted a fake paper to a conference on Integrative Medicine, described in this post on Respectful Insolence. The fake paper described a new form of reflexology involving the buttocks, and was reportedly accepted for presentation at the conference. While most of us recognize that reflexology is bunk, this story contributes nothing of value to the debate. The only thing we learn is that in a discipline where the integrity of the researcher is essential, a compromised researcher can successfully commit fraud. This does not diminish those who were victims of the fraud, and does not disprove anything else in the domain. It only speaks to the lack of integrity of the one who perpetrated the fraud.

My final example is Wikileaks, the complete disrobing of diplomats world-wide. Julian Assange is not a hero, nor is the American who delivered the leaks to him. Whether good will come out of this episode remains to be seen. What we know today is that a man with no integrity publicly exposed hundreds of people who were just doing their jobs, and then shouted from the public square, "the diplomats have no clothes".

Hypocrisy and pretension should be denounced. But those who would do the denouncing should consider their methods, lest they sacrifice their own integrity.

Monday, December 20, 2010

St. Petersburg, Part 2

This post is the second in a planned series on the St. Petersburg Paradox. Start reading from the first post.

In the first post I asked, "How much would you pay for a 1 in a billion chance to earn $1 billion?"


In this post, I explore the question, how much would you charge for a 1 in a billion chance to owe $1 billion?

Details below the line:

Sunday, December 19, 2010

The scientific method is Pro-Life

Randall Munroe is at his best again.

 "This world is amazing, and I'm going to live to experience more of it thanks to people who refused to gracefully accept the ineffability of reality. I find my courage where I can, but I take my weapons from science."

How to use ngrams, part 1

The new ngrams tool from Google Labs is awesome. But many of the examples floating around the blog world are flawed. I am not a linguist, I just like data. The Lousy Linguist blog has more stuff to say about the validity of the tool.

The purpose of this post is to help you avoid several basic mistakes in formulating queries or interpreting results. I haven't read the Google paper in Science, so I can't really speak to methods or sources.

What does an ngram measure? Word frequency / all words. Start with the ngram for "the" and "and", where you can observe that these common words are roughly 1/20 and 1/40 of all printed English words, respectivley. Nearly any word will appear to flatline when scaled against "the". Consider the ngram for politics, paired with "the" and alone. At least right now, there appears to be no option to use a log scale.

Looking for famous people? Query formation is critical. Consider this ngram for "Franklin D Roosevelt". It looks like his popularity is growing. Check again, this time including "Franklin Roosevelt". Of course, FDR was most well known by his intials. For this reason, Ngrams isn't set up right now for an easy comparison of presidential popularity. OCR errors, of course affect search results, particularly for early periods. Consider this ngram for FDR and fdr. Ngrams are, of course, case sensitive.

Some conclusions can't be easily tested in ngrams. Is the rising popularity of "Friedman" due to "Milton Friedman" or something else? An exhaustive search of pairs would be difficult using the ngram browser.

Thursday, December 16, 2010

St. Petersburg, Part 1

This is the first post in a planned series on the St. Petersburg paradox. But don't follow the link just yet, unless you want to spoil the excitement. Note, I use the formulation give by Robert Martin, rather than the one in wikipedia.

How much would you pay for a 1/2 chance to earn $2? (Ignore taxes and transaction costs)
How much would you pay for a 1/4 chance to earn $4?
How much would you pay for a 1/8 chance to earn $8?

How much would you pay for a 1/16 chance to earn $16?
....


How much would you pay for a 1/2^10 chance to earn $2^10? (2^10 is about 1,000)
How much would you pay for a 1/2^20 chance to earn $2^20? (2^20 is about 1 million)
How much would you pay for a 1/2^30 chance to earn $2^30? (2^30 is about 1.1 billion)
How much would you pay for a 1/2^40 chance to earn $2^40? (2^40 is about 1.1 trillion)


Consider each of the 40 similar bets from 2 to 2^40. Suppose you could take them all at the same time in a mutually exclusive game, such that the chance of winning nothing is 1/2^40. For example, the game could be a sequential coin flip until the coin shows tails, and paying 2^n when the nth flip is the first one to show tails, and paying $0 if the all of the first 40 flips are heads.

Is the amount you would pay to play this game equal to the sum of the amounts you would pay for each of the individual games listed above? If not, why not?

Why would you not be willing to pay $40 to play this game?

A variation follows below the fold...

Sunday, December 12, 2010

Confirmation Bias: Elusive Republican Scientists

The mystery of the missing Republican scientists is a nice study in confirmation bias. The anti-science Republican meme has become so pervasive that a 2009 study which appears to show that only 6% of scientists are Republican is accepted uncritically by media and bloggers alike. A more likely explanation is simply that the study design failed to survey a representative sample of the scientific community. Just because Republican scientists don't subscribe to Science magazine doesn't mean they don't exist.

Saturday, December 11, 2010

Scary Statistics Fail: No Dog Bite Epidemic

Media hype and official government studies to the contrary, there is no epidemic of hospitalizations resulting from dog bites. The data presented are consistent with a level rate of hospitalizations from 1997 to 2008. The fluctuations during 1994 to 1996 suggest that the growth over this time period may be an artifact of the source data and the statistical methods used to estimate national rates.