Tom Taverner introduced me to Benford's Law as we were eating lunch together at a statistical computing conference: If you look at the first digits of data in many naturally-occuring datasets, a startling 30 percent of them are ones. "Pah!" I said. "That belies intuition! Why would one digit occur any more than another? I'd expect each digit to occur with about equal frequency--1/9 of the time. Why isn't the probability 11 percent? Eh?" A hip, bespeckled biostatistician nearby joined the fray with the help of his iPhone. He was also skeptical, but he looked at the distribution of leading digits in a fancy gene database he was analyzing, and indeed 1 occurred about 30 percent of the time, with 2, 3, 4, and so on occuring with decreasing frequency. You prove me wrong so good, statistics!

Benford's Law, generally, states that the probability of the first digit *d *in base *b* is:

This turns out to give a 30 percent chance for starting with 1, 18 percent for starting with 2, and so on. It has even has wide applicability outside of entertaining lunch conversations--including fraud detection and computer disk space allocation. Several clever folks in the R world have recently used Benford to assess whether data is actually naturally occuring: Drew Conway decided there was not strong evidence of numerical tampering with the Wikileaks Afganistan War Logs, and Diego Valle discussed problems in homocide-reporting by the Mexican government. Rattle, a graphical interface for R, has a function to overlay plots of leading digits in base 10 of different subsets of the data to evaluate where funny business may be occuring in a dataset; also, as I found out from his comment below, Kevin Wright has posted some R distribution and plot functions for Benford's law on the R wiki.

Does Benford's law seem unintuitive? Well maybe it's because kids these days are just too darn LAZY to look at a good book of logarithms like we did in the good old days! (They're also too lazy to walk to school barefoot over barbed wire uphill both ways in the snow.) Logarithmic tables are where you look up the first several digits of a number to see what the logarithm of that number is--then you can simply add to the log of the number to represent greater powers of ten, or add the logs of two numbers to get the product; then you can look up that result in a table that will convert back to familiar numbers. It's a lot easier to add numbers than multiply them by hand, so this was a huge time-saver.

Not only does using logarithm books build character, but, in the words 19th-century science fiction author and astronomer Simon Newcomb, you can discover important scientific laws: "That the ten digits do not occure with equal frequency must must be evident to any one making much use of logarithmic tables, and noticing how much faster the first pages wear out than the last ones." In other words, people have simply been looking up more numbers that begin with the smaller digits, so those pages get dirty faster. Now you know what you've been missing, you technology-dependent slacker!

Newcomb continues his "Note on the Frequency of Use of the Different Digits in Natural Numbers" to precisely state the law, which was later independently discovered, popularized, and demonstrated in the 30s on all sorts of different kinds of data by a bright Schenectadian, Frank Benford. As an illustration of the similar paths that human minds can follow, Benford *also* was inspired by logarithmic books, citing in his "The Law of Anomalous Numbers" how "the logarithms of the low numbers 1 and 2 are apt to be more stained and frayed by use than those of the higher numbers 8 and 9."

So next time you spill coffee on or drool all over a library book, don't feel guilty--your detritus may just inspire scientists of future generations.

**Coming up next:** My functions for Benford analysis in R, and using them to look at baby names, MA property values, dinosaur bone lengths, traffic data, and Congressional lobbyists!

*Benford's Law, R*

Kevin Wright

August 24, 2010

I've already created a set of Benford functions.

http://rwiki.sciviews.org/doku.php?id=tips:stats-distri:benford

Kevin Wright

Ethan Brown

August 24, 2010

Thanks for the link! I've linked to your functions in my post. I like your idea of creating full distribution functions with the 'd', 'p', 'r', 'q'--I'll add that full suite to my code. I'm interested in exploring Benford's law in different bases (i.e. other than base 10), so my distribution functions will be similar to yours just with a "base" parameter passed in. My plot function will be a little different, since I need to find the first digit numerically.

Best,

Ethan

ulimited hosting

August 5, 2013

I truly love your website.. Pleasant colors & theme. Did you create this site yourself? Please reply back as I'm hoping to create my own personal website and would like to know where you got this from or just what the theme is called. Cheers!|

www.mobile games.in

April 14, 2014

Hi i am kavin, its my first time to commenting anywhere, when i read this piece of writing i thought i could also make comment due to this sensible

post.

locke movie watch online free

April 23, 2014

I blog often and I truly appreciate your information.

Your article has really peaked my interest. I am going to book mark your website and

keep checking for new information about once per week.

I subscribed to your Feed as well.

Also visit my site ... locke movie watch online free

frozen movie watch online free

April 26, 2014

Its like you read my mind! You appear to grasp so much about this, like you

wrote the e book in it or something. I think

that you simply could do with a few percent to force the message house a bit, however instead of that,

this is wonderful blog. A great read. I'll definitely

be back.

My website frozen movie watch online free

Virgie

June 1, 2014

Hi there, I log on to your blog on a regular

basis. Your story-telling style is witty, keep up the good work!

My web blog; social landings (Virgie)

Parthenia

June 8, 2014

I constantly emailed this web site post page to all my friends,

for the reason that if like to read it after that my contacts will too.

authoritative parenting vs authoritarian parenting

June 25, 2014

My brother suggested I may like this blog. He was once entirely right.

This submit truly made my day. You cann't believe simply how so much time I had spent for this info!

Thanks!

Feel free to surf to my page ... authoritative parenting vs authoritarian parenting

Donte

August 26, 2014

This inth'isgs just the way to kick life into this debate.

Tiera

July 2, 2014

I see a lot of interesting content on your blog. You have to spend a lot of

time writing, i know how to save you a lot of work, there is a tool that

creates unique, SEO friendly posts in couple of minutes, just type in google - laranita's

free content source

7 days to die download

July 5, 2014

Hey I know this is off topic but I was wondering if you knew of any widgets I could add to my blog that automatically tweet

my newest twitter updates. I've been looking for a plug-in like this for

quite some time and was hoping maybe you would have some experience with something like this.

Please let me know if you run into anything. I truly enjoy reading your blog and I look forward to your new updates.

Here is my homepage - 7 days to die download

Raynes

August 25, 2014

Stay with this guys, you're henlipg a lot of people.

battle nations cheat android

July 15, 2014

And that's the battle nations cheats game.

Services may vary depending on the 5th edition of

the display. Before you start developing a mobile game you are developing

out ever interesting and enjoyable to try to undergo

their previews.

My web site battle nations cheat android