The year of the anti-model

Here’s how it used to work. You have a hypothesis, something you want to test. You go out, collect a mess of data, then start to build a model. The model is your key weapon for understanding the data. Is there a linear relationship? Fit a regression line. Does a particular variable have an impact on the results? Do t-test and find out. The goal is to make your models clear, interpretable, and above all concise. We all know the more parameters you add to a model, the closer you can get it match the data, whatever the data may be, so avoid the temptation to overfit at all costs. Overly complicated models tell you nothing.

Stick to the process above, and you can claim that your results show not just tendencies and correlations, but meaning. The models, properly tested and fit, offer understanding. Through the use of math and inductive logic, we are able to separate the word into signal and noise, “systematic” trends and “random” variation. Once complete, we know what we know (Gremlins are 87% more evil if you feed them after midnight), and we also know what we don’t know (23% of evil behavior in gremlins can’t be explained by violations of the three rules). As an added bonus, we get bounds for how well we know what we know, and how little we know about what we don’t know.

Models can be incredibly powerful tools, but perhaps their least understood property is how well they fool us into believing that fitting a line through points is the same things as understanding an underlying process.

In 2011 I’m going beyond the model. Instead of understanding, I’ll be striving for accuracy of prediction, or to optimize some profit/loss function related to the accuracy of prediction. Instead of trying to part the world into signal and noise — the part that can be understood, and the part that must be dealt with as inevitable “error” — I’m going to design a system that treats signal and noise as all one and the same. Instead of using math and algorithms to extract meaning, I’ll be using these tools to decrease the informational entropy of a stream of data. Data will be treated like a dense, tangled and interconnect forest, an entire ecology of information that cannot be split apart, and can only be “understood” by non-deterministic, evolutionary models which grow in complexity and inscrutability as quickly as their real-world counterparts. In my most well-read (and controversial!) post of 2010, I argued that Occam’s razor was the dumbest argument smart people made. In 2011, I’ll try to demonstrate the power of leaving behind the “simple is better” mentality once and for all.

Tags: ,

5 comments

  1. Good luck with this. The great thing about statisticians is we are almost all trained in two areas.

  2. heya! your 2011 creedo put me in mind of a computational mechanics paper (cited below) on parsing stochasticity, determinism and entropic information loss from a data stream. the santa fe institute folks do lots of rad stuff around complexity.

    enjoy!

    Crutchfield, J. P. and Feldman, D. P. (2003). Regularities unseen, randomness observed: Levels of entropy convergence. Chaos, 13(1):25–54. Santa Fe Institute Working Paper.

  3. @Alexis:
    Thanks for the reference I just downloaded the paper. I’ve become more and more convinced that reducing informational entropy, even at the cost of using “lossy” algorithms, is the very heart of statistics and, in many ways, the success of the human mind.

    As I begin to dive into my 2011 “credo” it’s starting to look crazy ambitious. I just noticed that Wired has an interesting article about AI related to the idea of looking to predict instead of understanding:

    http://www.wired.com/magazine/2010/12/ff_ai_flashtrading/

    It’s clear to me that whatever my personal success at this, the overall direction is unmistakable: the major advances ahead lie not in creating simple, explainable models, but rather in creating “black boxes” that work, in ways we may never be able to explain or even fully understand (check out Wired’s description of how robots store and retrieve warehouse goods http://www.wired.com/magazine/2010/12/ff_ai_essay_airevolution).

    Before long our tools will surpass, in complexity, the ability for our brains to understand how they work. The power of this approach (and the unintended consequences) will only grow over time.

  4. hey matt,

    have you listened to kevin kelly or read his book “what technology wants?” he has a *lot* to say about many subjects (i don’t always agree with him–especially where i feel he punts on social issues). some of what he says that i find provoking is the idea that “intelligence” ought not to mean “behaves like a human” (with brain in or out of a vat). so he describes calculators–things that simply do arithmetic much faster and more accurately than we can (in general)–as having a particular *kind* of intelligence.

    what you write about designing black boxes hits a similar note for me: what happens if we think of intelligences as ways of processing and manipulating data streams, possibly also manipulating real objects. then there are lots of ways we may design intelligences to augment our already cybernetic existence.

    you are also tracking donna haraway? (“simians, cyborgs, women and the reinvention of nature” in particular), we have already been making use of black boxes (e.g. “seeing” by means of scanning electron microscopes, etc.)

    toodles!
    alexis

  5. Hey Alexis,

    I liked Kelly’s book, though I’m way less optimistic than he is, mostly because our future is determined not just by what technology “wants”, but by what powerful institutions want.

    “what happens if we think of intelligences as ways of processing and manipulating data streams, possibly also manipulating real objects. ”

    Yes. Yes. Yes.

    The system is the intelligence.

    I hadn’t heard of Donna Haraway I’ll check her out. There’s a decent book on animal intelligence by Jeremy Narby. (who’s fist book, BTW, has some stunning accounts that remind me of E.T. Jaynes’s discussion of what to do with strong data from a psychic experiment in the chapter “Queer Uses for probability theory”. The kind of stuff that, if true, changes everything.)