Thursday, 24 January 2013

Beauty vs Truth : A Case Study

It started, innocently enough, with a picture...

And a comment...
"Why couldn't all of you just stand in this order : \/ \/ /\ \/ /\ \/ \/
Instead of : \/ \/ \/ /\ \/ \/ /\

The lack of symmetry disturbs me to no end
[sic], I am forced to assume that the arrangement is
\/ \/ \/ /\ \/ \/ \/

Yes the lack of  symmetry is rather vexing, now that it's been pointed out. But as good as my calves look in a dress, surely there must be some simpler way to resolve this travesty...

But despair not, there is hope yet! We can always turn to the Data Analytics field to save us. Btw, 'Data Analytics' (also called Data Mining or Business Intelligence), is something that is pretty much indispensable for businesses these days and might soon become a crucial skill-set for individuals as well. No need to be intimidated though, it's a relatively simple concept:
"Analysis of data is a process of inspecting, cleaning, transforming, and modeling data with the goal of highlighting useful information, suggesting conclusions, and supporting decision making."

Let's get to it then. The first thing to do when approaching a problem is to form a hypothesis. Here obviously, the hypothesis we'll go with is:
"All images must contain symmetry".
Let's take another look at our data set :

Now that simply won't do. It's too vague, our hypothesis, and is clearly mistaken. Not only are the people not gender-symmetric, they're not arranged by height, weight, skin colour, muscle tone, style of attire or even how wide their smiles are either. No matter, we can simply ignore these factors as not being relevant and concentrate on any single aspect. In this case our new hypothesis now reads:
 "All images of people must be composed in such a way that their arrangement contains gender symmetry". 
By the way, it hardly matters that we've now zero-ed in rather arbitrarily on a relatively unimportant aspect of the picture, atleast we're making good progress. We'll have this hypothesis proven yet no sweat!

Next we need to set some ground rules. This is important because without any constraints we'd just run mad with power and start turning elephants into mole-hills and horses will be riding flying-pigs! So we shall have some rules and we shall christen them *hushed tones* The Methodology...

So with our sample data for example, we could just rearrange the people but as the original commenter points out, that would be illegal because :

"...since the picture is already taken, you need least effort transformations (in your head) to enforce symmetry... Rearranging people requires more effort than a single gender transformation..."

Well he's right but it's not very helpful because "least head effort" is a rather vague term. For the sake of the argument, let's quantify it and say that the easier it is to Photoshop something, the easier it is to imagine. By this scale, our commenter is proven correct; My lazy Photoshopping aside, it would be harder to swap people around to enforce symmetry than it would be to apply gender transformations to the stud in the middle and the belle on the left.

And again, one gender transformation is less expensive than two.

But again it seems our original commenter hasn't really considered every option. After all, considering photoshop as the reference, there's a much easier solution. All we'd have to do to enforce gender-symmetry is crop the picture at the ends...

Or if you want a cleaner image without people's shoulders sticking out at the edge of the frame, we could crop another one which would leave us with :

Quick side note : In data analysis the cropping we just did is called 'Removing Outliers' or (more pejoratively) 'Cherry Picking of Data'. This is an important tool in *hushed tones* The Methodology because sometimes data points aren't really relevant to your experiment. So for example if you're doing a study on the effects of smoking on life expectancy and one guy is a hundred and two and still puffing that'd be a huge finding! But then if it turns out he has a secret lab that's cloning replacement lungs for him every few years, you can safely ditch him as a data point. However not all things are so clean-cut, and there's a very fine line that separates "Removing Outliers" from "Cherry Picking".

And there we have it, our hypothesis is now quite firmly proven and no one's junk had to be re-wired or anything! *Wild Cheering* But before we pat ourselves on the back we must ask, what exactly have we accomplished here? We had a hypothesis and we proved it, but we haven't exactly learned anything new. Also typically, after a hypothesis is proven, you can use that as a base for other hypotheses but here there's really nothing useful that you could extrapolate. So the dirty secret is that really all we did was to twist data to suit theories, rather than theories to suit reality.

More generally speaking in our everyday lives as well, it would be beautiful if things just always worked as you think they should. But sadly that girl at the bar won't magically realize you've been standing there for the last hour, hopelessly slack-jawed. Likewise, you don't just trip over success while out for a walk and things like good health appear to require conscious effort. Ultimately one must give the truth it's due and hope for the best rather than turn the data into a kind of Rorschach blot on which to imprint ones expectations. And who knows, you might find that reality has a beauty all of it's own... :)

1 comment: