the new link test

Coolpeople_map_01_lg This is the next fence that market research needs to jump in order to be able to use online connections organically as they actually work. Fundamental to sampling populations is the idea that you sample them randomly.  And you try not to resample the same person if you can avoid it. If you can't then don't do it very often. Recruiters who keep little black books of contacts are tolerated unless you keep seeing the same person showing up – in other words if they get greedy and careless. The best metaphor I know for this kind of sampling is pulling snooker balls out of a bag without looking.

The internet is all about human interconnectedness. Which makes for problems if you are trying to port a random sampling methodology from conventional offline market research whether quantitative or qualitative. The first 'successful' port was the panel survey which could replicate offline surveys (sort of) worldwide, much quicker and initially at lower cost than face to face surveys. It worked because of turning a blind eye to the number of times a few people were doing surveys, to evidence that surveys were being filled in mechanically in return for cash or related incentives.  The success and scale of online panels meant they grew fast and collapsed under their own weight as samples were bought and sold like cattle and the same respondents started to turn up completing more than one survey.  The point I am trying to make is that so called random sampling has always been problematic on or off line and a lot of the time respondents took far too many surveys, or provided bogus data.

And now we need to find a way to harness the natural flows of the internet. People who actually know each other.  If I take a survey and enjoy it (shock horror) then I might forward the link to a friend or family member. I may add a note giving an indication of my scores and asking how theirs compare.  This is meltdown for random sampling. But interestingly companies like Toluna are tolerating it not only because you get a lot more surveys completed but also because you get a higher quality of completion. There's no point in sharing fraudulent survey results with friends and family is there?

The link test challenge is to get past the problem that the data is fatally compromised because people know one another. But it isn't insuperable. The danger with the purists is that they have ringfenced off a mechanism which is probably the only way research can be made to work on the internet.  so how do we do it?

1. Degrees of separation? If a questionnaire goes through 6 people then as long as A and C don't have a separate relationship with each other but only because both of them know B then could we have confidence in questionaires A and C. Could we sample B and D in much the same way so we don't have to lose B's data we just use degrees of separation.  It works for jobs – you don't hear about jobs from your friends – the best source is friends of friends – a much wider group who aren't contaminated by knowing you in the first place!

2. Co-efficient of connectedness. All this means is that we need to check that if A passes to B and B to C that C doesn't know A. Though we would probably have to ensure that we didn't end up sampling a bunch of loners. We need an even distribution of connectors/sneezers whatever you want to call them and shall we say the ahm less connected.

Sistine1 3. Modulation. This is where things start to get much more interesting. We keep the string A B C D  – but we look for how the scores modulate as the surveys move along it – in other words we start to monitor the degree of bias. And to develop co-efficients for particular subjects or cultures which we can model. So its less the opinions but their wave motion or modulation which we monitor.

4. Reverb/acoustic model. One of the obscure achievements of digital culture is the way that a sound can be reproduced and modelled for just about any acoustic environment. You just need a fast computer and a little patience. You can actually sample the echo of the Sistine Chapel and effect your recording of Chopsticks on the piano so you can hear what it might sound like if you were playing for the Pope. Its called convolution. Now if we were to look at the shockwaves of data moving through groups then we could take account of differences between people. Better still we could fill in missing bits. Instead of regarding gaps in the network as faults – we could virtualise them. In convetional sampling we want every member of the sample in toto to make up a small scale copy of the entire population. But a convuluted sample would have lots of gaps in it which we would model virtualised response – if your best friends mate had answered the survey what would he have put based on the convoluted reverb echo which is emerging from those we did sample?

This last approach would start to address the problem afflicting most quant and qual online specifically research communities where only a minority are participating. WE need to be able to establish the views of those who are less articulate or who post less often. Or who have less extreme views. Or who don't post because when they look they find they have already been spoken for but don't have the facility/motivation to click a I think this too button.

Cloudofknowingsml3 5. The constellation/galactic model.  Random sampling was never that random. We never went that far from the shore. We were always looking at heaving medium and light bottle washers. We instinctively avoided those who found bottle washing boring or who hadn't washed a bottle since 1995. Or who couldn't spell the word bottle so we mistrusted their intelligence and opinions. The internet illuminates that we are not al the same – that there are hundreds of thousands of semiprivate worlds.  Yes we have language to teleport between these worlds and aggregate them but when I consider the artefacts on the internet  I am struck more by the differences. So how about census style research where we sample the entire population at homeopathic levels. Representing how dispersed our communities really are. This is perhaps the diametric opposite from links but one of the interesting aspects of the internet is that these diverse communities are much less digital separate than they are in the offline world. I am a click away from hitting a twitter trend and going into a microworld where the death of Michael Jackson really is an unrecoverable tragedy. So it should be possible to conduct small scale surveys where we maximise the differences within social networks and on their boundaries.  Most of our universe is empty space. So are our social spaces. But that doesn't make them empyt or silent.

I would welcome any thoughts, critiques suggestions on this. I am putting together a workshop called the Cloud of Knowing in the autumn when I can assemble a group of researchers and client observers to start to hammer some of these issues out. If you're interested in this topic then get in touch. 



Twitter Updates
Photo Albums
Putna Sihastra Monastery 2011Willie Williams exhibitionPlanners drinks night No1 BucharestGrindleford Grinch April 2007TV stations Jeddah

Designed by Matthew Pattman