# Is social media related to listening figures?

### By James Cridland for media.info

Posted 23 June 2015, 6.50am edt

Glyn Roylance asked whether TuneIn followers were related to audience figures.

media.info contains full topline RAJAR data, as well as TuneIn follower figures for many stations.

So, an hour or so in a coffee shop gives this graph of radio listening figures vs TuneIn figures. You can zoom in and out to see smaller stations if you like.

Are they related? Well, I don't really think so, looking at that graph. But the Pearsons coefficient, calculated below, seems to show a strong correlation. So, I've also added a rough readyreckoner to help understand how many listeners you might have, based on your TuneIn followers.

Additionally, I've done the same work for radio listening figures vs Twitter, and radio listening figures vs Facebook. The correlation is rather less pronounced. But then, there's precious little to compare a listener following a brand on a social network to audience figures...

## Comments

### Shankar Meembat2 years, 10 months ago

Interesting stuff and I disagree with "the above graph is a silly thing to do ".

It maybe a stretch to use this for estimating listener figures based on TuneIn followers, especially given that at the lower end, correlation decreases.

But there are still some good insights to take away - and the main one for me is the stronger correlation on the larger stations. In order to follow on TuneIn, the person should have at least visited the stations page there once.

And the fact that the correlation is higher and the multiplier is lower shows that people still access the most popular stations online - clearly there is a demand for consuming the stations content online

### James Cridland2 years, 10 months ago

Thanks, Shankar. That disclaimer is really there because we are comparing two entirely different things. To me, it's no surprise that the larger stations appear to do well on TuneIn: particular those with international brands.

I think it is a stretch to assume that a 'follow' means that I am a regular listener. I have many stations in my TuneIn list that I have followed, only to never return.

That said, now I've published the data, others have gone and taken a look. Here's Dean Kavanagh's workings...

(image)

### David Lewis2 years, 10 months ago

Is there a correlation between the target demographic of the station and the number of Tune In followers?

Is a station targeting 20 year olds likely to have a higher percentage of followers than a station targeting 50 year olds?

### Adam Bowie2 years, 10 months ago

The other thing to note is that some stations have worked with TuneIn, while others haven't. TuneIn can send push notifications for programmes via their app which might promote those stations. Talksport has certainly used such a mechanism in the past. And I even once got one from Radio 4.

I suspect that some stations might even promote TuneIn on-air as a mechanism to listen to their station.

The demographic question obviously does come into play too.

I think we'd have to look at any dynamic changes in follower numbers too. Do people actually "unfollow" stations, or just stop listening to them?

If you *really* wanted to go to work James, you'd add extra data sets for thinks like Facebook Likes, Twitter followers, Comscore/UKOM figures and see how they align!

### James Cridland2 years, 10 months ago

I can do Facebook and Twitter numbers right now. Can't get hold of Comscore/UKOM as far as I'm aware.

But, oh heavens, okay then. Next week.

### Dean Kavanagh2 years, 10 months ago

I think the real diamond data would be to factor in online listening figures and use all the other data to work out which pieces of data correlate and which don't. Perhaps even more ambitiously, it would interesting to see how things correlate with groupings - for instance, do Twitter likes correlate with particular age groups - so, for instance, does increasing Twitter likes correlate well with an increasing <25 audience?

If you find that some things correlate well and others not so well, then you can give each one weightings and come up with a formula that takes all the data into account. I imagine, for instance, Twitter likes would be useful at predicting listeners for Capital much better than it would be for Smooth (hunch, no data).

I think the proof will be in whether it actually works - can these formulas *guess* the RAJARs with any accuracy?

### radiohead3192 years, 10 months ago

If the stats say there is a strong correlation, then you can't argue with that. They are not saying it is cause and effect, just there is a strong correlation whatever the reason.

I'd be interested to see what the correlation looks like for other metrics such as Twitter or Facebook, and also with combinations of metrics there might be an even tighter correlation.

I'm surprised there are only 82 data points - I'd imagined there would be many more. Such a low number does make me start to question things a little.

Even with this result that you describe as "silly" I think we have taken a big step forward for stations that are not on RAJAR. What else do they have? Local clipboard surveys - always a concern getting a representative sample. Population covered - often used, but any cautious advertiser needs to think what proportion of that population actually listen. This analysis tends to confirm (to quite a degree) the hypothesis that Tunein likes are proportional to RAJAR listening, AND quantifies the relationship. For sure there will be distortions according to age and propensity to listen online, but nevertheless it's a giant leap for radio-kind.

If you chose to provide an estimated listening figure for non RAJAR stations, it probably ought to be accompanied by a +/- figure to 90% confidence.

### Dean Kavanagh2 years, 10 months ago

If anyone is interested in plugging data in, the equation for the non-linear regression in the above graph (the curved line of best fit) can be calculated in Excel:

=ROUND(24110000+((13897-24110000)*EXP((-(0.0000002167))**tunein*)),0)

Replacing TuneIn with the value for TuneIn followers. Here's something I never thought I'd be doing on a Radio forum!

I don't profess to it's accuracy(!), but it might be fun to have a play around.

### James Cridland2 years, 10 months ago

I'm surprised there are only 82 data points - I'd imagined there would be many more. Such a low number does make me start to question things a little.

Bear with me, and I'll go and add the rest.

Dean - if you can work out how I might be able to add a line of best fit to the Google Scatter Graph, then I'd more than happy to add it...

### Glyn Roylance2 years, 10 months ago

Dean, Any reason to use a curved rather than linear regression?

(PS: I'm radiohead319 on my other Google account if you were wondering James!)

### Adam Bowie2 years, 10 months ago

If you go to the Customisation tab of the Google Chart, and scroll down to the Series section, there should be an option just below that to add a trend line. You'll want a linear one, and you can label the line in a number of ways. Google will work out and present the R^2 coefficient of determination which tells you how good the fit is.

(And yes, I can't see why this wouldn't be a linear correlation if there was one.)

### Dean Kavanagh2 years, 10 months ago

Hi Glyn,

The non-linear fit gives a better R^2 value (goodness of fit) than the linear fit (0.731 vs 0.722). Although I realise I'm splitting hairs with that kind of difference! However, it is fair to say that a linear fit would in this case not result in wildly different results and the formula might be easier to digest. It's simply down to choice as to what line best fits your data; the fit line of course isn't related to the correlation itself directly, but is more a means to get Y from X.

The one thing that I have learn in all my dealing with statistics and statisticians, is that there are a million ways to skin a data set.

### James Cridland2 years, 10 months ago

I've added a trend line; and a point count. And a way for you to narrow down the figures without fiddling with URLs.

### radiohead3192 years, 10 months ago

Excellent work James - and really interesting. We just need a few fellow data anoraks to add more Tunein URL's using the method you mention above.

Come to think of it RAJAR stations may use this to know whether they are under or over-performing on Tunein likes (or other social media if you do charts for those too)

### Adam Bowie2 years, 10 months ago

Dean,

Because the R^2 values are very close, would you not be better choosing the linear regression line?

Choosing a non-linear regression means that at extremes, strange things are likely to happen to relationship between the two data sets. I'd just be very wary of finding a closer relationship between the two variables than the data is able to show.

### Glyn Roylance2 years, 10 months ago

James - I think the figure after the text "You have asked for this chart to show a max of " is a factor of 1000 too large.

### James Cridland2 years, 10 months ago

Glyn - fixed that.

Now there are 214 data points. The correlation is less good when looking at the lower portion of data.

Of note: where stations have more than one TuneIn URL, I'm currently not SUMming those together. I should be, of course.

### James Cridland2 years, 10 months ago

Of note: where stations have more than one TuneIn URL, I'm currently not SUMming those together. I should be, of course.

...and now I am. The correlation was temporarily knocked off track because many stations had a TuneIn follower figure of zero (we normally only check one TuneIn figure every minute). I've pushed through all those updates, and we're now even tighter.

### Dean Kavanagh2 years, 10 months ago

Adam, from what I know (that's the disclaimer!)

Given the set, there is probably very little consequence of using either linear or non-linear regression as the result might be broadly the same. In terms of strange things happening at extremes, that should only be a problem if you force fit a non-linear regression to your data when it doesn't really work (also, most of the real-world use of the plot would be within the bounds of the current upper and lower, or at least not far off, so you can be reasonably confident it works within that range). In addition, if the data does plateau at the top end, then a linear regression is not best suited to fit that data. The problem is that because the data set is quite sparse in the high region, it's not totally clear whether that linear or non-linear regression better fits the model. Choosing one or the other is unlikely to have a huge consequence to what it pops out at the other end.

The R^2 is there to tell you how well the model fits your data, it doesn't give you any idea of the strength of relationship between X and Y; that's better considered by the correlation co-efficients. For instance, if you have a really high R (say, 0.90) and a R^2 of 0.25, then your data is highly correlated, but the curve fitting model you've used is not a good match for the data.

(In regard correlation coefficients, Spearman's is better (rather than Pearsons) as it doesn't assume a normal distribution).

I will put out there that I'm not a statistician by training , but rather a scientist who has learnt stats by prodding stats software, the occasional stats lecture 9 years ago and reading webpage after webpage to work out how to make stats work for our data. So am happy to be corrected!

### James Cridland2 years, 10 months ago

King of the copy'n'paste - here's the same thing but for Twitter

### Glyn Roylance2 years, 9 months ago

So what do we think - how high a correlation coefficient is needed to move this from "maybe silly" to "probably not silly"? I'm no statistician, but 85% seems pretty good correlation to me.

### Dean Kavanagh2 years, 9 months ago

It's a tricky question. From my experience, every field has it's own value. For instance, in Medicine, you'd be looking for correlation coefficients in the 90s. That said, it's clearly correlated. I suppose the test might be to see if you can use those figures to calculate backwards - e.g. can you use them to predict the RAJARs?

I said this earlier on in the thread, but I think it's an important point - once someone has clicked on follow or like, they're done. They may end up ditching the station, moving their allegiance elsewhere or just falling out of the station's demo - but they won't go to TuneIn and unfollow that station. I can see this limiting the usefulness of this technique in the future as listeners shift around stations. What I would suggest if you want to pursue this further is to use per quarter (or year, at worst) figures. You would need to recalculate all the above, but that it would be a much better dataset than using overall followers.

### James Cridland2 years, 9 months ago

The difficulty is that we don't have any per-quarter figures from TuneIn, and no way of calculating them. And I agree, they're measuring different things that shouldn't be compared.

I think the correlation is coincidence, mainly. I'd not rely on them to any financial extent. But as a yardstick, they're probably useful to compare station sizes.

What's very clear with the Twitter figures, and still visible in the TuneIn figures, is that different demographics behave differently and produce different engagement figures. We don't know enough to be able to correlate them.

### Adam Bowie2 years, 9 months ago

There clearly is a correlation, but it's not surprising - bigger stations have more listeners using TuneIn online. If you assume that all radio listeners are equally likely to use TuneIn, and that they promote availability of TuneIn equally as a listening mechanism, then you'd expect a fairly linear correlation.

Of course neither of those last two things is true, but they're true enough that there's a genuine correlation.

However as others have pointed out, it could never be used as a predictor for RAJAR for numerous reasons. I think the best you could do is be presented with a station's TuneIn figures and be able to establish a vague range of what that station's audience size is. No more than that really.

### Ivor Etienne2 years, 9 months ago

James you posted a graph for twitter followers, did you post a graph for Rajar vs Facebook followers?

### James Cridland2 years, 9 months ago

Hi, Ivor. I didn't. I could do that, though it'll be Monday before I've the time to do so.

### Ivor Etienne2 years, 9 months ago

Okay Thanks James and well done in posting this discussion - very good read!

### Ivor Etienne2 years, 9 months ago

Hi James - did you get a chance to post a graph for Rajar vs Facebook followers?

### James Cridland2 years, 9 months ago

I hadn't; but I've just added that code now for you. Here it is... but prepare yourself for disappointment. Particularly when looking at the majority of the figures, there's a very low correspondence between Facebook likes and audience figures.

I've also taken the opportunity to edit the above article to add all three.

Login or register to comment

It only takes a second with your Google or Facebook account.