Saturday, June 6, 2009

Twitter says: Coldplay follows the unpopular, Oprah doesn't

Coldplay follows 2,624 tweeters on twitter - who, on average, are only followed by 10 people each. Oprah on the other hand only follows 14 people, who, on average, are followed by 709,350 people (as of this post on 9 June, 2009). I know - I spent the past 14 weeks counting! Nah... actually - there's a cool way to do this in - yes, you guessed it - a spreadsheet.

In tweaking the TweeterScore and other TwitSheets I've discovered that using xpath, you can quickly summarize data from an XML feed into a spreadsheet... that is, you don't have to do the adding and looping yourself.
In one command, you can summarize data across all the entries in any XML feed.
For example... the twitter XML data for "friends statuses" (the recent tweets of all the people you follow) can be grabbed with a URL like this:
http://twitter.com/statuses/friends/whatsername.xml - and you can do that for any tweeter, to see the statuses of (and other stuff about) all the people they follow...

The actual XML data - in brief form - includes something like this (shortened massively):
<users>
  <user>
    <id>1010101</id>
    <name>Whatsher Name</name>
    <screen_name>Whatsername</screen_name>
    ...
    <followers_count>1031</followers_count>
    ...
    <status>
      <created_at>Tue Apr 07 22:52:51 +0000 2009</created_at>
      <id>1234567890</id>
      <text>brb - going to get ice cream now</text>
      <source><a href="http://www.tweetdeck.com/">TweetDeck</a>
      </source>
      ...
    </status>
  </user>
</users>

So - for the given tweeter (that you gave as the tweeter-screen-name.xml file name) it gives information for every other tweeter that person follows. A cool way to get the raw data for every person they follow. But the best part is the ability to summarize across all the entries... and XPath lets you do that.

Example: Let's say I wanted to know not just all the names of the people that Oprah follows (no idea why i picked her) - but I wanted to know the average number of people who follow the people she follows. That would tell me whether Oprah is following her fans (unpopular) or other celebs (popular, with lots of followers themselves).
I can do that in a spreadsheet in one command (almost).

I use the ImportXML() command with an XPath command string of "sum(/users/user/followers_count)" to get the total number of followers of all the people being followed by this tweeter... so for Oprah, the spreadsheet formula looks like this:
=importXML("http://twitter.com/statuses/friends/oprah.xml","sum(/users/user/followers_count)")
Divide that by the number of people she follows, and you have the average...
I'll leave it to you to check out this sample spreadsheet which does a whole bunch of this xml manipulation to compare 10 tweeters.

6 comments:

NoodleGei said...

Hi JR (Apps Script?),
i wrote a Blog-Post (german) in April about the same topic ;-):
Docs and Twitter, GeoCoding
But, your example are very, very nice.
I have work with Importfeed()and later with GeoCoding.

Bye

NoodleGei said...

JR,
have you change the access to the sample spreadsheet?
It's not allowed to make a copy of your example spreadsheet.

BTW: Sorry, for my bad english.

JR said...

oops - sorry - but I did actually forget to make that sheet open for anyone to view (and, therefore, copy). Fixed now - thanks for the poke!

Tom said...

Man those ImportXXX functions are super useful! I bet you guys are missing the intern who wrote those.

Tom said...
This comment has been removed by the author.
Anonymous said...

will the yolinkGoogle Docs connection be restored?