Starting to explore Wikipedia: Part 1. Query woes.

I’ve started to wonder, just how much can we find out about a subject from Wikipedia? I’ve been wondering if I can ask serious questions and big questions to the data set to get serious and big answers out. I thought I’d start by exploring an area of Wikipedia that was reasonably well maintained, I…

The birth place of every pro wrestler (according to wikipedia)

A few days ago I mapped out the death places of Monarchs of England. I wanted to try the same technique on a bigger dataset and keeping on trend with some other stuff I’ve done with Reddit I decided to map the birthplace of every wrestler in Wikipedia. This is a *rough* guide. It maps…

Where the English monarchs died

(according to Wikipedia) This was a quick and dirty experiment to see how easy it is to auto generate an interactive map from Wikipedia/DBpedia. There are some caveats and things I still need to iron out. These are people that Wikipedia has described as a Monarch of England, they must have a place of death…

New LAK dataset

I’ve been informed by Davide Taibi that the LAK dataset has been updated, this update includes some paper text that I reported on, but also has lots more data. As described by Davide: This version includes papers from: – EDM conferences (2008-2014) – LAK conferences (2011-2014, 2014 only abstract since we are waiting for ACM…

Playing with Wikipedia

Sometimes I like to write bits of code that poke Wikipedia, its fun and the data is a great way to start a conversation with people. I recently teamed up with the pro wrestling fans of Reddit to find out information on wrestling stables, we did this using the structured information from Wikipedia stored in…

Visual breakdown of categories on wordpress blogs using R

This is a very simple recipe, just a few lines to get an indication of tags/categories being used on a WordPress site. The idea is that I use R to read the RSS feed of the blog, pick out the tags and categories and display a pie chart of the tags being used. Since tags…

Are we too naive in education?

In my last post I had been thinking about technologies that online communities use and how the technology pushes people to communicate in a certain way. I pushed this out to my Google+ feed, perhaps because Google use plus metrics to an rank article but I would like to think more so because I have…

Ric Flair Promotion Jumps

I posted an experiment on a wrestling subreddit a few days where I used DBpedia to explore the links between wrestling stables/ .To my surprise wrestling fans are really interested in the things you can find out using structured data from wikipedia. I was contacted by a graphic  designer asking if I wanted to collaborate…

Scraping a HTML table into an R dataframe

For some work I plan on doing for the LACE tech focus blog I wanted to get some information off a webpage and in to an R dataframe. It turns out this is a three line solution (1 line if you throw your URL straight to the function paramaters and have the XML package already…