Scrapping Reddit comments

Something I had been meaning to do for a long time was write a quick script to scrape Reddit comments. A chap has beaten me to it and you can find the code here: https://github.com/ctaggart878/redditscraper.

During lunch today I had a little play with it the script (and I mean quick!). A two line script imported the function and ran it over the comments in ukpolitics for this week. The two lines:

source(“RedditScraper.R”)
redditScrape(‘ukpolitics’,’week’ )

Here is the magical wordcloud it created:

Screen Shot 2013-09-04 at 16.04.13
Reddit comments in ukpolitics Aug 28th – Sep 4th

And the top words for the week are Weapons, Many, Should, Cameron, Going. I guess you can tell what’s been on peoples minds. My next steps are to have a closer look at the function, see what it does and see if I can bend it to do some of the things I am doing. I wonder if I can do something with Topic Models to explore what each subreddit is really about. Anyway best get back to real work.

Big thanks to /user/Snotaphilious

4 Replies to “Scrapping Reddit comments”

  1. David,

    Good luck with the script! Hopefully it’s useful.

    I’ll keep an eye on your blog and look forward to your thoughts on what certain subreddits are really about. Sounds like it could be fascinating.

    Best,

    Tagg

    1. Thanks for the script. I’m currently working on a few scripts that make generates a users personal corpus based around many of the websites they engage with, I guess the idea is so that the user can work out what sort of audiences and resources they are engaging with. The script is just perfect for me.

      Many thanks
      David

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.