Login or Sign up

Use Python to Store Data from Twitter's Streaming API in MongoDB

Posted by: skyl on Dec. 14, 2011

You can create a virtualenv if you like and pip install pymongo and pip install tweetstream. Of course you will have to install mongodb and, in the below case, have it listening on the default port. Should be easy to do, I don't remember having any problems. tweetstream and pymongo are really easy to use and have some nice docs. Here is a simple example of using them together:

#!/usr/bin/env python

import tweetstream
import pymongo

connection = pymongo.Connection("localhost", 27017)
db = connection.election

username = "_skyl"
password = "XXX"
words = ["ron paul", "gingrich", "romney"]
with tweetstream.FilterStream(username, password, track=words) as stream:
    for tweet in stream:
        db.tweets.save(tweet)

Wow, that was easy. Now, we can let that run and run. We can make a connection to the database and query it as usual with pymongo.

>>> import pymongo
>>> connection = pymongo.Connection("localhost", 27017)
>>> db = connection.election
>>> db.tweets.find_one()
{u'_id': ObjectId('4ee84fcb9ed1fe7d38000000'),
 u'contributors': None,
 u'coordinates': None,
 u'created_at': u'Wed Dec 14 07:27:15 +0000 2011',
 u'entities': {u'hashtags': [],
  u'urls': [],
  u'user_mentions': [{u'id': 17293897,
  u'id_str': u'17293897',
  u'indices': [3, 18],
  u'name': u'Andy Borowitz',
  u'screen_name': u'BorowitzReport'}]},
 u'favorited': False,
 u'geo': None,

 ... TONS of stuff omitted ...

 u'text': u"BREAKING: Romney Hopes Christine O'Donnell
           Endorsement Will Help Woo Elusive Moron Vote",
 ...
}
>>> [i["text"] for i in db.tweets.find().limit(10).skip(120)]
[u'RT @KoryeLogan: @abcnews Ron Paul SMOKES the GOP competition
 @Yahoo Social Sentiment Economy, Taxes & Foreign Policy Vote 4
 @RonPaul htt ...',
 u'Why Is Mitt Romney Recycling a KKK Slogan In His Campaign Speeches?
 http://t.co/CmZaLMMf',
 u'Blog: A-Paul-calypse Now: Ron Paul Trails Newt Gingrich By 1 Point In New
 Iowa Poll http://t.co/d0IUCHgJ #News #TheOtherCnn',
 u'WSJ/NBC poll finds 1/2 of all voters, and 57% of indies, say they
 won\u2019t vote for #Newt #Gingrich. Is ANYONE out there surprised by
 this??',
 u"RT @heloiseellzey: If you don't want this country to go Communist you
 better go the Ron Paul way.",
 u'http://t.co/kTLHTWrN\nCheck out the new song with me and Ron Paul holla
 back we up!!!',
 u'RT @HowardKurtz: People in Iowa told me Ron Paul might win. New poll has
 Newt at 22, Paul at 21, Mitt at 16. And Paul better organized i ...',
 u"RT @kpereira: If you're American and aren't outraged by the prospect of
 the NDAA, you're part of the problem: http://t.co/5ch78aiB",
 u'RT @rollcall: MOST EMAILED TODAY: Gingrich Rose to Wealth Through
 Congress. http://t.co/Tkw283y7',
 u'RT @JDGOLDBLOG: Newt Gingrich Fades Fast in Iowa Polls, Ron Paul in
 Statistical Dead Heat: Whether or not the news media pays an.. http: ...']

It's ready for some natural language processing! Ah, we can be our own analysts now.

Comments on This Post:

Please Login (or Sign Up) to leave a comment