You can create a virtualenv if you like and pip install pymongo and pip install tweetstream. Of course you will have to install mongodb and, in the below case, have it listening on the default port. Should be easy to do, I don't remember having any problems. tweetstream and pymongo are really easy to use and have some nice docs. Here is a simple example of using them together:
#!/usr/bin/env python
import tweetstream
import pymongo
connection = pymongo.Connection("localhost", 27017)
db = connection.election
username = "_skyl"
password = "XXX"
words = ["ron paul", "gingrich", "romney"]
with tweetstream.FilterStream(username, password, track=words) as stream:
for tweet in stream:
db.tweets.save(tweet)
Wow, that was easy. Now, we can let that run and run. We can make a connection to the database and query it as usual with pymongo.
>>> import pymongo
>>> connection = pymongo.Connection("localhost", 27017)
>>> db = connection.election
>>> db.tweets.find_one()
{u'_id': ObjectId('4ee84fcb9ed1fe7d38000000'),
u'contributors': None,
u'coordinates': None,
u'created_at': u'Wed Dec 14 07:27:15 +0000 2011',
u'entities': {u'hashtags': [],
u'urls': [],
u'user_mentions': [{u'id': 17293897,
u'id_str': u'17293897',
u'indices': [3, 18],
u'name': u'Andy Borowitz',
u'screen_name': u'BorowitzReport'}]},
u'favorited': False,
u'geo': None,
... TONS of stuff omitted ...
u'text': u"BREAKING: Romney Hopes Christine O'Donnell
Endorsement Will Help Woo Elusive Moron Vote",
...
}
>>> [i["text"] for i in db.tweets.find().limit(10).skip(120)]
[u'RT @KoryeLogan: @abcnews Ron Paul SMOKES the GOP competition
@Yahoo Social Sentiment Economy, Taxes & Foreign Policy Vote 4
@RonPaul htt ...',
u'Why Is Mitt Romney Recycling a KKK Slogan In His Campaign Speeches?
http://t.co/CmZaLMMf',
u'Blog: A-Paul-calypse Now: Ron Paul Trails Newt Gingrich By 1 Point In New
Iowa Poll http://t.co/d0IUCHgJ #News #TheOtherCnn',
u'WSJ/NBC poll finds 1/2 of all voters, and 57% of indies, say they
won\u2019t vote for #Newt #Gingrich. Is ANYONE out there surprised by
this??',
u"RT @heloiseellzey: If you don't want this country to go Communist you
better go the Ron Paul way.",
u'http://t.co/kTLHTWrN\nCheck out the new song with me and Ron Paul holla
back we up!!!',
u'RT @HowardKurtz: People in Iowa told me Ron Paul might win. New poll has
Newt at 22, Paul at 21, Mitt at 16. And Paul better organized i ...',
u"RT @kpereira: If you're American and aren't outraged by the prospect of
the NDAA, you're part of the problem: http://t.co/5ch78aiB",
u'RT @rollcall: MOST EMAILED TODAY: Gingrich Rose to Wealth Through
Congress. http://t.co/Tkw283y7',
u'RT @JDGOLDBLOG: Newt Gingrich Fades Fast in Iowa Polls, Ron Paul in
Statistical Dead Heat: Whether or not the news media pays an.. http: ...']
It's ready for some natural language processing! Ah, we can be our own analysts now.
I was wondering if you had any guidance on inserting tweets into MongoDB after they were already saved into a text file using tweetstream?