Creating a RSS feed from a Twitter timeline
Thu 04 August 2016
For work, I create mobile applications for school districts. Often, the customers would like to have their social media accounts represented in their application. A while back, I got involved in dealing with the Facebook Graph API, and using it to turn a district's Facebook 'Page' into a RSS feed which could be loaded into the app. It took a lot of work, but the end result is pretty robust. That said, I'm going to rework that a bit before I write anything about it here.
However, recently I was tasked with doing similar for Twitter - some customers wanted their Twitter 'Timelines' integrated into their app. Again, the best way to do this with our codebase is to implement a RSS feed. So, as I did with Facebook, I began looking into the Twitter API. The first thing I noticed is that Twitter REQUIRES you to use OAuth. As I wasn't too confident in my ability to handle this myself in Python, I chose to use python-twitter, a Python wrapper for the Twitter API. I may switch to another one in the future, or create my own, but I wanted to document my current solution first. As for the RSS output, I'm using the excellent feedgen module. Believe it or not, but one of the reasons I'm not writing up my Facebook RSS solution yet is that I'm manually creating the RSS feed and that's sort of a nightmare.
Now, you'd think that (aside from the OAuth requirement) getting info about a user's timeline via the API would be a simple thing. I mean, how much data can there be? Tweets are 140 characters (for now), for crying out loud! Well, you'd be surprised.
This python script is meant to connect via the Twitter API to a specific user's timeline. The username is given via the sole argument to the program. It reads in some data about each tweet, and uses feedgen to create a RSS feed, with each item populated from a tweet. Currently, there's no support for the media entity (read: photos), but I plan to add that soon.
First, let's get started with our imports:
from sys import argv import twitter from apicreds import * from feedgen.feed import FeedGenerator
I'm importing argv so that I can supply an argument to the python script, the python-twitter wrapper, python-feedgen, and I'm importing the four pieces of information needed from my API credentials from a file named 'apicreds.py'.
api = twitter.Api(consumer_key=ckey, consumer_secret=csecret, access_token_key=akey, access_token_secret=asecret, sleep_on_rate_limit=True) script, twitter_account = argv webpath = '/var/www/html/whatever' filename = webpath + twitter_account + '.xml' statuses = api.GetUserTimeline(screen_name=twitter_account) userstuff = api.GetUser(screen_name=twitter_account) status_prefix = "https://twitter.com/" + twitter_account + "/status/" twitter_page = "https://twitter.com/" + twitter_account feedtitle = "Twitter Timeline RSS - " + twitter_account
Next, we define 'api' using the four pieces of information supplied in the apicreds.py file: ckey, csecret, akey, and asecret. We also tell the python-twitter wrapper to chill out for fifteen minutes if it hits the rate limit.
After that, we define a bunch of variables to make everything prettier later. You'll probably want to change 'webpath' if you're trying this at home. You can see that the script is going to output a file there that is named after the twitter account, with an .xml extension. I also define 'statuses' and 'userstuff' as methods supplied by the python-twitter wrapper. There's also some stuff for naming the RSS feed, and creating URLs to link to the twitter timeline or the specific 'status'.
fg = FeedGenerator() fg.id('http://ryanhampton.net') fg.title(feedtitle) fg.link( href=twitter_page, rel='alternate') fg.logo(userstuff.profile_image_url) fg.language('en') fg.description('auto-generated RSS feed') for status in statuses: fe = fg.add_entry() fe.title(status.text[:50]) fe.description(status.text) twitlink = status_prefix + status.id_str fe.link( href=twitlink ) fe.id(status.id_str) fe.published(status.created_at) fg.rss_file(filename)
Now, we define the feed information, followed by a 'for' loop for each twitter timeline 'status' to become a RSS 'item'. You'll notice that the first part is the only place that we use the 'userstuff' var, because that's a different API method than is needed for the actual statuses, and we just need that so we can snag the user's twitter bio pic. I'm actually not sure if fg.id is necessary, but I've put my website URL there - feel free to do something different or try leaving it out. I actually don't see it in the feeds I've generated with this, so I don't know what's up.
As for the second part, it's not beautiful, but it's an okay proof of concept. Keep in mind that RSS items have a lot more fields than a tweet. So, for instance, I've chosen to fill out title with the first 50 characters of the given tweet, whereas description will be the whole tweet. The URL for this item will be the link back to the actual tweet on twitter.com. We're getting the RSS item's 'pubDate' from the twitter's 'created_at' info. Finally, after the loop, we write this feed to a file.
That's it! I am not sure if this'll be helpful to anybody, and I probably should have used Tweepy, and I probably should have also included a way to get images into the RSS feed. I really just wanted to fix the mistake of only having one post on my blog, and try out posting code on it. Please feel free to get ahold of me via the twitter link over to the right if you have any suggestions, insults, or questions for me.