A Look at Trump and Clinton’s Tweets Using Tweepy – Part 1: Popularity Metrics

This post is part of a series.
Part 2 can be found here.
Part 3 can be found here.

We’re in the midst of the 2016 Election right now. As most Americans have noticed, politician’s tweets have been making up half of the headlines (despite being as long as headlines themselves). In this series I’ll be using the Python library Tweepy to look at the popularity/loyalty of the candidates’ Twitter accounts, as well as the topics they’ve chosen to discuss over time.

Macro Bonzanini’s Tweepy guide held my hand through this entire process. His more comprehensive Tweepy tutorials can be found here.


Using Tweepy requires a set of preliminary steps, such as creating a Twitter app to interact with Twitter’s API. This has been well documented on some other Tweepy guides so I will be breezing over it for now. Just note that creating the “auth” field as I have done below requires “consumer key” and “consumer secret” fields which are unique to your Twitter app.

Setting up Tweepy:

import tweepy

key = 'put yours here'
secret = 'put yours here'
access_token = 'put yours here'
access_secret = 'put yours here'

auth = OAuthHandler(key, secret)
auth.set_access_token(access_token, access_secret)

api = tweepy.API(auth)


Relative Popularity, Loyalty, and Engagement

Before finding out anything else, I was curious how popular Donald Trump and Hillary Clinton were among their own followers in relative and absolute terms.

Twitter’s API offers a pretty vast range of fields for each tweet. A list of all these possibilities can be found here. For the first test I looked at the favorite count and retweet count of each politician’s last 1000 tweets. I also looked at the follower count of both, which is not a tweet specific field but a user specific field.

userID = 'HillaryClinton'
followers = api.get_user(id=userID).followers_count
setsize = 1000
fav_avg = 0
rt_avg = 0
for status in tweepy.Cursor(api.user_timeline, id = userID).items(setsize):
    fav_avg += status.favorite_count
    rt_avg += status.retweet_count
fav_avg = fav_avg / setsize
rt_avg = rt_avg / setsize
print (userID)
print ('Followers: ' + str(followers))
print ('Average Favorites: ' + str(fav_avg))
print ('Average Retweets: ' + str(rt_avg))
print ('Average Favorites (% of Followers): ' + '%f' % (fav_avg/followers * 100) + '%')
print ('Average Retweets (% of Followers): ' + '%f' % (rt_avg/followers * 100) + '%')

The above segment outputs:

Followers: 8,327,832
Average Favorites: 6,005.776
Average Retweets: 2,866.363
Average Favorites (% of Followers): 0.072117%
Average Retweets (% of Followers): 0.034419%

Just by changing userID to “realDonaldTrump”, we can see the equivalent numbers for Donald:

Followers: 10,934,978
Average Favorites: 23,026.52
Average Retweets: 8,071.84
Average Favorites (% of Followers): 0.210577%
Average Retweets (% of Followers): 0.073817%

The differences in absolute terms are pretty striking, but the differences as a % of total followers are more telling. Donald’s numbers for retweets are double Hillary’s, and his numbers for favorites are nearly triple her’s. Not only does Donald have more followers and favorites, but individual tweets engage his audience more. But the two don’t always scale. Here’s Barack Obama’s numbers for comparison:

Average Followers: 76,754,938
Average Favorites: 4,033.78
Average Retweets: 1,851.41
Average Favorites (% of Followers): 0.005255%
Average Retweets (% of Followers): 0.002412%

Obama has a ton of followers (76.8 million), likely because he’s the President. But his tweets have even fewer favorites and retweets on average than Hillary’s.

I wondered if there was a distinct difference in tweeting volume here (there is), so I decided to check the average number of tweets per day for both candidates using the following code segment:

import datetime
test_date = datetime.datetime.now() + datetime.timedelta(-30)

userID = 'HillaryClinton'
tweetCount = 0
for status in tweepy.Cursor(api.user_timeline, id = userID).items():
    if (status.created_at > test_date):
        tweetCount = tweetCount + 1
print (userID)
print (tweetCount)

I found that over the last 30 days, Hillary tweeted an average of 16.5 times a day whereas Donald tweeted an average of 9.4. I imagine this affects the amount of favorites the average tweet garners, as followers naturally spread their favorites out amongst whatever tweets come in that day.


5 thoughts on “A Look at Trump and Clinton’s Tweets Using Tweepy – Part 1: Popularity Metrics

  1. Hello Keith,
    Nice tutorials!
    We are using a modified version of your tutorial materials in our meetup workshop. Hopefully it is fine with you!
    Our github repo in the website of the comment.
    Let me know if you want me to take it off anytime.
    Thank you!


  2. I have been surfing online more than 3 hours these days, but I never found
    any attention-grabbing article like yours. It is pretty value enough for me.
    Personally, if all webmasters and bloggers made good content as you
    probably did, the net can be a lot more helpful than ever before.


  3. Hello Keith,
    Excellent tutorials!
    am getting the correct output except these:

    Average Favorites (% of Followers): 0.000000%
    Average Retweets (% of Followers): 0.000000%
    any mistake? am using python 2.7


    • Hey Ammy, apologies for my delay in responding to you.
      It’s likely a rounding error caused by integer division.
      To ensure that it works correctly, try initializing fav_avg and rt_avg as type double:
      fav_avg = 0.0
      rt_avg = 0.0

      Hope that helps!


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s