Scrape Twitter Data in Python with Twitterscraper Module
Articles,  Blog

Scrape Twitter Data in Python with Twitterscraper Module


hello everyone ken here today I’m
showing you how to scrape Twitter data using Python and the Twitter scraper
module so on my screen is the Twitter scraper documentation it is also in the
description below as you can see you can do it from the command line but I prefer
to do it from the notebook it allows me to read it in directly into a data frame
and allows me to manipulate it from there to get started we install it pip
install Twitter Scraper I’ve obviously already downloaded it so all the
requirements are met here now I’m going to open up spyder which is my IDE of
choice for data science so from Twitter scraper and for query weeks now we’re
also going to import date/time because we want to set a date range for the
tweets and we’re going to import pandas because we want to turn this into a data
frame after without so the first thing we’re going to do is query tweets takes
a couple parameters the first one and a relevant one to us is begin date is
equal to so all in a query is something recent
I’m gonna do the notre-dame fire that happened very recently that’ll be a good
example to show us something that is relevant and a lot of the information
associated with that on Twitter will also an end date we’re gonna do tomorrow today is the
17th for me and that’s because it just takes all of yesterday’s yeah well
today’s data as well and that’s relevant if you just want to do a time window for
example in one of my previous videos I did the timing around a movie premiere
we can also put in a limit so let’s say this is a really really popular topic we
can set a limit of a thousand or so so we don’t go crazy we don’t have millions
and millions of tweets that would take forever to download now we can also set
a language so lang equals English in this case if we’re doing the Notre Dame
fire there’s gonna be a lot of French tweets I would expect so we’d want to
filter those out so it’s understandable for me mainly if we wanted to just look
at one person tweets we could also just query them we would set the user is
equal to let’s just say Rio Trump if you wanted to here is wacky comments on the
Notre Dame fire you’d be able to type that in and filter specifically for that
now for the real magic here we just query tweets so and we put in our
parameters right here we’re asking for better than fire which is our key
component that we’re gonna query you can also put hashtags anything relevant that
you look like there we do begin date equals begin date end date equals we’re gonna set the limit and so we can
load all of this in and this should take just a couple minutes if you’re doing
anything over ten hundred thousand it can take up to you know twenty thirty
minutes actually wrong the way they do this is they just have
from my understanding a bunch of threads hitting Twitter so it will it depends
fairly heavily on the processing power of your computer if you want to increase
the speed or not so looks like we got a thousand twenty which is pretty close to
our limit so I can’t complain too much it it isn’t super exact just because of
I guess the the structure of the back end there now we want to make this
relevant to us as you can see the tweets here are just in tweet objects that
don’t really mean it down to us yet so let’s transform this into a data frame
so data frame is equal to all right so after doing that we have
all of our Twitter data we have the username we have the number of likes
replies etc you can filter by that if you want or you can order by it etc and
we also have all of the tweets as you can see there’s a ton of duplicates so
that’s something that I like to remove duplicates of a lot of them are news
related most of the URL and the actual username of the person who posted these
tweets you can do a lot of really interesting stuff with this I’ve done
the sentiment analysis in the past on Captain Marvel the movie premiere and
you can see that above you can also combine it all make a word cloud related
to the event or the topic that you’re covering and there’s plenty of other
text-based analysis that you can do with this information this is a great tool
it’s fast easy and free and I highly recommend that you use it thank you so
much for watching my video if you enjoyed it please like if you want to
see more content like this please subscribe have a great one

Leave a Reply

Your email address will not be published. Required fields are marked *