======== Twhatter ======== A simple Python scraper for Twitter. Motivation ---------- Twitter's API `terms and conditions `_ have become very demanding in May 2018. Inspired by other attempts, I have put together yet another twitter scraper that uses a simple HTTP client instead of the developer API, and allows retrieving any data that can be accessed in an anonymous browsing session. This is mostly an attempt for me to produce some clean, functional and maintainable Python code. I have especially focused on a clean separation between data retrieval, Twitter pages exploration, and output, which allows to easily define and combine various crawling strategies and data formats. And why that terrible name ? Simple, "WHAT's going on TWITTER ?" => TWHATTER ! Features -------- At the moment, this utility only provides a command-line to interact with it. Anonymous client **************** - Get any user's full timeline. - Get any user's profile data. Data output *********** All scraped information can either be : * displayed on the terminal, * stored into a JSON / YAML file * stored into a local database. Installation ------------ Installation requires Python >= 3.6. :: $ pip install --user git+https://code.theenglishway.eu/theenglishway-corp/twhatter You then have to ensure that `~/.local/bin` is in your `$PATH` or call `~/.local/bin/twhatter` instead of `twhatter` in the following examples Usage ----- Display some user's tweets :: $ twhatter timeline realDonaldTrump --limit 40 ... Display their profile information :: $ twhatter profile realDonaldTrump User(id=25073877, fullname='Donald J. Trump', join_date=datetime.datetime(2009, 3, 18, 0, 0), tweets_nb=40183, following_nb=45, followers_nb=57144827, likes_nb=7) Put them into a JSON/YAML file :: $ twhatter json timeline realDonaldTrump $ twhatter yaml profile realDonaldTrump Put them into a local database (by default in /tmp/db.sqlite) :: $ twhatter db timeline realDonaldTrump Open a session on the local database and make queries with SQLAlchemy :: $ twhatter db shell In [1]: session.query(Tweet).all() Out[1]: [`_ Other scrapers that might fit your needs **************************************** In Python : * `twint `_ * `twitterscraper `_ * `twitter-scraper `_ In Javascript: * `scrape-twitter `_