For script 01 of 28 I decided to figure out the module Newspaper and build a quick n' dirty news reader for the terminal.
Run a command with a few arguments, spit out the headlines, maybe a summary with NLP.
Requirements for Newspaper
sudo apt-get install python-dev libxml2-dev libxslt-dev lbjpeg-dev zlib1g-dev libpng12-dev
brew install libxml2 libxslt libtiff libjpeg webp little-cms2
Install Newspaper from Pip
sudo pip3 install newspaper3k
On my Raspberry Pi, it took an awful long time to install the requirements. On my Mac the install was rather speedy. I'd recommend a VM or other box if you are in a hurry.
Download NLP related corpora:
curl https://raw.githubusercontent.com/codelucas/newspaper/master/download_corpora.py | python3
Hopefully you have a virtualenv setup. Install the requirements for the OS and Python3.
pip install -r requirements.txt
** The app is set to cache previous articles, so if there are no new articles posted, you won't get any summaries. **
Currently it is setup to go to CNN's Tech page.
Here is the expected output:
Some Articles for you:
WHAT WENT WRONG AT THE LOS ANGELES TIMES?
The turmoil has left staffers and Angelinos asking: How did the L.A. Times get here? (He will become chief content officer at Tronc, the Times' parent company). Those in the pro-Ferro camp counter that he has drastically increased Tronc's value, and therefore the value of the L.A. Times. Hartenstein was brought to the Times as publisher in 2008 by Sam Zell, the billionaire who infamously drove the Tribune Company into the largest bankruptcy in media industry history. Within days, reports emerged that Tronc was building a separate entity called the Los Angeles Times Network in what the guild feared was an attempt to bust up the union.
In depth knowledge of the Newspaper module.