We're going to use
pygooglenews package that will help us get structured news articles from any Google News page.
Disclaimer: NewsCatcher team has created this Python package. If you want to know more about how this package works, read this article:
PyGoogleNews package overview
In a nutshell, it exploits the fact that Google News data can also be accessed via the RSS: even custom search! Months of testing have shown that Google won't block your IP if you access RSS feed 100k+ times per day. I believe it is because RSS is created to be accessed by other machines. Also, it is a super lightweight page (30 kB compared to 1 Mb+ google news UI).
What data you can access with pygooglenews
- Top news
- News articles by topic (business, politics, etc)
- News articles by town, country, location
- News by your custom search
pip install pygooglenews --upgrade
1. Top Google News articles
from pygooglenews import GoogleNews # default GoogleNews instance gn = GoogleNews(lang = 'en', country = 'US') top_news = gn.top_news()
To know more about the supported languages and countries, check here.
2. Google News articles by topic
Accepted topics are:
from pygooglenews import GoogleNews # default GoogleNews instance gn = GoogleNews(lang = 'en', country = 'US') business = gn.topic_headlines('BUSINESS')
In addition to these preset topics you may also parse custom ones, such as "COVID-19". Check more in this part of the documentation.
3. Google News articles by geolocation
from pygooglenews import GoogleNews # default GoogleNews instance gn = GoogleNews(lang = 'uk', country = 'UA') kyiv = gn.geo_headlines('kyiv') # or kyiv = gn.geo_headlines('kiev') # or kyiv = gn.geo_headlines('киев') # or kyiv = gn.geo_headlines('Київ')
All of the 4 options presented above will return the same news feed about Kyiv, Ukraine. Google News will "autoparse" the place name. It also seems to be language agnostic but it doesn't mean that all places feeds will be present for all languages.
4. Google News articles by your custom search
from pygooglenews import GoogleNews # default GoogleNews instance gn = GoogleNews(lang = 'en', country = 'US') # find all latest news about NFT s = gn.search('NFT')
Here you can pass any keywords that you want.
pygooglenews helps you with all the URL-escaping that is required by Google Newsю
Some advanced search parameters that you might want to add (check this part of the documentation):
- restrict search to some particular date
- exclude/include keywords
- exact match
- search for keywords to be present in the title
Check advanced examples to have a better understanding.
If you liked this post, or you're using our package, please just share this blog post! This will help us get better SEO.