Software Engineering Radio - the podcast for professional software developers cover image

Episode 503: Diarmuid McDonnell on Web Scraping

Software Engineering Radio - the podcast for professional software developers

00:00

Web Scraping - Data Quality Challenges

Data quality is an area social scientists have a lot of experience with. Dealing with missing observations and duplicates, that's usually not problematic. What can be quite difficult is the up dating of web sites. Depending on the web page you're interested in, there'll be some clues about how often the web page actually up dates. There are definite times where older web pages become obsolete. Back to the detective work of frequently checking him your scraper,. Making sure that the webside is working as intended, looks as you expect, and making any necessary changes to your scraper.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app