2min chapter

Software Engineering Radio - the podcast for professional software developers cover image

Episode 503: Diarmuid McDonnell on Web Scraping

Software Engineering Radio - the podcast for professional software developers

CHAPTER

Web Scraping - Data Quality Challenges

Data quality is an area social scientists have a lot of experience with. Dealing with missing observations and duplicates, that's usually not problematic. What can be quite difficult is the up dating of web sites. Depending on the web page you're interested in, there'll be some clues about how often the web page actually up dates. There are definite times where older web pages become obsolete. Back to the detective work of frequently checking him your scraper,. Making sure that the webside is working as intended, looks as you expect, and making any necessary changes to your scraper.

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode