AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Follow Kevin! https://twitter.com/SahinKevin
Check out ScrapingBee: https://www.scrapingbee.com/
AUTOMATED TRANSCRIPT
Colleen Schnettler 0:00
This episode of Software Social is sponsored by Hey Check It. Does your website performance keep you up at night? The creators behind Hey Check It started it for this very reason—peace of mind about their sites and the sites they manage. Hey Check It is a website performance monitoring and suggestion tool focused on SEO, accessibility, uptime, site speed and content. It includes AI-generated SEO, data, spelling and grammar checking, custom sitemaps, and a number of other tools. If you're managing multiple websites, check their agency plans with public facing dashboards to meet your clients' needs. Start a free trial today at HeyCheckIt.com
Michele Hansen 0:39
Hey, welcome back to Software Social. We're doing another interview this week. I am so excited to have Kevin Sahin with me. He is co-founder of ScrapingBee. Kevin, welcome to software social.
Kevin Sahin 0:57
Well, thank you, Michele, I'm excited to be here.
Michele Hansen 1:01
So this kind of came about because I was on Twitter, as I often am. And I noticed, I think it was actually someone tweeted about MicroConf Europe, which I had been really wanting to go to, but conflicted with a friend's wedding. So we couldn't go. So I was just sort of following and watching everything unfold on Twitter and tweeted about how peer your co founder was, was giving a talk. And he mentioned how scraping DEA offered free API credits to customers who are willing to jump on a 15 minute call with them. And you guys ask them questions like, what else have you tried, and my interest immediately perked up. And really wanted to talk to you about those calls you had and what you learned from them, and what that added for the business. But before we jump into that, perhaps you should say for a moment, just what scraping be. Is and, and whatnot. And?
Kevin Sahin 2:09
Sure. So um, so basically scraping the is an API for web scraping. When you are extracting data from the web, you often have the two same problems, which are, there are more and more websites that are using JavaScript frameworks like Vue js, react, etc. And so you have to render the page inside a web browser. And this is kind of, it's a pain to manage, especially at scale. Because you have to, you know, there are lots of DevOps skills that you need. You need big servers, you need lots of things. And it's really handy to have, you know, a headless browser accessible with a simple API call. The other thing that you have to do when you scrape the web at scale, is to manage proxies. So you can you probably need proxies for many different reasons. For example, let's say that you are extracting data from ecommerce website. Well, most ecommerce websites are internationalized, meaning that if you access the website from an IP address in Europe, you will have the prices in euro if you access the IP address or the website from an IP address in the US you will have prices in dollars. So you need some kind of proxy management system. The other thing is IP rate limit. Some websites are limiting the number of pages you can access per day from a single IP address if you need to access more pages, you need more IP addresses etc, etc. And so we bundled this inside a single API which is scraping
Michele Hansen 4:04
so I love how you're solving that because we have felt that pain personally. So I've kind of talked a little bit in the past about how my husband dies first project that was what so the one well, not at first, but the one right before geocoder that basically funded Juco was this mobile app called what's open nearby where you could open it up and see grocery stores convenience stores and coffee shops that were open near you. And how we ran that in the back end was we had a ton of scrapers running of like grocery store, you know Starbucks, whatever like their websites, scraping the hours off of them and we like just all the time there's issues you know, the parsers breaking or you get blocked or actually the the sort of recent side project we did Keren, which allowed people to get an alert when a grocery pick slot opened up on a on a grocery stores website because of COVID and everything that was also powered by scrapers basically and the back end. And so I have I have personally felt the pain of, you know, the impacts when when when, you know, scraping goes wrong or you know it can get frustrating at times.
Kevin Sahin 5:29
Yeah, that's I mean, there are the, the story behind scraping is that we, we personally experienced some of those frustrations, because p&i like before launching scraping beam, we started our career in two different startups that were heavily relying on web scraping. In the business, I was working on a startup in France, which is kind of a mix between mint.com in the US and plaid.com. So for those who don't know, it's a bank account aggregation software's sublet, that comm is an API that allows third party to access your bank account. And means that comm is a bank account aggregation, personal finance management app. And so at this startup, I was really exposed to all of these issues. And Param, he was working for a real estate startup, a real estate data startup in France. And so there will relying on scraping lots of real estate portals. So we both, you know, experienced lots of these issues regarding how to handle headless browsers, how to handle proxies, how to, you know, handle blocks, etc, etc. So that was something we, we knew a little about,
Michele Hansen 7:16
I love how you started with a pain that you had. But also as, as you've run the business, you're also actively reaching out to your customers to understand what they were trying to do, what problems they were having, and how they were solving those problems. So I wonder if you can kind of take us back to when you like, how did those emails come about where you were reaching out to people like, like, what what kind of prompted that?
Kevin Sahin 7:47
Yeah. So that we quickly realized that we really knew when I say that we knew a little about it, it's not an a few million. Because we really knew a little about the different web scraping use cases each time. I mean, from the beginning, when we launched the API we like from day one, I'd say, we realized that some users, we're scraping, have had some use cases that we never imagined. So we quickly realized that we had to get them on the phone and knew more about about it, understand their businesses, what kind of data they they needed, what frequency for what we use case, etc, etc. But the problem that we had is that at the beginning, so we had we had the banner on the dashboard, covering that, if they had any question, they could schedule a call with me. But nobody was scheduling any call. So maybe, maybe the banner was wasn't, I mean, the copy wasn't great, maybe. The CTA wasn't clear, I don't know. But the fact is, nobody was getting any call with me. And we also had an email sequence where we, we had a few links to my county. But it wasn't working. I mean, sometimes we had a trial scheduling a call, but it was not very, not a lot. And and then we we had this idea of offering more 10x more free API calls. Then the trial offered. And then instantly, we started to get a lot of calls. So many that I had to, you know, delete some availability in my week, because I was just doing calls every day all day. And, and it was great because we will learn so much we, I mean, we will learn so many different use cases that we never thought about. For example, I don't know, we, we, we had so many diverse people. So for example, university resea...