Running in Production

Nick Janetakis - Full stack developer

Hear about how folks are running their web apps in production. We'll cover tech choices, why they chose them, lessons learned and more.

Episodes

Mentioned books

Jun 29, 2020 • 46min

A Real Estate Order and Appraisal System for a Small Business

In this episode of Running in Production, Austin Lewis goes over replacing an Excel sheet with a custom / internal Django app to manage his real estate business. It’s been up and running on the AWS free tier since April 2020. It has processed over 300 orders in the few months it’s been up and Austin is sole developer of this project. It is one of the first apps he’s deployed. Topics Include 6:40 – Motivation for using Django and Python and taking advantage of Django’s admin 9:22 – Breaking down how the app is structured as a monolith and a few helpful libraries 15:28 – Having the foresight to upload files to S3 while having only 1 production EC2 server 17:24 – Sticking with Django templates and sprinkles of JavaScript to avoid Yak Shaving 20:06 – Using Docker / Docker Compose with PostgreSQL and Traefik 25:26 – Recap of AWS services (free tier) and setting up the EC2 servers 27:07 – It’s very helpful to deploy your app early and to also use Docker 30:25 – Covering the deploy process, the value in testing and secret management 36:01 – Using Mailgun for sending email and Sentry for error reporting 41:36 – Planning for disaster by letting RDS handle backups 43:05 – Best tips? Keep learning and just get something up and running 45:10 – You can find Austin on GitHub or contact him by email Links 📄 References https://en.wiktionary.org/wiki/yak_shaving ⚙️ Tech Stack django → python → aws → bootstrap → docker → jquery → lets-encrypt → mailgun → postgres → rds → route53 → s3 → sentry → traefik → ubuntu → 🛠 Libraries Used https://github.com/shivanshs9/pdfgen-python https://github.com/jschneier/django-storages https://gunicorn.org/ https://github.com/pytest-dev/pytest Support the Show This episode does not have a sponsor and this podcast is a labor of love. If you want to support the show, the best way to do it is to purchase one of my courses or suggest one to a friend. Dive into Docker is a video course that takes you from not knowing what Docker is to being able to confidently use Docker and Docker Compose for your own apps. Long gone are the days of "but it works on my machine!". A bunch of follow along labs are included. Build a SAAS App with Flask is a video course where we build a real world SAAS app that accepts payments, has a custom admin, includes high test coverage and goes over how to implement and apply 50+ common web app features. There's over 20+ hours of video.

Jun 22, 2020 • 1h 24min

Passiv Is a Portfolio Management and Automation Platform

In this episode of Running in Production, Brendan Wood talks about building a portfolio management platform with Django and Python. It’s been running in production since mid 2017 and is hosted on DigitalOcean. There’s about 3,000+ active users and overall they are responsible for managing hundreds of millions of dollars in funds for their users. Topics Include 3:13 – It started as a 50 line Python script that replaced an Excel sheet 10:49 – Motivation for using Django, Python, NumPy and creating a monolithic app 15:38 – Eventually decommissioning a legacy version of the back-end over time 19:00 – There’s about 33,000+ lines of back-end code, including tests 22:24 – There’s a clean split between the back-end API and the TypeScript React front-end 30:52 – The entire front-end is open source on GitHub 32:13 – It’s hosted on DigitalOcean w/ Ubuntu 18.04, PostgreSQL, Redis, Celery and nginx 39:08 – There’s ~5 seconds of down time per deploy which is done outside of trading hours 46:00 – Everything runs on a single server + a managed PostgreSQL DB (with replicas) 48:20 – Ansible is being used to configure the server 55:22 – Getting code from dev to production in a few minutes with git and a deploy script 1:01:07 – Brendan’s philosophy on starting a business is to do things when you need to do it 1:02:58 – Logging, email alerts and using Stripe to handle payments 1:08:35 – Handling disasters and other unexpected events with backups and alerts 1:16:19 – Best tips? Use the tools that you know unless you have a compelling reason not to 1:19-27 – Setting up a customer support system only after they had a need for it 1:21:39 – Check out https://getpassiv.com/ Links 📄 References https://en.wikipedia.org/wiki/Exchange-traded_fund ⚙️ Tech Stack django → python → react → ansible → digitalocean → lets-encrypt → nginx → postgres → redis → stripe → supervisor → ubuntu → webpack → 🛠 Libraries Used https://numpy.org/ https://github.com/encode/django-rest-framework https://github.com/celery/celery https://gunicorn.org/ https://github.com/kakulukia/django-secrets https://github.com/dj-stripe/dj-stripe Support the Show This episode does not have a sponsor and this podcast is a labor of love. If you want to support the show, the best way to do it is to purchase one of my courses or suggest one to a friend. Dive into Docker is a video course that takes you from not knowing what Docker is to being able to confidently use Docker and Docker Compose for your own apps. Long gone are the days of "but it works on my machine!". A bunch of follow along labs are included. Build a SAAS App with Flask is a video course where we build a real world SAAS app that accepts payments, has a custom admin, includes high test coverage and goes over how to implement and apply 50+ common web app features. There's over 20+ hours of video.

Jun 15, 2020 • 36min

Determine What Your Toilet Paper Supply Is Based on Your Usage

In this episode of Running in Production, Ben Sassoon goes over building a site that helps you figure out how much toilet paper you have left. It’s a static site using pure HTML. It’s been running in production since March 2020 and it’s hosted on GitHub Pages. The site has had over 10 million visitors and was featured on various cable TV news outlets and talk shows. The MVP was built as a joke for his friends in about 20 minutes. Topics Include 6:47 – A very simple static site let him spin up an MVP in about 20 minutes 8:00 – GitHub embraced his service, even though he surpassed the GH Pages traffic limit 10:26 – Making the site mobile friendly using BrowserStack and Polypane 12:59 – There’s not even a static site generator being used, it’s pure HTML 14:14 – Ezoic helped quickly add Google AdSense to the page 19:30 – Getting 300-400 donations and $5,000 / day from ads during its peak 22:01 – The core of the site is about 6 lines of vanilla JavaScript 27:32 – The process of transferring a domain name / site to another person 30:23 – A DNS mystery caused a bit of down time at one point 32:37 – Ben’s workflow for pushing code from development to production 34:17 – Going from basically no traffic to millions of visitors in a short period of time 35:27 – You can find Ben on Twitter at @bensassoon Links 📄 References https://education.github.com/pack https://www.browserstack.com/ https://polypane.app/ https://empireflippers.com/ ⚙️ Tech Stack static-site → bootstrap → cloudflare → github-pages → namecheap → weglot → 🛠 Libraries Used https://www.ezoic.com/ https://ko-fi.com/ https://icons8.com/ Support the Show This episode does not have a sponsor and this podcast is a labor of love. If you want to support the show, the best way to do it is to purchase one of my courses or suggest one to a friend. Dive into Docker is a video course that takes you from not knowing what Docker is to being able to confidently use Docker and Docker Compose for your own apps. Long gone are the days of "but it works on my machine!". A bunch of follow along labs are included. Build a SAAS App with Flask is a video course where we build a real world SAAS app that accepts payments, has a custom admin, includes high test coverage and goes over how to implement and apply 50+ common web app features. There's over 20+ hours of video.

Jun 8, 2020 • 1h 8min

Confectionery Connect Is an E-commerce Video Course Marketplace

In this episode of Running in Production, Sean Parsons goes over building an e-commerce video course marketplace to sell Confectionery goods with Django and Python. It’s been running in production since December 2019 and it’s hosted on AWS. The app has roughly ~100k lines of code and was solo developed part time over about 3 months before shipping an MVP. Topics Include 3:00 – Modifying an existing e-commerce library called Seleor 7:20 – Figuring out how to pay out instructors fairly based on activity 10:04 – Picking Django, avoiding burnout and splitting the code into ~15 Django apps 20:49 – Celery is being used extensively, along with Celery Beat 25:10 – Stripe as a payment gateway was a natural fit given their subscription model 29:44 – It is a server rendered site with Django templates, except for the video player 35:26 – Turns out using Amazon’s video encoding service is expensive, so Sean uses ffmpeg 38:48 – High level overview about the rest of the tech stack 42:21 – Using Fabric to deploy to a single EC2 instance 45:00 – Going over the deploy process from development to production 50:08 – Benefits of switching to a compute optimized C5n.large EC2 instance 1:00:46 – Handling disasters and unexpected events 1:04:39 – Best tips? Pick the tool you’re the most productive with and ship something 1:07:03 – They’re on Instagram with a new account name of ZenVur Links 📄 References https://en.wikipedia.org/wiki/Confectionery https://transferwise.com/us https://www.youtube.com/watch?v=8hY6DSSVvYw (Etsy talk on deployment) ⚙️ Tech Stack django → python → aws → cloudfront → cloudwatch → docker → elasticache → postgres → rds → redis → route53 → s3 → statuscake → stripe → supervisor → ubuntu → 🛠 Libraries Used https://github.com/mirumee/saleor https://github.com/deschler/django-modeltranslation https://github.com/celery/celery https://flower.readthedocs.io/en/latest/ https://videojs.com/ https://gunicorn.org/ https://www.fabfile.org/ https://github.com/antonagestam/collectfast Support the Show This episode does not have a sponsor and this podcast is a labor of love. If you want to support the show, the best way to do it is to purchase one of my courses or suggest one to a friend. Dive into Docker is a video course that takes you from not knowing what Docker is to being able to confidently use Docker and Docker Compose for your own apps. Long gone are the days of "but it works on my machine!". A bunch of follow along labs are included. Build a SAAS App with Flask is a video course where we build a real world SAAS app that accepts payments, has a custom admin, includes high test coverage and goes over how to implement and apply 50+ common web app features. There's over 20+ hours of video.

Jun 1, 2020 • 1h 14min

Zego Lets You Easily Buy Insurance by the Hour

In this episode of Running in Production, Stuart Kelly lets us know what it’s like to build an insurance company from scratch with Django and Python. It’s been running in production since early 2017 and they’ve issued out 290+ million hours of insurance so far. It’s hosted on AWS. Stuart covers building an MVP in 8 weeks, using Stripe with SCA, creating 25+ Django apps over time, working with a GraphQL API back-end, querying 45+ million DB rows quickly, making app deploys a pleasant experience for his team, achieving 99.99% uptime and so much more. Topics Include 2:02 – Shipping an MVP insurance company in 8 weeks with little insurance knowledge 3:53 – React Native was used to build mobile apps after a demand for it was seen 4:46 – Motivation for using Django and Python to build this site 6:15 – The Django admin is used for simple config changes and CRUD operations 6:59 – Examples of when they needed to roll their own admin UI due to added complexity 8:41 – Stripe is being used to handle the payments with SCA support 11:41 – How do you even start an insurance company? 13:32 – It’s a monolithic app broken up by Django apps which is a nice way to break things up 15:15 – Django apps are a nice stepping stone to maybe microservices due to easy refactors 16:18 – What type of Django apps do you have to power your site? There’s 25+ of them 17:36 – Not every Django app would end up being its own service in the future 18:10 – The MVP didn’t start off with this many apps, it grew organically over time 18:46 – Which microservices would you tease out later if it came down to it? 20:28 – The split up services would end up having their own dedicated databases too 22:42 – The back-end is powered by a GraphQL API 23:37 – Using an API back-end came from realizing they are building a platform not an app 24:39 – Hole in one insurance isn’t offered, but they did offer rocket launcher insurance 25:46 – Graphene is used on the Python side of things and it works nicely with Django models 26:04 – On the front-end Relay is being used, but in hindsight maybe Apollo would be better 27:40 – The front-end is about 500,000 lines of code (not including node_modules) 27:53 – The back-end is about 300,000 to 350,000 lines of code 28:44 – There’s about 40-50 top level dependencies in the requirements.txt file 29:54 – PostgreSQL is used through RDS on AWS, along with a RedShift cluster 30:34 – What is RedShift and how does it help make certain queries much faster? 32:43 – They don’t connect to RedShift through Django’s ORM but you do write SQL 33:34 – Their financial reconciliation engine has 40-50 million rows and queries are fast 34:12 – Celery, Redis, Kubernetes, AWS Lambda, oh my! 34:52 – There’s 3-5 web app servers but up to 24 background workers 36:08 – Payment handling doesn’t need to happen live as a driver is working 37:41 – A majority of things are running on t3 EC2 instances 38:24 – Steps taken to safely go from 1 background worker to running many of them 40:50 – One mistake they made early on was not having idempotent worker tasks 41:52 – Having zero down time deploys with AWS CodeDeploy, but migrations are tricky 44:41 – The infrastructure is managed with Terraform, Stuart knows enough to be dangerous 47:12 – Trusting your developers to do reviews is important, along with having tests 48:41 – There’s a few different environments, such as QA which is after a dev pushes code 49:31 – Moving from a git flow model to doing PRs that get merged to a deployed master 50:55 – Every pull request that comes in gets a sub-domain that can be directly accessed 51:33 – Feature flags are sometimes used, but not with a dedicated library or framework 52:55 – Secrets are managed using AWS’ Parameter Store 53:45 – The EC2 instances are spun up using pre-baked AMIs, except for the code itself 55:11 – They pay somewhere between $10,000 and $50,000 a month on hosting 55:46 – How they went from $3,000 to $3 a month from making a database backup change 57:21 – Cloudflare is used as their CDN, DNS host, anti-DDoS and SSL certificate service 58:06 – The imgix service is used to do on the fly image resizing and optimizations 58:31 – Cloudflare is a solid service and competitively priced 58:54 – The JavaScript payload for the front-end is about 1MB after being gzipped 1:00:29 – The Next.js library is used to do server side rendering initially 1:00:56 – Mailgun is used for sending emails and Twilio is used for sending text messages 1:01:40 – Sentry.io (hosted version) captures all of their errors with loads of integrations 1:02:11 – DataDog is used for alerting, APM metrics and logging 1:03:54 – It’s valuable to have your metrics and logging on 1 service 1:04-22 – Various alarms and alerts get sent through DataDog 1:04:42 – Health checks are done with Django Health Check, and they query the DB in it 1:06:02 – So far in 2020 they’re operating at 99.99% uptime which is quite the feat 1:06:46 – Checking your database in your health check is totally worth it 1:07:44 – There’s not many live tests that happen in production due to the nature of the app 1:08:53 – Best tips? Release as often as you can and invest in your release process 1:10:03 – That’s also been the biggest pain point as they scaled up to a larger dev team 1:11:19 – Database migrations are run on every deploy 1:12:33 – Check out https://zego.com, their open source work or email Stuart for questions Links 📄 References https://reactnative.dev/ https://en.wikipedia.org/wiki/Underwriting https://relay.dev/ https://www.apollographql.com/ https://en.wikipedia.org/wiki/Column-oriented_DBMS https://www.stitchdata.com/ https://fivetran.com/ https://blog.getdbt.com/what--exactly--is-dbt-/ https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-parameter-store.html https://www.imgix.com/solutions/resizing-and-cropping ⚙️ Tech Stack django → python → react → aws → cloudflare → codedeploy → datadog → graphql → mailgun → postgres → rds → redis → sentry → sns → sqs → stripe → terraform → twilio → 🛠 Libraries Used https://docs.graphene-python.org/projects/django/en/latest/ https://github.com/celery/celery https://github.com/joealcorn/laboratory https://github.com/KristianOellegaard/django-health-check Support the Show This episode does not have a sponsor and this podcast is a labor of love. If you want to support the show, the best way to do it is to purchase one of my courses or suggest one to a friend. Dive into Docker is a video course that takes you from not knowing what Docker is to being able to confidently use Docker and Docker Compose for your own apps. Long gone are the days of "but it works on my machine!". A bunch of follow along labs are included. Build a SAAS App with Flask is a video course where we build a real world SAAS app that accepts payments, has a custom admin, includes high test coverage and goes over how to implement and apply 50+ common web app features. There's over 20+ hours of video.

May 25, 2020 • 50min

Building a Site Around Thousands of Diary Entries from Samuel Pepys

In this episode of Running in Production, Phil Gyford goes over building a community around 9+ years of diary entries from Samuel Pepys. The site was built with Django. It gets about 150k+ page views a month and has been up and running since 2002. It’s currently hosted on Heroku. Phil talks about being in the sweet spot in terms of engagement while not being under high load, rewriting the platform with Django as a monolith, how Heroku helps him get it all up and running without needing to bother with servers and much more. The site is open source. Topics Include 1:21 – Who is Samuel Pepys and why a weblog is a natural fit for this site 2:26 – John Carmack had daily write ups in the mid-1990s 3:35 – It gets about 150,000+ page views a month with 30,000+ users 4:39 – The site is more than just weblog entries, there’s 88k+ user comments 6:34 – It’s the sweet spot of engagement between popular but not crazy popular 7:05 – Motivation for using Django and Python after using Movable Type for 9 years 9:03 – Deadlines are a great way to ensure you abort the idea of perfect and release it 9:26 – Django was enjoyable to use, and Phil thought about using Rails and PHP too 11:17 – We live in a really nice time where we have so many good choices for web frameworks 12:23 – It’s a monolithic app with about 12,000 lines of Python across 200 files 12:53 – It’s split into a bunch of Django apps, here’s a few 13:45 – The idea of using apps to organize your code is a great idea 14:43 – This whole site is open source on GitHub, you can use it as a learning resource 16:08 – How new entries make their way onto the site (spoiler alert: it was laborious) 19:21 – The site uses server rendered Django templates with sprinkles of JavaScript 19:43 – Tiny bit of JS for things like maps (Leaflet) and charts (D3.js) 20:19 – Server rendered templates are simple and fast, it’s a great combo 21:21 – It runs on Heroku with PostgreSQL and a bit of caching with Redis 21:43 – The site runs on (1) $7 / month “Hobby” Dyno and it’s more than enough 23:43 – There’s full text search using Django’s built in PostgreSQL search features 26:12 – Django 3.0 powers the site as of today and Phil likes to keep it up to date 27:54 – If you postpone updating your dependencies for too long it can get painful 28:48 – What are you caching? Everything! At least for anonymous users 31:26 – The PostgreSQL database runs off the $9 / month Heroku add-on 32:54 – Have you ever thought about spinning up your own server? 35:17 – If you don’t like the idea of managing your own servers, Heroku can be decent 37:27 – Heroku handles issuing SSL certificates for you for free 38:13 – Sentry is used for error handling through the Heroku add-on 39:14 – Errors coming in are pretty rare 40:04 – Phil’s site holds its own in terms of SEO, even against Wikipedia 42:51 – Heroku handles backing up the database once a day, and Phil backs it up to S3 too 43:49 – He also uses S3 to store some of the static files, such as uploaded blog post images 45:15 – Django storage is used to handle uploading to S3 46:52 – Best tips? Start simple and grow it from there, writing any code is important 48:22 – Maybe using an app generator isn’t worth it, unless you make a lot of new apps 49:45 – You can find Phil on Twitter, he also has his own site at https://www.gyford.com/ Links 📄 References https://www.gutenberg.org/ https://github.com/ESWAT/john-carmack-plan-archive/tree/master/by_day https://en.wikipedia.org/wiki/Movable_Type https://docs.djangoproject.com/en/3.0/ref/contrib/postgres/search/ https://devcenter.heroku.com/articles/django-assets ⚙️ Tech Stack django → python → aws → bootstrap → heroku → open-source → postgres → redis → s3 → sentry → 🛠 Libraries Used https://github.com/django/django-contrib-comments https://d3js.org/ https://leafletjs.com/ https://django-storages.readthedocs.io/en/latest/ Support the Show This episode does not have a sponsor and this podcast is a labor of love. If you want to support the show, the best way to do it is to purchase one of my courses or suggest one to a friend. Dive into Docker is a video course that takes you from not knowing what Docker is to being able to confidently use Docker and Docker Compose for your own apps. Long gone are the days of "but it works on my machine!". A bunch of follow along labs are included. Build a SAAS App with Flask is a video course where we build a real world SAAS app that accepts payments, has a custom admin, includes high test coverage and goes over how to implement and apply 50+ common web app features. There's over 20+ hours of video.

May 18, 2020 • 1h 21min

Mux Is an API Based Platform That Lets You Process and Stream Videos

In this episode of Running in Production, Dylan Jhaveri talks about building an API driven video platform called Mux. It uses Phoenix, Elixir and Go to handle billions of video views a month. It’s hosted on AWS and GCP with Kubernetes and has been up and running since early 2016. Dylan covers how video streaming works, processing billions of events a month, taking advantage of Elixir and Phoenix’s features, providing a zero downtime public API, continuously deploying their products, working with massive databases, metered billing and tons more. Topics Include 1:14 – How online streaming video works with HLS and where Mux fits into the picture 7:51 – Mux lets you post a video to their API and they give you an HLS playback URL 8:24 – Mux has been up and running since January 2016 and went through YCombinator 8:37 – Mux Data is another service they offer, it’s like New Relic but for video data 12:04 – They process billions of video views per month through Mux Data 12:36 – You could use Mux as a lower level alternative to Vimeo or Wistia 13:33 – Sometimes embedding iframes can be problematic and Mux can help in this area 14:35 – About 45 people work at Mux and half are involved with engineering 15:03 – Motivation for using Phoenix and Elixir, even when they were very new tools 16:52 – Their main public API is an out of the box Phoenix app 17:52 – They have a real-time dashboard that is powered by websockets and channels 20:28 – Some of Mux’s customers have millions of concurrent video views through that 20:42 – Will you switch to using Live View? Probably not since they are so API driven 21:51 – A dozen or so Go microservices and Kafka handle processing the videos 23:25 – Go is a great fit for super CPU intensive tasks such as video encoding 24:03 – The video processing infrastructure was very well thought out early on 24:50 – The public API is RESTful and there’s ~40-50 endpoints with a few private endpoints 26:14 – Cookie based auth is done in a browser but there’s tokens for API access 26:47 – The exq library is used for processing jobs asynchronously in Elixir land 27:22 – exq runs within a supervisor of your app, not a dedicated OS level service 28:21 – Prometheus is used for metrics but it’s not hooked into Elixir Telemetry (yet) 29:26 – Kubernetes and Docker drive their production infrastructure 29:47 – Buildkite is used for their CI / CD pipeline 32:08 – Deployments are very automated, a human only needs to merge to a specific branch 32:53 – The video processing microservices are in 1 mono repo, but there’s 2 other repos 33:33 – There’s PR approvals in place but all developers can merge to the production branch 34:39 – Code reviews are really important and you need to trust your developers 35:41 – The Elixir app has a PostgreSQL billing DB and also uses ClickHouse (SQL based) 37:53 – ClickHouse lets them store billions of rows and access everything quickly 40:58 – You do write SQL queries with ClickHouse but it doesn’t work with Ecto out of the box 41:44 – The Elixir API runs on AWS with an AWS load balancer sitting in front of it all 42:20 – The video infrastructure runs on Google Cloud 42:56 – How many servers do you run in total? Hard to tell really, but it’s a lot of compute 43:44 – Despite being on AWS, they are not using Amazon’s managed Kubernetes (EKS) 44:01 – All payments go through Stripe, including the metered billing which they hand rolled 45:06 – Instead of billing based on bandwidth, Mux bills by minutes watched 46:06 – SendGrid is used for transactional emails, Sentry for errors and Opsgenie for paging 46:48 – All sorts of CI / CD related information gets sent over to a Slack channel 47:08 – Developers are broken out into 4 cross functional teams 48:31 – There’s 2 flavors of SDKs that Mux has (REST API wrappers and video players) 50:21 – They currently have 22 different video players to account for across many platforms 50:36 – Efficiently creating so many different SDKs by having a core library for each language 54:20 – It’s sort of like having a core payment library and supporting Stripe, PayPal, etc. 54:41 – The SDK team needs to be aware of many different languages and players 55:16 – Another key metric to track is the video upscale and downscale percentages 56:47 – As of today Mux is focused on supplying service quality metrics 58:08 – There’s a lot of data stored but it all gets rolled over after 90 days 58:42 – The API is deployed all the time, but there’s zero down time deploys 59:45 – There’s been one day in the past there they had to put the API in read-only mode 1:00:19 – The data is backed up, but Dylan isn’t sure how often (but it happens, he swears!) 1:00:42 – Video thumbnails can be picked out from any timestamp, even animated GIFs too 1:02:21 – For now you need to supply your own closed captions to Mux 1:03:52 – Captions are downloaded, cached locally until processed and then backed up too 1:04:38 – Smoke tests and various alarms help detect issues in production (they use Flink) 1:06:25 – Uptime is important, Mux has high profile clients where downtime is not an option 1:06:52 – Rate limiting is done at the Elixir level for API calls with the ex_rated library 1:07:25 – It’s a reasonable idea to always assume users are out to get you 1:07:52 – For video rate limiting, it’s up to the CDN and they use a few different CDNs 1:09:33 – You could build a live streaming service like Twitch with Mux’s API 1:13:19 – The Elixir API doesn’t get billions of calls a month but it’s a still a lot 1:16:37 – Best tips? Video is hard and it keeps getting more and more complicated 1:18:15 – Fortunately the video player SDK’s churn isn’t too high due to the HTML5 spec 1:19:14 – You can email Dylan or contact him on Twitter, also Mux is hiring too! Links 📄 References https://en.wikipedia.org/wiki/HTTP_Live_Streaming https://howvideo.works/ https://www.ycombinator.com/about/ https://bugzilla.mozilla.org/show_bug.cgi?id=356558 https://golang.org/ https://en.wikipedia.org/wiki/Column-oriented_DBMS https://mux.com/blog/from-russia-with-love-how-clickhouse-saved-our-data/ https://en.wikipedia.org/wiki/WebVTT https://flink.apache.org/usecases.html https://www.fastly.com/ https://en.wikipedia.org/wiki/Real-Time_Messaging_Protocol https://obsproject.com/ ⚙️ Tech Stack phoenix → elixir → golang → aws → buildkite → clickhouse → docker → fastly → gcp → kafka → kubernetes → opsgenie → postgres → prometheus → sendgrid → sentry → slack → stackpath → stripe → 🛠 Libraries Used https://github.com/akira/exq https://github.com/elixir-mint/mint https://github.com/grempe/ex_rated Support the Show This episode does not have a sponsor and this podcast is a labor of love. If you want to support the show, the best way to do it is to purchase one of my courses or suggest one to a friend. Dive into Docker is a video course that takes you from not knowing what Docker is to being able to confidently use Docker and Docker Compose for your own apps. Long gone are the days of "but it works on my machine!". A bunch of follow along labs are included. Build a SAAS App with Flask is a video course where we build a real world SAAS app that accepts payments, has a custom admin, includes high test coverage and goes over how to implement and apply 50+ common web app features. There's over 20+ hours of video.

May 11, 2020 • 43min

TradeRev Is a Machine Learning Vehicle Appraisal / Auctioning System

In this episode of Running in Production, Amit Jain goes over building an auctioning system that uses machine / deep learning and is powered by Flask and Python. It’s all hosted on AWS and has been up and running since mid 2011. Amit goes over a few machine learning libraries, refactoring a 100k+ line monolith into microservices without any automated tests, the importance of machine learning accuracy, using a bunch of AWS services to deploy a large site, treating your infrastructure as code and more. Topics Include 3:58 – Amit lead a team of ~10 R&D engineers responsible for Data Science / ML 4:33 – Roughly 1,000 cars a day are being traded with 8-10k auctions / bids per day 5:15 – Motivation for using Flask and Python 6:55 – Scikit-Learn and TensorFlow for machine / deep learning 7:39 – Did things start off with multiple microservices or was it a monolith early on? 9:41 – There’s about 80,000 to 120,000 lines of code across 200-300+ Python files 10:14 – The huge refactor to microservices was done without automated tests initially 11:11 – After the refactor now there’s 86% test coverage which is enough to be confident 12:24 – Flask-Restplus is the main library used to build their RESTful APIs 12:43 – Other notable libraries were gunicorn and boto3 (AWS SDK for Python) 13:05 – Locust is an open source load / performance testing tool 13:40 – With machine learning, speed is important but accuracy is even more important 15:30 – gunicorn is very compact, performant and easy to configure 16:28 – Most caches were in memory and they used Amazon DynamoDB 17:09 – The primary database is MySQL running on Amazon RDS 18:04 – SQLAlchemy is used on the Python side as an ORM 19:29 – Docker is sort of being used in development 21:02 – The platform runs on AWS with Lambda, API Gateway and AWS Fargate with ECS 22:24 – What is AWS Fargate and what does it allow you to do? 23:48 – Scaling with Fargate while using auto scaling policies and configuration 26:28 – Taking advantage of the cloud and setting up load balancing with configuration 28:04 – How do you deal with secrets when using Fargate / ECS? 30:02 – What about logging and metrics? Are you exclusively using all of AWS’ services? 31:12 – What about error reporting, such as getting notified if an error happens 31:34 – The deploy process from development to production (includes CI / CD with Jenkins) 33:26 – A Walk through of how the different AWS services come together 36:54 – Terraform is being used to manage the infrastructure as code (valuable tool) 40:04 – Database backups were performed by the DevOps team 40:41 – Best tips? Start slow and expect failures, also don’t chase perfection 42:14 – You can find Amit on Twitter at @ml_amit and on LinkedIn Links 📄 References https://en.wikipedia.org/wiki/Machine_learning https://en.wikipedia.org/wiki/Deep_learning https://en.wikipedia.org/wiki/Natural_language_processing https://en.wikipedia.org/wiki/Convolutional_neural_network (CNN) https://en.wikipedia.org/wiki/Smoke_testing_(software) https://locust.io/ https://en.wikipedia.org/wiki/Cache_replacement_policies#Least_recently_used_(LRU) ⚙️ Tech Stack flask → python → aws → cloudwatch → docker → dynamodb → ecs → fargate → jenkins → lambda → mysql → pagerduty → python → rds → stripe → terraform → 🛠 Libraries Used https://scikit-learn.org/stable/ https://www.tensorflow.org/ https://github.com/noirbizarre/flask-restplus https://gunicorn.org/ https://github.com/boto/boto3 https://github.com/sqlalchemy/sqlalchemy Support the Show This episode does not have a sponsor and this podcast is a labor of love. If you want to support the show, the best way to do it is to purchase one of my courses or suggest one to a friend. Dive into Docker is a video course that takes you from not knowing what Docker is to being able to confidently use Docker and Docker Compose for your own apps. Long gone are the days of "but it works on my machine!". A bunch of follow along labs are included. Build a SAAS App with Flask is a video course where we build a real world SAAS app that accepts payments, has a custom admin, includes high test coverage and goes over how to implement and apply 50+ common web app features. There's over 20+ hours of video.

May 4, 2020 • 45min

Cover Tuner Uses NLP to Help Improve Your Cover Letters

In this episode of Running in Production, Saad Malik talks about building a free cover letter analysis tool with Flask and Python. It uses NLP (Natural language processing) and has been up and running on Google App Engine since April 2020. Saad goes over various Python NLP libraries, processing 400+ cover letters in his first month after shipping an MVP, using MongoDB as a primary database, keeping his front-end simple with a bit of jQuery, what it’s like to deploy a Python app using Google App Engine and more. Topics Include 2:54 – You can upload your cover letter and get back an analysis without an account 3:50 – Motivation for using Flask and Python 4:48 – Writing the “business logic” in a standalone script before adding a web layer 5:48 – What is NLP (Natural language processing) and what Python libraries exist for it 7:01 – Using an NLP library vs using a full text search database 8:14 – About 1,000 users a month go to the site and 50% of them upload a cover letter 9:25 – Lots of users re-upload new copies of their cover letter after making changes to it 10:06 – Server side rendered templates with Jinja plus a touch of jQuery here and there 10:53 – After submitting a cover letter, an ajax response fills in the info after ~5 seconds 11:31 – Gunicorn is used as the app server for Flask 11:46 – Why Saad chose to use Google App Engine over using Google Compute Engine 12:43 – Motivation for using Google App Engine over Heroku and other PaaS alternatives 14:09 – It’s mostly a monolithic application but with a separate script that runs locally 14:59 – The local script helps validate cover letters 16:47 – MongoDB Atlas is used to host MongoDB along with Google Cloud Storage 17:51 – Why did you choose MongoDB over PostgreSQL or another SQL database? 18:50 – MongoDB Compass is a way for you to visually explore your data 19:29 – Docker isn’t being used in development but app engine uses it in production 20:20 – nginx isn’t needed because app engine handles all of that for you 20:57 – App engine is nice but it does come at a price (it’s quite a bit more expensive) 22:13 – App engine costs won’t necessarily scale linearly with your traffic 23:46 – A run down on all of the Google Cloud services Saad is using and how they connect 25:14 – Are MongoDB databases really schemaless? 26:05 – PyMongo is used to connect the Python app to MongoDB 27:13 – It only took 4-5 days to turn the standalone script into an MVP Flask app 29:13 – Only the Python NLP libraries are note worthy libs to make this app work 29:44 – There’s no user authentication needed because no user accounts are necessary 30:14 – WuFoo is used to accept form submissions using their free tier 30:35 – WTForms is also used to process the cover letter form submissions 31:17 – Google Search Console helped make the site more mobile friendly 32:09 – The site isn’t using Bootstrap, it’s just plain old hand rolled CSS and JavaScript 32:53 – Both app engine and MongoDB Atlas provide notifications for various events 33:26 – Walking through deploying code from development to production on app engine 34:38 – Saad has tests set up with Pytest 35:09 – What exactly is that YAML file with app engine? 35:48 – Dealing with secret keys 37:05 – Both MongoDB Atlas and Google App Engine have tools for disaster recovery 38:17 – Alerts can be set up to measure resource consumption, including cost limits 39:22 – App engine’s price is high, Saad would probably use Google Compute Engine instead 40:47 – Best tips? Be mindful of the SAAS tools you use and how they interact with your app 42:11 – If you crank out code ASAP to ship an MVP, don’t forget to go back and refactor 43:10 – Using the Spyder IDE to help develop certain features faster and easier 44:28 – If you want to contact Saad you can find him on LinkedIn Links 📄 References https://en.wikipedia.org/wiki/Cover_letter https://en.wikipedia.org/wiki/Natural_language_processing https://www.mongodb.com/products/compass https://www.wufoo.com/ https://cloud.google.com/appengine/docs/standard/python/config/appref https://www.spyder-ide.org/ ⚙️ Tech Stack flask → python → app-engine → gcp → jquery → mongodb → 🛠 Libraries Used https://www.nltk.org/ https://spacy.io/ https://jquery.com/ https://gunicorn.org/ https://pymongo.readthedocs.io/en/stable/ https://github.com/wtforms/wtforms https://github.com/pytest-dev/pytest Support the Show This episode does not have a sponsor and this podcast is a labor of love. If you want to support the show, the best way to do it is to purchase one of my courses or suggest one to a friend. Dive into Docker is a video course that takes you from not knowing what Docker is to being able to confidently use Docker and Docker Compose for your own apps. Long gone are the days of "but it works on my machine!". A bunch of follow along labs are included. Build a SAAS App with Flask is a video course where we build a real world SAAS app that accepts payments, has a custom admin, includes high test coverage and goes over how to implement and apply 50+ common web app features. There's over 20+ hours of video.

Apr 27, 2020 • 1h 19min

Easily Find, Reproduce and Track Your JavaScript Errors with TrackJS

In this episode of Running in Production, Todd Gardner goes over how he built TrackJS. It’s written in .NET and pulls together a number of different technologies to get the job done. It’s all hosted on OVH using dedicated hardware and has been running in production since 2013. Todd talks about how to track JavaScript errors in production, creating a data pipeline to ingest thousands of errors a minute in ~80 milliseconds, the benefits of pjax, how dedicated hardware ended up being half the price of cloud servers and using Ansible to configure all of the servers. Topics Include 2:06 – Working nights and weekends until it made enough to replace consulting work 3:05 – Being your own boss changes how you think about writing software 4:23 – Thousands of developers a day use TrackJS resulting in thousands of errors per minute 5:45 – Installing TrackJS is painless, just drop the JS snippet into your site’s HTML 6:38 – Debugging client side JavaScript can be difficult for a number of reasons 7:43 – Motivation for using .NET / C# 9:05 – Why it’s a good idea to avoid shiny new tech when you’re building a new product 12:14 – Why TrackJS is mostly a monolithic application instead of microservices 13:23 – But there are bits that are broken into their own service when it makes sense 13:40 – Creating a pipeline to efficiently capture and process a ton of incoming data 15:21 – Leveraging nginx to quickly create logs for requests that are processed later 16:23 – For data that is more time sensitive they wrote a .NET service that uses Redis 18:45 – If TrackJS gets slammed, it will never effect page load speeds for their customers 19:40 – nginx was configured to write out JSON formatted logs 21:46 – The processor service ingests those log files and figures out what to do next 22:38 – Then there’s the web front-end service that developers use to browse their errors 23:30 – Elasticsearch is used to store the errors to create very fine grained reports and filtering 24:15 – A quick recap of the technologies used so far 24:39 – ASP.NET is similar to Rails, it’s server rendered templates but they use React too 25:52 – pjax is used to make the app feel very fast even with server rendered templates 27:10 – pjax / Turbolinks is one of the best bangs for your buck to make your site feel fast 28:23 – Making the most of your tech stack with a small team of developers 31:12 – Elasticsearch needs a bit of tuning if you’re using it as your primary database 32:04 – Writing their own .NET class to interface with the Redis backed queue 33:31 – IIS (Microsoft’s web server) serves the app without nginx sitting in front of it 34:17 – Load balancing is done over DNS with a round-robin strategy across 3 servers 36:13 – All 3 web servers get restarted at once during updates because IIS is great like that 37:31 – Everything is hosted on dedicated hardware with OVHCloud after moving off Azure 39:58 – Poor support and opaque downtime resolutions is why they moved off the cloud 41:26 – After thinking about, using Ansible to set up machines seemed like a good idea 42:01 – They landed on using OVH after doing a bunch of research 42:48 – $180 / month for a high end Xeon 8 CPU core server w/ 64 GB of RAM + 2 TB of SSDs 43:28 – It was more work to set up but it’s A LOT faster and costs were dropped in half 44:40 – When something goes wrong, it’s obvious on what went wrong when it will be fixed 45:47 – Even while running at 10% capacity, they do capacity planning every quarter 46:44 – $180 / month is an average figure, they have smaller servers doing different things 47:19 – They run about 12x Elasticsearch servers that are pretty beefy ($240 / month) 47:49 – Overall they have about 30 servers that they have to manage 48:31 – Some servers run Ubuntu LTS, and the web servers run Windows Server 2016 49:10 – Managing Windows servers is kind of a pain in the butt 51:04 – Ansible is used to configure both the Windows and Linux servers 53:32 – It takes about 48 hours to get new hardware from OVH, but that’s not a problem 54:11 – Using Team City to help get code from development to production 55:50 – The test environment gets real production data synced every hour 56:32 – Their “dev” environment is really a test environment 58:20 – It gets pushed to production manually through a Team City job by choice 59:28 – But every time they git push code, a new test environment is set up automatically 59:43 – They use their own service to help monitor JavaScript errors and it helps 1:00:29 – They built their own back-end monitoring tools too due to lack of choices 1:00:55 – Todd has opinions on back-end monitoring in general 1:01:47 – Real exceptions get sent to their primary Slack chat channel 1:03:30 – Payments are handled using Stripe but it doesn’t use SCA 1:05:16 – Monitis is used to monitor their infrastructure load and website up-time 1:07:39 – They would still use rented hardware but maybe use .NET Core today 1:09:10 – Depending on well tested and mature tools allows you to use them years later 1:10:38 – Best tips? Don’t build something in new tech just to use new tech 1:12:36 – When it comes to billing code, try to deal with it early on (it’s tricky) 1:15:26 – It’s hard to test webhooks and other external interactions in an automated way 1:17:23 – You can find Todd on Twitter @toddhgardner and check out his new monitoring service at https://requestmetrics.com Links 📄 References https://docs.microsoft.com/en-us/dotnet/core/ https://dotnet.microsoft.com/apps/aspnet https://github.com/turbolinks/turbolinks https://en.wikipedia.org/wiki/Round-robin_DNS https://en.wikipedia.org/wiki/Windows_Server_2016 https://www.jetbrains.com/teamcity/ https://www.monitis.com/ ⚙️ Tech Stack dotnet → c-sharp → react → jekyll → ansible → elasticsearch → iis → monitis → nginx → ovh → redis → slack → stripe → teamcity → ubuntu → windows → 🛠 Libraries Used https://github.com/defunkt/jquery-pjax Support the Show This episode does not have a sponsor and this podcast is a labor of love. If you want to support the show, the best way to do it is to purchase one of my courses or suggest one to a friend. Dive into Docker is a video course that takes you from not knowing what Docker is to being able to confidently use Docker and Docker Compose for your own apps. Long gone are the days of "but it works on my machine!". A bunch of follow along labs are included. Build a SAAS App with Flask is a video course where we build a real world SAAS app that accepts payments, has a custom admin, includes high test coverage and goes over how to implement and apply 50+ common web app features. There's over 20+ hours of video.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

App store banner

Play store banner