

The InfoQ Podcast
InfoQ
Software engineers, architects and team leads have found inspiration to drive change and innovation in their team by listening to the weekly InfoQ Podcast. They have received essential information that helped them validate their software development map. We have achieved that by interviewing some of the top CTOs, engineers and technology directors from companies like Uber, Netflix and more. Over 1,200,000 downloads in the last 3 years.
Episodes
Mentioned books

Jan 18, 2019 • 27min
Lynn Langit on 25% Time and Cloud Adoption within Genomic Research Organizations
Lynn Langit is a consulting cloud architect who holds recognitions from all three major cloud vendors on her contributions to their respective communities. On today’s podcast, Wes talks with Lynn about a concept she calls 25% time and a project it led her to become involved within genomic research. 25% time is her own method of learning while collaborating with someone else for a greater good. A recent project leads her to become involved with the Commonwealth Scientific and Industrial Research Organisation (CSIRO) in Australia. Through cloud adoption and some lean startup practices, they were able to drop the run time for a machine learning algorithm against a genomic dataset from 500 hours to 10 minutes.
Why listen to this podcast:
- 25% time is a way to learn, study, or collaborate with someone else for a greater good. It’s unbilled time in the service of offers. Using the idea of 25% time along with some personal events that occurred in her life, Lynn became involved with genomic researchers in Australia.
- Price of genomic sequencing has dropped. The price drop has enabled researchers to create huge repositories of genomic data; however, it was mostly on-prem. The idea of building data pipelines was pretty new in the genome community. Additionally, the genome itself is 3 billion data points. A variant of as little at 10-15 variants can be statistically significant.
- The challenge was to leverage cloud resources. To gain a quick win and buy-in for Commonwealth Scientific and Industrial Research Organisation (or CSIRO an independent Australian federal government agency) for cloud adoption, a first step was to capture interest in the idea. So the team stored their reference data in the cloud and enabled access via a Jupyter Notebook.
- They demonstrated a use case against the genomic data set leveraging a synthetic phenotype (or a fake disease) called hipsterdom. The solution became a basis for global discussion that got more people involved in the community.
- By leveraging cloud resources, the CSIRO was able to get a run their dataset that took 500 hours against an on-prem Spark cluster to 10 minutes.
- Learning new programming language has unseen benefits. For example, Ballerina (a language written as an integration language between APIs) interested Lynn because of its live visual diagrams; however, benefited her with some of the cloud pipelines because of its ability to produce YAML files.
You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq
Subscribe: www.youtube.com/infoq
Like InfoQ on Facebook: bit.ly/2jmlyG8
Follow on Twitter: twitter.com/InfoQ
Follow on LinkedIn: www.linkedin.com/company/infoq
Check the landing page on InfoQ: https://bit.ly/2T2LZBQ

Dec 28, 2018 • 35min
Charles Humble and Wes Reisz Take a Look Back at 2018 and Speculate on What 2019 Might Have in Store
In this podcast Charles Humble and Wes Reisz talk about autonomous vehicles, GDPR, quantum computing, microservices, AR/VR and more.
* Waymo vehicles are now allowed to be on the road in California running fully autonomous; they seem to be a long way ahead in terms of the number of autonomous miles they’ve driven, but there are something like 60 other companies in California approved to test autonomous vehicles.
* It seems reasonable to assume that considerably more regulation around privacy will appear over the next few years, as governments and regulators grapple with not only social media but also who owns the data from technology like AR glasses or self-driving cars.
* We’ve seen a huge amount of interest in the ethical implications of technology this year, with Uber getting into some regulatory trouble, and Facebook being co-opted by foreign governments for nefarious purposes. As software becomes more and more pervasive in people's lives the ethical impact of what we all do becomes more and more profound.
* Researchers from IBM, the University of Waterloo, Canada, and the Technical University of Munich, Germany, have proved theoretically that quantum computers can solve certain problems faster than classical computers.
* We’re also seeing a lot of interest around human computer interaction - AR, VR, voice, neural interfaces. We had a presentation at QCon San Francisco from CTRL-labs, who are working on neural interfaces - in this case interpreting nerve signals - and they have working prototypes. Much like touch this could open up computing to another whole group of people.

Dec 23, 2018 • 41min
Java Language Architect Brian Goetz on Java and the JDK
On this week’s podcast, Wes Reisz talks with Brian Goetz. Brian is the Java Language Architect at Oracle. The two start with a discussion on what the six-month cadence has meant to the teams developing Java. Then move to a review of the features in Java 9 through 12. Finally, the two discuss the longer-term side projects (such as Amber, Loom, and Valhalla) and their role in the larger release process for the JDK.
* The JVM’s sixth-month cadence changed the way the JDK is delivered and planned. While it definitely provides more rapid delivery at expected intervals, the release train approach turned out to also improve flexibility and efficiency.
* Oracle JDK and OpenJDK are almost identical. Most of the JDK distributions are forks from OpenJDK with different bug fixes and backports applied. So the difference between the distributions now is largely which bug fixes are picked up.
* Local Variable inference (which was released as part of Java 10) illustrated the tension on making changes to the language. Many people wanted the change, but many others felt it would enable people to write bad code. Oracle had to balance the two views when making the change.
* The number of Java versions allow finer grain decision making on what is appropriate for an application. With the adoption of containers, applications are bundled with an exact JDK version rather than having to use one from a systems level. The different versions give developers more options.
* Incubating features are new libraries added to the JDK. They were offered starting with Java 9 as a way for people to test and offer feedback more rapidly. With Java 12, preview features will be released. Preview features are similar but are core platform and language features.
* Shenandoah and ZGC are both low latency garbage collectors. They originally came from different sources. While both garbage collectors are similar, each has different performance characteristics under different workloads. The two garbage collectors represent options available to JVM developers.
* Most non-trivial JDK features take more than six months to develop. Longer term side projects like Amber, Loom, Valhalla are where these features are developed prior to being released with a version of the JDK. The projects range from language enhancements to concurrency work.

Dec 17, 2018 • 32min
Tanya Reilly on Site Reliability Engineering and the Evolution of the New York City Fire Code
This week on the InfoQ Podcast, Wes Reisz talks to Tanya Reilly (Principal Engineer at Squarespace and previously a staff SRE at Google). Tanya discusses her research into how the fire code evolved in New York and draws on some of the parallels she sees in software. Along the way, she discusses what it means to be an SRE, what effective aspects of the role might look like, and her opinions on what we as an industry should be doing to prevent disasters. This podcast features discussion on paved roads, prevention, testing, firefighting (in software), and reliability questions to ask throughout the software lifecycle.
Why listen to this podcast:
- Teams increasingly are responsible for the entire software lifecycle. When this happens, they think about the software differently because they know their the ones that will get paged if it fails. This idea is at the core of the “You Build It, You Run It” philosophy in DevOps.
- The role of SRE is to define how to do things in a really reliable way. The focus is to make the majority of the operations work go away, and, for the things that can’t go away, it’s as easy as possible.
- At the very start of a project (when you’re writing the initial design), you should be thinking about the dependencies for a system and how will those that follow with be able to determine that. A great way to do this is to offer an API that people will want to use and then instrument it.
- We can learn a lot from the growth of fire safety regulations as metaphors for software, including: fireproof interior walls, socializing best practices, software inspections, and circuit breakers are all examples.
- The work SREs do varies in many places. SREs range from making recommendations on patterns to library creators in other areas. Occasionally, SREs are firefighters of last resort. In these cases, they’re the last resort though.
- We use error budgets and SLOs to quantify how many much risk we’re comfortable taking. It’s used to inform how much less (or more risk) we’re willing to take on.
- We need to consider software reliability throughout the full cycle of software development. When you build systems. Think about as if there will not be someone on call for it .
You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq
Subscribe: www.youtube.com/infoq
Like InfoQ on Facebook: bit.ly/2jmlyG8
Follow on Twitter: twitter.com/InfoQ
Follow on LinkedIn: www.linkedin.com/company/infoq

Nov 30, 2018 • 36min
Jason Maude on Building a Modern Cloud-Based Banking Startup in Java
On today’s podcast, Wes Reisz talks with Jason Maude of Starling Bank. Starling Bank is a relatively new startup in the United Kingdom working in the banking sector. The two discuss the architecture, technology choices, and design processes used at Starling. In addition, Maude goes into some of the realities of building in the cloud, working with regulators, and proven robustness with practices like chaos testing.
Why listen to this podcast:
- Starling Bank was created because the government lowered the barrier to entry for banking startups in reaction to previous industry bailouts.
- The system is composed of around 19 applications hosted on AWS and running Java and backed by a PostgreSQL database.
- These applications are not monolithic but are focused around common functionality (such as a Card or Payment Service).
- Java was chosen primarily because of its maturity and long term viability/reliability in the market.
- The heart of Starling is every action the system takes happens at least once and at most once. To help with these rules, everything in their system uses as a correlation id (UUID) and are used to make sure these two rules are met.
You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq
Subscribe: www.youtube.com/infoq
Like InfoQ on Facebook: bit.ly/2jmlyG8
Follow on Twitter: twitter.com/InfoQ
Follow on LinkedIn: www.linkedin.com/company/infoq

Nov 2, 2018 • 33min
Martin Fowler Discusses New Edition of Refactoring, Along With Thoughts on Evolutionary Architecture
Martin Fowler chats about the work he’s done over the last couple of years on the rewrite of the original Refactorings book. He discusses how this thought process has changed and how that’s affected the new edition of the book. In addition to discussing Refactors, Martin and Wes discuss his thoughts on evolutionary architecture, team structures, and how the idea of refactors can be applied in larger architecture contexts.
Why listen to this podcast:
- Refactoring is the idea of trying to identify the sequence of small steps that allows you to make a big change. That core idea hasn’t changed.
- Several new refactorings in the book deal with the idea of transforming data structures into other data structures, Combine Functions into Transform for example.
- Several of the refactorings were removed or not added to the book in favor of adding them to a web edition of the book.
- A lot of the early refactorings are like cleaning the dirt off the glass of a window. You just need them to be able to see where the hell you are and then you can start looking at the broader ones.
- Refactorings can be applied broadly to architecture evolution. Two recent posts How to break a Monolith into Microservices, by Zhamak Dehghani, and How to extract a data-rich service from a monolith by Praful Todkar on MartinFowler.com deal with this specifically.
- Evolutionary architecture is a broad principle that architecture is constantly changing. While related to Microservices, it’s not Microservices by another name. You could evolve towards or away from Microservices for example.
More on this: Quick scan our curated show notes on InfoQ https://bit.ly/2QbdHej
You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq
Subscribe: www.youtube.com/infoq
Like InfoQ on Facebook: bit.ly/2jmlyG8
Follow on Twitter: twitter.com/InfoQ
Follow on LinkedIn: www.linkedin.com/company/infoq
Check the landing page on InfoQ: https://bit.ly/2QbdHej

Oct 21, 2018 • 34min
Mitchell Hashimoto on Consul since 1.2 and its Role as a Modern Service Mesh
In June of this year, Consul 1.2 was released. The release expanded Consul’s capability around service segmentation (controlling who and how services connect East and West). On this week’s podcast, Wes and Mitchell discuss Consul in detail. The two discuss Consul’s design decisions around focusing on user space networking, layer 4 routing, Go, Windows’ performance characteristics, the roadmap for eBPF on Linux, and an interesting feature that Consul implements called Network Tomography. The show wraps with Mitchell’s discussion on some of the research that Hashicorp is doing around machine learning and security with Consul.
Why listen to this podcast:
- Consul is first and foremost a centralized service registry that provides discovery. While it has a key-value store, it is Consul’s least important feature.
With the June release (1.2), Consul entered more into the space of a service mesh with the focus on service segmentation (controlling how you connect and who can connect).
- Hashicorp attempts to limit the language fragmentation in the Company and has seen a lot of success leveraging Go across their platforms. Therefore, Consul is written in Go.
- Because Consul focused on layer 4 first, it is recommended to leverage the recent integration with Envoy for achieving high degrees of observability.
- All of the network routing with Consul happens in user space at this point; however, kernel space routing with eBPF is planned for the near term. The focus, at this point, is safely cross-compiling to every platform and addressing the most possible use cases. The focuses isn’t on the high performance use cases (yet).
- For any two servers across the globe in different data centers, instantly Consul can give you 99th percentile round-trip time between with uses a feature called Network Tomography.
More on this: Quick scan our curated show notes on InfoQ https://bit.ly/2S3ZiSx
You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq
Subscribe: www.youtube.com/infoq
Like InfoQ on Facebook: bit.ly/2jmlyG8
Follow on Twitter: twitter.com/InfoQ
Follow on LinkedIn: www.linkedin.com/company/infoq
Check the landing page on InfoQ: https://bit.ly/2S3ZiSx

Oct 12, 2018 • 33min
Camille Fournier on Platform Engineering, Engineering Ladders, and her Book “The Managers Path
On the podcast this week Charles Humble talks to Camille Fournier about running a platform team, how her current role differs from the CTO role she had a Rent the Runway, the skills developers need to acquire as they move from engineering to management positions, tends like Holacracy, and her book "The Manager's Path"
Why listen to this podcast:
- When looking for platform engineers Camille looks for people who understand what it takes to build and run distributed systems - network, availability, data - and customer empathy.
- The team needs to be focussed on taking the time do build robust software for operational excellence.
- The technical skills were different at Rent the Runway - these would tend to be more full-stack engineers who worked in a more iterative way.
- Much of what we do at work is really about human relationships. One thing about relationships is that they tend to be better when you have one on one conversations with people on our regular basis. A lot of the value of one on one meeting is that you are reenforcing the social connection you have with the other person.
- One of the most important things we do as engineering managers is stay abreast of how to make teams effective in the context of delivering software.
More on this: Quick scan our curated show notes on InfoQ https://bit.ly/2RJdwYR
You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq
Subscribe: www.youtube.com/infoq
Like InfoQ on Facebook: bit.ly/2jmlyG8
Follow on Twitter: twitter.com/InfoQ
Follow on LinkedIn: www.linkedin.com/company/infoq
Check the landing page on InfoQ: https://bit.ly/2RJdwYR

Oct 8, 2018 • 35min
Emmanuel Ameisen, Head of AI at Insight, on Building a Semantic Search System for Images
On this week’s podcast, Wes Reisz talks to Emmanuel Ameisen, head of AI for Insight Data Science, about building a semantic search system for images using convolution neural networks and word embeddings, how you can build on the work done by companies like Google, and then explores where the gaps are and where you need to train your own models. The podcast wraps up with a discussion around how you get something like this into production.
Why listen to this podcast:
- A common use case is the ability to search for similar things - I want to find another pair of sunglasses like these, or I want a cat that looks like this picture, or even a tool like Google’s Smart Reply, can all be considered broadly the domain of semantic search.
- For image classification you generally want a convolutional neural network. You typically use a model pre-trained with a public data set like Imagenet pre-trained to generate embeddings, using the pre-trained model up to the penultimate layer, and storing the value of the activations.
- From here the idea is to mix image embeddings with word embeddings. The embeddings, whether for words or images, are just a vector that represents a thing. There are many approaches to getting vectors for words, but the one that started it is word2vec.
- For both image embeddings and word embeddings you can typically use pre-trained models, meaning that you only need to train the final step of bringing the two models together.
- Before deploying to production it is important that you validate the model against biases such as sexism, typically using outside people to a carry out a through audit.
More on this: Quick scan our curated show notes on InfoQ https://bit.ly/2RAEUrV
You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq
Subscribe: www.youtube.com/infoq
Like InfoQ on Facebook: bit.ly/2jmlyG8
Follow on Twitter: twitter.com/InfoQ
Follow on LinkedIn: www.linkedin.com/company/infoq
Check the landing page on InfoQ: https://bit.ly/2RAEUrV

Sep 21, 2018 • 39min
Ben Kehoe, Cloud Robotics Research Scientist, Discusses Serverless @iRobot
On this week’s podcast, Wes Reisz talks with Ben Kehoe of iRobot. Ben is a Cloud Robotics Research Scientist where he works on using the Internet to allow robots to do more and better things. AWS and, in particular, Lambda is a core part of cloud enabled robots. The two discuss iRobot’s cloud architecture. Some of the key lessons on the podcast include: thoughts on logging, deploying, unit/integration testing, service discovery, minimizing costs of service to service calls, and Conway’s Law.
Why listen to this podcast:
- The AWS Platform, including services such as Kinesis, Lambda, and IoT Gateway were key components in allowing iRobot to build out everything they needed for Internet-connected robots in 2015.
- Cloud-enabled Roombas talk to the cloud via the IoT Gateway (which is MQQT) and are able to perform large file uploads using mutually authenticated certificates signed via an iRobot Certificate Authority. The entire system is event-driven with lambda being used to perform actions based on the events that occur.
- When you’re using serverless, you are using managed infrastructure rather than building your own. So that means, when they exist, you have to accept the limitations of the infrastructure. For example, until recently Lambda didn’t have an SQS integration. So because of that limitation, you have to have inventive ways to make things work as you want.
- Serverless is all about the total cost of ownership. It’s not just about development time, but across on areas that need to support operating the environment.
- iRobot takes an approach of unit testing functions locally but does integration testing on a deployed set of functions. A library called Placebo helps engineers record events sent to the cloud and then replay them for local unit tests.
- For logging/tracing, iRobot packages up information that a function uses into a structured record that is sent to CloudWatch. They then pipe that into SumoLogic to be able to trace executions. Most of the difficulties that happen tend to happen closer to the edge.
- iRobot uses Red/Black deployments to have a completely separate stack when deploying. In addition, they flatten (or inline) their function calls on deployment. Both of these techniques are used as cost optimization techniques to prevent lambdas calling lambdas when not needed.
- Looking towards the future of serverless, there is still work to be done to offer the same feature set that more traditional applications can use with service meshes.
More on this: Quick scan our curated show notes on InfoQ https://bit.ly/2NYsXNN
You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq
Subscribe: www.youtube.com/infoq
Like InfoQ on Facebook: bit.ly/2jmlyG8
Follow on Twitter: twitter.com/InfoQ
Follow on LinkedIn: www.linkedin.com/company/infoq
Check the landing page on InfoQ: https://bit.ly/2NYsXNN