Jitendra Malik is a professor at Berkeley and one of the seminal figures in the field of computer vision, the kind before the deep learning revolution, and the kind after. He has been cited over 180,000 times and has mentored many world-class researchers in computer science.
Support this podcast by supporting our sponsors:
– BetterHelp: http://betterhelp.com/lex
– ExpressVPN: https://www.expressvpn.com/lexpod
If you would like to get more information about this podcast go to https://lexfridman.com/ai or connect with @lexfridman on Twitter, LinkedIn, Facebook, Medium, or YouTube where you can watch the video versions of these conversations. If you enjoy the podcast, please rate it 5 stars on Apple Podcasts, follow on Spotify, or support it on Patreon.
Here’s the outline of the episode. On some podcast players you should be able to click the timestamp to jump to that time.
OUTLINE:
00:00 – Introduction
03:17 – Computer vision is hard
10:05 – Tesla Autopilot
21:20 – Human brain vs computers
23:14 – The general problem of computer vision
29:09 – Images vs video in computer vision
37:47 – Benchmarks in computer vision
40:06 – Active learning
45:34 – From pixels to semantics
52:47 – Semantic segmentation
57:05 – The three R’s of computer vision
1:02:52 – End-to-end learning in computer vision
1:04:24 – 6 lessons we can learn from children
1:08:36 – Vision and language
1:12:30 – Turing test
1:16:17 – Open problems in computer vision
1:24:49 – AGI
1:35:47 – Pick the right problem