AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
The year 2024 posed significant challenges for AI wearables, highlighted by the rise and subsequent decline of companies like Rabbit and Humane. Evenfriend.com, a pioneer in the AI pendant market, faced delays in product launches, demonstrating the hurdles faced by startups in this arena. In contrast, BII, co-founded by Maria and Ethan, emerged as a leader with their innovative device known as B, which offers a comprehensive AI experience. This highlights the extraordinary dynamics of the AI wearables market, where only a handful of companies have successfully adapted and thrived amidst rapid changes.
The B device is positioned as a sophisticated personal AI system, featuring a range of capabilities like context understanding, transcription, and memory assistance. It employs beam-forming microphones and boasts a seven-day battery life, enabling continuous operation without interruption. One of its unique functionalities is to summarize daily experiences, allowing users to reflect on their day without the friction of manual journaling. This seamless integration of technology into daily life makes B a potentially transformative tool for user convenience and productivity.
Maria and Ethan have emphasized the importance of user feedback in refining the B device, adapting its form factor from earlier iterations like pendants to a more user-friendly wristwatch design. Initial market reactions indicated a preference for a wearable that blends into daily life rather than drawing attention, which influenced the decision to create a bracelet-style device. Additionally, extensive community input has facilitated design choices that cater to user preferences, enhancing overall satisfaction and usability. This iterative design process showcases how incorporating user insights can lead to better product outcomes.
The B device relies on sophisticated memory management techniques to store and retrieve user-specific information, fostering a personalized experience. It can generate and adapt information based on ongoing interactions, understanding shifts in user preferences over time. However, considerations around privacy and data management remain paramount, necessitating a human-in-the-loop approach for accuracy and reliability. The integration of external data sources, like emails and calendars, further enriches its contextual awareness, streamlining user interactions and fostering meaningful engagement.
As AI systems become more embedded in daily life, ethical concerns surrounding privacy and consent are increasingly significant in the development of personal AIs like B. Users are encouraged to consider the implications of their data being shared or used by other agents in various contexts, raising important questions about ownership and control. Debates surrounding real-time monitoring and the potential for miscommunication also highlight the need for careful oversight. Nevertheless, the evolution of personal AI technology appears poised for growth, with predictions that it will dramatically enhance user interaction and comprehension in everyday scenarios.
Bundle tickets for AIE Summit NYC have now sold out. You can now sign up for the livestream — where we will be making a big announcement soon. NYC-based readers and Summit attendees should check out the meetups happening around the Summit.
2024 was a very challenging year for AI Hardware. After the buzz of CES last January, 2024 was marked by the meteoric rise and even harder fall of AI Wearables companies like Rabbit and Humane, with an assist from a pre-wallpaper-app MKBHD.
Even Friend.com, the first to launch in the AI pendant category, and which spurred Rewind AI to rebrand to Limitless and follow in their footsteps, ended up delaying their wearable ship date and launching an experimental website chatbot version.
We have been cautiously excited about this category, keeping tabs on most of the top entrants, including Omi and Compass.
However, to date the biggest winner still standing from the AI Wearable wars is Bee AI, founded by today's guests Maria and Ethan.
Bee is an always on hardware device with beamforming microphones, 7 day battery life and a mute button, that can be worn as a wristwatch or a clip-on pin, backed by an incredible transcription, diarization and very long context memory processing pipeline that helps you to remember your day, your todos, and even perform actions by operating a virtual cloud phone.
This is one of the most advanced, production ready, personal AI agents we've ever seen, so we were excited to be their first podcast appearance. We met Bee when we ran the world's first Personal AI meetup in April last year.
As a user of Bee (and not an investor! just a friend!) it’s genuinely been a joy to use, and we were glad to take advantage of the opportunity to ask hard questions about the privacy and legal/ethical side of things as much as the AI and Hardware engineering side of Bee. We hope you enjoy the episode and tune in next Friday for Bee’s first conference talk: Building Perfect Memory.
Full YouTube Video Version
Watch this for the live demo!
Show Notes
* Ethan Sutin, Maria de Lourdes Zollo
* Buy Bee with Listener Discount Code!
Timestamps
* 00:00:00 Introductions and overview of Bee Computer
* 00:01:58 Personal context and use cases for Bee
* 00:03:02 Origin story of Bee and the founders' background
* 00:06:56 Evolution from app to hardware device
* 00:09:54 Short-term value proposition for users
* 00:12:17 Demo of Bee's functionality
* 00:17:54 Hardware form factor considerations
* 00:22:22 Privacy concerns and legal considerations
* 00:30:57 User adoption and reactions to wearing Bee
* 00:35:56 CES experience and hardware manufacturing challenges
* 00:41:40 Software pipeline and inference costs
* 00:53:38 Technical challenges in real-time processing
* 00:57:46 Memory and personal context modeling
* 01:02:45 Social aspects and agent-to-agent interactions
* 01:04:34 Location sharing and personal data exchange
* 01:05:11 Personality analysis capabilities
* 01:06:29 Hiring and future of always-on AI
Transcript
Alessio [00:00:04]: Hey everyone, welcome to the Latent Space podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co-host Swyx, founder of SmallAI.
swyx [00:00:12]: Hey, and today we are very honored to have in the studio Maria and Ethan from Bee.
Maria [00:00:16]: Hi, thank you for having us.
swyx [00:00:20]: And you are, I think, the first hardware founders we've had on the podcast. I've been looking to have had a hardware founder, like a wearable hardware, like a wearable hardware founder for a while. I think we're going to have two or three of them this year. And you're the ones that I wear every day. So thank you for making Bee. Thank you for all the feedback and the usage. Yeah, you know, I've been a big fan. You are the speaker gift for the Engineering World's Fair. And let's start from the beginning. What is Bee Computer?
Ethan [00:00:52]: Bee Computer is a personal AI system. So you can think of it as AI living alongside you in first person. So it can kind of capture your in real life. So with that understanding can help you in significant ways. You know, the obvious one is memory, but that's that's really just the base kind of use case. So recalling and reflective. I know, Swyx, that you you like the idea of journaling, but you don't but still have some some kind of reflective summary of what you experienced in real life. But it's also about just having like the whole context of a human being and understanding, you know, giving the machine the ability to understand, like, what's going on in your life. Your attitudes, your desires, specifics about your preferences, so that not only can it help you with recall, but then anything that you need it to do, it already knows, like, if you think about like somebody who you've worked with or lived with for a long time, they just know kind of without having to ask you what you would want, it's clear that like, that is the future that personal AI, like, it's just going to be very, you know, the AI is just so much more valuable with personal context.
Maria [00:01:58]: I will say that one of the things that we are really passionate is really understanding this. Personal context, because we'll make the AI more useful. Think about like a best friend that know you so well. That's one of the things that we are seeing from the user. They're using from a companion standpoint or professional use cases. There are many ways to use B, but companionship and professional are the ones that we are seeing now more.
swyx [00:02:22]: Yeah. It feels so dry to talk about use cases. Yeah. Yeah.
Maria [00:02:26]: It's like really like investor question. Like, what kind of use case?
Ethan [00:02:28]: We're just like, we've been so broken and trained. But I mean, on the base case, it's just like, don't you want your AI to know everything you've said and like everywhere you've been, like, wouldn't you want that?
Maria [00:02:40]: Yeah. And don't stay there and repeat every time, like, oh, this is what I like. You already know that. And you do things for me based on that. That's I think is really cool.
swyx [00:02:50]: Great. Do you want to jump into a demo? Do you have any other questions?
Alessio [00:02:54]: I want to maybe just cover the origin story. Just how did you two meet? What was the was this the first idea you started working on? Was there something else before?
Maria [00:03:02]: I can start. So Ethan and I, we know each other from six years now. He had a company called Squad. And before that was called Olabot and was a personal AI. Yeah, I should. So maybe you should start this one. But yeah, that's how I know Ethan. Like he was pivoting from personal AI to Squad. And there was a co-watching with friends product. I had experience working with TikTok and video content. So I had the pivoting and we launched Squad and was really successful. And at the end. The founders decided to sell that to Twitter, now X. So both of us, we joined X. We launched Twitter Spaces. We launched many other products. And yeah, till then, we basically continue to work together to the start of B.
Ethan [00:03:46]: The interesting thing is like this isn't the first attempt at personal AI. In 2016, when I started my first company, it started out as a personal AI company. This is before Transformers, no BERT even like just RNNs. You couldn't really do any convincing dialogue at all. I met Esther, who was my previous co-founder. We both really interested in the idea of like having a machine kind of model or understand a dynamic human. We wanted to make personal AI. This was like more geared towards because we had obviously much limited tools, more geared towards like younger people. So I don't know if you remember in 2016, there was like a brief chatbot boom. It was way premature, but it was when Zuckerberg went up on F8 and yeah, M and like. Yeah. The messenger platform, people like, oh, bots are going to replace apps. It was like for about six months. And then everybody realized, man, these things are terrible and like they're not replacing apps. But it was at that time that we got excited and we're like, we tried to make this like, oh, teach the AI about you. So it was just an app that you kind of chatted with and it would ask you questions and then like give you some feedback.
Maria [00:04:53]: But Hugging Face first version was launched at the same time. Yeah, we started it.
Ethan [00:04:56]: We started out the same office as Hugging Face because Betaworks was our investor. So they had to think. They had a thing called Bot Camp. Betaworks is like a really cool VC because they invest in out there things. They're like way ahead of everybody else. And like back then it was they had something called Bot Camp. They took six companies and it was us and Hugging Face. And then I think the other four, I'm pretty sure, are dead. But and Hugging Face was the one that really got, you know, I mean, 30% success rate is pretty good. Yeah. But yeah, when we it was, it was like it was just the two founders. Yeah, they were kind of like an AI company in the beginning. It was a chat app for teenagers. A lot of people don't know that Hugging Face was like, hey, friend, how was school? Let's trade selfies. But then, you know, they built the Transformers library, I believe, to help them make their chat app better. And then they open sourced and it was like it blew up. And like they're like, oh, maybe this is the opportunity. And now they're Hugging Face. But anyway, like we were obsessed with it at that time. But then it was clear that there's some people who really love chatting and like answering questions. But it's like a lot of work, like just to kind of manually.
Maria [00:06:00]: Yeah.
Ethan [00:06:01]: Teach like all these things about you to an AI.
Maria [00:06:04]: Yeah, there were some people that were super passionate, for example, teenagers. They really like, for example, to speak about themselves a lot. So they will reply to a lot of questions and speak about them. But most of the people, they don't really want to spend time.
Ethan [00:06:18]: And, you know, it's hard to like really bring the value with it. We had like sentence similarity and stuff and could try and do, but it was like it was premature with the technology at the time. And so we pivoted. We went to YC and the long story, but like we pivoted to consumer video and that kind of went really viral and got a lot of usage quickly. And then we ended up selling it to Twitter, worked there and left before Elon, not related to Elon, but left Twitter.
swyx [00:06:46]: And then I should mention this is the famous time when well, when when Elon was just came in, this was like Esther was the famous product manager who slept there.
Ethan [00:06:56]: My co-founder, my former co-founder, she sleeping bag. She was the sleep where you were. Yeah, yeah, she stayed. We had left by that point.
swyx [00:07:03]: She very stayed, she's famous for staying.
Ethan [00:07:06]: Yeah, but later, later left or got, I think, laid off, laid off. Yeah, I think the whole product team got laid off. She was a product manager, director. But yeah, like we left before that. And then we're like, oh, my God, things are different now. You know, I think this is we really started working on again right before ChatGPT came out. But we had an app version and we kind of were trying different things around it. And then, you know, ultimately, it was clear that, like, there were some limitations we can go on, like a good question to ask any wearable company is like, why isn't this an app? Yes. Yeah. Because like.
Maria [00:07:40]: Because we tried the app at the beginning.
Ethan [00:07:43]: Yeah. Like the idea that it could be more of a and B comes from ambient. So like if it was more kind of just around you all the time and less about you having to go open the app and do the effort to, like, enter in data that led us down the path of hardware. Yeah. Because the sensors on this are microphones. So it's capturing and understanding audio. We started actually our first hardware with a vision component, too. And we can talk about why we're not doing that right now. But if you wanted to, like, have a continuous understanding of audio with your phone, it would monopolize your microphone. It would get interrupted by calls and you'd have to remember to turn it on. And like that little bit of friction is actually like a substantial barrier to, like, get your phone. It's like the experience of it just being with you all the time and like living alongside you. And so I think that that's like the key reason it's not an app. And in fact, we do have Apple Watch support. So anybody who has a watch, Apple Watch can use it right away without buying any hardware. Because we worked really hard to make a version for the watch that can run in the background, not super drain your battery. But even with the watch, there's still friction because you have to remember to turn it on and it still gets interrupted if somebody calls you. And you have to remember to. We send a notification, but you still have to go back and turn it on because it's just the way watchOS works.
Maria [00:09:04]: One of the things that we are seeing from our Apple Watch users, like I love the Apple Watch integration. One of the things that we are seeing is that people, they start using it from Apple Watch and after a couple of days they buy the B because they just like to wear it.
Ethan [00:09:17]: Yeah, we're seeing.
Maria [00:09:18]: That's something that like they're learning and it's really cool. Yeah.
Ethan [00:09:21]: I mean, I think like fundamentally we like to think that like a personal AI is like the mission. And it's more about like the understanding. Connecting the dots, making use of the data to provide some value. And the hardware is like the ears of the AI. It's not like integrating like the incoming sensor data. And that's really what we focus on. And like the hardware is, you know, if we can do it well and have a great experience on the Apple Watch like that, that's just great. I mean, but there's just some platform restrictions that like existing hardware makes it hard to provide that experience. Yeah.
Alessio [00:09:54]: What do people do in like two or three days that then convinces them to buy it? They buy the product. This feels like a product where like after you use it for a while, you have enough data to start to get a lot of insights. But it sounds like maybe there's also like a short term.
Maria [00:10:07]: From the Apple Watch users, I believe that because every time that you receive a call after, they need to go back to B and open it again. Or for example, every day they need to charge Apple Watch and reminds them to open the app every day. They feel like, okay, maybe this is too much work. I just want to wear the B and just keep it open and that's it. And I don't need to think about it.
Ethan [00:10:27]: I think they see the kind of potential of it just from the watch. Because even if you wear it a day, like we send a summary notification at the end of the day about like just key things that happened to you in your day. And like I didn't even think like I'm not like a journaling type person or like because like, oh, I just live the day. Why do I need to like think about it? But like it's actually pretty sometimes I'm surprised how interesting it is to me just to kind of be like, oh, yeah, that and how it kind of fits together. And I think that's like just something people get immediately with the watch. But they're like, oh, I'd like an easier watch. I'd like a better way to do this.
swyx [00:10:58]: It's surprising because I only know about the hardware. But I use the watch as like a backup for when I don't have the hardware. I feel like because now you're beamforming and all that, this is significantly better. Yeah, that's the other thing.
Ethan [00:11:11]: We have way more control over like the Apple Watch. You're limited in like you can't set the gain. You can't change the sample rate. There's just very limited framework support for doing anything with audio. Whereas if you control it. Then you can kind of optimize it for your use case. The Apple Watch isn't meant to be kind of recording this. And we can talk when we get to the part about audio, why it's so hard. This is like audio on the hardest level because you don't know it has to work in all environments or you try and make it work as best as it can. Like this environment is very great. We're in a studio. But, you know, afterwards at dinner in a restaurant, it's totally different audio environment. And there's a lot of challenges with that. And having really good source audio helps. But then there's a lot more. But with the machine learning that still is, you know, has to be done to try and account because like you can tune something for one environment or another. But it'll make one good and one bad. And like making something that's flexible enough is really challenging.
Alessio [00:12:10]: Do we want to do a demo just to set the stage? And then we kind of talk about.
Maria [00:12:14]: Yeah, I think we can go like a walkthrough and the prod.
Alessio [00:12:17]: Yeah, sure.
swyx [00:12:17]: So I think we said I should. So for listeners, we'll be switching to video. That was superimposed on. And to this video, if you want to see it, go to our YouTube, like and subscribe as always. Yeah.
Maria [00:12:31]: And by the bee. Yes.
swyx [00:12:33]: And by the bee. While you wait. While you wait. Exactly. It doesn't take long.
Maria [00:12:39]: Maybe you should have a discount code just for the listeners. Sure.
swyx [00:12:43]: If you want to offer it, I'll take it. All right. Yeah. Well, discount code Swyx. Oh s**t. Okay. Yeah. There you go.
Ethan [00:12:49]: An important thing to mention also is that the hardware is meant to work with the phone. And like, I think, you know, if you, if you look at rabbit or, or humane, they're trying to create like a new hardware platform. We think that the phone's just so dominant and it will be until we have the next generation, which is not going to be for five, you know, maybe some Orion type glasses that are cheap enough and like light enough. Like that's going to take a long time before with the phone rather than trying to just like replace it. So in the app, we have a summary of your days, but at the top, it's kind of what's going on now. And that's updating your phone. It's updating continuously. So right now it's saying, I'm discussing, you know, the development of, you know, personal AI, and that's just kind of the ongoing conversation. And then we give you a readable form. That's like little kind of segments of what's the important parts of the conversations. We do speaker identification, which is really important because you don't want your personal AI thinking you said something and attributing it to you when it was just somebody else in the conversation. So you can also teach it other people's voices. So like if some, you know, somebody close to you, so it can start to understand your relationships a little better. And then we do conversation end pointing, which is kind of like a task that didn't even exist before, like, cause nobody needed to do this. But like if you had somebody's whole day, how do you like break it into logical pieces? And so we use like not just voice activity, but other signals to try and split up because conversations are a little fuzzy. They can like lead into one, can start to the next. So also like the semantic content of it. When a conversation ends, we run it through larger models to try and get a better, you know, sense of the actual, what was said and then summarize it, provide key points. What was the general atmosphere and tone of the conversation and potential action items that might've come of that. But then at the end of the day, we give you like a summary of all your day and where you were and just kind of like a step-by-step walkthrough of what happened and what were the key points. That's kind of just like the base capture layer. So like if you just want to get a kind of glimpse or recall or reflect that's there. But really the key is like all of this is now like being influenced on to generate personal context about you. So we generate key items known to be true about you and that you can, you know, there's a human in the loop aspect is like you can, you have visibility. Right. Into that. And you can, you know, I have a lot of facts about technology because that's basically what I talk about all the time. Right. But I do have some hobbies that show up and then like, how do you put use to this context? So I kind of like measure my day now and just like, what is my token output of the day? You know, like, like as a human, how much information do I produce? And it's kind of measured in tokens and it turns out it's like around 200,000 or so a day. But so in the recall case, we have, um. A chat interface, but the key here is on the recall of it. Like, you know, how do you, you know, I probably have 50 million tokens of personal context and like how to make sense of that, make it useful. So I can ask simple, like, uh, recall questions, like details about the trip I was on to Taiwan, where recently we're with our manufacturer and, um, in real time, like it will, you know, it has various capabilities such as searching through your, your memories, but then also being able to search the web or look at my calendar, we have integrations with Gmail and calendars. So like connecting the dots between the in real life and the digital life. And, you know, I just asked it about my Taiwan trip and it kind of gives me the, the breakdown of the details, what happened, the issues we had around, you know, certain manufacturing problems and it, and it goes back and references the conversation so I can, I can go back to the source. Yeah.
Maria [00:16:46]: Not just the conversation as well, the integrations. So we have as well Gmail and Google calendar. So if there is something there that was useful to have more context, we can see that.
Ethan [00:16:56]: So like, and it can, I never use the word agentic cause it's, it's cringe, but like it can search through, you know, if I, if I'm brainstorming about something that spans across, like search through my conversation, search the email, look at the calendar and then depending on what's needed. Then synthesize, you know, something with all that context.
Maria [00:17:18]: I love that you did the Spotify wrapped. That was pretty cool. Yeah.
Ethan [00:17:22]: Like one thing I did was just like make a Spotify wrap for my 2024, like of my life. You can do that. Yeah, you can.
Maria [00:17:28]: Wait. Yeah. I like those crazy.
Ethan [00:17:31]: Make a Spotify wrapped for my life in 2024. Yeah. So it's like surprisingly good. Um, it like kind of like game metrics. So it was like you visited three countries, you shipped, you know, XMini, beta. Devices.
Maria [00:17:46]: And that's kind of more personal insights and reflection points. Yeah.
swyx [00:17:51]: That's fascinating. So that's the demo.
Ethan [00:17:54]: Well, we have, we can show something that's in beta. I don't know if we want to do it. I don't know.
Maria [00:17:58]: We want to show something. Do it.
Ethan [00:18:00]: And then we can kind of fit. Yeah.
Maria [00:18:01]: Yeah.
Ethan [00:18:02]: So like the, the, the, the vision is also like, not just about like AI being with you in like just passively understanding you through living your experience, but also then like it proactively suggesting things to you. Yeah. Like at the appropriate time. So like not just pool, but, but kind of, it can step in and suggest things to you. So, you know, one integration we have that, uh, is in beta is with WhatsApp. Maria is asking for a recommendation for an Italian restaurant. Would you like me to look up some highly rated Italian restaurants nearby and send her a suggestion?
Maria [00:18:34]: So what I did, I just sent to Ethan a message through WhatsApp in his own personal phone. Yeah.
Ethan [00:18:41]: So, so basically. B is like watching all my incoming notifications. And if it meets two criteria, like, is it important enough for me to raise a suggestion to the user? And then is there something I could potentially help with? So this is where the actions come into place. So because Maria is my co-founder and because it was like a restaurant recommendation, something that it could probably help with, it proposed that to me. And then I can, through either the chat and we have another kind of push to talk walkie talkie style button. It's actually a multi-purpose button to like toggle it on or off, but also if you push to hold, you can talk. So I can say, yes, uh, find one and send it to her on WhatsApp is, uh, an Android cloud phone. So it's, uh, going to be able to, you know, that has access to all my accounts. So we're going to abstract this away and the execution environment is not really important, but like we can go into technically why Android is actually a pretty good one right now. But, you know, it's searching for Italian restaurants, you know, and we don't have to watch this. I could be, you know, have my ear AirPods in and in my pocket, you know, it's going to go to WhatsApp, going to find Maria's thread, send her the response and then, and then let us know. Oh my God.
Alessio [00:19:56]: But what's the, I mean, an Italian restaurant. Yeah. What did it choose? What did it choose? It's easy to say. Real Italian is hard to play. Exactly.
Ethan [00:20:04]: It's easy to say. So I doubt it. I don't know.
swyx [00:20:06]: For the record, since you have the Italians, uh, best Italian restaurant in SF.
Maria [00:20:09]: Oh my God. I still don't have one. What? No.
Ethan [00:20:14]: I don't know. Successfully found and shared.
Alessio [00:20:16]: Let's see. Let's see what the AI says. Bottega. Bottega? I think it's Bottega.
Maria [00:20:21]: Have you been to Bottega? How is it?
Alessio [00:20:24]: It's fine.
Maria [00:20:25]: I've been to one called like Norcina, I think it was good.
Alessio [00:20:29]: Bottega is on Valencia Street. It's fine. The pizza is not good.
Maria [00:20:32]: It's not good.
Alessio [00:20:33]: Some of the pastas are good.
Maria [00:20:34]: You know, the people I'm sorry to interrupt. Sorry. But there is like this Delfina. Yeah. That here everybody's like, oh, Pizzeria Delfina is amazing. I'm overrated. This is not. I don't know. That's great. That's great.
swyx [00:20:46]: The North Beach Cafe. That place you took us with Michele last time. Vega. Oh.
Alessio [00:20:52]: The guy at Vega, Giuseppe, he's Italian. Which one is that? It's in Bernal Heights. Ugh. He's nice. He's not nice. I don't know that one. What's the name of the place? Vega. Vega. Vega. Cool. We got the name. Vega. But it's not Vega.
Maria [00:21:02]: It's Italian. What
swyx [00:21:10]: Vega. Vega.
swyx [00:21:16]: Vega. Vega. Vega. Vega. Vega. Vega. Vega. Vega. Vega.
Ethan [00:21:29]: Vega. Vega. Vega. Vega. Vega.
Ethan [00:21:40]: We're going to see a lot of innovation around hardware and stuff, but I think the real core is being able to do something useful with the personal context. You always had the ability to capture everything, right? We've always had recorders, camcorders, body cameras, stuff like that. But what's different now is we can actually make sense and find the important parts in all of that context.
swyx [00:22:04]: Yeah. So, and then one last thing, I'm just doing this for you, is you also have an API, which I think I'm the first developer against. Because I had to build my own. We need to hire a developer advocate. Or just hire AI engineers. The point is that you should be able to program your own assistant. And I tried OMI, the former friend, the knockoff friend, and then real friend doesn't have an API. And then Limitless also doesn't have an API. So I think it's very important to own your data. To be able to reprocess your audio, maybe. Although, by default, you do not store audio. And then also just to do any corrections. There's no way that my needs can be fully met by you. So I think the API is very important.
Ethan [00:22:47]: Yeah. And I mean, I've always been a consumer of APIs in all my products.
swyx [00:22:53]: We are API enjoyers in this house.
Ethan [00:22:55]: Yeah. It's very frustrating when you have to go build a scraper. But yeah, it's for sure. Yeah.
swyx [00:23:03]: So this whole combination of you have my location, my calendar, my inbox. It really is, for me, the sort of personal API.
Alessio [00:23:10]: And is the API just to write into it or to have it take action on external systems?
Ethan [00:23:16]: Yeah, we're expanding it. It's right now read-only. In the future, very soon, when the actions are more generally available, it'll be fully supported in the API.
Alessio [00:23:27]: Nice. I'll buy one after the episode.
Ethan [00:23:30]: The API thing, to me, is the most interesting. Yeah. We do have real-time APIs, so you can even connect a socket and connect it to whatever you want it to take actions with. Yeah. It's too smart for me.
Alessio [00:23:43]: Yeah. I think when I look at these apps, and I mean, there's so many of these products, we launch, it's great that I can go on this app and do things. But most of my work and personal life is managed somewhere else. Yeah. So being able to plug into it. Integrate that. It's nice. I have a bunch of more, maybe, human questions. Sure. I think maybe people might have. One, is it good to have instant replay for any argument that you have? I can imagine arguing with my wife about something. And, you know, there's these commercials now where it's basically like two people arguing, and they're like, they can throw a flag, like in football, and have an instant replay of the conversation. I feel like this is similar, where it's almost like people cannot really argue anymore or, like, lie to each other. Because in a world in which everybody adopts this, I don't know if you thought about it. And also, like, how the lies. You know, all of us tell lies, right? How do you distinguish between when I'm, there's going to be sometimes things that contradict each other, because I might say something publicly, and I might think something, really, that I tell someone else. How do you handle that when you think about building a product like this?
Maria [00:24:48]: I would say that I like the fact that B is an objective point of view. So I don't care too much about the lies, but I care more about the fact that can help me to understand what happened. Mm-hmm. And the emotions in a really objective way, like, really, like, critical and objective way. And if you think about humans, they have so many emotions. And sometimes something that happened to me, like, I don't know, I would feel, like, really upset about it or really angry or really emotional. But the AI doesn't have those emotions. It can read the conversation, understand what happened, and be objective. And I think the level of support is the one that I really like more. Instead of, like, oh, did this guy tell me a lie? I feel like that's not exactly, like, what I feel. I find it curious for me in terms of opportunity.
Alessio [00:25:35]: Is the B going to interject in real time? Say I'm arguing with somebody. The B is like, hey, look, no, you're wrong. What? That person actually said.
Ethan [00:25:43]: The proactivity is something we're very interested in. Maybe not for, like, specifically for, like, selling arguments, but more for, like, and I think that a lot of the challenge here is, you know, you need really good reasoning to kind of pull that off. Because you don't want it just constantly interjecting, because that would be super annoying. And you don't want it to miss things that it should be interjecting. So, like, it would be kind of a hard task even for a human to be, like, just come in at the right times when it's appropriate. Like, it would take the, you know, with the personal context, it's going to be a lot better. Because, like, if somebody knows about you, but even still, it requires really good reasoning to, like, not be too much or too little and just right.
Maria [00:26:20]: And the second part about, well, like, some things, you know, you say something to somebody else, but after I change my mind, I send something. Like, it's every time I have, like, different type of conversation. And I'm like, oh, I want to know more about you. And I'm like, oh, I want to know more about you. I think that's something that I found really fascinating. One of the things that we are learning is that, indeed, humans, they evolve over time. So, for us, one of the challenges is actually understand, like, is this a real fact? Right. And so far, what we do is we give, you know, to the, we have the human in the loop that can say, like, yes, this is true, this is not. Or they can edit their own fact. For sure, in the future, we want to have all of that automatized inside of the product.
Ethan [00:26:57]: But, I mean, I think your question kind of hits on, and I know that we'll talk about privacy, but also just, like, if you have some memory and you want to confirm it with somebody else, that's one thing. But it's for sure going to be true that in the future, like, not even that far into the future, that it's just going to be kind of normalized. And we're kind of in a transitional period now. And I think it's, like, one of the key things that is for us to kind of navigate that and make sure we're, like, thinking of all the consequences. And how to, you know, make the right choices in the way that everything's designed. And so, like, it's more beneficial than it could be harmful. But it's just too valuable for your AI to understand you. And so if it's, like, MetaRay bands or the Google Astra, I think it's just people are going to be more used to it. So people's behaviors and expectations will change. Whether that's, like, you know, something that is going to happen now or in five years, it's probably in that range. And so, like, I think we... We kind of adapt to new technologies all the time. Like, when the Ring cameras came out, that was kind of quite controversial. It's like... But now it's kind of... People just understand that a lot of people have cameras on their doors. And so I think that...
Maria [00:28:09]: Yeah, we're in a transitional period for sure.
swyx [00:28:12]: I will press on the privacy thing because that is the number one thing that everyone talks about. Obviously, I think in Silicon Valley, people are a little bit more tech-forward, experimental, whatever. But you want to go mainstream. You want to sell to consumers. And we have to worry about this stuff. Baseline question. The hardest version of this is law. There are one-party consent states where this is perfectly legal. Then there are two-party consent states where they're not. What have you come around to this on?
Ethan [00:28:38]: Yeah, so the EU is a totally different regulatory environment. But in the U.S., it's basically on a state-by-state level. Like, in Nevada, it's single-party. In California, it's two-party. But it's kind of untested. You know, it's different laws, whether it's a phone call, whether it's in person. In a state like California, it's two-party. Like, anytime you're in public, there's no consent comes into play because the expectation of privacy is that you're in public. But we process the audio and nothing is persisted. And then it's summarized with the speaker identification focusing on the user. Now, it's kind of untested on a legal, and I'm not a lawyer, but does that constitute the same as, like, a recording? So, you know, it's kind of a gray area and untested in law right now. I think that the bigger question is, you know, because, like, if you had your Ray-Ban on and were recording, then you have a video of something that happened. And that's different than kind of having, like, an AI give you a summary that's focused on you that's not really capturing anybody's voice. You know, I think the bigger question is, regardless of the legal status, like, what is the ethical kind of situation with that? Because even in Nevada that we're—or many other U.S. states where you can record. Everything. And you don't have to have consent. Is it still, like, the right thing to do? The way we think about it is, is that, you know, we take a lot of precautions to kind of not capture personal information of people around. Both through the speaker identification, through the pipeline, and then the prompts, and the way we store the information to be kind of really focused on the user. Now, we know that's not going to, like, satisfy a lot of people. But I think if you do try it and wear it again. It's very hard for me to see anything, like, if somebody was wearing a bee around me that I would ever object that it captured about me as, like, a third party to it. And like I said, like, we're in this transitional period where the expectation will just be more normalized. That it's, like, an AI. It's not capturing, you know, a full audio recording of what you said. And it's—everything is fully geared towards helping the person kind of understand their state and providing valuable information to them. Not about, like, logging details about people they encounter.
Alessio [00:30:57]: You know, I've had the same question also with the Zoom meeting transcribers thing. I think there's kind of, like, the personal impact that there's a Firefly's AI recorder. Yeah. I just know that it's being recorded. It's not like a—I don't know if I'm going to say anything different. But, like, intrinsically, you kind of feel—because it's not pervasive. And I'm curious, especially, like, in your investor meetings. Do people feel differently? Like, have you had people ask you to, like, turn it off? Like, in a business meeting, to not record? I'm curious if you've run into any of these behaviors.
Maria [00:31:29]: You know what's funny? On my end, I wear it all the time. I take my coffee, a blue bottle with it. Or I work with it. Like, obviously, I work on it. So, I wear it all the time. And so far, I don't think anybody asked me to turn it off. I'm not sure if because they were really friendly with me that they know that I'm working on it. But nobody really cared.
swyx [00:31:48]: It's because you live in SF.
Maria [00:31:49]: Actually, I've been in Italy as well. Uh-huh. And in Italy, it's a super privacy concern. Like, Europe is a super privacy concern. And again, they're nothing. Like, it's—I don't know. Yeah. That, for me, was interesting.
Ethan [00:32:01]: I think—yeah, nobody's ever asked me to turn it off, even after giving them full demos and disclosing. I think that some people have said, well, my—you know, in a personal relationship, my partner initially was, like, kind of uncomfortable about it. We heard that from a few users. And that was, like, more in just, like— It's not like a personal relationship situation. And the other big one is people are like, I do like it, but I cannot wear this at work. I guess. Yeah. Yeah. Because, like, I think I will get in trouble based on policies or, like, you know, if you're wearing it inside a research lab or something where you're working on things that are kind of sensitive that, like—you know, so we're adding certain features like geofencing, just, like, at this location. It's just never active.
swyx [00:32:50]: I mean, I've often actually explained to it the other way, where maybe you only want it at work, so you never take it from work. And it's just a work device, just like your Zoom meeting recorder is a work device.
Ethan [00:33:09]: Yeah, professionals have been a big early adopter segment. And you say in San Francisco, but we have out there our daily shipment of over 100. If you go look at the addresses, Texas, I think, is our biggest state, and Florida, just the biggest states. A lot of professionals who talk for, and we didn't go out to build it for that use case, but I think there is a lot of demand for white-collar people who talk for a living. And I think we're just starting to talk with them. I think they just want to be able to improve their performance around, understand what they were doing.
Alessio [00:33:47]: How do you think about Gong.io? Some of these, for example, sales training thing, where you put on a sales call and then it coaches you. They're more verticalized versus having more horizontal platform.
Ethan [00:33:58]: I am not super familiar with those things, because like I said, it was kind of a surprise to us. But I think that those are interesting. I've seen there's a bunch of them now, right? Yeah. It kind of makes sense. I'm terrible at sales, so I could probably use one. But it's not my job, fundamentally. But yeah, I think maybe it's, you know, we heard also people with restaurants, if they're able to understand, if they're doing well.
Maria [00:34:26]: Yeah, but in general, I think a lot of people, they like to have the double check of, did I do this well? Or can you suggest me how I can do better? We had a user that was saying to us that he used for interviews. Yeah, he used job interviews. So he used B and after asked to the B, oh, actually, how do you think my interview went? What I should do better? And I like that. And like, oh, that's actually like a personal coach in a way.
Alessio [00:34:50]: Yeah. But I guess the question is like, do you want to build all of those use cases? Or do you see B as more like a platform where somebody is going to build like, you know, the sales coach that connects to B so that you're kind of the data feed into it?
Ethan [00:35:02]: I don't think this is like a data feed, more like an understanding kind of engine and like definitely. In the future, having third parties to the API and building out for all the different use cases is something that we want to do. But the like initial case we're trying to do is like build that layer for all that to work. And, you know, we're not trying to build all those verticals because no startup could do that well. But I think that it's really been quite fascinating to see, like, you know, I've done consumer for a long time. Consumer is very hard to predict, like, what's going to be. It's going to be like the thing that's the killer feature. And so, I mean, we really believe that it's the future, but we don't know like what exactly like process it will take to really gain mass adoption.
swyx [00:35:50]: The killer consumer feature is whatever Nikita Beer does. Yeah. Social app for teens.
Ethan [00:35:56]: Yeah, well, I like Nikita, but, you know, he's good at building bootstrap companies and getting them very viral. And then selling them and then they shut down.
swyx [00:36:05]: Okay, so you just came back from CES.
Maria [00:36:07]: Yeah, crazy. Yeah, tell us. It was my first time in Vegas and first time CES, both of them were overwhelming.
swyx [00:36:15]: First of all, did you feel like you had to do it because you're in consumer hardware?
Maria [00:36:19]: Then we decided to be there and to have a lot of partners and media meetings, but we didn't have our own booth. So we decided to just keep that. But we decided to be there and have a presence there, even just us and speak with people. It's very hard to stand out. Yeah, I think, you know, it depends what type of booth you have. I think if you can prepare like a really cool booth.
Ethan [00:36:41]: Have you been to CES?
Maria [00:36:42]: I think it can be pretty cool.
Ethan [00:36:43]: It's massive. It's huge. It's like 80,000, 90,000 people across the Venetian and the convention center. And it's, to me, I always wanted to go just like...
Maria [00:36:53]: Yeah, you were the one who was like...
swyx [00:36:55]: I thought it was your idea.
Ethan [00:36:57]: I always wanted to go just as a, like, just as a fan of...
Maria [00:37:01]: Yeah, you wanted to go anyways.
Ethan [00:37:02]: Because like, growing up, I think CES like kind of peaked for a while and it was like, oh, I want to go. That's where all the cool, like... gadgets, everything. Yeah, now it's like SmartBitch and like, you know, vacuuming the picks up socks. Exactly.
Maria [00:37:13]: There are a lot of cool vacuums. Oh, they love it.
swyx [00:37:15]: They love the Roombas, the pick up socks.
Maria [00:37:16]: And pet tech. Yeah, yeah. And dog stuff.
swyx [00:37:20]: Yeah, there's a lot of like robot stuff. New TVs, new cars that never ship. Yeah. Yeah. I'm thinking like last year, this time last year was when Rabbit and Humane launched at CES and Rabbit kind of won CES. And now this year, no wearables except for you guys.
Ethan [00:37:32]: It's funny because it's obviously it's AI everything. Yeah. Like every single product. Yeah.
Maria [00:37:37]: Toothbrush with AI, vacuums with AI. Yeah. Yeah.
Ethan [00:37:41]: We like hair blow, literally a hairdryer with AI. We saw.
Maria [00:37:45]: Yeah, that was cool.
Ethan [00:37:46]: But I think that like, yeah, we didn't, another kind of difference like around our, like we didn't want to do like a big overhypey promised kind of Rabbit launch. Because I mean, they did, hats off to them, like on the presentation and everything, obviously. But like, you know, we want to let the product kind of speak for itself and like get it out there. And I think we were really happy. We got some very good interest from media and some of the partners there. So like it was, I think it was definitely worth going. I would say like if you're in hardware, it's just kind of how you make use of it. Like I think to do it like a big Rabbit style or to have a huge show on there, like you need to plan that six months in advance. And it's very expensive. But like if you, you know, go there, there's everybody's there. All the media is there. There's a lot of some pre-show events that it's just great to talk to people. And the industry also, all the manufacturers, suppliers are there. So we learned about some really cool stuff that we might like. We met with somebody. They have like thermal energy capture. And it's like, oh, could you maybe not need to charge it? Because they have like a thermal that can capture your body heat. And what? Yeah, they're here. They're actually here. And in Palo Alto, they have like a Fitbit thing that you don't have to charge.
swyx [00:39:01]: Like on paper, that's the power you can get from that. What's the power draw for this thing?
Ethan [00:39:05]: It's more than you could get from the body heat, it turns out. But it's quite small. I don't want to disclose technically. But I think that solar is still, they also have one where it's like this thing could be like the face of it. It's just a solar cell. And like that is more realistic. Or kinetic. Kinetic, apparently, I'm not an expert in this, but they seem to think it wouldn't be enough. Kinetic is quite small, I guess, on the capture.
swyx [00:39:33]: Well, I mean, watch. Watchmakers have been powering with kinetic for a long time. Yeah. We don't have to talk about that. I just want to get a sense of CES. Would you do it again? I definitely would not. Okay. You're just a fan of CES. Business point of view doesn't make sense. I happen to be in the conference business, right? So I'm kind of just curious. Yeah.
Maria [00:39:49]: So I would say as we did, so without the booth and really like straightforward conversations that were already planned. Three days. That's okay. I think it was okay. Okay. But if you need to invest for a booth that is not. Okay. A good one. Which is how much? I think.
Ethan [00:40:06]: 10 by 10 is 5,000. But on top of that, you need to. And then they go like 10 by 10 is like super small. Yeah. And like some companies have, I think would probably be more in like the six figure range to get. And I mean, I think that, yeah, it's very noisy. We heard this, that it's very, very noisy. Like obviously if you're, everything is being launched there and like everything from cars to cell phones are being launched. Yeah. So it's hard to stand out. But like, I think going in with a plan of who you want to talk to, I feel like.
Maria [00:40:36]: That was worth it.
Ethan [00:40:37]: Worth it. We had a lot of really positive media coverage from it and we got the word out and like, so I think we accomplished what we wanted to do.
swyx [00:40:46]: I mean, there's some world in which my conference is kind of the CES of whatever AI becomes. Yeah. I think that.
Maria [00:40:52]: Don't do it in Vegas. Don't do it in Vegas. Yeah. Don't do it in Vegas. That's the only thing. I didn't really like Vegas. That's great. Amazing. Those are my favorite ones.
Alessio [00:41:02]: You can not fit 90,000 people in SF. That's really duh.
Ethan [00:41:05]: You need to do like multiple locations so you can do Moscone and then have one in.
swyx [00:41:09]: I mean, that's what Salesforce conferences. Well, GDC is how many? That might be 50,000, right? Okay. Form factor, right? Like my way to introduce this idea was that I was at the launch in Solaris. What was the old name of it? Newton. Newton. Of Tab when Avi first launched it. He was like, I thought through everything. Every form factor, pendant is the thing. And then we got the pendants for this original. The first one was just pendants and I took it off and I forgot to put it back on. So you went through pendants, pin, bracelet now, and maybe there's sort of earphones in the future, but what was your iterations?
Maria [00:41:49]: So we had, I believe now three or four iterations. And one of the things that we learned is indeed that people don't like the pendant. In particular, woman, you don't want to have like anything here on the chest because it's maybe you have like other necklace or any other stuff.
Ethan [00:42:03]: You just ship a premium one that's gold. Yeah. We're talking some fashion reached out to us.
Maria [00:42:11]: Some big fashion. There is something there.
swyx [00:42:13]: This is where it helps to have an Italian on the team.
Maria [00:42:15]: There is like some big Italian luxury. I can't say anything. So yeah, bracelet actually came from the community because they were like, oh, I don't want to wear anything like as necklace or as a pendant. Like it's. And also like the one that we had, I don't know if you remember, like it was like circle, like it was like this and was like really bulky. Like people didn't like it. And also, I mean, I actually, I don't dislike, like we were running fast when we did that. Like our, our thing was like, we wanted to ship them as soon as possible. So we're not overthinking the form factor or the material. We were just want to be out. But after the community organically, basically all of them were like, well, why you don't just don't do the bracelet? Like he's way better. I will just wear it. And that's it. So that's how we ended up with the bracelet, but it's still modular. So I still want to play around the father is modular and you can, you know, take it off and wear it as a clip or in the future, maybe we will bring back the pendant. But I like the fact that there is some personalization and right now we have two colors, yellow and black. Soon we will have other ones. So yeah, we can play a lot around that.
Ethan [00:43:25]: I think the form factor. Like the goal is for it to be not super invasive. Right. And something that's easy. So I think in the future, smaller, thinner, not like apple type obsession with thinness, but it does matter like the, the size and weight. And we would love to have more context because that will help, but to make it work, I think it really needs to have good power consumption, good battery life. And, you know, like with the humane swapping the batteries, I have one, I mean, I'm, I'm, I think we've made, and there's like pretty incredible, some of the engineering they did, but like, it wasn't kind of geared towards solving the problem. It was just, it's too heavy. The swappable batteries is too much to man, like the heat, the thermals is like too much to light interface thing. Yeah. Like that. That's cool. It's cool. It's cool. But it's like, if, if you have your handout here, you want to use your phone, like it's not really solving a problem. Cause you know how to use your phone. It's got a brilliant display. You have to kind of learn how to gesture this low range. Yeah. It's like a resolution laser, but the laser is cool that the fact they got it working in that thing, even though if it did overheat, but like too heavy, too cumbersome, too complicated with the multiple batteries. So something that's power efficient, kind of thin, both in the physical sense and also in the edge compute kind of way so that it can be as unobtrusive as possible. Yeah.
Maria [00:44:47]: Users really like, like, I like when they say yes, I like to wear it and forget about it because I don't need to charge it every single day. On the other version, I believe we had like 35 hours or something, which was okay. But people, they just prefer the seven days battery life and-
swyx [00:45:03]: Oh, this is seven days? Yeah. Oh, I've been charging every three days.
Maria [00:45:07]: Oh, no, you can like keep it like, yeah, it's like almost seven days.
swyx [00:45:11]: The other thing that occurs to me, maybe there's an Apple watch strap so that I don't have to double watch. Yeah.
Maria [00:45:17]: That's the other one that, yeah, I thought about it. I saw as well the ones that like, you can like put it like back on the phone. Like, you know- Plog. There is a lot.
swyx [00:45:27]: So yeah, there's a competitor called Plog. Yeah. It's not really a competitor. They only transcribe, right? Yeah, they only transcribe. But they're very good at it. Yeah.
Ethan [00:45:33]: No, they're great. Their hardware is really good too.
swyx [00:45:36]: And they just launched the pin too. Yeah.
Ethan [00:45:38]: I think that the MagSafe kind of form factor has a lot of advantages, but some disadvantages. You can definitely put a very huge battery on that, you know? And so like the battery life's not, the power consumption's not so much of a concern, but you know, downside the phone's like in your pocket. And so I think that, you know, form factors will continue to evolve, but, and you know, more sensors, less obtrusive and-
Maria [00:46:02]: Yeah. We have a new version.
Ethan [00:46:04]: Easier to use.
Maria [00:46:05]: Okay.
swyx [00:46:05]: Looking forward to that. Yeah. I mean, we'll, whenever we launch this, we'll try to show whatever, but I'm sure you're going to keep iterating. Last thing on hardware, and then we'll go on to the software side, because I think that's where you guys are also really, really strong. Vision. You wanted to talk about why no vision? Yeah.
Ethan [00:46:20]: I think it comes down to like when you're, when you're a startup, especially in hardware, you're just, you work within the constraints, right? And so like vision is super useful and super interesting. And what we actually started with, there's two issues with vision that make it like not the place we decided to start. One is power consumption. So you know, you kind of have to trade off your power budget, like capturing even at a low frame rate and transmitting the radio is actually the thing that takes up the majority of the power. So. Yeah. So you would really have to have quite a, like unacceptably, like large and heavy battery to do it continuously all day. We have, I think, novel kind of alternative ways that might allow us to do that. And we have some prototypes. The other issue is form factor. So like even with like a wide field of view, if you're wearing something on your chest, it's going, you know, obviously the wrist is not really that much of an option. And if you're wearing it on your chest, it's, it's often gone. You're going to probably be not capturing like the field of view of what's interesting to you. So that leaves you kind of with your head and face. And then anything that goes on, on the face has to look cool. Like I don't know if you remember the spectacles, it was kind of like the first, yeah, but they kind of, they didn't, they were not very successful. And I think one of the reasons is they were, they're so weird looking. Yeah. The camera was so big on the side. And if you look at them at array bands where they're way more successful, they, they look almost indistinguishable from array bands. And they invested a lot into that and they, they have a partnership with Qualcomm to develop custom Silicon. They have a stake in Luxottica now. So like they coming from all the angles, like to make glasses, I think like, you know, I don't know if you know, Brilliant Labs, they're cool company, they make frames, which is kind of like a cool hackable glasses and, and, and like, they're really good, like on hardware, they're really good. But even if you look at the frames, which I would say is like the most advanced kind of startup. Yeah. Yeah. Yeah. There was one that launched at CES, but it's not shipping yet. Like one that you can buy now, it's still not something you'd wear every day and the battery life is super short. So I think just the challenge of doing vision right, like off the bat, like would require quite a bit more resources. And so like audio is such a good entry point and it's also the privacy around audio. If you, if you had images, that's like another huge challenge to overcome. So I think that. Ideally the personal AI would have, you know, all the senses and you know, we'll, we'll get there. Yeah. Okay.
swyx [00:48:57]: One last hardware thing. I have to ask this because then we'll move to the software. Were either of you electrical engineering?
Ethan [00:49:04]: No, I'm CES. And so I have a, I've taken some EE courses, but I, I had done prior to working on, on the hardware here, like I had done a little bit of like embedded systems, like very little firmware, but we have luckily on the team, somebody with deep experience. Yeah.
swyx [00:49:21]: I'm just like, you know, like you have to become hardware people. Yeah.
Ethan [00:49:25]: Yeah. I mean, I learned to worry about supply chain power. I think this is like radio.
Maria [00:49:30]: There's so many things to learn.
Ethan [00:49:32]: I would tell this about hardware, like, and I know it's been said before, but building a prototype and like learning how the electronics work and learning about firmware and developing, this is like, I think fun for a lot of engineers and it's, it's all totally like achievable, especially now, like with, with the tools we have, like stuff you might've been intimidated about. Like, how do I like write this firmware now? With Sonnet, like you can, you can get going and actually see results quickly. But I think going from prototype to actually making something manufactured is a enormous jump. And it's not all about technology, the supply chain, the procurement, the regulations, the cost, the tooling. The thing about software that I'm used to is it's funny that you can make changes all along the way and ship it. But like when you have to buy tooling for an enclosure that's expensive.
swyx [00:50:24]: Do you buy your own tooling? You have to.
Ethan [00:50:25]: Don't you just subcontract out to someone in China? Oh, no. Do we make the tooling? No, no. You have to have CNC and like a bunch of machines.
Maria [00:50:31]: Like nobody makes their own tooling, but like you have to design this design and you submit
Ethan [00:50:36]: it and then they go four to six weeks later. Yeah. And then if there's a problem with it, well, then you're not, you're not making any, any of your enclosures. And so you have to really plan ahead. And like.
swyx [00:50:48]: I just want to leave tips for other hardware founders. Like what resources or websites are most helpful in your sort of manufacturing journey?
Ethan [00:50:55]: You know, I think it's different depending on like it's hardware so specialized in different ways.
Maria [00:51:00]: I will say that, for example, I should choose a manufacturer company. I speak with other founders and like we can give you like some, you know, some tips of who is good and who is not, or like who's specialized in something versus somebody else. Yeah.
Ethan [00:51:15]: Like some people are good in plastics. Some people are good.
Maria [00:51:18]: I think like for us, it really helped at the beginning to speak with others and understand. Okay. Like who is around. I work in Shenzhen. I lived almost two years in China. I have an idea about like different hardware manufacturer and all of that. Soon I will go back to Shenzhen to check out. So I think it's good also to go in place and check.
Ethan [00:51:40]: Yeah, you have to like once you, if you, so we did some stuff domestically and like if you have that ability. The reason I say ability is very expensive, but like to build out some proof of concepts and do field testing before you take it to a manufacturer, despite what people say, there's really good domestic manufacturing for small quantities at extremely high prices. So we got our first PCB and the assembly done in LA. So there's a lot of good because of the defense industry that can do quick churn. So it's like, we need this board. We need to find out if it's working. We have this deadline we want to start, but you need to go through this. And like if you want to have it done and fabricated in a week, they can do it for a price. But I think, you know, everybody's kind of trending even for prototyping now moving that offshore because in China you can do prototyping and get it within almost the same timeline. But the thing is with manufacturing, like it really helps to go there and kind of establish the relationship. Yeah.
Alessio [00:52:38]: My first company was a hardware company and we did our PCBs in China and took a long time. Now things are better. But this was, yeah, I don't know, 10 years ago, something like that. Yeah.
Ethan [00:52:47]: I think that like the, and I've heard this too, we didn't run into this problem, but like, you know, if it's something where you don't have the relationship, they don't see you, they don't know you, you know, you might get subcontracted out or like they're not paying attention. But like if you're, you know, you have the relationship and a priority, like, yeah, it's really good. We ended up doing the fabrication assembly in Taiwan for various reasons.
Maria [00:53:11]: And I think it really helped the fact that you went there at some point. Yeah.
Ethan [00:53:15]: We're really happy with the process and, but I mean the whole process of just Choosing the right people. Choosing the right people, but also just sourcing the bill materials and all of that stuff. Like, I guess like if you have time, it's not that bad, but if you're trying to like really push the speed at that, it's incredibly stressful. Okay. We got to move to the software. Yeah.
Alessio [00:53:38]: Yeah. So the hardware, maybe it's hard for people to understand, but what software people can understand is that running. Transcription and summarization, all of these things in real time every day for 24 hours a day. It's not easy. So you mentioned 200,000 tokens for a day. Yeah. How do you make it basically free to run all of this for the consumer?
Ethan [00:53:59]: Well, I think that the pipeline and the inference, like people think about all of these tokens, but as you know, the price of tokens is like dramatically dropping. You guys probably have some charts somewhere that you've posted. We do. And like, if you see that trend in like 250,000 input tokens, it's not really that much, right? Like the output.
swyx [00:54:21]: You do several layers. You do live. Yeah.
Ethan [00:54:23]: Yeah. So the speech to text is like the most challenging part actually, because you know, it requires like real time processing and then like later processing with a larger model. And one thing that is fairly obvious is that like, you don't need to transcribe things that don't have any voice in it. Right? So good voice activity is key, right? Because like the majority of most people's day is not spent with voice activity. Right? So that is the first step to cutting down the amount of compute you have to do. And voice activity is a fairly cheap thing to do. Very, very cheap thing to do. The models that need to summarize, you don't need a Sonnet level kind of model to summarize. You do need a Sonnet level model to like execute things like the agent. And we will be having a subscription for like features like that because it's, you know, although now with the R1, like we'll see, we haven't evaluated it. A deep seek? Yeah. I mean, not that one in particular, but like, you know, they're already there that can kind of perform at that level. I was like, it's going to stay in six months, but like, yeah. So self-hosted models help in the things where you can. So you are self-hosting models. Yes. You are fine tuning your own ASR. Yes. I will say that I see in the future that everything's trending down. Although like, I think there might be an intermediary step with things to become expensive, which is like, we're really interested because like the pipeline is very tedious and like a lot of tuning. Right. Which is brutal because it's just a lot of trial and error. Whereas like, well, wouldn't it be nice if an end to end model could just do all of this and learn it? If we could do transcription with like an LLM, there's so many advantages to that, but it's going to be a larger model and hence like more compute, you know, we're optimistic. Maybe we could distill something down and like, we kind of more than focus on reducing the cost of the existing pipeline or trying to the next generation. Cause it's very clear that like all ASR, all speech to the text is going to be pretty obsolete pretty soon. So like investing into that is probably kind of a dead end. Cause it's just going to be. It's going to be obsolete.
swyx [00:56:39]: It's interesting. Like I think when I initially invested in tab this is, this shows you how wrong I was. I was like, oh, this is a sort of razor blades, blade razors and blades model where you sell a cheap hardware and you make up a subscription, like a monthly subscription. And now I just checked friend is a one-time sale, $99 limitless one-time sale, $99. These guys one-time sale, $49 and inference is free. What? Wow. It's crazy.
Ethan [00:57:09]: I think when you probably invested, like how much was a million input tokens at that time and what is it now?
swyx [00:57:15]: It's a fascinating business and like, you know, there's a lot to dig into there, but just getting that perspective out there is, I think it's not something that people think about a lot.
Alessio [00:57:24]: And you obviously have thought a lot about. What about memory? I think this is something we go back and forth on about memory as in you're just memorizing facts and then understanding implicit preference and adjusting facts that you think are important. Have you ever done something about a person? Any learnings from that? I know there's a lot of open source frameworks now that do it that you build all of your own infrastructure internally.
Ethan [00:57:46]: Yeah, we did. I mean I evaluated used a lot in other projects. I think that there's a few different tasks or things that revolve around memory. Like one is like retrieval obviously. And like when you need to find like even if you have a large corpus of how do you find? And so like I think existing kind of rag pipelines also will probably be the most helpful. The frameworks, I have not found one, like, there's no general way to do RAG that works, like, it's really highly dependent on the data. So, like, if you're going to be customizing something that much, it's just, you get kind of more bang from the buck from designing it all yourself. You know, a lot of those frameworks are great for getting going quickly. But I think it's really interesting memory when you're trying to do, for a person, because memory is decay, right? Like, I'm going to London, you know, then I come back, I'm not going to London anymore. What we've learned is, like, doing the traditional, like, embedding and RAG is suboptimal. We kind of built our own using small models to do really massively parallel retrieval. Which I think is going to be maybe more common in the future. And then, like, how to represent a person. We still require some human loop. And I mean, this is an ongoing project. And, you know, we're learning every day. Like, how do you correct the model when it gets something wrong about you? Right now, we have, like, things that are, like, super confirmed that are, like, ground truth about you because the human accepted it. But ideally, like, that step wouldn't be necessary. And then we have things that are fuzzier. And, like, the more... Stuff that we know is true, the more accurate we are when we're trying to decide, is this fuzzy stuff? Because it's probably, like, if you have the context, it's probably not true. So I think it's one of the most core challenges is how to handle both retrieval and then modeling and, like, especially when you're dealing with noisy source data. Because, like, even if, in an ideal world, even if you just had perfect transcription and you're going off that, that's still not enough information, right? And even if you had visual, it's still not enough. Like, there's still going to be...
Alessio [00:59:55]: Yeah, one way I think about it is I usually like to order the same thing from the same restaurant if I like it. But I'm not saying that out loud. And it's kind of like, are these type of behaviors? Like, when you ask about a favorite restaurant, I would just want it to give me restaurants that I've already been to that I like. Or, like, if I'm like, hey, just order something. from this place, I should just reorder the same thing. Because it knows that I like to redo the same thing. But I feel like today, most agent memory things that I see people publish, it's like, you know, just write down the data thing.
Ethan [01:00:39]: Yeah, I mean, I think that's why the reasoning, like, in our case, like, giving it time to consider all of the sources it has. So, like, look at the email, see, like, the receipts, and then look at the conversations to see, like, what I've mentioned. And then be able to then take enough time to search through all the contexts and connect the dots is, I think, really important. And, like, I don't know, like, some of the agent memory stuff is, like, the key value with RAG on top. Like, and the results there are just not complete enough when you have, like, growing corpus and, like, managing decay and hallucinations that might be in the source material. So, this is where people usually bring in knowledge graphs. Yes. And do you do it? We don't extensively use knowledge graphs. It's something, you know, we didn't talk also about the kind of potential future social aspects.
Maria [01:01:33]: Yeah, I wanted to speak about it.
Ethan [01:01:35]: But the problem with knowledge graphs that we found is, like, and I don't know if you can tell me what your experience has been, but they're great for representing the data, but then, like, using it at inference time is kind of challenging.
swyx [01:01:49]: For speed or what other issues?
Ethan [01:01:51]: Just, like, the LLM understanding. Like, the graph. Yeah. The input. Yeah, it's not in the training data, for sure. I think that the graph is the right kind of way to store the data, but, like, then you need to have the right retrieval and then just kind of formatting in a way that, like, doesn't just overwhelm or confuse what you're trying to do. Should we ask about social? Yeah, I thought you were going to go into it. Yeah. Like, not directly related. We did some experimentation. Not directly related to, like, graph retrieval or graph knowledge races. Yeah. Yeah. Yeah. Yeah. The idea that having, like, your personal context, but then, like, other people can query it, you know, it can divulge some things that you would have full control over. Then Maria and I are trying to negotiate, like, where we're going to dinner, like, there can be an exchange. We exactly did this experiment. Yeah. There can be an exchange between the agents and, like, oh.
Maria [01:02:45]: So how, like, my agent can speak with Ethan's agent. Both of them, they know our location, what we like, where we went in the past. Yeah. And even, you know, if we have our calendar integrated, they know when we're free. So they can interact with each other and have a conversation and decide a place to go for us. Wow. And we did that. And it was, for me, really cool because they suggested to us a nice French restaurant that we went at the end.
swyx [01:03:11]: That you've never been to?
Maria [01:03:12]: That we've never been to. Okay. But both of us, they said that we like French food. Both of us, we were in Pacific Heights. And, yeah, this was really trivial. Yeah.
Ethan [01:03:23]: It's a trivial, like, toy use. But I guess, like, in terms of you've been using it for a while, like, if I wanted to buy you a gift.
Maria [01:03:30]: Oh, my God. You bought me a bunch of candles now that I think about it.
Ethan [01:03:35]: This is another use case. I was like, yeah. When we were testing the agent, like, a bunch of candles from Amazon showed up at her door.
Maria [01:03:43]: Yeah, because I love candles, but I didn't expect 20. Yeah.
Ethan [01:03:47]: It was a lot of experimenting. But, like, how to manage that where it's like, what's okay for your B to divulge to him? Who? Yeah. Like, shouldn't you get an authorization request every time? Yeah, yeah, yeah.
swyx [01:03:58]: For personal context. Yeah, yeah, yeah.
Ethan [01:04:00]: So, like, you know, you would have to, human would have to sign off on it. But I think then, like, then I wouldn't have to guess. I could just.
swyx [01:04:10]: Yeah, yeah. You know, there's this culture that, like, is very alien to everyone else outside of SF and outside the Gen Z bubble in SF, which is sharing, location sharing. Yeah. I can tell my close friends where they are exactly right now in the city. Yeah. And it's opt-in. And, like, it's. Dude. Dude. You know, and, like, it's normal and, like, it freaks out everyone who's not here. Yeah. Yeah. And so maybe we can share preference, like, who we like. Absolutely.
Maria [01:04:34]: I really believe in it, for sure. We will.
Ethan [01:04:36]: Or even, like, small updates about your day. My parents would love that because I don't do that. Yeah. now there's no friction. It can just be more or less automatic. Yeah. Dating? I was trained always to avoid dating. Really? As a startup founder. Yeah, you can hate that. Yeah. Everyone hates it?
Maria [01:04:55]: We thought about it. Like, sometimes some people, they ask to us because it's like, oh, you know so much about me. Like, can you measure compatibility with somebody else or something like that? Yeah. Probably there is a future. Maybe somebody should build that. I think on our end, we were like, no, this is. We don't want to.
Ethan [01:05:11]: I will build on your API. My sister is actually a personality psychology professor and she studies personality. And we were at Thanksgiving because my parents wear one. And I was like, ask it. Like, give me my big five. Yeah. Which is like the personality type. And it's like. Does it know my big five? Just ask it to consider everything and give your big five. And my sister said it was pretty. I didn't agree with it because it said I was disagreeable. I agree with that. But she seemed to think it was agreeable. And so.
swyx [01:05:41]: You disagree that you're disagreeable? Yeah. Yeah. What other proof do we need then?
Ethan [01:05:47]: Yeah. I think I'm very agreeable.
Ethan [01:05:51]: But I think that we do. I did get some users are like, oh, if like we're a couple. Yeah.
Maria [01:05:56]: We had like couples. Actually. They bought the product together. Yeah. Like both. Like couple. They bought the hardware. So there is something there. Another test is like the Myers-Briggs. I know that you don't like that one. No. No.
swyx [01:06:08]: Ocean is cooler than Myers-Briggs. Yeah. Everyone stop using my MBTI. Use my. Use Ocean. Yeah.
Maria [01:06:12]: Yeah. For me, like it was on point. Like every time. Like it. Awesome.
Alessio [01:06:16]: Anything else that we didn't cover? Any cool underrated things?
Maria [01:06:21]: Go to b.computer. Forty nine. Ninety nine. And you buy the device. That's the. That's the call to action.
swyx [01:06:28]: And you're hiring?
Maria [01:06:29]: We are hiring. For sure.
Ethan [01:06:32]: AI engineers.
Maria [01:06:33]: AI engineers. Nice. What is an AI engineer?
Ethan [01:06:35]: Yeah. But did you study? Somebody who's scrappy and willing to.
Maria [01:06:42]: Work with us. Yeah.
Ethan [01:06:43]: I think. I think you coined the term, right? So you can tell us.
Maria [01:06:48]: Somebody that can adapt. That has resistance. Yeah. Yeah.
swyx [01:06:51]: People have different perspectives and what is useful for you is different from what is useful for me. Yeah. So anyway, it's so useful.
Ethan [01:06:57]: I mean, I think that always on AI is really going to explode and it's going to be a lot from both a lot of startups, but incumbents and there's going to be all kinds of new things that we're going to learn about how it's going to change all of our lives. I think that's the thing I'm most certain about. So. And being AI.
swyx [01:07:15]: Well, thanks very much. Thank you guys. This is a pleasure. Thank you. Yeah. We'll see you launch whenever. Thank you. I'm sure that launch is happening. Yeah. Thanks. Thank you.
Listen to all your favourite podcasts with AI-powered features
Listen to the best highlights from the podcasts you love and dive into the full episode
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
Listen to all your favourite podcasts with AI-powered features
Listen to the best highlights from the podcasts you love and dive into the full episode