聊天框之后:语音、眼镜与下一代交互
主题综述
主题页(活文档)· 最近更新 2026-06-12 · 取材 9 篇访谈
更新日志
- 2026-06-12 — 首次综述。基于 9 篇访谈:九位发言人都默认聊天框/手机屏幕只是过渡形态,真正的分歧在竞争维度——语音派赌"零注意力占用的伴随",眼镜派赌"高带宽的视觉替代",而 Eugenia Kuyda 单独提醒:屏幕和手机可能根本不会退场,瓶颈也不在光学而在社会接受度。(本主题由旧库挖掘 /excavate 发现)
主流共识
共识一:没有人为聊天框辩护。 这是全部 9 篇访谈里最一致的前提——做语音的、做眼镜的、做个人软件的,都把当前的 chat 界面当作必将被超越的中间态。
"And certainly the chatbot is not the end of the road here for the way that people are going to want to engage with and use AI."「聊天机器人肯定不是人们与 AI 交互方式的终点。」Evan Spiegel · Snap CEO Evan Spiegel is Betting on Smart Glasses
"We got obsessed with this metaphor that the current chatbots are really the Microsoft DOS era for AI interfaces and that there will be some sort of a Windows, macOS moment."「我们沉迷于这样一种隐喻,即当前的聊天机器人实际上是 AI 界面的 Microsoft DOS 时代,并且会出现某种 Windows、macOS 时刻。」Eugenia Kuyda (Replika / Wabi) · Seeing The Future from AI Companions to Personal Software
「如果你今天从头开始,你可能不会构建这个以应用为中心的世界,作为一个消费者,我试图解决一个问题,首先必须决定使用哪个提供商来解决这个问题。」(该篇逐字稿仅存中文)
— Andrew Bosworth (Meta CTO) · What Comes After Mobile? Meta's Andrew Bosworth on AI and Consumer Tech
共识二:AI 改变的不是界面皮肤,而是人和计算机的角色分工——从"操作者"变成"监工/出题人"。这是眼镜派和语音派共同的论证地基。
"Instead of needing to sit and operate your computer all day long, AI is going to operate your computer for you and you're going to observe AI, monitor AI, make sure it's on the right track."「你不再需要整天坐着操作电脑,AI 会替你操作电脑,你只需观察 AI、监控 AI、确认它在正轨上。」
「我想要的是播放音乐。……我只想对 AI 说。……我不想负责编排我要打开哪个应用来做某事。我们不得不这样做,因为在整个数字计算的历史中,事情就是这样做的。」(仅存中文)
— Andrew Bosworth · What Comes After Mobile?
共识三:屏幕疲劳是这一代交互创业共同的情绪燃料。 Spiegel 的"七小时驼背"和 Raiza Martin 的"争夺眼球很疯狂",是同一个观察的两种产品化。
"All of this felt very limited by this tiny screen where we spend, you know, seven hours a day hunched over this tiny glowing rectangle."「这一切都被这块小屏幕限制住了——我们每天花七个小时驼着背盯着这个发光的小矩形。」Evan Spiegel · A Conversation with Snapchat CEO Evan Spiegel
"It's kind of crazy that when you build anything, you have to compete for the user's eyeballs. … And audio is really special in the sense that you can participate in it when you're doing something else."「建立任何东西,你都必须争夺用户的眼球,这有点疯狂。……音频真的很特别,因为你可以在做其他事情的时候参与其中。」Raiza Martin (Huxe) · Building the Star Trek computer with Huxe CEO Raiza Martin
共识四:新形态必须立刻给出 10 倍价值,否则没人改变习惯。 眼镜派自己也承认这条硬杠。
"We put a camera on a pair of glasses, but the value just wasn't there. It's not 10 times better than just taking a picture with your phone."「我们把相机装上眼镜,但价值就是不够——它没有比直接用手机拍照好 10 倍。」Evan Spiegel · How Snap Plans to Win the AR Race
分歧在哪
语音派:Star Trek computer 与"被动内容"
语音派的旗手其实大多是借他人之口出场的。Sam Altman 的立场由 Alex Heath(记者,Huxe 访谈共同主持)转述:
"He was talking about how he thinks that AI will be primarily used through voice. And he's like so into voice as the main input for AI."「他说他认为 AI 主要会通过语音使用。他非常喜欢语音作为 AI 的主要输入方式。」Alex Heath(转述 Sam Altman,OpenAI Dev Day 问答) · Building the Star Trek computer
"And Johnny seemed to agree that the OS matters less and less, especially with voice. And I do think their device will be voice."「Johnny 似乎也同意,操作系统的重要性正在降低,特别是对于语音来说。我确实认为他们的设备会是语音设备。」Alex Heath(谈 Jony Ive × OpenAI 设备) · Building the Star Trek computer
Raiza Martin 是语音派里少有的第一人称论证者,她的原点是童年的 Star Trek:
"You stand in front of the thing and you say, computer, and then you issue the command, right? And it just happens. … And so it's totally magical. I think it's the interface of the future."「你站在电脑前说,'电脑',然后发出指令,对吧?然后就发生了。……所以这太神奇了。我认为这是未来的界面。」Raiza Martin · Building the Star Trek computer
但她对"语音助手人人都要"这个假设有清醒的限定——大多数人并不像科技从业者那样把世界当作可对话的界面:
"If you already have an internal monologue, I think it's easy for you to imagine that you could have an AI voice assistant. You watch the movie Her and you're like, for sure, I get it. But then I think for most people, the movie Her is just a movie."「如果你已经有了内心的独白,我认为你很容易想象你可以拥有一个 AI 语音助手。你看了电影《Her》,你肯定会觉得,我明白了。但我认为对于大多数人来说,电影《Her》只是一部电影。」Raiza Martin · Building the Star Trek computer
所以她的产品策略不是"让用户开口",而是反过来:
"And I think that the key to this is actually passive content that you make interactive."「我认为这里的关键实际上是将你制作的被动内容变成互动式的。」Raiza Martin · Building the Star Trek computer
主持人 Ellis Hamburger 替语音派把竞争维度说破——别跟视觉比带宽,比的是"可占用的时间":
"A lot of people in tech talk about like, what is the throughput of the different modalities? … Dude, like it's not about that. It's just about the time spent."「很多科技界人士都在谈论不同模态的吞吐量是多少?……伙计,不是这样的。只是关于花费的时间。」Ellis Hamburger · Building the Star Trek computer
值得存档的是:标题喊得最响的 ElevenLabs CEO: Why Voice is the Next AI Interface,Mati Staniszewski 在逐字稿里其实没有论证这个命题——他给出的是语音作为技术栈正在成熟的证据:
"That suddenly shifts from using a one foundational model into combining the speech-to-text, the LLM, the text-to-speech to orchestrate them together."「这种转变突然从使用一个基础模型转变为结合语音转文本、LLM、文本转语音来协调它们。」Mati Staniszewski (ElevenLabs) · ElevenLabs CEO: Why Voice is the Next AI Interface
眼镜派:视觉带宽、共享计算与"后手机时代"
Spiegel 在五场访谈里立场一贯(眼镜 11 年、$3B 投入),但每个场合给的理由不同——这些差异值得分开保存:
对 Bloomberg(主流商业媒体),他给的是"AR 是 AI 的前端"和"替代屏幕"的总论,并且明确不抢手机的位置:
"I think fundamentally AR is the interface for AI. … The reason why we think AR is such a powerful interface is because it brings AI into the world with you."「我认为从根本上说,AR 就是 AI 的界面。……我们认为 AR 是如此强大的界面,是因为它把 AI 带进你所在的世界。」Evan Spiegel · Snap CEO Evan Spiegel is Betting on Smart Glasses
"Initially, I don't think it will replace phones, but I think over a very long period of time, it will replace most of the screens in your life. … Glasses can provide a display area that's maybe the equivalent of 400 times a smartphone, but at a fraction of the weight."「一开始我不认为它会取代手机,但在很长的时间跨度里,它会取代你生活中的大多数屏幕。……眼镜能提供大约相当于智能手机 400 倍的显示面积,重量却只是零头。」Evan Spiegel · Snap CEO Evan Spiegel is Betting on Smart Glasses
对 CFR(政策圈),他正面攻击了语音派的物理前提——这是整个语料库里对"voice is the next interface"最直接的反驳:
"The interface is primarily an audio interface, which ultimately is just very low bandwidth for humans, right? We're very visual. Our visual cortex is incredibly powerful. We consume information visually. So I think devices that only have audio input and output or even a small screen are just going to be limited in their utility overall."「(那类设备的)界面主要是音频界面,而音频对人类来说带宽实在太低了。我们是高度视觉化的动物,视觉皮层无比强大,我们靠视觉消化信息。所以我认为只有音频输入输出、或只有小屏幕的设备,整体效用注定有限。」Evan Spiegel(被问及录音吊坠类 AI 硬件) · A Conversation with Snapchat CEO Evan Spiegel
"We're going to spend less time hunched over operating a computer and more time supervising AI agents that are working on our behalf."「我们会花更少时间驼背操作电脑,花更多时间监督替我们干活的 AI agents。」Evan Spiegel · A Conversation with Snapchat CEO Evan Spiegel
对 VC 播客 Grit,理由换成了生产力——眼镜是"无限的随身工作站",并第一次把"共享"抬到本体论高度:
"You'll have a giant 100 plus inch screen workstation with you. … Right now, one of the things your phone really struggles with is multitasking. It's a single app framework."「你会随身带着一个 100 多英寸的巨屏工作站。……眼下手机最吃力的事情之一就是多任务,它是单应用框架。」Evan Spiegel · How Snap Plans to Win the AR Race
"For the first time in human history, computing will be shared instead of single player. Like today, every time we use a computer, we're using it alone."「人类历史上第一次,计算将是共享的而不是单人的。今天我们每次用电脑、用手机,都是独自在用。」Evan Spiegel · How Snap Plans to Win the AR Race
也是在这一场,他把 VR 整条路线判了死刑:
"A lot of people spend a lot of time focused on VR for the last decade, which is, in our view, a road to nowhere because people just don't want to wear this chunky headset all day long to access their computer."「过去十年很多人把大量时间押在 VR 上,在我们看来那是一条死路——人们就是不想整天戴着笨重的头显去访问自己的电脑。」Evan Spiegel · How Snap Plans to Win the AR Race
对创作者播客 Colin & Samir,叙事再换:健康、姿态、近视、孩子在户外——并给出他最明确的时间表:
"There was a huge opportunity to make computers something that actually keep you grounded in the real world. I think people feel really frustrated that the phone gets in the way of their day-to-day lives."「有一个巨大的机会,把计算机做成让你扎根真实世界的东西。我觉得人们真的很沮丧,手机挡在了他们的日常生活中间。」Evan Spiegel · Snapchat CEO on the future of Human connection
"I want to be sitting at home, look out the window, see our four kids running around outside and look at my wife and be like, they won't get off the damn computer."「我想坐在家里,望向窗外,看到我们四个孩子在外面跑来跑去,然后扭头对我太太说:他们就是不肯从电脑上下来。」Evan Spiegel · Snapchat CEO on the future of Human connection
"I think by the end of this decade."(问:AR 眼镜何时大规模普及)「我认为是在这个十年结束之前。」Evan Spiegel · Snapchat CEO on the future of Human connection
对 Diary of a CEO(创业者受众),重点又变成竞争与生态——"feature 容易抄,平台抄不动",以及对 Meta 的明嘲:
"Now they say they're working on glasses, which we've been working on for over a decade. I think I've earned that title."(指他 LinkedIn 自称 "VP of Product at Meta" 的玩笑)「现在他们说自己在做眼镜了——我们已经做了十多年。我觉得这个头衔是我挣来的。」
"I think they're investing $20 billion a year right now just into the AR Glasses stuff and some of their VR stuff. AR Glasses stuff is largely copying what we've been doing."「我估计他们现在一年就往 AR 眼镜和一部分 VR 上投 200 亿美元。AR 眼镜那部分大体上是在抄我们做过的东西。」Evan Spiegel · Snapchat CEO: Exact Formula
被点名的 Meta 一方,Bosworth 的眼镜叙事与 Spiegel 同向但不同构。他更激进——直接谈"后手机时代"和应用模型的整体倒置,也更坦白地列出了风险排序:
「当你使用 Orion,当你使用完整的 AR 眼镜时,你可以想象一个后手机时代。你会觉得,哇,如果它足够吸引人,足够轻便,电池续航时间足够长,可以全天佩戴,它就能满足我的一切需求。」(仅存中文)
— Andrew Bosworth · What Comes After Mobile?
「话虽如此,硬件问题很难,而且是真实存在的,成本问题也很难,而且是真实存在的,想要弑君,最好一击即中。如今,手机是我们生活中不可思议的核心。」(仅存中文)
— Andrew Bosworth · What Comes After Mobile?
「我们有真正的发明风险。……我认为,比发明风险更大的,是采用风险。它是否被认为是社会可以接受的?人们是否愿意学习一种新的模式?……生态系统风险,甚至比这更大。……我实际上认为生态系统风险是我以前认为的最大风险。但现在 AI 是我的潜在灵丹妙药。如果 AI 成为主要的界面,那么它就是免费的。」(仅存中文)
— Andrew Bosworth · What Comes After Mobile?
注意两人的微妙差别:Spiegel 把 AR 说成"AI 的前端",赌的是垂直整合的光学+渲染+开发者生态;Bosworth 则把 AI 当作替眼镜解决生态问题的"灵丹妙药",并且愿意想象 app、品牌、应用商店整体被意图流取代——Ray-Ban Meta 本身就是个佐证:它本来不是 AI 眼镜,「距离投产还有六个月的时候,Llama 3 问世了。团队表示,不,我们必须做这个。所以现在它们是 AI 眼镜」(仅存中文)。Spiegel 从不把自己的 lens 生态说成可以被 AI 溶解的东西。
第三阵营:Kuyda——"语音是思维陷阱,屏幕不会死"
Eugenia Kuyda 同时反对两派的默认前提。先打语音:
"I think there's a huge mind trap that exists among builders in the space where they somehow think that voice is the main interface."「我认为这个领域的建设者存在一个巨大的思维陷阱,他们不知何故认为语音是主要的界面。」Eugenia Kuyda · Seeing The Future from AI Companions to Personal Software
"But if you really think about voice interfaces, they're just so imperfect. You can't use that device if you're laying in bed with someone who's sleeping. You can't use it in a crowded space. You can't use it at the office."「但如果你真的考虑语音界面,它们就是不完美。如果你和别人一起躺在床上睡觉,你就不能使用那个设备。你不能在拥挤的空间里使用它。你不能在办公室使用它。」Eugenia Kuyda · Seeing The Future from AI Companions to Personal Software
"Every single Alexa right now, like 75% of them are being shipped with a screen."「现在每台 Alexa,大约 75% 都配备了屏幕。」Eugenia Kuyda · Seeing The Future from AI Companions to Personal Software
她甚至和 Spiegel 用了同一条论据(音频带宽低)——但导向完全不同的结论:不是眼镜,而是更好的手机。
"I would not ever make a screenless device. In fact, I would make it very much a screen-first device, but I do believe that the AI device is not about a voice-driven thing. It's more about building this AI-first operating system, having all the models run locally as well."「我永远不会制造无屏幕设备。事实上,我会让它成为一个以屏幕为先的设备,但我确实认为人工智能设备与语音驱动无关。更多的是构建这种 AI 优先的操作系统,让所有模型也在本地运行。」Eugenia Kuyda · Seeing The Future from AI Companions to Personal Software
"How is it possible that we have this God-like technology, yet we pass around these text prompts, which is almost like Microsoft DOS commands, but worse?"「我们明明拥有如此强大的技术,却还在传递这些文本提示,这几乎就像 Microsoft DOS 命令一样,甚至更糟,这怎么可能?」Eugenia Kuyda · Seeing The Future from AI Companions to Personal Software
她对应用模型的判断也卡在 Bosworth 和 Spiegel 之间:app 不会消失,但会从"专业开发者的固定软件"变成"人人可改的 UGC 内容"(mini-apps)。这与 OpenAI 的路线又构成第四种张力——Nick Turley(经 Alex Heath 转述)想把聊天框本身长成操作系统:
"What you're going to see over the next six months is an evolution of ChatGPT from an app that was really useful into something that feels a little bit more like an operating system where you can access goods and services."「在接下来的六个月里,你们将会看到 ChatGPT 从一个非常有用的应用程序,演变成更像一个操作系统,你可以在其中获取商品和服务。」Nick Turley(OpenAI,经 Alex Heath 转述) · Building the Star Trek computer
瓶颈到底在哪:技术、光学,还是社会?
- Spiegel(技术派):瓶颈是光学引擎与功耗——"the optical engine, so the little projector that projects light on the waveguide… Making that a lot more efficient will be a big driver"(Colin & Samir)。
- Bosworth(社会派):发明风险已过半,更大的是社会接受度与监管——「我这里有一个始终在线的机器,可以给我超人的感知能力。……我可以问这个问题吗?你的权利是什么?这是你的脸。」「伟大的技术可能会在很长一段时间内受阻。核能受到了阻碍。」(仅存中文,What Comes After Mobile?)
- Raiza(听觉接受度):语音的瓶颈不是分辨真假而是"可容忍"——"is it tolerable enough for this particular media that you're consuming?";她孩子管 AI 语音叫 clankers:"They're like, I'm not listening to a clanker."「他们会说,我不听机器人的。」(Building the Star Trek computer)
- Kuyda(认知派):最大的瓶颈是建设者自己的"思维陷阱"——把《Her》看错了重点。
都没说透的
1. "voice is the next AI interface" 这个命题本身缺少第一人称论证。 标题里喊出它的 ElevenLabs 访谈,逐字稿通篇是组织设计与 go-to-market(小团队、无头衔、激励结构),界面命题只活在节目包装里;Sam Altman 的 voice-first 立场也只经记者转述进入语料。正面交锋(Spiegel 的"低带宽"、Kuyda 的"思维陷阱")都来自反方——这个辩论目前是单边出席的。
2. "降低注意力占用"的界面,配的全是注意力生意。 Raiza 一边说争夺眼球"很疯狂",一边详细推演了基于日历+邮件+消费记录的个性化音频广告(Blue Bottle 例);Bosworth 一边畅想意图界面,一边自问「我有多信任他们不会在后端被收买」;Spiegel 的 Snap 本身就是广告公司。注意力解放与注意力变现怎么共存,没有人接着说。
3. 旁观者隐私被提出又被放下。 Bosworth 把"眼镜认出十年没见的人"留作开放问题;Spiegel 给的答案是"未来世代会有 on-device 匿名化的技术路径"——一个对 2026 年就要发售的消费产品而言相当稀薄的回答。被拍到的人(而非戴眼镜的人)的同意机制,九篇里没有一篇展开。
4. 过渡期经济学是空白。 Bosworth 承认生态风险最大,随即用"AI 会成为主要界面,所以生态免费"一句带过;Spiegel 的 25M 台 × $1000 × 30% 毛利只是假想演算(Bloomberg);Kuyda 的 AI-first 手机连原型都还没有。在设备没有用户之前谁来建生态、消费者为什么买第一代——每个人都绕开了 Diary of a CEO 主持人那句提醒:"你可能是对的,但你可能早了 15 年。"
我的看法
以下是判断,不是语料归纳,把握程度中等。第一,语音与眼镜很可能不是赢者通吃的对手,而是两个不同维度的赢家:语音赢"注意力低占用的伴随场景"(通勤、家务、车内),眼镜赌"高带宽的替代场景"(工作站、共享体验)——Ellis 的"time spent"和 Spiegel 的"visual cortex"各自在自己的维度上都对。第二,我认为社会接受度(clankers、旁观者隐私、Glasshole 记忆)比光学更可能决定时间表,Bosworth 对风险的排序比 Spiegel 的路线图更接近真实约束。第三,Kuyda 那个不舒服的数据点——近十亿人已经在用聊天框、且只用最简单的功能——既是"DOS 时代"的证据,也同样可以读成惯性的证据:聊天框作为过渡形态的寿命,可能比两派都预期的长得多。
还想知道什么
- Spectacles 2026 消费版的真实定价与首年销量——对照 Spiegel 在 Bloomberg 给出的 25M 台 × $1000 × 30% 毛利假想,以及"end of this decade"普及承诺。
- OpenAI × Jony Ive 设备的实际形态:Heath 的两条线索(voice-first + Walkman 式的"文化趣味")是否兑现,以及它与 Altman "voice as main input" 表态的关系。
- ElevenLabs 是否在别处给过 "voice is the next interface" 的完整第一人称论证(agents 平台数据、turn-taking 延迟、企业语音渗透率),可补上这场辩论缺席的一方。
- Ray-Ban Meta 的留存与日活数据:Bosworth 说"采用风险看起来比以前好多了",需要使用数据而非出货量来检验;同样想要 Kuyda 反复引用的 "75% Alexa 带屏出货" 的原始出处。
取材
- Snapchat CEO on the future of Human connection · 2026-04-14
- Snap CEO Evan Spiegel is Betting on Smart Glasses · 2026-04-14
- A Conversation with Snapchat CEO Evan Spiegel · 2026-04-14
- Snapchat CEO: Exact Formula Used To Build A $130 Billion Company! · 2026-04-14
- How Snap Plans to Win the AR Race | Evan Spiegel on Spectacles · 2026-04-14
- Seeing The Future from AI Companions to Personal Software · 2025-11-06
- ElevenLabs CEO: Why Voice is the Next AI Interface · 2025-11-06
- Building the Star Trek computer with Huxe CEO Raiza Martin · 2025-10-15
- What Comes After Mobile? Meta's Andrew Bosworth on AI and Consumer Tech · 2025-04-28