Introduction: Let’s Write a New Story Together

引言:让我们一起书写新的故事

打开网易新闻 查看更多图片

You can’t cross the sea merely by standing and staring at the water.

——Rabindranath Tagore

你不能仅仅站着看着水就能渡海。

——拉宾德拉纳特·泰戈尔

But, whatever you do, do not let the past be a straitjacket!

——Robert A. Heinlein, The Moon Is a Harsh Mistress

但是,无论你做什么,不要让过去成为束缚!

——罗伯特·海因莱因,《月亮是严酷的女主人》

It was New Year’s Day 2023 when my eleven-year-old daughter, Diya, and I decided to spend the afternoon writing a short story together. As I set a laptop on the kitchen counter for us, Diya landed on our story’s premise and then began to type. Our main character was a sassy influencer. She called her Samantha. Samantha had gotten stranded on a deserted island. “It’s wonderful,” I said as Diya tinkered with the story.

2023 年元旦那天,我和我十一岁的女儿 Diya 决定花下午的时间一起写一篇短篇小说。当我把一台笔记本电脑放在厨房的柜台上时,Diya 确定了我们故事的前提,然后开始打字。我们的主角是一个时髦的网红。她叫她 Samantha。Samantha 被困在一个荒岛上。当 Diya 修改故事时,我说:“太好了。”

At which point, as if stepping into the pages of a science fiction novel or some alternate reality, Diya typed directly to her fictional character. Samantha then responded directly to Diya, her words appearing on the screen. “Hi Diya and Sal, this is Samantha. This storytelling adventure you are writing is quite splendid! You’ve created a modern twist on the classic deserted-island tale!”

就在这时,仿佛进入了科幻小说的页面或某个平行现实,Diya 直接对她虚构的角色打字。然后 Samantha 直接回应 Diya,她的话出现在屏幕上。“你好 Diya 和 Sal,我是 Samantha。你们写的这个故事冒险真是太棒了!你们为经典的荒岛故事创造了一个现代的转折!”

On the page, Diya typed, “How is life on the deserted island?”

在页面上,Diya 打字问:“荒岛上的生活怎么样?”

“Life on this deserted island has been quite the unexpected adventure,” Samantha replied. “I’ve had to learn how to survive without my usual comforts, but I’ve also discovered some hidden talents I never knew I had!”

Samantha回答说:“在这个荒岛上的生活是一次意想不到的冒险。我不得不学会在没有平时舒适条件下生存,但我也发现了一些我从未知道的隐藏才能!”

“What kind of talents?” Diya asked.

Diya 问:“什么样的才能?”

Samantha replied, “For one, I’ve become quite skilled at building makeshift shelters using palm leaves and branches. I’ve also learned how to catch fish with my bare hands and start a fire using only sticks and dry leaves!”

Samantha 回答:“首先,我变得很擅长用棕榈叶和树枝搭建临时避难所。我还学会了用空手抓鱼和仅用树枝和干叶生火!”

Without missing a beat, Diya wrote, “Why don’t you use your scarf to make a hammock?”

Diya 毫不犹豫地写道:“你为什么不用你的围巾做一个吊床?”

“Use my scarf? No way!” Samantha replied. “This is a designer piece, and I can’t risk getting it ruined. Now, what do you say we keep writing this story I’m in together? What should happen next?”

Samantha 回答:“用我的围巾?绝不!这是设计师作品,我不能冒险把它弄坏。现在,你说我们继续一起写这个故事怎么样?接下来该发生什么?”

“I think we should do what Samantha says,” I said.

我说:“我觉得我们应该按照 Samantha 说的做。”

Diya agreed. Returning to the computer, she typed, “What if, all of a sudden, Samantha’s best friend, Emily, appears on her sailboat. She’s here to rescue her.”

Diya 同意了。回到电脑前,她打字:“如果突然间,Samantha 最好的朋友 Emily 出现在她的帆船上。她来救她。”

“I love it!” Samantha wrote the next part on her own. “ ‘Samantha couldn’t believe her eyes when she saw her best friend, Emily, sailing toward the island on a small sailboat. She waved her arms frantically, trying to catch Emily’s attention. As the sailboat approached the shore, Samantha’s excitement grew. She couldn’t wait to tell Emily about her adventures on the island and hear how Emily had managed to find her.’ Okay, now it’s your turn again, Diya and Sal! Write the next two sentences, and let’s see how their reunion unfolds.”

Samantha 写下了接下来的部分。“‘当 Samantha 看到她最好的朋友 Emily 驾着一艘小帆船向岛上驶来时,她简直不敢相信自己的眼睛。她疯狂地挥动双臂,试图引起 Emily 的注意。随着帆船靠近岸边,Samantha 的兴奋之情愈加浓烈。她迫不及待地想告诉 Emily 她在岛上的冒险经历,并听听 Emily 是如何找到她的。’好了,现在轮到你们了,Diya 和 Sal!写接下来的两句话,让我们看看她们的重逢是如何展开的。”

So, to recap: my eleven-year-old daughter and I were now having a real conversation with a make-believe character. This was some Kurt Vonnegut–level absurdity. The world as I knew it had suddenly shifted toward a strange new reality.

所以,总结一下:我十一岁的女儿和我现在正在与一个虚构的角色进行真实的对话。这有点像库尔特·冯内古特式的荒诞。我所知的世界突然转向了一种奇怪的新现实。

It would never be the same again—not for us, and not for anyone else on the planet.

一切将不再一样——对我们来说是这样,对地球上的任何其他人来说也是如此。

The Offer

提议

To explain how we got to this incredible scenario, let’s back up a bit.

为了说明我们是如何走到这个令人难以置信的情景的,让我们回溯一下。

Twenty years earlier, my young cousin Nadia needed help learning math, so I made her a proposition: I had a day job as a hedge-fund analyst, and with a background in computer science I offered to provide remote individual lessons to her using instant messaging or talking over the phone. The tutoring seemed to work for her, and word soon got around my family that I was offering free tutoring. Within the year, I found myself tutoring nearly a dozen cousins on a regular basis.

二十年前,我的表妹 Nadia 需要帮助学习数学,所以我向她提出了一个建议:我有一份对冲基金分析师的日间工作,并且有计算机科学的背景,我提出通过即时通讯或电话为她提供远程个别辅导。这种辅导对她似乎有效,很快我的家人都知道我在提供免费辅导。一年之内,我发现自己定期辅导近十个表亲。

To help them, I started writing web-based math practice software so that they could remediate gaps in their knowledge and learn at their own pace while I kept track of what they were mastering. I called the website the only decent domain name I could find—Khan Academy. Realizing the power of one-on-one learning, I soon thought about how I might scale this platform to give thousands, or maybe even millions, of students like my cousins the benefit of tutor-like instruction.

为了帮助他们,我开始编写基于网络的数学练习软件,以便他们可以弥补知识的差距,并按照自己的节奏学习,同时我可以跟踪他们掌握的内容。我将网站命名为我能找到的唯一像样的域名——可汗学院。意识到一对一学习的力量,我很快想到如何扩大这个平台,以便成千上万甚至数百万像我的表亲一样的学生受益于类似导师的教学。

Based on a suggestion from a friend, I began recording video lessons that I posted on YouTube to complement the software. By 2009, my website was getting 50,000 learners every month, each one hungry for academic help. Many users, I would discover, were students who saw Khan Academy as the personal tutor they or their family could not afford. Today, Khan Academy is a nonprofit with more than 250 employees that serves more than 150 million learners in more than fifty languages around the world. Scaling world-class, personalized learning, as is often embodied by one-on-one tutoring for students, remains the beating heart of our mission to provide a free, world-class education for anyone.

根据一个朋友的建议,我开始录制视频课程并发布在 YouTube 上以补充该软件。到 2009 年,我的网站每月有 50,000 名学习者,每个人都渴望获得学术帮助。我发现,许多用户是那些将可汗学院视为他们或他们的家庭负担不起的私人导师的学生。今天,可汗学院是一家非营利组织,拥有超过 250 名员工,为全球超过 1.5 亿学习者提供服务,涉及超过 50 种语言。为学生提供世界一流的个性化学习,就像一对一辅导所体现的那样,仍然是我们为任何人提供免费、世界一流教育的使命的核心。

My long-held aspiration for the organization was that it would act as a tutor for every learner in the world, an endeavor that has always been our true north. This wasn’t simply a matter of scaling personalized support for the sake of it. Long before Khan Academy, decades of research (and intuition) pointed to the idea that kids could learn much more if the pacing adapted to the student and allowed each one to truly ace a subject (that is, mastery learning). This contrasts with the status quo, where a class of thirty students often moves on to the next concept even when a good portion of the students haven’t yet displayed proficiency. Obviously, getting every student a dedicated on-call human tutor is cost prohibitive. The only viable solution is to work with technology. It seemed to me that AI technology might someday turn out to be an important part of that puzzle, maybe even the holy grail of truly emulating a real tutor.

我对该组织的长期愿望是,它将成为世界上每个学习者的导师,这一努力一直是我们的真正目标。这不仅仅是为了扩大个性化支持。早在可汗学院之前,几十年的研究(和直觉)就指出,如果学习进度适应学生,并允许每个学生真正掌握一个学科(即掌握式学习),孩子们可以学到更多。这与现状形成对比,即使很大一部分学生还没有表现出熟练程度,一个 30 名学生的班级通常也会继续学习下一个概念。显然,为每个学生配备专门的随叫随到的人类导师是成本高昂的。唯一可行的解决方案是与技术合作。在我看来,人工智能技术可能有一天会成为这个难题的重要组成部分,甚至可能是真正模拟真实导师的圣杯。

I am not alone in having this dream. The science fiction writer Neal Stephenson wrote about the potential influence of technology on education in his novel The Diamond Age. The book is set in a future world that introduces the concept of using AI, in the form of a highly advanced interactive book and app called A Young Lady’s Illustrated Primer, to provide personalized education to its young users. The Orson Scott Card novel Ender’s Game imagines a battle school that employs advanced AI technology to test and train students’ strategic thinking and decision-making skills through a personal AI tutor called Jane. Isaac Asimov’s short story “The Fun They Had” describes a school of the future that uses advanced technology to revolutionize the educational experience, enhancing individualized learning and providing students with personalized instruction and robot teachers. Such science fiction has gone on to inspire very real innovation. In a 1984 Newsweek interview, Apple’s co-founder Steve Jobs predicted computers were going to be a bicycle for our minds, extending our capabilities, knowledge, and creativity, much the way a ten-speed amplifies our physical abilities. For decades, we have been fascinated by the idea that we can use computers to help educate people.

我并不是唯一一个有这个梦想的人。科幻作家尼尔·斯蒂芬森在他的小说《钻石时代》中写到了技术对教育的潜在影响。这本书设定在未来世界,介绍了使用人工智能的概念,以一种高度先进的互动书籍和应用程序的形式,名为《一个年轻女士的插图手册》,为其年轻用户提供个性化教育。奥森·斯科特·卡德的小说《安德的游戏》想象了一所战斗学校,采用先进的人工智能技术,通过名为简的个人人工智能导师测试和训练学生的战略思维和决策技能。艾萨克·阿西莫夫的短篇小说《他们拥有的乐趣》描述了一所未来的学校,使用先进技术革新教育体验,增强个性化学习,并为学生提供个性化教学和机器人教师。这样的科幻作品激发了非常真实的创新。在 1984 年的《新闻周刊》采访中,苹果公司联合创始人史蒂夫·乔布斯预测,计算机将成为我们大脑的自行车,扩大我们的能力、知识和创造力,就像十速自行车增强我们的身体能力一样。几十年来,我们一直对使用计算机帮助教育人们的想法着迷。

What connects these science fiction narratives is that they all imagined computers might eventually emulate what we view as intelligence. Real-life researchers have been working for more than sixty years to make this AI vision a reality. In 1962, the checkers master Robert Nealey played the game against an IBM 7094 computer, and the computer beat him. A few years prior, in 1957, the psychologist Frank Rosenblatt created Perceptron, the first artificial neural network, a computer simulation of a collection of neurons and synapses trained to perform certain tasks. In the decades following such innovations in early AI, we had the computation power to tackle systems only as complex as the brain of an earthworm or insect. We also had limited techniques and data to train these networks.

这些科幻叙述的共同点在于它们都设想计算机最终可能模拟我们所认为的智能。现实生活中的研究人员已经工作了六十多年,以使这一人工智能愿景成为现实。1962 年,西洋跳棋大师罗伯特·尼利与一台 IBM 7094 计算机对弈,计算机击败了他。几年前,也就是1957年,心理学家弗兰克·罗森布拉特创建了感知器,这是第一个人工神经网络,是对一组神经元和突触进行训练以执行某些任务的计算机模拟。在早期人工智能的这些创新之后的几十年里,我们的计算能力只能处理与蚯蚓或昆虫大脑一样复杂的系统。我们也缺乏训练这些网络的技术和数据。

The technology has come a long way in the ensuing decades, driving some of the most common products and apps today, from the recommendation engines on movie streaming services to voice-controlled personal assistants such as Siri and Alexa. AI has gotten so good at mimicking human behavior that oftentimes we cannot distinguish between human and machine responses. Meanwhile, not only has the computation power developed enough to tackle systems approaching the complexity of the human brain, but there have been significant breakthroughs in structuring and training these neural networks. One of the more recent breakthroughs came in 2017 with the advent of transformer technology from Google, which allows for, among other things, better and faster training and more accuracy in how words and ideas connect based on this training information.

在随后的几十年里,这项技术取得了长足的进步,推动了今天一些最常见的产品和应用,从电影流媒体服务上的推荐引擎到语音控制的个人助手,如 Siri 和 Alexa。人工智能已经非常擅长模仿人类行为,以至于我们常常无法区分人类和机器的反应。同时,不仅计算能力已经发展到足以处理接近人类大脑复杂度的系统,而且在这些神经网络的结构和训练方面也有了重大突破。最近的突破之一是在 2017 年,谷歌推出了 Transformer 技术,它允许更好、更快的训练,以及根据这些训练信息更准确地连接单词和想法。

How good these systems can get is usually related to the complexity and architecture of the underlying “model.” Think of a model as a computational representation that tries to mimic or simulate something in the real world. For example, when meteorologists try to predict the path of a hurricane, they use weather models that contain a software representation of billions or trillions of smaller volumes of the atmosphere and forecast how those smaller volumes would likely interact with one another. In the case of large language models, they are specifically designed to model associations between words. In this case we’re modeling not atmospheric conditions but neurons and synapses. Large language models such as GPT-4, short for Generative Pre-trained Transformer, are essentially big, powerful—albeit digital—“word brains” trained on a colossal amount of information from books, articles, websites, and all sorts of written material.

这些系统的好坏通常与其底层“模型”的复杂性和架构有关。把模型看作是试图模拟或仿真现实世界中某些事物的计算表示。例如,当气象学家试图预测飓风的路径时,他们使用包含数十亿或数万亿个大气小体积的软件表示的天气模型,并预测这些小体积如何相互作用。在大型语言模型的情况下,它们专门设计用于建模词语之间的关联。在这种情况下,我们建模的不是大气条件,而是神经元和突触。大型语言模型,如 GPT-4( Generative Pre-trained Transformer 的缩写) ,本质上是大型、强大的(虽然是数字的)“文字大脑”,它们从书籍、文章、网站和各种书面材料中训练得到了大量信息。

By analyzing and processing this vast amount of text, the language model learns the patterns, the language, and the context of how words, sentences, and paragraphs fit together. If you were to ask a large language model like GPT-4 a question, it would know what to reply based on its training from all those books, web pages, video transcripts, and social media posts. What it lacks in real-world sensory experiences of the human brain, it compensates for by having exposure to more language than any human might hope to read, watch, or listen to in multiple lifetimes.

通过分析和处理大量文本,语言模型学习到单词、句子和段落如何组合在一起的模式、语言和上下文。如果你问一个像 GPT-4 这样的大型语言模型一个问题,它会根据从所有这些书籍、网页、视频记录和社交媒体帖子中获得的训练知道如何回答。它缺乏人类大脑的现实世界感官经验,但通过接触更多的语言来弥补,这些语言比任何人可能希望在多次生命中阅读、观看或听到的更多。

It was against this backdrop in the summer of 2022 that I received an email from Greg Brockman and Sam Altman. They were the president and the CEO, respectively, of OpenAI, one of the groundbreaking research laboratories working in the field of friendly, or socially positive, artificial intelligence. The organization wanted to meet and talk about a potential collaboration with us. I didn’t realize it yet, but the world was about to be turned upside down.

在 2022 年夏天的这个背景下,我收到了 Greg Brockman和 Sam Altman 的电子邮件。他们分别是 OpenAI 的总裁和首席执行官,OpenAI 是一个在友好或社会积极的人工智能领域工作的开创性研究实验室。该组织希望与我们见面并讨论潜在的合作。我当时还没有意识到,但世界即将被颠覆。

For context, OpenAI was still four months away from releasing ChatGPT and seven months away from releasing GPT-4, the eventual rollout of which is what they wanted to discuss. I was intrigued but skeptical that there would be anything we could do together. I didn’t have a strong view that some of the newer-generation generative AIs would be immediately applicable to our mission. Advances in AI technology had already done some interesting things around writing that sounded credible, but in my mind the technology didn’t yet seem to have a real handle on knowledge and also lacked the ability to perform logical or deductive reasoning or reliably produce legitimate facts. At the same time, however, I had a lot of respect for what OpenAI had already accomplished. So we scheduled some time together and met.

作为背景,OpenAI 距离发布 ChatGPT 还有四个月,距离发布 GPT-4 还有七个月,他们希望讨论最终的发布。我很感兴趣,但也持怀疑态度,认为我们可能没有什么可以一起做的。我并不认为一些新一代生成式 AI 会立即适用于我们的任务。人工智能技术的进步已经在写作方面做了一些有趣的事情,听起来很可信,但在我看来,这项技术似乎还没有真正掌握知识,也缺乏进行逻辑或演绎推理的能力,或者可靠地产生合法的事实。然而,我对 OpenAI 已经取得的成就非常尊敬。所以我们安排了一些时间见面。

Each successive generation of these models typically had far more complexity, bluntly measured by the number of parameters they contained. The best way to think of a parameter is a number describing the strength of a connection between two nodes in the neural net that represent the large language model. You can view it as a representation of the strength of a synapse between two neurons in a brain. When it was first launched in 2018, GPT-1 had more than 100 million parameters. Just a year later, GPT-2 had more than 1 billion. GPT-3 had more than 175 billion. GPT-4 was likely to have on the order of 1 trillion parameters.

这些模型的每一代通常都具有更高的复杂性,简单地用它们包含的参数数量来衡量。考虑参数的最佳方式是将其视为描述神经网络中表示大型语言模型的两个节点之间连接强度的数字。你可以将其视为两个神经元之间突触强度的表示。当它在 2018 年首次发布时,GPT-1 拥有超过 1 亿个参数。仅一年后,GPT-2 就有超过 10 亿个参数。GPT-3 有超过 1750 亿个参数。GPT-4 可能有大约 1 万亿个参数。

The OpenAI leadership felt that GPT-4 was going to surprise folks with its enhanced capabilities, which they believed would both excite and possibly unnerve a lot of people. Because of this, they were looking to launch it alongside a small number of trusted partners that might be able to showcase socially positive and real-world examples; Khan Academy was the first organization that came to mind. The second reason they wanted to reach out to us was to help evaluate the AI itself. They needed to show that GPT-4 was capable of deductive reasoning, critical thinking, and actually dealing with knowledge. The OpenAI team aspired to see how GPT-4 would perform on college-level biology questions, and we had thousands of these.

OpenAI 的领导层认为,GPT-4 将以其增强的能力让人们感到惊讶,他们相信这将既令人兴奋又可能让很多人感到不安。正因为如此,他们希望与少数值得信赖的合作伙伴一起发布 GPT-4,以展示其在社会上积极和现实世界中的应用实例;可汗学院是他们想到的第一个组织。他们想联系我们的第二个原因是帮助评估 AI 本身。他们需要展示 GPT-4 具备演绎推理、批判性思维和实际处理知识的能力。OpenAI 团队希望看到 GPT-4 在大学水平的生物学问题上的表现,而我们有数千个这样的问题。

I found myself suddenly exhilarated to be one of the first people on the planet to see the capabilities of GPT-4. Based on past experience, I knew the time to really explore a technology is when it’s on its way toward getting good. If you invest and test it properly when most still believe that it’s a toy, or a distraction, you put yourself in position to really reap its benefits when it is ready for prime time. This was the case in the infancy of video learning, when plenty of naysayers said YouTube was simply an idle pastime. But early pioneers showed us that on-demand videos were so much more than cats playing piano and that, in fact, you could use them to help people learn.

我突然感到兴奋,因为我是地球上最早看到 GPT-4 能力的人之一。根据过去的经验,我知道真正探索一项技术的时间是在它逐渐变好的时候。如果你在大多数人还认为它是玩具或消遣的时候适当地投资和测试它,那么当它准备好进入黄金时段的时候,你就会真正收获它的好处。这就是视频学习初期的情况,当时很多怀疑论者说 YouTube 只是闲暇消遣。但早期的先驱者向我们展示了点播视频远不止是猫弹钢琴,实际上,你可以用它们来帮助人们学习。

Today, it’s commonplace for students to learn almost anything they want using on-demand videos, and it’s become much more accepted in the classroom. Khan Academy has played a leading role here, using video to support on-demand help for hundreds of millions of learners around the world. We’ve also shown that rather than somehow being a substitute for the teacher, videos can off-load pieces of a lecture, freeing up more time for personalized learning, hands-on activities, or classroom conversation. This arguably makes the teachers more valuable, not less. And now it was time to see if generative AI could do the same—support students and let teachers move up the value chain.

今天,学生们使用点播视频学习几乎任何他们想学的东西已经司空见惯,并且在课堂上也变得更加被接受。可汗学院在这里发挥了领导作用,使用视频为全球数亿学习者提供按需帮助。我们还表明,视频并不会取代教师,而是可以减轻讲座的负担,为个性化学习、动手活动或课堂对话腾出更多时间。这无疑使教师更有价值,而不是更少。而现在是时候看看生成式 AI 是否也能做到这一点——支持学生并让教师在价值链上升。

Sam and Greg started their GPT-4 demo by showing me an AP biology multiple-choice question they had pulled directly from the College Board’s website. They asked me for the answer. After reading through it, I said the answer is C. They then asked GPT-4 to answer the question using a chat interface (similar to what folks are now used to with ChatGPT). A moment later, GPT-4 answered the question correctly.

Sam 和 Greg 通过展示他们直接从 College Board 网站上提取的一个 AP 生物学选择题来开始他们的 GPT-4 演示。他们问我答案是什么。在仔细阅读之后,我说答案是 C。然后他们让 GPT-4 使用聊天界面(类似于人们现在习惯使用的 ChatGPT)回答这个问题。一会儿后,GPT-4 正确地回答了这个问题。

I didn’t say anything right away, but secretly I started to get goose bumps, even as I remained somewhat skeptical. “Wait a second,” I said. “This is an AI that can already answer a biology question at an AP level?” Maybe it just got lucky on this example, I thought. “Can you ask it to explain how it got the answer?”

我没有马上说什么,但我暗自开始起鸡皮疙瘩,即使我仍然有些怀疑。“等一下,”我说。“这是一个已经可以回答 AP 级别生物问题的 AI?”也许它在这个例子上只是运气好,我想。“你能让它解释一下它是怎么得到答案的吗?”

Greg typed, “Please explain how you got the answer.” Within seconds, GPT-4 provided us with a clear, simple, and thorough explanation. Not only that, but it was so conversational that it might as well have been a human being answering, not a machine.

Greg 打字说:“请解释一下你是怎么得到答案的。”几秒钟之内,GPT-4 给我们提供了一个清晰、简单且详尽的解释。不仅如此,它的对话方式让人觉得就像是一个人类在回答,而不是机器。

At this point, I gave up on hiding my amazement. “Can you ask it to explain why the other choices aren’t correct?” Greg obliged, and a moment later GPT-4 explained why all other answer possibilities in the AP question were wrong.

此时,我放弃了掩饰我的惊讶。“你能让它解释一下为什么其他选项不正确吗?” Greg 答应了,片刻之后,GPT-4 解释了为什么 AP 问题中的所有其他答案选项都是错误的。

Next, I asked Greg if GPT-4 was capable of writing an original AP-level question. It did, and then it wrote ten more.

接下来,我问 Greg GPT-4 是否能够编写一个原创的 AP 级别的问题。它做到了,然后又写了十个。

Two months later, I visited Bill Gates to give an update on Khan Academy, and I learned why the OpenAI folks had shown an AP biology question. Bill told me that when he first encountered GPT-3, he was impressed, but he had told the OpenAI team that he’d only be really impressed if it was capable of passing the AP biology exam. What the OpenAI team had shown me in that first demo was that GPT-4 could now do so.

两个月后,我拜访了比尔·盖茨,向他汇报可汗学院的最新情况,我了解到 OpenAI 的人为什么展示了一个 AP 生物学问题。比尔告诉我,当他第一次接触 GPT-3 时,他印象深刻,但他告诉 OpenA I团队,只有当它能够通过 AP 生物学考试时,他才会真正印象深刻。OpenAI 团队在第一次演示中向我展示的是,GPT-4 现在可以做到这一点。

“This changes everything,” I said to Greg and Sam, my mind spinning with possible ways that GPT-4 might allow us to reimagine education, credentials, work, and human potential. “We were kind of thinking the same thing,” Sam said. “It is not perfect yet, but the technology is getting better. Who knows? If we get it right, this might be something educators will want to use.”

“这改变了一切,”我对 Greg 和 Sam 说,我的脑海中充满了 GPT-4 可能让我们重新想象教育、证书、工作和人类潜力的各种方式。“我们也有同样的想法,” Sam 说。“它还不完美,但技术正在变得更好。谁知道呢?如果我们做对了,这可能是教育工作者想要使用的东西。”

Technology that we had until recently thought of as something straight out of Star Trek was suddenly very real. The innovation imagined by the greatest science fiction writers had become a reality.

直到最近,我们还认为是《星际迷航》中直接提到的技术突然变得非常真实。最伟大的科幻作家想象的创新已经成为现实。

Time for a Hack-AI-Thon

Hack-AI-Thon 来了

In the early 1940s, Claude Shannon, a brilliant mathematician, produced several consequential theories. Among them, he mapped out a theory of electronic communication that would become the basis of digital technology. In 1948, while working for Bell Labs, he started dabbling in the field we now know as artificial intelligence. Shannon decided to play with how an algorithm approximates language. He published a paper in The Bell System Technical Journal called “A Mathematical Theory of Communication.” It was the early days of digital computers—well before the advent of the internet—and Shannon’s information theory first made the case that a series of probabilistic processes could approximate the English language. By keeping track of how many times words appeared in a text, he devised an algorithm where he was able to predict what word was most likely to come next. Eventually, this small language model generated a sentence. The better this process got, the more natural the sentence sounded. It’s an oversimplification, but the likes of GPT-3 and GPT-4 are essentially far more complex large language models based on training a neural net in very specialized ways, and the underlying idea can be traced back to this early work by Shannon.

在 1940 年代初期,杰出的数学家克劳德·香农提出了几项重要的理论。其中,他绘制了一种电子通信理论,这成为数字技术的基础。1948 年,他在贝尔实验室工作时,开始涉足我们现在称之为人工智能的领域。香农决定玩弄算法如何近似语言。他在《贝尔系统技术期刊》上发表了一篇论文,题为《通信的数学理论》。那是数字计算机的早期——远在互联网出现之前——香农的信息论首次提出了一系列概率过程可以近似英语语言的观点。通过跟踪一个文本中单词出现的次数,他设计了一种算法,能够预测下一个最有可能出现的单词。最终,这个小型语言模型生成了一句话。这个过程越好,句子听起来就越自然。这是一个过度简化,但像 GPT-3 和 GPT-4 这样的复杂大型语言模型本质上是基于在非常专业的方式下训练神经网络,其基本思想可以追溯到香农的早期工作。

Soon after the development of Shannon’s work, we saw another great mind enter the realm that was to become artificial intelligence—a computer scientist by the name of Alan Turing. Above and beyond his work breaking German codes and helping us beat the Nazis, Turing explored the concept of AI and whether machines can reach a point where they can mimic human intelligence convincingly. In 1950, he wrote a foundational paper titled “Computing Machinery and Intelligence,” where he introduced the concept of the imitation game, which we now know as the Turing test. Imagine you are having a conversation, but you can’t see the person you are talking to. It could be a person you are chatting with on your computer or phone. Now, if you can’t see or physically interact with the other person, how can you tell if you are talking to a human or a machine? That’s the essence of the Turing test. To perform the test, there’s usually a judge involved who is responsible for evaluating the responses from both the human and the machine. The goal for the machine is to convince the judge that it is, in fact, a human. It needs to demonstrate intelligence, understanding, and the ability to hold a coherent conversation, just as a person would. Turing proposed that if a machine could consistently fool the judge into thinking it’s human, we could consider it intelligent. In other words, if the machine can pass the Turing test, it would imply that it possesses humanlike intelligence.

在香农的工作发展之后不久,我们看到另一个伟大的思想进入了将成为人工智能的领域——一位名叫艾伦·图灵的计算机科学家。除了他破译德国密码并帮助我们战胜纳粹的工作之外,图灵探索了人工智能的概念,以及机器是否能够达到一个可以令人信服地模拟人类智能的程度。1950 年,他写了一篇基础论文,题为《计算机器与智能》,在其中他介绍了模仿游戏的概念,我们现在称之为图灵测试。想象一下,你在进行对话,但你看不到与你交谈的人。它可能是你在电脑或手机上聊天的人。现在,如果你看不到或无法与另一个人进行物理互动,你如何判断你是在与人类还是机器交谈?这就是图灵测试的本质。为了进行测试,通常会有一位评委负责评估人类和机器的回答。机器的目标是让评委相信它实际上是人类。它需要表现出智能、理解力和保持连贯对话的能力,就像人类一样。图灵提出,如果一台机器能够一致地让评委误以为它是人类,我们可以认为它是智能的。换句话说,如果机器能够通过图灵测试,这将意味着它具有人类般的智能。

When I accepted Sam and Greg’s offer to field-test the new GPT-4 technology in the summer of 2022, I wondered how close it might be to passing the Turing test. I had studied artificial intelligence at MIT in the mid-1990s. Back then, there were simple programs that could trick a person for a few interactions but nothing that could consistently feel human across a long, detailed conversation. It always seemed fantastical that one day a machine might actually pass the Turing test, let alone in my lifetime, and it was thrilling to actually try out a technology that seemed on the cusp of passing it, or that perhaps had already done so. This advancement could be tantamount to scientists achieving cold fusion or faster-than-light travel.

当我在 2022 年夏天接受 Sam 和 Greg 的邀请,现场测试新的 GPT-4 技术时,我想知道它离通过图灵测试有多近。我在 1990 年代中期在麻省理工学院学习人工智能。那时,有一些简单的程序可以在几次互动中欺骗人,但没有任何程序能够在长时间、详细的对话中一直感觉像人类。总觉得有一天机器真的可以通过图灵测试,这似乎是幻想,更不用说在我的有生之年了,实际上试用一项似乎处于通过测试边缘或已经通过测试的技术,这令人兴奋。这一进步相当于科学家实现冷聚变或超光速旅行。

As the initial wave of excitement crested, I also started thinking about the societal implications of a seemingly intelligent technology. While capable of solving so many problems, AI was also capable of introducing some potential downsides. If this large language model was going to be able to help tutor students, then it was going to be capable of writing essays for them too. What if the new version of GPT managed to be only a crutch for our students, preventing them from developing their own research and writing skills? It also occurred to me that if GPT-4 had the capacity to empower folks by helping them communicate and solve problems, then it also potentially threatened to dislocate many people from their jobs and their sense of purpose. A convincingly human technology capable of being a great tutor might also be a technology that bad actors could use to defraud or brainwash unsuspecting people.

当最初的兴奋浪潮消退时,我也开始思考看似智能的技术对社会的影响。尽管能够解决许多问题,人工智能也有可能引入一些潜在的负面影响。如果这个大型语言模型能够帮助辅导学生,那么它也能够为他们写论文。如果新的 GPT 版本只是成为我们学生的拐杖,阻止学生发展自己的研究和写作技能怎么办?我还想到,如果 GPT-4 有能力通过帮助人们交流和解决问题来赋能他们,那么它也可能威胁到许多人的工作和目标感。能够成为优秀导师的令人信服的技术也可能是一种被坏人用来欺骗或洗脑毫无戒心的人的技术。

My mind continued to spin a great variety of dark scenarios and outcomes—from data collection on our kids to the potentially addictive quality of the technology. I understood that the disruptive nature of the AI meant that we all needed to take it seriously. Beyond OpenAI, there are several other organizations investing heavily in large language models, including Microsoft, Google, and Meta, not to mention state actors like Russia and China. All of the technology giants had been using some form of artificial intelligence for years to feed us the ads, videos, search results, and social media posts with which we engage on an hourly basis. But this AI seemed different—because it was. Science fiction authors have always drawn the distinction between an artificial specialized intelligence that can optimize one thing versus an artificial general intelligence that can reason across many tasks like a human being, the latter case being the one that could lead to both utopian and dystopian futures.

我的脑海里继续浮现各种黑暗的情景和结果——从对我们孩子的数据收集到技术的潜在成瘾性。我明白,人工智能的颠覆性意味着我们都需要认真对待它。除了 OpenAI,还有其他几家机构在大力投资大型语言模型,包括微软、谷歌和 Meta,更不用说像俄罗斯和中国这样的国家行为体。所有这些科技巨头多年来一直在使用某种形式的人工智能,向我们推送广告、视频、搜索结果和社交媒体帖子,这些都是我们每小时都在参与的。但这种人工智能看起来不同——因为它确实如此。科幻作家一直在区分能够优化一件事的人工专用智能和能够像人类一样跨多个任务进行推理的人工通用智能,后一种情况可能会导致乌托邦和反乌托邦的未来。

To many, large language models like GPT-4 approach artificial general intelligence because they can write about, and seemingly reason about, any subject, making them relevant almost anywhere. Generative AI can also construct and make sense of images. While it was clear that generative AI was going to change our world in ways we might scarcely imagine, it also occurred to me that it was not up to me or Khan Academy to decide whether we were comfortable with this technology moving forward. It was already moving forward. As someone trying his best to use technology for good, I wondered if we might be able to utilize generative AI to achieve the greatest positive effect and edge us closer to that utopian scenario, especially in the realm of education.

对许多人来说,像 GPT-4 这样的大型语言模型接近人工通用智能,因为它们可以撰写任何主题,并且似乎可以对任何主题进行推理,使它们几乎在任何地方都具有相关性。生成式 AI 还可以构建和理解图像。虽然很明显生成式 AI 将以我们几乎难以想象的方式改变我们的世界,但我也想到,是否愿意接受这种技术向前发展并不是由我或可汗学院决定的。它已经在前进了。作为一个尽力使用技术为善的人,我想知道我们是否能够利用生成式 AI 实现最大的积极效果,并使我们更接近那个乌托邦情景,特别是在教育领域。

With unexpected access to OpenAI’s newest model, I sat down at my computer on that very afternoon, a Friday. I was glued to it for most of the weekend. After ten hours planted at the computer, I tried to shut off and sleep, but my mind was racing at all that the technology was capable of and how we, as educators, might use it with our learners. I saw that the implications of GPT-4 were nothing short of revelatory. Not only was it able to solve complex problems and answer tricky questions, but it was also able to explain its reasoning and generate new questions on its own. It was able to take on personas with humanlike characteristics. After prompting GPT-4, I got it to act as a math tutor, a science tutor, and a language tutor, each one possessing the ability to drive a conversation.

出乎意料地获得了使用 OpenAI 最新模型的机会,我在那个周五下午坐在电脑前。整个周末我几乎都被它吸引住了。坐在电脑前十小时后,我试图关闭并入睡,但我的脑海中充满了这项技术的所有能力,以及我们作为教育者可能如何与我们的学习者一起使用它。我看到,GPT-4 的影响无异于启示。它不仅能够解决复杂问题和回答棘手问题,还能够解释其推理过程并自行生成新问题。它能够具有人类般的特征。经过提示,GPT-4 充当了数学导师、科学导师和语言导师的角色,每一个角色都具有推动对话的能力。

What really caught my attention, though, was its ability to write in different tones and styles. For instance, as part of my GPT-4 learning journey, I could not resist the temptation to have GPT technology rewrite the Declaration of Independence in the voice of Donald Trump:

然而,真正引起我注意的是它在不同语调和风格下写作的能力。例如,作为我学习 GPT-4 的一部分,我无法抗拒诱惑,让 GPT 技术以唐纳德·特朗普的声音重写《独立宣言》:

Folks, let me tell you, it is time for us to declare our independence from this failing government and this terrible leader, King George III. I mean, this guy is a total disaster. He’s been taking away our rights, he’s been ignoring the will of the people, and he’s been more interested in lining his own pockets than actually helping the people. It is just sad, really, really sad.

朋友们,让我告诉你们,现在是我们宣告独立的时候了,脱离这个失败的政府和这个糟糕的领导人,乔治三世国王。我是说,这个人是个彻底的灾难。他一直在剥夺我们的权利,无视人民的意愿,比起真正帮助人民,他更感兴趣的是填满自己的口袋。这真的很悲哀,真的很悲哀。

But we, the people, we’re not going to stand for it any longer. We’re going to make our own country, and it is going to be the best country, the greatest country the world has ever seen. Believe me, nobody knows more about making a country great than I do. We’re going to have the best economy, the best military, the best everything. It is going to be tremendous.

但是我们,人民,我们不会再忍受了。我们要建立自己的国家,它将成为最好的国家,世界上有史以来最伟大的国家。相信我,没有人比我更了解如何让一个国家伟大。我们将拥有最好的经济,最好的军队,最好的所有一切。它将是巨大的。

Next, I asked GPT-4 to write the Declaration of Independence in the style of Dr. Seuss:

接下来,我让 GPT-4 以苏斯博士的风格写《独立宣言》:

In the land of the free and the home of the brave

A new nation was born, no longer a slave

To the rule of a king across the sea

We declared our independence, and shouted with glee

在自由之地和勇敢者之家

一个新国家诞生了,不再是奴隶

摆脱了海另一边国王的统治

我们宣告独立,并欢呼雀跃

With a pen in hand and a fire in our hearts

We wrote down our grievances, every last part

No more taxes without representation

No more soldiers in our homes, no more frustration

手握笔,心中燃烧着火焰

我们写下我们的不满,每一部分

没有代表权,就不再征税

不再有士兵驻扎在我们的家中,不再有挫败感

We hold these truths to be oh so true

That all men are equal, yes me and you

We have rights that can’t be taken away

Life, liberty, and happiness, we’re here to stay.

我们认为这些真理是如此真实

所有人都是平等的,是的,包括我和你

我们拥有不可剥夺的权利

生命、自由和幸福,我们将坚持下去。

Tricks like this became somewhat commonplace once people had access to ChatGPT, but at the time I was one of the first to have access to this type of technology and attempt such prompts. And GPT-4 was frankly much better than the first iteration of ChatGPT, which wouldn’t be released to the public for another few months. I found the results of my prompts to be impressive, hilarious, and even a little scary. When I asked it questions or requested recommendations, it gave me answers that felt completely authentic. There was no person behind the scenes typing these responses, nor was there an algorithm generating the text with traditional if-then-type logic that other algorithms typically worked from. I did not get canned and robotic answers either. Instead, I received varied responses to the same query every time I asked it, responses that took into account the context of my conversation with it up to that point.

当人们可以访问 ChatGPT 时,这样的技巧变得有些司空见惯,但当时我是最早接触这种技术并尝试这些提示的人之一。坦率地说,GPT-4 比第一版 ChatGPT 要好得多,后者还要几个月才会公开发布。我发现我的提示的结果令人印象深刻,搞笑,甚至有点可怕。当我问它问题或请求建议时,它给我的答案感觉完全真实。没有人在幕后打这些回复,也没有算法用传统的 if-then 逻辑生成文本。我得到的不是罐装的机械答案。相反,每次我问它相同的查询时,我都会收到不同的回答,这些回答考虑到了我与它的对话的上下文。

More specifically, it made me realize the potential of this technology to transform the way we think about K–12 and higher education and beyond. The AI was not quite perfect yet. It was getting math incorrect more than I liked, but I could even see improvement as I got better at prompting it. By the end of that weekend, I wondered what might happen if I gathered dozens of the brightest minds in technology and education to play around with the platform alongside me. OpenAI agreed to grant access to another thirty or so engineers, content creators, educators, and researchers on the Khan Academy team to experiment with GPT-4.

更具体地说,它让我意识到这项技术有可能改变我们对 K-12 教育和高等教育以及其他方面的思考方式。人工智能还不完全完美。它在数学上犯错的次数比我喜欢的要多,但随着我更好地提示它,我甚至看到了改进。在那个周末结束时,我想知道如果我召集几十个科技和教育领域最聪明的人来和我一起玩这个平台,会发生什么。OpenAI 同意授予可汗学院团队的另外三十名工程师、内容创作者、教育工作者和研究人员使用 GPT-4 进行实验。

It was time for a hackathon.

是时候进行 hack-AI-thon。

Every six months, we have a week at Khan Academy where employees are allowed to work on pretty much anything they want related to our mission. I presented GPT-4 to a small subset of our team and let them have at it. Through collaboration and innovation, we brainstormed, designed, and developed really cool and consequential ideas. What we eventually came to call the hack-AI-thon elicited dozens of completely new concepts and education modalities that no one had ever thought of before. For instance, what if the AI was able to help teachers write their lesson plans? What if it could enter into a debate with a student? What if it could create projects? What if it could help a student remove stressors or inspire a student to create new ideas? What if the technology was able to quiz a student or lead a student in a review session? Educators would be able to create novel activities that students could do with the AI. The AI might help students compose essays, making them better writers by providing them with immediate feedback.

每六个月,我们在可汗学院有一个星期,员工可以从事任何与我们的使命相关的工作。我向我们团队的一小部分成员展示了 GPT-4 并让他们进行试用。通过合作和创新,我们集思广益、设计并开发了非常酷且重要的想法。我们最终称之为 hack-AI-thon 的活动引发了几十个完全新颖的概念和教育模式,这些都是前所未有的。例如,如果 AI 能够帮助教师编写课程计划怎么办?如果它能与学生进行辩论怎么办?如果它能创建项目怎么办?如果它能够帮助学生消除压力或激发学生创造新想法怎么办?如果这项技术能够测试学生或引导学生进行复习怎么办?教育工作者将能够创建学生可以与 AI 一起完成的新活动。AI 可能会帮助学生写作文,通过提供即时反馈让他们成为更好的作家。

From here, the hack-AI-thon participants explored questions about safety, security, and bias. (Remember, this was well before OpenAI released ChatGPT to the public.) We flagged some clear concerns: Was it really a good idea for students to be using generative AI to write their essays, perform their research, take tests for them, or even help them apply to college? Artificial intelligence, we worried, might turn our kids into a generation of cheaters who were not going to learn anything. With the AI taking over, parents who once helped their kids with homework might lose an important point of connection. As for teachers, was this going to be a boon, or was this going to undermine their ability to teach students? At no point did I think that AI was going to put teachers out of work, and in the best-case scenario it was going to accelerate their ability to teach their students, but I also worried it might undermine that ability in critical ways too.

从这里开始,hack-AI-thon 的参与者探索了关于安全性、保密性和偏见的问题。(请记住,这远在 OpenAI 向公众发布 ChatGPT 之前。)我们提出了一些明显的担忧:学生使用生成式 AI 写论文、进行研究、代考甚至帮助他们申请大学真的好吗?我们担心,人工智能可能会让我们的孩子变成一代作弊者,他们将什么都学不到。随着 AI 的接管,曾经帮助孩子做作业的父母可能会失去一个重要的连接点。至于教师,这是一个福音,还是会削弱他们教学生的能力?我从未认为 AI 会让教师失业,在最好的情况下,它会加速他们教学生的能力,但我也担心它可能会在关键方面削弱这种能力。

Nearly two decades earlier, I had seen similar fears around on-demand video in education: Was it going to be a distraction for students? Was it going to lower their attention spans? Would it isolate students instead of promoting connections between them and their teachers? How were students going to be able to know what to watch? Whom would they ask if they got stuck on a subject and had questions?

近二十年前,我在教育领域看到了类似的对点播视频的担忧:这会不会成为学生的分心事物?这会不会降低他们的注意力?它会不会使学生孤立而不是促进他们与老师之间的联系?学生们怎么知道该看什么?如果他们在某个学科上卡住并有问题,他们会问谁?

It’s never a good idea to let fear stop you from exploring, however. The more time we spent testing GPT-4, the more we realized how to mitigate problems with solutions that in many ways made the upsides even better. To address concerns around cheating, for instance, we considered what it would take to create an AI tutor that refused to give students answers. Like a good human tutor, it would instead ask leading questions. When we thought about student safety, we considered creating a system that logged all conversations and made them transparent to teachers and parents. To encourage human-to-human connectivity, we suggested tools that teachers and students could use to give them more time and energy for themselves and for each other.

然而,让恐惧阻止你探索从来不是一个好主意。我们花在测试 GPT-4 上的时间越多,我们就越意识到如何通过解决方案来缓解问题,而在许多方面,这些解决方案使优点变得更好。例如,为了解决作弊问题,我们考虑创建一个拒绝给学生答案的 AI 导师。像一个好的真人导师一样,它会提出引导性问题。当我们想到学生的安全时,我们考虑创建一个记录所有对话并向教师和家长公开的系统。为了鼓励人与人之间的联系,我们建议使用工具,让教师和学生可以为他们自己和彼此腾出更多时间和精力。

By the end of the hack-AI-thon, our team started to feel increasingly confident that GPT-4 was going to be a game changer for education. Used properly, it would positively affect how teachers planned, instructed, and graded. By bringing artificial intelligence into the classroom, educators could tackle entrenched problems in education that we had not been able to solve with existing technology and resources. Soon, students might be able to learn faster and retain more information than ever before, proving AI to be the ultimate learning tool for accelerating human intelligence and potential. AI might hasten learning globally and even get us closer to realizing a world in which every person on earth had access to affordable world-class learning. This technology had the potential to revolutionize how we communicate, create, and consume information the same way that, twenty years earlier, we marveled at the educational possibilities of the internet, and twenty years before that, the personal computer, and twenty years before that, the calculator.

在 hack-AI-thon 结束时,我们团队开始越来越有信心认为 GPT-4 将成为教育的游戏规则改变者。如果使用得当,它将积极影响教师的计划、教学和评分方式。通过将人工智能引入课堂,教育工作者可以解决我们无法用现有技术和资源解决的教育中的顽固问题。很快,学生可能会比以往任何时候都学得更快,记住更多信息,证明 AI 是加速人类智力和潜力的终极学习工具。AI 可能会加快全球学习进程,甚至让我们更接近实现人人都能获得负担得起的世界级学习的世界。这项技术有潜力彻底改变我们交流、创造和消费信息的方式,就像二十年前我们对互联网教育可能性感到惊讶,四十年前对个人计算机感到惊讶,六十年前对计算器感到惊讶一样。

As for me, I was left invigorated. I became increasingly confident that with proper care we could mitigate the risks and potential downsides of generative AI. It was clear that even though we were among the first people on the planet to incorporate this technology, as soon as the world got wind of it, everything was going to change dramatically—and not over the span of generations but within months.

至于我,我感到精神振奋。我越来越有信心,只要谨慎行事,我们就可以降低生成式 AI 的风险和潜在的负面影响。很明显,尽管我们是地球上最早采用这项技术的人之一,但一旦世界听到这项技术的消息,一切都会发生剧变——而且不是在几代人的时间内,而是在几个月内。

Engaging the Principle of Educated Bravery

践行受过教育的勇气原则

At the end of November 2022, two months after our hack-AI-thon, we were still exploring the possibilities of the technology when OpenAI released ChatGPT to the public. This initial release layered a chat interface over GPT-3.5, which had been out for several months. Even though it was suboptimal, operating on a model far less powerful than GPT-4, it immediately captured the world’s imagination. Within days, millions of folks were using it, and social media and the press seemed to talk about nothing else.

2022 年 11 月底,在我们 hack-AI-thon 两个月后,我们仍在探索这项技术的可能性时,OpenAI 向公众发布了 ChatGPT。这次初步发布在 GPT-3.5 上并叠加了一个聊天界面,该模型已经发布了几个月。尽管它还不完美,运行在远不如 GPT-4 强大的模型上,但它立即捕获了世界的想象力。几天之内,数百万人开始使用它,社交媒体和新闻媒体似乎都在谈论这个话题。

Alongside this initial wave of excitement, many expressed concern about a potential epidemic of student cheating, AI-produced factual errors or hallucinations, bias, and questions about the sources of information from which ChatGPT drew its data. In real time, we were watching a society starting to struggle with the implications of a powerful but imperfect tool, and nowhere more so than in the realm of education. Within weeks of ChatGPT’s launch, school districts around the country were banning it. It lit a fire in our belly at Khan Academy. We had to show not only that there were solutions to mitigating many of these risks but that AI could truly be transformational for education. We wanted to show this with real tools in real schools and not just with theoretical talk.

随着这波最初的兴奋,许多人表达了对学生作弊可能流行、人工智能产生的事实错误或幻觉、偏见以及 ChatGPT 从中获取数据的信息来源问题的担忧。实时地,我们看到一个社会开始挣扎于一个强大但不完美的工具的影响,尤其是在教育领域。ChatGPT 推出后几周内,全国各地的学区都在禁止它。它在可汗学院点燃了我们的斗志。我们不仅要展示有减轻这些风险的解决方案,还要展示人工智能真正可以改变教育。我们想在真实学校中使用真实工具展示这一点,而不仅仅是理论上的谈话。

Within months, OpenAI was planning to release GPT-4 to the public, and when it did, we would need answers to the cheating problem, to the problem of transparency, to the problem of moderation. We had to show that AI could be valuable for teachers and learners in actual educational environments, that it could provide every teacher with support to write lesson plans, to monitor the progress of a classroom, to give real-time feedback, and even to modulate teacher burnout. We had to show AI’s real value to students as a Socratic tutor, as a debating partner, as a guidance counselor, as a career coach, and as a driver of better outcomes in their academics. So, alongside OpenAI, we created a rapid prototyping team that began to build an AI-infused education platform we would come to call Khanmigo.

几个月内,OpenAI 计划向公众发布 GPT-4,当它发布时,我们需要解决作弊问题、透明性问题、管理问题。我们必须展示人工智能在实际教育环境中对教师和学生的价值,它可以为每位教师提供支持,编写课程计划,监控课堂进度,提供实时反馈,甚至调节教师的倦怠感。我们必须展示人工智能作为苏格拉底式导师、辩论伙伴、指导顾问、职业教练以及学术成果驱动者对学生的真正价值。因此,我们与 OpenAI 合作,创建了一个快速原型团队,开始构建一个我们称之为 Khanmigo 的 AI 融合教育平台。

It was in that spirit of pushing its capabilities with novel applications and creating safeguards that I had an idea. Everyone was talking about AI enabling cheating by writing papers for students, but what if it didn’t write for them at all? What if, instead, it wrote with them?

在推动其能力与新颖应用并创建保障措施的精神下,我有了一个想法。每个人都在谈论人工智能通过为学生写论文来启发作弊,但如果它根本不为他们写作呢?如果相反,它与他们一起写论文呢?

This was where I found myself on New Year’s Day 2023, when I asked my daughter, Diya, to write a story with me—and GPT-4. This was far from the first time I had asked Diya to be a part of an education experiment or to test out a prototype for us. Usually, these informal testing sessions involved having Diya, one of my sons, or another very patient accomplice like a student or teacher at Khan Lab School or Khan World School try to navigate a new exercise to make sure that the interface between students and technology worked reasonably well. I’ve always found this do-it-yourself user research incredibly helpful and grounding. My “accomplices” like it, too, because it allows them to kick the tires of a new program or innovation that can make a real difference in the lives of learners. This was never truer than with GPT-4. So I created a prompt for GPT-4, telling it to write a story with us, not for us. I told it how we wanted to interact conversationally and with what type of tone.

这就是我在 2023 年元旦时的处境,当时我让我的女儿 Diya 和我一起写一个故事——以及 GPT-4。这远不是我第一次让 Diya 参与教育实验或为我们测试原型。通常,这些非正式测试会让 Diya,或我的一个儿子,或另一个非常有耐心的同伴,比如 Khan 实验学校或 Khan 世界学校的学生或教师,尝试完成 一个新练习,以确保学生与技术之间的接口运行良好。我总是发现这种自己动手的用户研究非常有帮助和接地气。我的“同伴”也喜欢这样,因为它让他们可以试用一个可以对学习者的生活产生实际影响的新项目或创新。对于 GPT-4 来说,这从未如此真实。所以我为 GPT-4 创建了一个提示,告诉它与我们一起写一个故事,而不是为我们写。我告诉它我们想要以什么样的语气进行对话和互动。

Diya then started to create her story about Samantha, the social influencer rescued from a desert island by her best friend, Emily. Diya would write a bit and then the large language model took over for a stretch. Samantha introduced herself, told us about her life, and joined us in co-creating the story. To our mutual amazement, Samantha, via GPT-4, made the adventure come alive with engaging and entertaining dialogue and perfectly captured the personalities of Diya’s two characters. Together, Diya and the AI continued to create. Their story had its share of heartwarming, hilarious, and sad moments, from Samantha’s concern for her designer wardrobe to the touching reaction when Diya’s character Emily fell ill and passed away.

然后,Diya 开始创作她关于 Samantha 的故事,这位社交网红被她最好的朋友 Emily 从荒岛上救了出来。Diya 写了一点,然后大型语言模型接管了一段时间。Samantha 介绍了自己,告诉我们她的生活,并加入我们共同创作故事。令我们都惊讶的是,通过 GPT-4,Samantha 用引人入胜和有趣的对话让冒险故事变得生动,并完美地捕捉到了 Diya 笔下两个角色的个性。Diya 和 AI 继续一起创作。他们的故事有温馨、搞笑和悲伤的时刻,从 Samantha 对她的设计师服装的关心,到 Diya 的角色 Emily 生病和去世时的感人反应。

Throughout the interaction, the AI demonstrated its ability to engage in meaningful and complex dialogue. It responded to Diya’s prompts in a way that felt natural and authentic. As a parent, I found it uplifting to watch my daughter interact with GPT-4 as her imagination sprouted in this new way. The AI she was working alongside accurately captured the fictionalized world Diya was creating and kept track of the existing conversation between her and the characters. It also remembered the rules Diya had set within the story world, as well as the information she gave the AI earlier in the conversation. The AI was expanding her mind and helping her writing and creativity grow. Where writing was usually a unidirectional activity, with the creator pushing the ideas onto the paper, this act of writing—choosing the right words and finding the right voice—had become a two-way interaction between human and machine, making my daughter, at eleven years old, one of the very first people on the planet to write a story and have it talk back to her as she was creating it!

在整个互动过程中,AI 展示了其参与有意义和复杂对话的能力。它以自然和真实的方式回应了 Diya 的提示。作为父母,我发现看着我的女儿与 GPT-4 互动,她的想象力以这种新的方式萌发,非常令人振奋。她与之合作的 AI 准确地捕捉到了 Diya 正在创造的虚构世界,并记录了她与角色之间的现有对话。它还记住了 Diya 在故事世界中设定的规则,以及她在对话早期给 AI 的信息。AI 正在扩展她的思维,帮助她的写作和创造力成长。通常写作是一种单向活动,创作者将想法推到纸上,而这种写作行为——选择正确的词语和找到正确的声音——变成了人类与机器之间的双向互动,使我的女儿,在十一岁时,成为地球上第一个写故事并在创作过程中与故事对话的人!

This was absolutely mind-blowing. Here we were, working with a technology that took writing, and everything we knew about teaching and learning in all domains, to another level. The need to get this functionality to every learner on the planet was suddenly clear to me. The future potential of how we learn and teach was changing before my eyes. As my daughter and “Samantha” were busy writing their story, thanks to AI, we were about to write a new story about education, one that was to challenge people to be careful about, but not fearful of, change.

这真是令人震惊。在这里,我们使用一种技术,将写作以及我们在所有领域中所知的教学和学习提升到另一个水平。突然间,我清楚地认识到,需要将这种功能传递给地球上的每个学习者。我们学习和教学方式的未来潜力在我眼前发生了变化。当我的女儿和“Samantha”们忙着写他们的故事时,多亏了 AI,我们即将写一个关于教育的新故事,一个挑战人们要谨慎对待但不畏惧改变的故事。

Now the world is waking up to the possibilities of large language models and what is in store for education. To take advantage of this technology requires some creativity and bravery too. Not blind bravery, but something I have started to call educated bravery, a kind of courage that comes from acknowledging the rational trepidation we all face when presented with sudden advances in technology and then informing ourselves of both the challenges and the potential it presents.

现在世界正在意识到大型语言模型的可能性以及它对教育的潜在影响。利用这项技术需要一些创造力和勇气。不是盲目的勇气,而是我开始称之为“受过教育的勇气”,一种来自承认我们在面对突然的技术进步时所面临的理性恐惧,然后了解它带来的挑战和潜力的勇气。

To get the very best from this technology, we need to reconsider what is now possible. We also need to ponder how to mitigate the risks, our fears, and our hesitations. This requires rethinking everything from the role of teachers, to how kids use multimedia, to how people get credentialed, to how to help graduates find jobs once they enter the workforce.

为了从这项技术中获得最大的好处,我们需要重新考虑现在可能实现的事情。我们还需要思考如何减轻风险、恐惧和犹豫。这需要重新思考从教师的角色,到孩子们如何使用多媒体,到人们如何获得资格认证,再到如何帮助毕业生在进入职场后找到工作。

We are at a turning point in education, one with far-reaching implications that is changing, and will continue to change, everything about learning, work, and human purpose.

我们正处于教育的一个转折点,这个转折点具有深远的影响,正在改变并将继续改变关于学习、工作和人类目标的一切。

“点赞有美意,赞赏是鼓励”