AI

Researchers populated a tiny virtual town with AI (and it was very wholesome)

Comment

Interactive Simulacra of Human Behavior
Image Credits: Google / Stanford University

What would happen if you filled a virtual town with AIs and set them loose? As it turns out, they brush their teeth and are very nice to one another! But this unexciting outcome is good news for the researchers who did it, since they wanted to produce “believable simulacra of human behavior” and got just that.

The paper describing the experiment, by Stanford and Google researchers, has not been peer reviewed or accepted for publication anywhere, but it makes for interesting reading nonetheless. The idea was to see if they could apply the latest advances in machine learning models to produce “generative agents” that take in their circumstances and output a realistic action in response.

And that’s very much what they got. But before you get taken in by the cute imagery and descriptions of reflection, conversation and interaction, let’s make sure you understand that what’s happening here is more like an improv troupe role-playing on a MUD than any kind of proto-Skynet. (Only millennials will understand the preceding sentence.)

These little characters aren’t quite what they appear to be. The graphics are just a visual representation of what is essentially a bunch of conversations between multiple instances of ChatGPT. The agents don’t walk up, down, left and right or approach a cabinet to interact with it. All this is happening through a complex and hidden text layer that synthesizes and organizes the information pertaining to each agent.

Twenty-five agents, 25 instances of ChatGPT, each prompted with similarly formatted information that causes it to play the role of a person in a fictional town. Here’s how one such person, John Lin, is set up:

John Lin is a pharmacy shopkeeper at the Willow Market and Pharmacy who loves to help people. He is always looking for ways to make the process of getting medication easier for his customers; John Lin is living with his wife, Mei Lin, who is a college professor, and son, Eddy Lin, who is a student studying music theory; John Lin loves his family very much; John Lin has known the old couple next-door, Sam Moore and Jennifer Moore, for a few years; John Lin thinks Sam Moore is a kind and nice man…

With that information, the agents are then asked to come up with their next actions given the time and circumstances. For instance, they might tell the John agent that it is 8 AM and he just woke up. What does he do? Well, he brushes his teeth, kisses his wife (hopefully in that order), gets dressed, then goes to the kitchen.

Meanwhile, another, totally independent ChatGPT instance representing John’s son Eddy has also been prompted with its own information. It too gets up, brushes its teeth, then goes to the kitchen.

And now the overarching structure of the experiment steps in: The agents representing John and Eddy aren’t both “in” the same virtual space or anything. Instead, when John has finished getting dressed and says he will move to the kitchen, the experimental framework informs him that his son Eddy is there, because in its own instance, Eddy has decided to move into the kitchen at an overlapping time in the experiment-level “day,” based on an estimate of how long various actions take.

Since they’re both in the room at the same time, the experimental setup informs them that the other person is there — along with information like that there is a table no one is sitting at, the stove is on, etc. What do they do next?

John: Good morning Eddy. Did you sleep well?
Eddy: Good morning dad. Yeah, I slept great.
John: That’s good. What are you working on today?
Eddy: I’m working on a new music composition for my class. It’s due this week, so I’m trying to get it finished. But I’m having so much fun with it!
John: That sounds great!

What a nice, normal interaction between a parent and child. Then Mei comes in and they have a pleasant interaction as well. And it’s all accomplished by asking totally separate chatbots what they would do if, in a fictional town where it is this person and had a family like this and was in a room at this time, where this was happening… what would “they,” a real human, do?

So ChatGPT guesses at what a person would do in that situation, and gives their answer as if they’re playing a text adventure. If you then told it, “it is pitch dark, you are likely to be eaten by a grue,” it would probably say it lights a torch. But instead, the experiment has the characters continue with their day minute by minute, buying groceries, walking in the park and going to work.

Image Credits: Google / Stanford University

The users can also write in events and circumstances, like a dripping faucet or a desire to plan a party, and the agents respond appropriately, since any text, for them, is reality.

All of this is performed by laboriously prompting all these instances of ChatGPT with all the minutiae of the agent’s immediate circumstances. Here’s a prompt for John when he runs into Eddy later:

It is February 13, 2023, 4:56 pm.
John Lin’s status: John is back home early from work.
Observation: John saw Eddy taking a short walk around his workplace.
Summary of relevant context from John’s memory:
Eddy Lin is John’s Lin’s son. Eddy Lin has been working on a music composition for his class. Eddy Lin likes to walk around the garden when he is thinking about or listening to music.
John is asking Eddy about his music composition project. What would he say to Eddy?

[Answer:] Hey Eddy, how’s the music composition project for your class coming along?

The instances would quickly begin to forget important things, since the process is so longwinded, so the experimental framework sits on top of the simulation and reminds them of important things or synthesizes them into more portable pieces.

For instance, after the agent is told about a situation in the park, where someone is sitting on a bench and having a conversation with another agent, but there is also grass and context and one empty seat at the bench… none of which are important. What is important? From all those observations, which may make up pages of text for the agent, you might get the “reflection” that “Eddie and Fran are friends because I saw them together at the park.” That gets entered in the agent’s long-term “memory” — a bunch of stuff stored outside the ChatGPT conversation — and the rest can be forgotten.

So, what does all this rigmarole add up to? Something less than true generative agents as proposed by the paper, to be sure, but also an extremely compelling early attempt to create them. Dwarf Fortress does the same thing, of course, but by hand-coding every possibility. That doesn’t scale well!

It was not obvious that a large language model like ChatGPT would respond well to this kind of treatment. After all, it wasn’t designed to imitate arbitrary fictional characters long term or speculate on the most mind-numbing details of a person’s day. But handled correctly — and with a fair amount of massaging — not only can one agent do so, but they don’t break when you use them as pieces in a sort of virtual diorama.

This has potentially huge implications for simulations of human interactions, wherever those may be relevant — of course in games and virtual environments they’re important, but this approach is still monstrously impractical for that. What matters though is not that it is something everyone can use or play with (though it will be soon, I have no doubt), but that the system works at all. We have seen that in AI: If it can do something poorly, the fact that it can do it at all generally means it’s only a matter of time before it does it well.

You can read the full paper, “Generative Agents: Interactive Simulacra of Human Behavior,” here.

More TechCrunch

Paris-based Blisce has become the latest VC firm to launch a fund dedicated to climate tech. It plans to raise as much as €150M (about $162M).

Paris-based VC firm Blisce launches climate tech fund with a target of $160M

Maad, a B2B e-commerce startup based in Senegal, has secured $3.2 million debt-equity funding to bolster its growth in the western Africa country and to explore fresh opportunities in the…

Maad raises $3.2M seed amid B2B e-commerce sector turbulence in Africa

The fresh funds were raised from two investors who transferred the capital into a special purpose vehicle, a legal entity associated with the OpenAI Startup Fund.

OpenAI Startup Fund raises additional $5M

Accel has invested in more than 200 startups in the region to date, making it one of the more prolific VCs in this market.

Accel has a fresh $650M to back European early-stage startups

Kyle Vogt, the former founder and CEO of self-driving car company Cruise, has a new VC-backed robotics startup focused on household chores. Vogt announced Monday that the new startup, called…

Cruise founder Kyle Vogt is back with a robot startup

When Keith Rabois announced he was leaving Founders Fund to return to Khosla Ventures in January, it came as a shock to many in the venture capital ecosystem — and…

From Miles Grimshaw to Eva Ho, venture capitalists continue to play musical chairs

On the heels of OpenAI announcing the latest iteration of its GPT large language model, its biggest rival in generative AI in the U.S. announced an expansion of its own.…

Anthropic is expanding to Europe and raising more money

If you’re looking for a Starliner mission recap, you’ll have to wait a little longer, because the mission has officially been delayed.

TechCrunch Space: You rock(et) my world, moms

Apple devoted a full event to iPad last Tuesday, roughly a month out from WWDC. From the invite artwork to the polarizing ad spot, Apple was clear — the event…

Apple iPad Pro M4 vs. iPad Air M2: Reviewing which is right for most

Terri Burns, a former partner at GV, is venturing into a new chapter of her career by launching her own venture firm called Type Capital. 

GV’s youngest partner has launched her own firm

The decision to go monochrome was probably a smart one, considering the candy-colored alternatives that seem to want to dazzle and comfort you.

ChatGPT’s new face is a black hole

Apple and Google announced on Monday that iPhone and Android users will start seeing alerts when it’s possible that an unknown Bluetooth device is being used to track them. The…

Apple and Google agree on standard to alert people when unknown Bluetooth devices may be tracking them

The company is describing the event as “a chance to demo some ChatGPT and GPT-4 updates.”

OpenAI’s ChatGPT announcement: Watch here

A human safety operator will be behind the wheel during this phase of testing, according to the company.

GM’s Cruise ramps up robotaxi testing in Phoenix

OpenAI announced a new flagship generative AI model on Monday that they call GPT-4o — the “o” stands for “omni,” referring to the model’s ability to handle text, speech, and…

OpenAI debuts GPT-4o ‘omni’ model now powering ChatGPT

Featured Article

The women in AI making a difference

As a part of a multi-part series, TechCrunch is highlighting women innovators — from academics to policymakers —in the field of AI.

16 hours ago
The women in AI making a difference

The expansion of Polar Semiconductor’s facility would enable the company to double its U.S. production capacity of sensor and power chips within two years.

White House proposes up to $120M to help fund Polar Semiconductor’s chip facility expansion

In 2021, Google kicked off work on Project Starline, a corporate-focused teleconferencing platform that uses 3D imaging, cameras and a custom-designed screen to let people converse with someone as if…

Google’s 3D video conferencing platform, Project Starline, is coming in 2025 with help from HP

Over the weekend, Instagram announced that it is expanding its creator marketplace to 10 new countries — this marketplace connects brands with creators to foster collaboration. The new regions include…

Instagram expands its creator marketplace to 10 new countries

You can expect plenty of AI, but probably not a lot of hardware.

Google I/O 2024: What to expect

The keynote kicks off at 10 a.m. PT on Tuesday and will offer glimpses into the latest versions of Android, Wear OS and Android TV.

Google I/O 2024: How to watch

Four-year-old Mexican BNPL startup Aplazo facilitates fractionated payments to offline and online merchants even when the buyer doesn’t have a credit card.

Aplazo is using buy now, pay later as a stepping stone to financial ubiquity in Mexico

We received countless submissions to speak at this year’s Disrupt 2024. After carefully sifting through all the applications, we’ve narrowed it down to 19 session finalists. Now we need your…

Vote for your Disrupt 2024 Audience Choice favs

Co-founder and CEO Bowie Cheung, who previously worked at Uber Eats, said the company now has 200 customers.

Healthy growth helps B2B food e-commerce startup Pepper nab $30 million led by ICONIQ Growth

Booking.com has been designated a gatekeeper under the EU’s DMA, meaning the firm will be regulated under the bloc’s market fairness framework.

Booking.com latest to fall under EU market power rules

Featured Article

‘Got that boomer!’: How cybercriminals steal one-time passcodes for SIM swap attacks and raiding bank accounts

Estate is an invite-only website that has helped hundreds of attackers make thousands of phone calls aimed at stealing account passcodes, according to its leaked database.

21 hours ago
‘Got that boomer!’: How cybercriminals steal one-time passcodes for SIM swap attacks and raiding bank accounts

Squarespace is being taken private in an all-cash deal that values the company on an equity basis at $6.6 billion.

Permira is taking Squarespace private in a $6.9 billion deal

AI-powered tools like OpenAI’s Whisper have enabled many apps to make transcription an integral part of their feature set for personal note-taking, and the space has quickly flourished as a…

Buy Me a Coffee’s founder has built an AI-powered voice note app

Airtel, India’s second-largest telco, is partnering with Google Cloud to develop and deliver cloud and GenAI solutions to Indian businesses.

Google partners with Airtel to offer cloud and GenAI products to Indian businesses

To give AI-focused women academics and others their well-deserved — and overdue — time in the spotlight, TechCrunch has been publishing a series of interviews focused on remarkable women who’ve contributed to…

Women in AI: Rep. Dar’shun Kendrick wants to pass more AI legislation