MarioGPT hints at a glorious AI-generated future where we will all play Nintendo forever

There can never be too much Mario in the world. Sure, it’s probably been a while since you played one of the original NES games, but probably because they’re so familiar. What if I told you researchers had created a way to generate infinite Mario levels so you can play a brand new one every day until the sun burns out?

Update: The level generator now has a working web app where you can play your prompts. Go give it a shot!

A team at IT University of Copenhagen just released a (pre-pub) paper and GitHub page showing a new method for encoding and generating Super Mario Bros levels, which they call MarioGPT. (Somewhere in Redmond, a lawyer sips his coffee and begins typing.)

MarioGPT is based on GPT-2, not one of these newfangled conversational AIs. These large language models are good at not just taking in words in sentences like these and putting out more like them — they are genera-purpose pattern recognition and replication machines.

“We honestly just picked the smaller one to see if it worked!” said Shyam Sudhakaran, lead author on the paper, in an email to TechCrunch. “I think with small datasets in general, GPT2 is better suited than GPT3, while also being much more lightweight and easier to train. However, in the future, with bigger datasets and more complicated prompts, we may need to use a more sophisticated model like GPT3.”

Even a very large LLM won’t understand Mario levels natively, so the researchers first had to render a set of them as text, producing a sort of Dwarf Fortress version of Mario that, honestly, I would play:

Each tile is rendered as a different character. Image Credits: IT University of Copenhagen

Want to make a buck? Mario in the terminal. Just saying.

Once the level is represented as a series of ordinary characters, it can be ingested by the model much the way any other series of characters can, be they written language or code. And once it understands the patterns that correlate with features, it can reproduce them.

Its output includes a “path” represented as lowercase x’s, essentially showing that the level is technically playable. They found that of 250 levels, nine out of 10 were able to be completed by the game-playing software agent A*.

Of course that wouldn’t be much of a success if the levels were just flat with occasional pipes to clear. But they included a few functions to measure how simple the path is, and to compare it to levels from the dataset as well. High novelty and “interesting” path trajectories mean doable levels that don’t resemble existing ones, but don’t let the player just walk through.

The labeled input also made it so that the model can understand natural language prompts, like asking it to make a level with “lots of pipes and lots of enemies,” or “many blocks, high elevation, no enemies.”

Examples of levels created by text prompts. Image Credits: IT University of Copenhagen

One limitation is that, due to the way their source data in the Video Game Level Corpus is encoded, there’s only one symbol for “enemy,” instead of one each for goombas, koopas, etc. But this can be changed if needed — the concept that needed proving was more that good levels could be generated at all. (Sadly, water levels are also not currently possible due to not being represented in the dataset.)

“In future work, we’re gonna explore some richer datasets!” said Sudhakaran.

Coincidentally, Julian Togelius at NYU GameLab and his group just wrote a paper showing a similar process for “sokoban” or block-pushing puzzle games. The principles are similar, but you can read about the differences here.

That these approaches worked for two different genres suggests it could work for others of similar complexity — not quite generating infinite Chrono Trigger, but an AI-powered 2D Sonic isn’t out of the question.

It should be said that this isn’t the first Mario generator we’ve seen, but others tend to rely not on a generative AI but on assembling levels from pre-created tilesets and sequences. So you may get a new sequence, but it won’t be original on a tile-by-tile basis, just screen-by-screen.

As the first version of MarioGPT, this is purely experimental and hopefully will avoid the Sauron-like gaze of Nintendo, which is known for hammering fan projects involving its properties. But of course while infinite Mario does sound fun, the charm of the original games is in their hand-crafted difficulty and themes — something that isn’t quite so easy to recreate.