At GDC last March I spoke with Evan Brown a digital artist working for Carnegie Mellon University. Earlier that day Evan had participated on a GDC panel where he talked about the Augur Project at the Carnegie Mellon ETC. According to the Augur website:
In cooperation with Lockheed Martin, Project Augur’s goal is to explore the frontiers of artificial intelligence as a predictive mechanism. Through the use of crowd-sourcing through Amazon Mechanical Turk, and with a trio of gamelike prototypes that collect data, Augur aims to build artificial intelligence algorithms that can predict based not only on an individual’s past play, but other, less obvious factors as well.
If you are looking to use games as tools for research, interested in the possibilities of research, or just want to hear more about how to rapidly develop successful games in a university setting – read on!
Travis: Tell me about the project how did you get involved with GDC?
Evan: Okay, alright, Jesse Schell was our adviser for the project. He was helping guide us how in to target what Lockheed Martin wanted out of us and how to use our expertise to build it up. He's also a great big figurehead. Ben (Sawyer) approached him about it.
Travis: So he's your direct adviser?
Evan: He was the adviser for the project. The way the ETC is structured is that the projects for the most part are independent study. So, students and staff work together to produce either work they propose themselves, or that clients come forward and ask for. Each project has an adviser who just acts as a guiding hand and tries to make sure that everything is running smoothly between client and team. Because this really is in many cases the first time experience for some people to be working on actual creative games using this technology.
Travis: Wow OK. So during the panel you were taking about the iteration and how quickly you were constructing and testing iterations of the game. I would like to hear more about the process that you guys went through and how you're handling this rapid prototyping.
Evan: Oh sure, what we ended up doing... When we first approached the problem we had to break it down into pieces, it was entirely too big of a project for us to tackle all in one chunk. When it was originally pitched, they were talking a real time strategy kit, right? "Make AI better for real-time strategy games." Which is entirely too difficult in a two-month time frame. There's too many factors to go into those types of games. So to reduce our risk factor, we started to break down interesting decisions that people would have to make, say navigation or reward choices, that kind of thing. We broke them down into individual components of human behavior, and then we built prototypes around them. So for navigation we decided to go with kind of a 3D maze game with extremely limited knowledge--[players] can only see their immediate surroundings, they can't plan out their strategy. What we were interested in for that was, how much can a player remember as they track through this maze? How frustrated do they get? When will they quit the maze if they are not succeeding? What strategies do they employ to get through? You know, there are simple strategies for navigating the maze. We were surprised that not many people used them.
Travis: How do you guys track that data? What is the actual format?
Evan: As players would play in that particular scenario, we would track their position data--anything--and judge time, and we mapped out the path that they would use. It actually looked like a helix. So the longer the game went the more the points would spiral. We could actually track any kind of doubling back in the maze. So to help us visualize how players you know moving through this maze, how they were problem solving, we used Unity, and the database would record all the information. We would comb through it and look at it in close proximity, and query the database.
Travis: What kind of analysis tools were you using?
Evan: I'm not actually the engineer in charge of the database. So unfortunately I don’t know.
Travis: Fair enough. I'm doing something similar myself, so I was curious. I built a PHP website that runs a game, tracks player behavior, and saves it in a database, and my dissertation is looking at that database information. I'm just in the analysis phase right now, so I'm just thinking, how I should hook into the MySQL Database.
Evan: We ended up digging even deeper into it. One of our strategies was to dig more deeply into the data since it was smaller. It was a large dataset, but it was smaller by comparison. It wasn't hundreds of thousands of people...
Travis: Like Everquest.
Evan: Yes, that was entirely out of the question.
Travis: You said it was 407 in your talk... no, that was the number of dollars it cost you. Was it two thousand, three thousand, how many people?
Evan: We ended up with, at the end, over 3700, across the playtests and all the approaches.
Travis: How many per game you think, 900?
Evan: Yeah, it came out to about 900 to 1000 per game. And then we also had some test runs that we ran for the database.
As we delved into this data there were emergent behavioral patterns in our players, and it really didn't matter if they were male or female, you know, young or old. The samples were showing noticeable trends. One of the most interesting ones in our first iteration was that almost 80% of our players turned at the first turn spot. It really didn't matter what strategy they were using in the game, which was fairly interesting, because you could tell that some people employed certain strategies, like the left wall strategy or the right wall strategy. But they almost always turned at the first turn. So was it this kind of psychological quirk that people turn... you know, they are walking down the hallway, and when they see an opportunity to turn they say "yeah, I would like to do this." We wanted to build that into the AI, because AI right now is very much machine learning. It's looking at the numbers, it's looking at hard facts. Can we add a component in there that looks at the quirks of human psychology?
On one of the last iterations of the game we did, we took interest in the color theory. We designed three mazes that were identical, everything except the color palette that were used for the maze. We ran the test on Mechanical Turk. A random maze was distributed to users. You would load up the game and it would be red, blue or green. You would run the maze, we would record the data. What ended up happening was users navigated the red maze, navigated it fast, but it wasn't like a small amount. It was 139 seconds. But in blue, 169 seconds was the average length of time. And green was 156. So it was actually a noticeable difference when the red hue was used in the maze. And we require more proof, right? But there is a pattern that emerges. You think wow, this is very interesting. And it was noticeable. It was not a small amount. And there's any number of things that could have contributed to that, like red sticks in memory, red is also an agitator so maybe they don't want to explore the next maze that much at the time, that kind of thing. So we were taking a bunch of different approaches. Like, how can we rapidly prototype the mini-game, just so if we have questions we can answer them. Like, what happens if you change the color of the maze? And there's very quick turnaround time when you're working with an engine like Unity, swapping in assets, that kind of thing. And then, running the actual tests on Mechanical Turk, which was really streamlined.
Travis: Really? So, Unity hooks right into Mechanical Turk?
Evan: No, that is one of the drawbacks. So as a policy, Mechanical Turk does not allow you to force people to download executable files. Unity pipes straight into our web distribution assets in Flash, all you need is a plugin. But the circumvention what we did was that we actually had users navigate to a link. And they would navigate away from Mechanical Turk and play our game, and then we would give them a passcode if they were able to complete the game. When they brought it back in, they would insert the passcode into Mechanical Turk, and we could process the payments that way. Same for the tipping. We could look at data for each of the users and then tip them if they were able to hit certain points.
Travis: Ah, that's a good idea.
Evan: So no, I had to rush through that at the end of the panel. That was one of the most interesting things about The Augur Project. Suddenly stumbling upon this idea of tipping and building that into the game mechanic of why the players would play. Not only is it this opportunity to make a little bit of money for playing the game, but rewarding players who are playing better, right? And it does entice them to try harder and take more risks to get more reward out of it.
Travis: Mechanical Turk's weird like that, because when you look at it, a person is getting paid pennies, to do these things. And you kind of wonder about it sometimes, the validity of it, you know? I mean, how much are the incentives really driving the behavior. But, I think a few people have done research on it--I know someone at Indiana University actually did a paper on looking at Mechanical Turk users and the incentives inside of it, and found that it's actually as good as a laboratory in which you pay seven bucks.
Evan: Yes, I don't really see any difference. If anything, it's a convenience thing, right? You know, if people are being inconvenienced to come to a physical location and fill out questionnaires for seven dollars, then seven bucks is comparable, right? You know, logging into a website and doing this little task versus driving to the university to participate in a study. The only concern we had was information reporting, which we didn't have. We didn't run into a whole lot of scenarios where, for example, we looked at the data at which they were giving us, like age, and it was outrageous. Any outrageous data points, we just cut them off. Like if someone said they were 10 years old, 3 years old, whatever.
Travis: Yeah, you just throw those out.
Evan: Exactly, we can pick and choose as we go.
Travis: What were the other two games that you guys built?
Evan: So the second prototype that we developed was a -- what would you call it -- kind of an oversimplified investment game. So the idea was that the player would be navigating through this environment, and their goal is to score points by standing on the middle location throughout the environment. Depending on what location they are standing on they gain more or less points throughout the game. If they're standing on the wrong platform at any given time they end up losing. So it's kind of like buying and selling in the stock market, although it completely boils down to just the idea that you have a fluctuating point scale environment.
Travis: Right. I mean there are foraging experiments where there are different points that are outputting an amount of value and then you are tasking a group of people to go find those, and they fluctuate up and down.
Evan: Oh, okay. I didn't know about that, I would love to read about that more. So what we did was we implemented that, also in Unity, and players would move their avatars around. They had five minutes to score as many points as possible. This was the game where we used the tipping system. If they scored under a certain threshold, they wouldn't be awarded any money whatsoever. That threshold was very low and the only way that you could actually not make it, or not make at least the base amount of money, would be if you didn't move around in the environment at all. This was just to discourage people from loading it up, letting the five minutes run out, and then going and claiming their cash from Mechanical Turk. Then we set the point thresholds, which were measurably more difficult depending on our playtests and the math behind the game. So I think we had set it to 50 points, then 100, and then 150. Then as the player hit these points we would give them an additional tip on top. So, that one was really interesting because we had a lot repeat players. I mean, just a ton of them just kept coming back trying it over and over again. So we got to watch this evolution of how they were learning about the game. Not all of them used the same strategy. Some of them were trying to memorize certain patterns, because we didn't randomize it. For the tipping system to work we had to make it impossible for the players to get certain point values. But we were watching them learn how these different areas work, and what the overarching pattern is per period--is it high risk low yield, low risk high yield. Demographically, we had more emergence of what we call experienced players. We had a divergence in male and female strategy. After four plays, 60% of females showed favor to the low risk low reward area of the game. So they almost stayed primarily in that area of the game to turn their points. Males actually went with more volatile spots, it was actually flipped. 58% of males showed favor towards the volatile green area. They were higher risk, higher reward, they had to jump in and out. To try to make it work otherwise would be too much, and they would lose money. Both actually showed no interest, well, very little interest in high risk, high yield. So it kind of just flipped. That would be another thing we would want to explore further and see if we could change.
Travis: Do you guys have a psychologist working for your team?
Evan: We actually had a psychologist come up from South America. Her name escapes me. She was a visiting scholar. She came in to chat with us about certain things and what we should think about. That was where we were trying to integrate a little bit more psychology into it. We really thought about these kind of things. And, you know, computers don't think about these kinds of things, which makes it very hard to create an AI opponent who is trying to predict your behavior and react against you. There's a lot of great literature on anticipation in AI. And one of the key functions of anticipation is that we as humans create internal models. If I'm playing against you in chess, and I makes a move, I'm creating an internal model what I think your goal is and what moves you'll make. Computers don't do that, they're completely reactionary. Especially in any real time strategy game, they are running on a set protocol until a condition changes. What we were interested in trying to find out was if computer could start to identify the way we think, our goals, and what influences would maybe work.
The last prototype was basically Battleship. This was the last of the prototypes so it didn't get as much love from us as the other two, but we got to measure some really interesting data on the deployment strategy. And it's been done before.
Travis: Yeah, that would make sense.
Evan: So what we did was we visualized all the deployment of players, according to the likelihood of ships occupying spaces. And there are distinct patterns in what spaces on the game board we are most likely to check, to put our ships. And then we also look at individual deployment strategies, especially with unique pieces like the carrier. And we started tracking search paths: how do people tend to search, on the diagonal, are they criss-crossers, that kind of thing. And then we wanted to develop an AI opponent that would play against the player using some of these rules, trying to figure out the psychology.
Travis: That is interesting. One of the cofounders of our blog – Jared Lorince – he is not here, but he studies search behavior. He worked for Yahoo last summer, and he's interested in trying to figure out predictive models of how people search for information. And he was interested in social search, herding behavior, information cascades, and whether or not computers can make important decisions about what people are doing. Sort of the same thing you are thinking about, which is can we predict behavior to learn the goals and intentions of what people are searching for and then make better search queries based on that. He also does work with cognitive heuristics, looking at how people use simple rules of thumb to make decisions, a pattern searching kind of thing. I think he'd be extremely fascinated with what you do.
Evan: Well, it sounds like what he is pursuing is a much more complex version of what we were. You know, we just use a little world.
Travis: Well, that's his broad research goals; his experiments are much more like what you're doing. So, putting people into games where they're seeking out patches. You can call them "information", but really all the game is that you're a bunny or a turtle in NetLogo or something and you're going and navigating, trying to find where the most dots are and click the most dots, competing against other turtles. And so that tells us a little more about how people search in a space, and then he tries to apply that to information search. He also does things with large datasets, looking at things like Flickr and actually looking at color theory. You can look at what kinds of people take different kinds of photos and what colors they're most attracted to, and using evolutionary theory to try to see if we can get at, like, are females are more likely to be attracted to certain colors and take pictures of those, and can that be identified inside of Flickr by looking at the data?
Evan: That's a pretty interesting question.
Travis: Are you going to continue working with the insights from Augur?
Evan: The projects are very isolated. Once we deliver something, it's up to the client if they want to continue doing it. As far as I know, the Augur project ended at the end of August. Well, what month are we in right now, February, right?
Travis: Close, March.
Evan: March. Yeah. As far as I know, it ended when the project was finally delivered, and our final presentation for the client happened. That said, the information is always archived, and there are instances where projects get revisited, especially, if other institutions or other clients take a genuine interest in them.
If anybody ever wanted to enter into a partnership with the ETC with a specific goal of exploring more of what our team has built, then we would have no problem drawing on the things that were learned. We have archives that postmortem all the information that we have ever gathered, so it would involve them going through all that first, and deciding on the next step.
Travis: Let's say I know a venture psychologist who wrote a grant, and we're interested in getting a game developed. Do you guys have experience with doing this kind of thing? Is that something that you all are open to, in the event that money could be channeled your way?
Evan: I'm certain that the ETC would be. I don't know if you're talking about us as an institution or us as graduate students.
Travis: Well, just as an institution, I think, generally. I'm just curious what sort of opportunities for collaboration there are out there.
Evan: Oh yeah, there's always opportunities. The ETC has been extremely open. They love exploring anything new. You would want to get into contact with the Director, Don Marinelli. I mentioned his name in the presentation.
Travis: I saw his picture. It doesn't ring a bell, but good to know.
Evan: He's a wonderful person. He's like the coolest guy you'll ever meet. He would be the one to talk to. They are always open for collaboration. Sometimes they do non-profits, that kind of thing, if it's not possible to have funding. I would speak with him for sure if you ever found someone who was interested.
Travis: Well Evan, this has been truly enlightening. I’m impressed with what you are doing. The CDE really seems to be a leader in this area. Thanks for your time.
Evan: No problem, thanks for asking.
Interview with Evan Brown of the Carnegie Mellon ETC. by Travis Ross, unless otherwise expressly stated, is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.