Dragons and Data: Final Update


After my research and development process, I have created a program that can play a simple text-based adventure game! In order to successfully play the game, the program will make random choices when faced with a decision in order to build a dictionary of words associated with winning and words associated with losing. Then, once it has played the game enough times randomly (‘enough’ here is defined by a decision tree algorithm that I will detail later in this update), it will start to make educated guesses as to which choice will lead to a victory, using a scoring system that I discussed in my previous update. Before I get into a discussion of the nitty-gritty of my program, I’m going to provide a brief summary of my research on machine learning and natural language processing.



Machine learning, or computer learning, is an evolution of artificial intelligence closely linked to statistical analysis. Essentially, the computer trains itself on data sets to eke out patterns in the data and provide labels for new data points using these patterns. There are a multitude of different machine learning algorithmic approaches, such as using a k-nearest neighbor algorithm. This sort of algorithm groups new data points with their k (k being some integer) nearest data points on a plot, and it is generally seen as ‘lazy learning’ since it tends to generalize data too much, leading to false positives. Another type of machine learning is learning with a neural net, meant to simulate our own network of neurons. This approach, although brimming with potential in the greater machine learning field, relies on knowledge beyond the scope of my summer project, so I chose not to utilize it. I ended up using an adapted form of another type of learning: decision tree learning. This is where conclusions are made about a piece of data based on observable attributes, called ‘features.’ 

Figure 1

In my project, I utilize decision tree learning when the program is deciding between making a random guess and making an educated one. When the program reaches a branch, it looks at the previously recorded data, stored in a dictionary, and chooses one path or the other, until it reaches a leaf- a.k.a the conclusion. Although this is a simplified version of this machine learning technique, it lends itself well to the function of my program.



While the machine learning techniques I learned throughout my research are important, the most vital knowledge I gained in this process was that of Python’s Natural Language Toolkit (NLTK), a library of functions that help with natural language processing. By tokenizing and tagging words in my scripts, the program is able to filter out unimportant words (like conjunctions, prepositions, etc.) and perform analyses on the word sets, such as gathering frequency distribution data. This is important because it allows me to add features to the word sets like the average frequency of the words, or the difference in frequencies of words in both the WIN and LOSS lists. With these features, the decision tree can make more accurate judgements on when to take random guesses and when to make educated ones. My scoring system was also refined using information from the frequency distributions: words occurring in both WIN and LOSS lists with a frequency difference of less than 5 occurrences aren’t counted for points. So, neutral words and words that could be good or bad depending on the script don’t impact the score of the phrase. The constantly-updating dictionary now stores much more than just the words encountered in either WIN or LOSS paths; as Figure 2 demonstrates, the dictionary contains for each category: the words, the unique top 50 most common words and their frequencies, and the average frequency for each word. These extra features prevent over-generalization and are few enough that overfitting isn’t really a problem.

 Figure 2

Since one of the requirements for making an educated guess is the average frequencies of WIN and LOSS words must be greater than or equal to 10, it takes quite a few random runs before the computer starts making its guesses.

Figure 3

In this particular example, it took somewhere between 39 and 45 runs before it started winning every time (making an educated decision). However, this example shows the program running with a completely empty dictionary, meaning it fills up its dictionary with instances of this particular script. If the dictionary is filled as it is in Figure 2 (which was captured after it had run multiple different scripts already), the program would start by making educated decisions. In order to prevent the program from skipping the random guessing process (which is what allows it to fill its dictionary with data about the script), I implemented a failsafe of sorts. If there are words in the script that the program has never encountered before, it will make a random guess, regardless of the average word frequency or any other features. By adding a new word to the common word list, the average word frequency goes down, meaning the program must make random guesses until the frequency is back above 10. I decided to not have the program update its word list with winning words after it has made an educated decision, because I figured inundating the WIN list with words that the program already knows are WIN words will just inflate the numbers and throw the average frequency off balance.



Seeing as this is my first attempt at anything even close to this level of coding complexity, the program has some limitations. First, it is confined to working only on my scripts, which are very simple in nature and have some specific requirements. 

Figure 4

When writing a script, it was important for me to keep in mind that the computer cannot actually use intuition to make a choice. Therefore, I had to keep consistent with what kind of words I used to describe the choices the computer could make. The scripts had to be simple enough so that someone would be able to make a ‘good’ or ‘bad’ decision just by analyzing context clues in the phrase surrounding the choice. So, a script with a choice like “The CAVE, or the COAST” is not sufficient because there is no indication as to which option might be good or bad. I noticed some interesting things when testing different kinds of scripts. When running through a path that contains a choice between two options that the program might view as ‘bad,’ such as choosing between a path blocked with a spider or a dragon, and then choosing between fight or flee as combat options (with fight being a LOSS and flee being a WIN), the computer would add all of the words encountered on a path. So, ‘dragon’ and ‘spider’ would be added to both the WIN and LOSS categories since a win path might be ‘dragon’ -> ‘flee,’ and the loss path is ‘dragon’ -> ‘fight.’ Because of this, my WIN dictionary was filled with the words ‘dragon’ and ‘spider,’ meaning they might eventually be omitted from the score in a script where one of them is indicative of failure. Comparing the frequencies of words appearing in both lists helps remediate this issue slightly, since ‘dragon’ is more frequent in the LOSS category. In a script where there are both WIN and LOSS options after an initial neutral choice, the program will make that initial choice based on its own biased understanding of the choices. For example, in Figure 4, the first choice involves two equally bad options, at least in the eyes of a human player. The program ends up choosing the spider path, since the word ‘right’ appears more frequently in the WIN category since it has played many scripts with the victory being down the right path. Since the program can win by choosing the spider path or the dragon path, this causes no issue with the final result. Instead, this kind of behavior demonstrates the programmer bias that I mentioned in my previous updates. Because I happened to write scripts with the winning path on the right side of the cave, the computer began to associate ‘right’ with WINNING.



This research project was a fun and educational foray into the fields of machine learning and natural language processing. I learned quite a bit about both topics, I got to see firsthand how a computer deals with storing and compounding data, and I developed a whole suite of new coding techniques. Although my project barely scratched the surface of what it takes for a computer to learn, I was able to accomplish some form of what I initially set out to do and discover a whole new aspect of machine learning that I would love to one day put my attention towards remediating: programmer bias. It’s been a tough time, but it’s safe to say that this project was well worth the effort. I look forward to advancing my skills even further in the coming years, and I’m excited to put some of this new knowledge to practical use one day!

Provided is a link to my Github page so you can view my code and a few of my testing scripts yourself. Feel free to try and write some scripts of your own; I’d love to see if my program can play them!

Link to update #2, update #1, abstract


  1. This was a really cool project, Alex! I have done some basic python coding before, but never knew you could do something as complex as this. My question is would the program create the same algorithm each time you started with an empty dictionary? Or is there a degree of randomness? For example, is it possible that the program would still be able to develop a strategy to beat your game, but create a different approach to do so each time?

    Really great project!

  2. mjminogue says:

    Hi Alex!

    This project is fascinating and super impressive, especially for a freshman in the summer! The fact that you were able to at least begin to foray into machine learning this early in your development of technical skills really shows how much potential there is in this field. I wonder, what possible uses do you see for this technology in the future? And how long did it take for you to design this game?