Update #3 and Final Summary: Trends and Relationships Between Broadway Theater and Movie Musical Revenue

2018 Freshman Monroe Project Summary:

I decided to complete my freshman summer research project on the trends and relationships between Broadway theater and movie musical industries. I had two main goals for this research project. My educational goal was to apply what I learned in my freshman statistics & data science classes and build my capabilities in data analysis. My personal objective was to learn more about Broadway and movie musicals, especially in advance of my chance to see Hamilton at the Kennedy Center in July.

My project was conducted in three main parts: data collection, data manipulation/formatting, and exploratory data analysis. In the data collection phase, I collected data on both Broadway theater and movie musical revenue and attendance. The data came from two online databases: Broadway League and Box Office Mojo. I compiled all of the data from those sources between the years 2000 and 2017 into separate excel spreadsheets. Although straightforward, this was by far the most tedious and time consuming part of my research.

Next was the data manipulation/formatting phase. At this time, I conducted quality control steps to make the data cleaner and remove any unnecessary columns. The most important step in this phase was normalizing the Broadway and movie musical data sets to enable comparison, particularly the time scales of each had to be changed.

Finally, the last phase was the exploratory data analysis phase. This is when I ran summary statistical tests and created charts in order to identify key findings. My primary comparison was of mean gross revenue of both industries. I also calculated linear correlation coefficients & regression (best fit) lines to understand my confidence in the trends and extrapolate into the future. Lastly, I visually examined the data to find any interesting interactions or patterns between the two industries.  

In the end, I was able to make three conclusions based off of my analysis.

  1. Movie musicals achieve higher revenues due to the far larger number of theaters in which they play.
  2. Both industries show increasing average revenues. Broadway shows a constant gradual increase in revenue with a correlation coefficient of 0.9. Movie musicals’ revenue, on the other hand, fluctuates with a correlation coefficient of 0.2. This revenue trend is more dependent on the success of new releases each year.
  3. I did not see a relationship where one industry caused (predicted) changes in the other.

In conclusion, this research process was extremely time consuming, and I learned that planning is key. This definitely made we want to learn more computer programming tools, so that I can conduct data manipulation and analysis more efficiently. Although I could not draw any relationships or hypotheses of causation between Broadway and movie musicals, I was able to learn about interesting trends in these industries. It made for an even more meaningful experience watching Hamilton and hoping for a movie adaptation in a few years!


  1. Hi Lauren!

    I really enjoyed reading about your work, as I found the intersect of arts and statistics fascinating. I would have never considered how these two particular fields played off of each other, and I think its a great idea. Also, I appreciate your dedication to going through data from such a long time period! I was wondering, did you have any ideas for a reason that both average revenues are increasing overall, and do you think it could be the same reason? (Are people just more into singing?). I think that overall, for me, your research seems to be a great reminder that data is all around us if we look for it, even if there are trends or not!