Gendered Nouns – Final Update

AbstractUpdate 1Update 2

Since my last update, I closed the survey after one complete month, since the incoming responses had slowed to a trickle, finished my noun database, and completed the process of sorting and analyzing my data.


It originally seemed like I was going to have quite a few “studied more than one gendered-noun language” people. However, I had to throw out many sets of incomplete data that were impossible to accurately categorize, and I was only left with a handful. Although I did initially plan to have a category just for current speakers of gendered-noun-languages, I ran into this same problem. I ended up using the following criteria for the categories, based on one of the longer ranges I found studies on language-retention for:

  • FSG – French, Spanish, and/or German
    • native language or studied within the last 3 years
  • MNG – Monolingual/No Gender (English, Mandarin, Vietnamese, Malayalam, Japanese, Korean)
  • Other  
    • FSG+ – studied French, Spanish, and/or German within the last 4-10 years
    • influence (native language or within the last 10 years) from a non-FSG language


The first document I have attached is the noun database I’ve been working on. Its main purpose in this project ending up being merely to determine the “problem languages” with different genders from French/Spanish/German, but I also found the comparisons interesting in their own right. (The two incomplete columns are from languages that were tricky to find good online dictionaries for, and were left unfinished because they were part of data sets that were thrown out.)


To start the analysis, I compiled all the data into three groups: All, FSG, and MNG. I was hoping to be able to analyze the data from the two “Other” groups, but since there are only a few of each, I wasn’t able to do anything meaningful with it. I made pie-charts showing the masc/fem split for each noun in each category, which you can see in the attached “Just Graphs” PDF. When I make the poster, I will choose a handful to visually represent the different types of results (strongly masc, strongly fem, even split, etc.).

Once I was back on campus, my advisor helped me use the stats program SPSS to analyze my data. We created binary logistic generalized linear models and started by looking for trends based on the group or the actual grammatical gender. We found no statistically-significant difference between the two groups and no statistically-significant relationship between the real genders and the responses, meaning we have no evidence of an effect of learned languages on the noun/gender associations.

Next, we looked at noun characteristics (“feminine associated,” “round,” and “colorful”), that may have impacted the respondents. When we ran the model using just the FemAssoc and Round groupings, we found that FemAssoc things tend to be classified as feminine (with a p-value of 0.006), and Round things are even more likely to be classified as feminine (with a p-value of <0.001). Things that are FemAssoc AND Round are even more likely to be classified as feminine.

Once we added in the Colorful grouping, the results shifted a bit. Colorful nouns were found to be very likely to be classified as feminine (with a p-value of <0.001). Since most of the FemAssoc nouns were colorful, once the model included the Colorful grouping, merely being FemAssoc was no longer a statistically significant factor. Round nouns were still statistically significantly classified as feminine nouns. With this most complete model, we found that the size of the effect, as represented by the Wald Chi Square value, is much higher for Colorful nouns (~103) than for Round (~13) or FemAssoc (~11).


A quick non-numerical summary of my findings: I found no evidence of influence from study of foreign languages on noun-gender associations and no evidence of any inherent genders of nouns that would be recognized by speakers of languages without gendered nouns. I found that round nouns, and to a much greater extent colorful nouns, are more likely to be classified as feminine, regardless of their actual grammatical gender.


During the process of putting together my poster, I plan to assemble my blogposts, emails with my advisor, and personal notes into a technical report that can be referenced if this project is taken further in the future.


Noun Database

Just Graphs