SAT Strategy by Gender: Men Guess, Women Leave it Blank

(Digital Vision)

To guess or not to guess?  Most students wrestle with this question at least once during their multiple choice test-taking years. A new paper by Harvard economics grad student Katherine Baldiga examines whether men and women approach the issue differently. From the abstract:

In this paper, we present the results of an experiment that explores whether women skip more questions than men. The experimental test consists of practice questions from the World History and U.S. History SAT II subject tests; we vary the size of the penalty imposed for a wrong answer and the salience of the evaluative nature of the task. We find that when no penalty is assessed for a wrong answer, all test-takers answer every question. But, when there is a small penalty for wrong answers and the task is explicitly framed as an SAT, women answer significantly fewer questions than men. We see no differences in knowledge of the material or confidence in these test-takers, and differences in risk preferences fail to explain all of the observed gap. Because the gender gap exists only when the task is framed as an SAT, we argue that differences in competitive attitudes may drive the gender differences we observe. Finally, we show that, conditional on their knowledge of the material, test-takers who skip questions do significantly worse on our experimental test, putting women and more risk averse test-takers at a disadvantage.”

Baldiga’s results might help explain why women often do better in college than their SAT scores would have predicted and raise an important question: Are multiple-choice test scores the best way to fairly “measure aptitude and forecast future achievement”?  Readers, what do you think?  Are SAT tests gender-biased?  Of course, whether or not such gender differences are innate or cultural is a whole other research question.


(HT: Market Design)



In my opinion multiple-choice test are always less accurate (than say oral exams or essays, or just plain open questions) and definitely not the best way to measure aptitude or future performance (especially at an age where cognitive development is far from finished), however, it is the only way to be able to (1) check the answers of a large volume of both people and questions, (2) (practically) solicit questions to a large group of people, (3) to compare answers and results and (4) (you could see this as part of (3)) make sure that both the questions (oral exam) and answers (open questions) are the same between different participants.


To summarize: Yes, multiple-choice test scores the best way to fairly “measure aptitude and forecast future achievement” efficiently. Even if other systems might produce more accurate measures, most would cost more and lack the appearance of objectivity.


On last years AP test college board removed wrong answer penalties because, " it made no difference in terms of final score one way or another".

What if the Gender differences are what accounts for that lack of improvement? if that's the case then yes the test is biased.

Corban Saezer

My father always told me that, even if I'm smart, if I don't work hard and take risks then I'll never get anywhere.

I took his lesson to heart.

Therefore, when I see smart people not taking risks, I treat them with the same contempt as I do for my past self. So it is for the men; so it is for the women.


If it is important in business and a career to be a good decision maker with incomplete information, then this kind of test is a good way to measure aptitude.

This might help explain differences in wages between genders. As a CEO, bad answers or decisions have penalties, but an answer or decision must be made.


Interesting thought!

A classic complaint about standardized tests is that they FAIL to measure the most important qualities, such as the ability to make a decision with imperfect information. People with skills at solving games and puzzles – that is, artificially constructed challenges that typically have an unambiguously right answer – are not the same people who have the skills at finding optimal strategies where there is no ambiguously right answer. The real world requires the latter aptitude much more often than the former. Yet, because standardized tests are supposed to be “objective,” they ask questions that are supposed to have one unambiguously right answer. By their structure, these tests necessarily focus on some types of aptitudes at the expense of other, arguably more important, aptitudes.

But now bruteostrich observes that these tests reward individuals who can make accurate guesses, even when the individual has inadequate information. Perhaps this quality partially offsets the deficit I noted above. I’m going to have to mull this one over….


Mike B

The strategy of guessing vs not guessing is well defined on the SAT and there is no reason that every person taking it should not adopt that strategy. The SAT and tests like it get a lot of flack because they aren't the best proxy for intelligence or knowledge, but on the other hand a high stakes test is a good measure of performing under pressure and in this case, risk taking and strategic thinking.

The measure of a successful person not only comes from their knowledge, but also their personality. The reason that the best and brightest are not always the best compensated stems from the fact that you need other skills in addition to just knowledge or intelligence. People need to perform under pressure, be comfortable with risk and know how to think strategically. These are all things the SAT seems to test and we should encourage it. If women tend to do poorly on the SAT we shouldn't bemoan the fact that the test is bias, but instead find ways to get Women to do better. If women won't take calculated risks with guessing vs leaving something blank then who is to say they won't similarly fail to take proper risks in life. For example studies show that part of the wage gap comes from Women failing to ask for raises or promotions. If they learned to be more aggressive when studying for the SAT perhaps this problem could be mitigated.



While I think you raise some valid points in abstract, it's not too big a step from your opinions to blaming women themselves for their smaller wages. It's important for us to create a culture where workers (male or female) are equally rewarded for good performance regardless of whether they aggressively seek those rewards. Similarly, if a little thought will result in a differently presented SAT that more accurately measures aptitude, why not take that into account?

Mike B

Don't mistake the example as a defense of wage inequality, just how shrugging off something as "bias" could result in real world consequences later in life. As much as I would love to have a more meritocratic society unless everyone becomes autistic there is going to be a strong social component in almost everything humans do. I think that scholastic aptitude can be viewed to include such social components as performance under pressure and risk taking. A car can have a huge engine, but it's not going to go anywhere if it can't put its power down to the road. You can have a lot of knowledge aptitude, but if you can't ever apply it you might as well not exist.


I don't think this should make a big difference in test results. I am not familiar with the SATs (I am not an American), but in general, penalty marks for wrong answers in multiple choice tests are designed in such a way that random guessing will have zero net effect on the final score. For example, if there are 5 options per question and each correct answer carries 1 mark, then a wrong answer would have a penalty of -0.25.

I am a male, and I consider myself quite risk-averse. I also consider myself generally good in multiple-choice tests. My strategy when I don't know the answer to a question is to guess an answer only if I am reasonably sure that at least one of the options is wrong: this way, the expected value of my gain from the random (among the remaining options) guess is more than zero. Of course, some tests do not disclose what the penalty scheme for wrong answers is: but unless the test is also designed to screen risk-averse or risk-seeking traits, the penalty marking would always be designed this way.

Anyway, I don't think the apparent difference in random-guessing patterns of males and females will make much of a difference in multiple-choice tests. Males would not gain more marks by more random guessing; and females would not lose any marks by not doing random guessing.


Eric M. Jones.

Hey, not making a decision is a decision too! So what's the problem?

I wonder about the bias of the subjects since they all seemed to be headed for college. What about other groups like Arabs or Japanese?

I like Charlie Brown's T-F test strategy: The first question is T to start off optimistically, the second is F to break up the pattern. The third is F to break up the pattern, the fourth is true...etc.


Suppose there wasn't a gender effect. But, suppose humans have two personality types: type R, which likes to take risks, and type S, which likes things secure. Suppose type Rs tend to guess, and type Ss leave questions blank. So type Rs do better on the SAT than type Ss.

Is the SAT unfairly discriminatory against type Ss? What should we do about that?

Mike B

If Type R people tend to do much better in life and be extension school it would be rational for a school to select toward Type R applicants and moreover more rational for society to encourage people to adopt Type R behaviors.


But type R’s do better because our society is structured to reward risky behavior. Then you have to ask the question is it better for society to have type R’s in charge? Or should we do more to try and reward type S’s and type S behavior? I would argue that the financial sector, at least, would function better if more type S folks were making the decisions.

alex in chicago

Here is a simple contradictory postulate.

#1.Being able to read instructions on the test is another skill.
#2. Instructions contain the information that will tell you the penalty for wrong answers.
#3. Being able to understand that guessing will improve your performance is a simple calculation that even a fourth grader could do.

Therefore, not guessing is evidence of inferior knowledge or otherwise an inability to connect ones simple knowledge to a larger scheme.

Full Disclosure: 35 ACT, Five '5's on AP exams (5 taken), 170 LSAT.


Sure- I can recall being coached on the 'educated guess': if you can eliminate one or more of the answer choices as wrong, it pays to guess, etc.. This is most interesting- are those who are failing to recognize that a guessing strategy can improve your score victims of bias, or is the test recognizing and rewarding those who do- or those who use guessing without the knowledge of it's advantage, for that matter? Is this a skill we're trying to identify as aptitude, or is it an unfortunate skew from this type of testing?

caleb b

While I understand the necessity of standardized tests for college admission, a major problem with the process is that the preparation for the exam is not standardized.

Some kids, from middle and upper class households, receive tremendous test preparation which can cost hundreds (or even thousands) of dollars. Children from lower class households, generally, cannot afford such a luxury. So Richie Rich goes into the test having prepared for weeks for every type of question that could possibly be asked, while Leon Latchkey takes the test cold. Obviously, Leon isn't going to score as high as Richie.

Richie now receives a full-ride merit scholarship based on his score, while Leon does not. Sure, Leon probably qualifies for a Pell Grant, but a Pell Grant only covers basic tuition. Therefore, Leon, who was already at a disadvantage, continues to be at a disadvantage once in college.

Granted, the issue is more complicated than just that, but current standardized test ignore this aspect.

Full Disclosure: HS GPA 3.2, ACT 25 No Test Prep, Pell Grant – no other scholarship, College GPA 4.0


Ian M

Man are more likely to guess = gender bias in testing? How so?

The penalty for guessing is often 1/4 of a mark for a 5 choice question. One should ALWAYS guess if they are confident that at least one of the answers is certainly wrong.

Jeff F

Actually, Ian, you should always guess if you plan on taking the test more than once. If you take it twice, guessing will create a higher standard deviation to your tests, and most schools only look at the highest score you get (at least that is what schools I applied to stated). Yes, you may get a lower score by guessing, but probabilistically the highest score you get will be higher (if you take it more than once).


Joshua Northey

Men are less risk averse than women? You don't say...

Coming up at 6 o'clock: Water is Wet a riveting expose'!