The Art of SATergy

My son took the SSAT exam this past Saturday. And while I was sitting in the Choate athletic facility waiting for him to finish, I remembered that Avinash Dixit and Barry Nalebuff‘s new book, The Art of Strategy, has a great example concerning standardized testing. Game theory is so powerful it can help you figure out the correct answer without even knowing what the question is.

Consider the following question for the GMAT (the test given to MBA applicants). Unfortunately, issues of copyright clearance have prevented us from reproducing the question, but that shouldn’t stop us.

Which of the following is the correct answer?

a) 4p sq. inches

b) 8p sq. inches

c) 16 sq. inches

d) 16p sq. inches

e) 32p sq. inches

O.K., we recognize that you’re at a bit of a disadvantage not having the question. Still, we think that by putting on your game-theory hat you can still figure it out.

Before reading their analysis, take a shot at trying to reason your way to the correct answer.

Here’s what they said:

The odd answer in the series is c. Since it is so different from the other answers, it is probably not right. The fact that the units are in square inches suggests an answer that has a perfect square in it, such as 4p or 16p.

This is a fine start and demonstrates good test-taking skills, but we haven’t really started to use game theory. Think of the game being played by the person writing the question. What is that person’s objective?

He or she wants people who understand the problem to get the answer right and those who don’t to get it wrong. Thus wrong answers have to be chosen carefully so as to be appealing to folks who don’t quite know the answer. For example, in response to the question: “How many feet are in a mile?” an answer of “Giraffe,” or even 16p, is unlikely to attract any takers.

Turning this around, imagine that 16 square inches really is the right answer. What kind of question might have 16 square inches as the answer but would lead someone to think 32p is right? Not many. People don’t often go around adding p to answers for the fun of it. “Did you see my new car — it gets 10p miles to the gallon.” We think not. Hence we can truly rule out 16 as being the correct solution.

Let’s now turn to the two perfect squares, 4p and 16p. Assume for a moment that 16p square inches is the correct solution. The problem might have been: “What is the area of a circle with a radius of 4?” The correct formula for the area of a circle is pr2. However, the person who didn’t quite remember the formula might have mixed it up with the formula for the circumference of a circle, 2pr. (Yes, we know that the circumference is in inches, not square inches, but the person making this mistake would be unlikely to recognize this issue.)

Note that if r = 4, then 2pr is 8p, and that would lead the person to the wrong answer of b. The person could also mix and match and use the formula 2pr2, and hence believe that 32p or e was the right answer. The person could leave off the p and come up with 16 or c, or the person could forget to square the radius and simply use pr as the area, leading to 4p or a. In summary, if 16p is the correct answer, then we can tell a plausible story about how each of the other answers might be chosen. They are all good wrong answers for the test maker.

What if 4p is the correct solution (so that r = 2)? Think now about the most common mistake: mixing up circumference with area. If the student used the wrong formula, 2pr, he or she would still get 4p, albeit with incorrect units. There is nothing worse, from a test maker’s perspective, than allowing the person to get the right answer for the wrong reason. Hence 4p would be a terrible right answer, as it would allow too many people who didn’t know what they were doing to get full credit.

At this point, we are done. We are confident that the right answer is 16p. And we are right. By thinking about the objective of the person writing the test, we can suss out the right answer, often without even seeing the question.

Now, we don’t recommend that you go about taking the GMAT and other tests without bothering to even look at the questions. We appreciate that if you are smart enough to go through this logic, you most likely know the formula for the area of a circle. But you never know. There will be cases where you don’t know the meaning of one of the answers or the material for the question wasn’t covered in your course. In those cases, thinking about the testing game may lead you to the right answer.

If you want a fun way to learn a ton of useful game theory, this is the book for you. How good is it? Steve Levitt has a blurb on the book saying it’s so good, he read it twice.


sarahmas

mccxxiii, thank you for saying what I was about to post in my own comment. No one was suggesting test takers skip the questions and deduce the answers using Game theory! It was just an interesting exercise that also happens to point out a major shortcoming of standardized tests. I fully believe that the only thing a high score on a standardized test shows is that you are good at taking standardized tests.

Pinky

"Think of the game being played by the person writing the question. What is that person's objective?

He or she wants people who understand the problem to get the answer right and those who don't to get it wrong."

But what if the objective is the one stated above AND to prevent test takers from deducing the answer using game theory?

It's a lot more complicated then, and it's not too much of a stretch to say that test-makers are sophisticated enough to know what techniques are used to guess answers.

In other words, what if the test-maker deliberately chose those answers choices when the correct answer really was 16?

X

This problem would be significantly complicated if we imagined that the test-maker is at least as competent as the test-taker. A savvy test-maker should anticipate the authors' strategy and construct appropriate solutions. (One possible response would be to randomly sprinkle in solution sets that reverse-engineer to the wrong answer.)

Peter Brady

I got it right even though after I narrowed it down to A or D using the same logic used in the analysis, but then simply guessed D. These are useful strategies, but they would not work on many of the IT certification tests I have taken in the past where multiple choices may be correct with no indication in the question as to how many. Single answer multiple-choice tests are much easier to B.S. through.

X

Just saw Pinky's comment above - same point...

Eric M. Jones

I always had a problem with tests...

The opposite of BLUE is ______
The opposite of BUY is _______
The opposite of EVAPORATION is_______
The opposite of BLACK is ______
The opposite of REFRIGERATOR is _______

Colors don't have opposites. I wish BUY and SELL were opposites, the stock market would be much simpler. The opposite of EVAPORATION is NO EVAPORATION???? Nouns never have opposites. I'm very Ying-Yang. There are
obverses, reverses, inverses, negatives, mirror images, converses and complementaries....

I always had problems with true and false questions.

TRUE or FALSE: The Earth is getting warmer......ahhhhhh ......This day, week, month, century, millennia, million years, a trillion?....there seemed no easy
dividing line.

When I took my FAA Pilot Certificate test, the examiner gave me a written question to show my comprehension of flight planning: "If you have you Cessna 150 in Santa Monica, and gold bars are free in Barstow, write a flight plan to fly to Barstow and bring back the maximum weight of gold to
Santa Monica in one flight."

I thought about this for a few moments and began writing my flight plan: The trip to Barstow would very standard. although I elected to take only the bare minimum fuel. Arriving in Barstow, I would remove all the radios, battery,
passenger seats, propeller hub, streamlining, landing lights and their wiring, beacons, various details of the interior, the doors, and basically stripped the airplane down to the lowest-possible weight except for what was legally required for the aircraft's certification. Then I would fuel up to the absolute minimum required to get to the next refuelling stop (usually only a few air miles and maybe a gallon or less).

Then I would hire the lightest-weight Barstow pilot I could get (the examiner didn't say I had to fly...I'd drive home), and would properly load 876 pounds of gold into this tiny airplane. The airplane would only climb to minimum legal altitudes, the fuels stops would be many, utilizing private air-strips and off-field but you just couldn't do better. Phenomenal, brilliant, amazing--I'd be rich!

The FAA Flight Examiner read my elaborate flight plan, looked up at me and scowled, "That's NOT what I meant....!"

LATIN TEST: "I wrote my name at the top of the page. I wrote down the number of the question 'I'. After much reflection I put a bracket round it thus '(I)'. But thereafter I could not think of anything connected with it that was either relevant or true. Incidentally there arrived from nowhere in
particular a blot of ink and several smudges. I gazed for two whole hours at this sad spectacle: and then merciful ushers collected my piece of foolscap with all the others and carried it up to the Headmaster's table."
--Winston Churchill (1874-1965)

Read more...

Conor - Ireland

Aghhhhh... this post really gets under my skin... In Ireland we rarely have these standardized tests, and generally they only have to be taken by MBA and Masters in Finance students.

However, I am currently going through the pain that is applying for a PhD in economics in the US, and I had to take the GRE exam, which was my first experience of one of these tests...

All I can say is that with very little prep I scored 780 on the math section, and I guessed at least 5 of the 28 questions because of time restrictions. I used the principle of 'most likely answer' discussed above... So it does work!!!

jim

When I was taking the beginning actuarial exams a long time ago this would not have worked. They actually figure out where people make common mathematical mistakes and put those in as possible answers. So you actually have to work the problems out. Since it is a timed test part of the strategy is answer the ones that take the least amount of time first.

Jim Cooper

It's amazing--and fascinating--how much of our learning can take place during the test-taking process. A related but more nebulous question is: "Can you 'peak' too early in studying?" Is it best to go into an exam 95% confident, leaving room for figuring out the last bit of subject matter "under fire" and without falling prey to sloppiness through over-confidence or fatigue? This is often a rationalization for late cramming, but I've heard it seriously advocated. (I suspect it may be better for social sciences than hard sciences, with no slight intended to either.)

As a sometime test-maker myself, I know that time pressure can also help cut down on "gaming" exams. I was always impatient with professors who utilized it, but I think they were on to something.

Rachael, Anna and Mike all made the same great point about finding the answer with the most commonality to the others. Without ever taking high school physics, I got a 74th percentile score on a SAT/ACT physics subject exam using this kind of approach. (I had only taken a textbook home in 8th grade years before. FWIW, several years later, when I was an engineering undergrad, I flunked physics the first time--a savvy test-maker, maybe?--and then was in the top 3 of my class for the 2-semester freshman physics sequence at Arizona State.)

Btw, thanks much for the Up Your Score book rec! That will be a Christmas present for my daughter (currently at Lawrenceville).

Read more...

Troy Camplin

No, Adam, it's not BS filler. Doing this allows one to know which formula to use. If we have the question, "What is the area of a circle with r=4" then the questioner expects us to come up with the formula for the area of a circle. This approach cal give us an idea of both the right answer and the missing part of the question that we were expected to fill in ourselves.

Mo

The notion that game theory can help us infer things in situations makes sense...we do it all the time. But, the reality is that this is a ridiculous example. They should have found something better to make their point.

Craig DeForest

These types of techniques are useful not just for multiple-choice questions but also for fill-in-the-blank questions -- especially if multiple questions exist on the same topic. Generally, people don't think about what information they're giving away (for elsewhere) when posing each question.

I took my undergraduate physics qualifying exam (at Reed College) before taking statistical mechanics, and there was enough information in the stat-mech / thermal-physics section to glean the answers to most of that section, so I aced that section completely cold. It was 20 years ago now, so I don't remember all the details any more -- but if I recall right the fundamental ideas were that entropy is the logarithm of the state function, and that temperature is the derivative of the system's energy with respect to its entropy. Knowing those two facts was enough to answer all the questions in the section...

Read more...

htb

Kevin, I went through the same sort of logic, and it took me about ten seconds. If you're used to this kind of thinking, it's really not a long and drawn out process.

Cris

I am with Anna @ #4...

Sam

I got the right answer but I just took a class where this line of reasoning didn't work at all. I somehow still managed an A.

Ruth

I always found it helpful to skim through the questions before I began answering them. It put my brain in gear and sometimes one question revealed answers to another.

I understand that some of these standardized tests are beginning to be given on computers, and that the computers setups do not allow test-takers to read questions ahead or go back and review or correct questions.

I don't mind tests with sections, but I don't like having tests where I can't go through and solve the questions that are easy for me and then return to the more difficult ones. Even if the test puts the "easy" questions first, there may be some harder questions that are easier for me to solve because of particular facts or abilities I have.

jacob

i discounted the smallest and largest answers., leaving me with 8 pi, 16 pi and 16 pi - Too bad my eye-sight isnt what it used to be, or that I forgot how to read after staring at a computer all day. I wonder if i would have answered correctly if i was able to read the question.

p.s. i scored a 780 on the SAT math about 10 years ago.

Owen

I got 0.349 square feet.

Ian McKay

A good test taker might just guess if they don't know the answer rather than taking too much time to reason over the 'best' guess (Assuming it is not so simple at only one has both 16 and pi). From the level of the question, I would guess there is only about 30 seconds time for each question. The over analysis may lead to not having time to answer 3 other questions.

On another note, I once had a phisosophy exam that was so poorly wriiten that it had the same question twice. They were fairly far apart in the exam but I think many if not most people would recognise a repeated question. Here's the kicker, there was only one common answer between them!

Bird course indeed.

TerryW

Multiple-choice test answers seem to be often designed in a way that you can eliminate it down to two of the answers. Clearly, 16 and 16pi, because of the small difference between them that indicates a "trap". Then, the similarity with the others - the pi - indicates 16pi is right.

It's similar to when the answers include two of the same answer, but change only one word, or some ordering- typically something like "John is taller than Jill" v. "Jill is taller than John". Oftentimes, one of those answers is right.