Another Case of Teacher Cheating, or Is It Just Altruism?

From the results of the high-school “maturity exam” in Poland (courtesy of reader Artur Janc), comes this histogram showing the distribution of scores for the required Polish language test, which is the only subject that all students are required to take, and pass.

Not quite a normal distribution. The dip and spike that occurs at around 21 points just happens to coincide with the cut-off score for passing the exam. Poland employs a fairly elaborate system to avoid bias and grade inflation: removing students’ names from the exams, distributing them to thousands of teachers and graders across the country, employing a well-defined key to determine grades. But by the looks of these results, there’s clearly some sort of bias going on.

Compare that to the results of the “advanced” Polish language exam, which is taken in addition to the basic level exam by about 10% of students. It has no influence on whether students pass or fail the exit examination, so there’s no incentive to grade inflate, as evidenced by the clean distribution.


Artur writes:

I’m quite sure there is nothing to be gained for the graders/districts if they pass a student with a borderline score (at the basic level), rather than failing him/her. So my take on this is that graders just didn’t want to fail some kids and seriously hurt their college prospects and/or make them re-take the exam when the score was close to the cutoff.

So, is that pure altruism on the part of the teachers? Or do they actually have some bit of national incentive to see students go on to college? One could probably ask the same questions of school officials in Atlanta.


Alex C

Perhaps the minimal passing criteria is just very well defined


It is possible that since teachers like to see his/her students to do well for obvious reasons, this feeling often transfer to students in general. This could create a sub-conscious incentive and not wholly altruism.


The grading system isn't explained in detail above, but what I gathered was that the teachers were NOT marking their own students papers....or there is a very low chance of that happening since the papers were distriubuted to so many teachers and graders. So what would the "sub-conscious incentive" be to inflate the grades for a random student?


Hello, I'am glad that something form Poland come to Freakonomics, too :)
When it comes to the incentives for such interesting distribution, there should be some background add-on:

This is exam from Polish language containg reading (with understanding) of some newspaper text and discussing specific literature subject (in this case: compare diffrent poems and their views on dreams, or compare life philosophies of two novel heroes - all texts was attached to exam). The problem is that in case of second task, there is plenty of possibility to find new meanings for the poems or novel. Student can write a beautiful, and very deep analysis, but if it is not matching "the key" - centrally prepared answer base in which student must notice something, write about some other things, etc. - there will be no points, and since then - one can even not pass the exam.
This is more the problem of measuring humanistic skills of people - which are tried to be counted in this exam. And since the teacher checking the exams (surely not knowing students taking it - there is a big procedure to maximize anonimity) is also a Polish language teacher - surely humanistic one - he or she can sympathize with "repressed" humanistic-oriented students.



Why the high bars at either end of the advanced test score graph, though? I can see a long tail on the high side, if all scores above 40 are lumped into the 40 bar, but at zero?


It seems that some people are really dedicated and planned to take this exam well in advance (40th bin), some are completely unprepared and just go for it counting on luck (0th bin) and the rest just don't know what else to do with their life (taking into account that the distribution is almost centred on pass/fail border) ;)

Aaron Goldman

I wonder why the tails on the second graph are so large. Could it be that if the score is really low its just easier to call it zero and if its very good some teachers will give a top grade more often than a point or two below top? Interesting that the top graph shows only the faintest hint of this behavior.

Joshua Northey

This just displays your ignorance of how tests work (I mean ignorance in the nicest possible way (seriously)).

The clump at 40 is more due to the way tests place a cap on what can be demonstrated than it is by people lazily "rounding up" (thought there is a tiny bit of that too).

For example if you gave the American population a basic calculus test you might get a pretty normal distribution among HS graduates who studied liberal arts, a huge clump of zeros among HS dropouts, while other people who need to know it well for their profession will frequently get the maximum score because their ability is off the scale of the test. This is why you should always make tests really hard and why grade inflation is so pernicious. You lose your ability to discriminate between the people.

For example, on a World history AP test a sophisticated 3 or 4 page essay and good score on the multiple choice will get you a 5. But the scoring is 0-5 so someone who has is perfect multiple choice writes an amazing 20 page essay still gets a 5. this causes there to be a clump at 5 because people who would theoretically get 6s or 7s are limited to a 5.

The clump of zeros is due to a lot of student who know they will not score well not trying at all. People get in over their heads. Depending on the test zeros may only include non-responses, or those may be counted separately.



I guess I must not understand tests, despite having taken rather more than my fair share in my life. Depending on just how easy or difficult the test was, I would have expected either a double hump (lots of people who know it all, plus declining numbers who know almost all or make a few careless errors, then increasing again to a normal distribution of "ordinary" people), or the normal distribution shifted right.

Jon Peltier

The same phenomenon has been described for speeding tickets. For example, see figure 1 in "Speed Discounting and Racial Disparities: Evidence from Speeding Tickets in Boston" (


These are the results from 2010, not 2011. The Polish Minsitry of Education report ( contains graphs for various subjects. Philosophy looks quite similar to Polish language, but mathematics and chemistry appear not to have been tampered with. Teachers who grade the papers are required to stick to lists of suggested correct answers. These are much more vague for soft subjects, so it is easy to infalte the results.


The "normalness" of the curve is not the issue, smoothness is.


Would the sample size (N=343,000) explain the smoothness?


It's easy... parents call and complain when their kid gets a 64 on our Regents exams in NYS, so up until this year (and I'm sure it didn't stop this year, unofficially), there was no more "regrading" to "search" for another point to boost those near-misses over the edge. Many people might find a problem with that, but if you take into consideration the errors built into these tests, there's no problem giving the student the benefit of the doubt. Since it only makes a real difference at the cusp of passing and failing, that's the only place you see a bump. No one cares about a 66 compared to a 65, and only the nerdiest nerds care about a 100 vs a 99, but when a kid gets a 64 instead of a 65, there are huge repercussions.

In the greater good, is it better to give the kid a point, or make him repeat the class? Statistics show kids who fail don't catch up eventually, they drop out.

Joshua Northey

Keeping in those underachievers diminishes the value of the degree certification of all the other students. There is no free lunch.

Joseph P

There are a few things you need to know analysing this issue. The key (according to which exams are being checked) is rather not "well-defined". The key cannot be very accurate because test should check student's creativity not only pure knowledge. So it is up to the grader if student gets point for fine language, special values, and sometimes even the realisation of the subject. If the teacher who is checking the essay doesn't understand some parts of it he won't give points for these parts. But very often teachers want students to pass "matura exam" (it's how do we call it in Poland). Why? First of all this exam means really nothing nowadays, it's rather easy (especially the basic one) and it's not usually a problem to get 30% from it. If someone wouldn't pass it he/she can write another one after 3 months. It's only a formality and more work to do for those who check tests. They are paid of course but apparently not enough - which is very common in Poland. What's more students who have problems with passing basic matura exam (Polish, English or Math - these three are obligatory) usually don't go to college, they just want to pass this and get a job. So I think that teachers who also have students passing this exam each year and kids who also go to school just want someone else's students and kids to pass this exam if it is possible. That's why they try to see in very bad essays something which probably isn't there explaining: "He (student) knows that. He just didn't write it. I know that he knows that...". I'd say that it's altruism then because they gain really nothing. There is just a widespread belief among teachers that each of them do his/her best to let the students pass. By the way, if someone gets less points than expected he can write a petition to the regional commission in order to see the exam and check it by him/herself and his test is verifying - if it's or if's not checked properly. That can be another reason why teachers want to give as many points as possible.