Key Words: Statistics Education, Mathematical Statistics ...



8 May 2012

Exploiting the Sudoku Syndrome

Mike Bedwell, MA, MSc

Consultant in the GMAT and Statistical Education

Ukrainian Education Center, Kyiv

Michael_Bedwell@

Abstract

This paper examines how teachers of statistics can exploit the universal popularity of the Sudoku mind-game by the use of small, simple and discrete observations. Two examples of problems that can be explicated by enumeration in two-dimensional matrices are discussed: first the calculation of standard deviations and standard error, and secondly the Kruskal-Wallis hypothesis test for three groups of not necessarily the same size.

The author argues that these examples support his contention that the fundamentals of statistical thinking can be learnt without recourse either to advanced mathematics or to electronic computation.

Introduction

Interest in the Sudoku mind-game has grown in a way few would have predicted when it first appeared in a Japanese newspaper some quarter of a century ago. Now, throughout the industrialized world, we can find people absorbed in Sudoku, even when in crowded public transport.

Most of these would probably deny any interest in statistics, yet as the instructions point out, the game requires ‘reason and logic’ rather than maths. This point is emphasized in the Wordoku variant of the game that uses letters in place of digits. So how can we teachers exploit what has been described as the ‘Sudoku syndrome’ to inspire the same enthusiasm for statistics?

First, we must acknowledge that Sudoku, like the jigsaw puzzle is a ‘closed problem’ to which there is only one right answer – and indeed if we want to cheat, there are a number of websites that will compute that answer for us. This is in stark contrast to the inductive thinking in statistics where, as in life generally, the greater our experience, the more we are inclined to conclude that ‘all the answers are wrong – but some are worse than others’.

Yet the convoluted path to understanding statistics is paved with mathematical sub-problems that have unique solutions; conventionally these are presented as algebraic formulae, but I want to suggest an approach that seems to appeal to many students, that of enumeration. The suggestion has been in part inspired by learning, through Hombas,(2012) what Herodotus wrote over two millennia ago: namely, that as a distraction from their privations officers in the Lydian army invented, ‘dice and….such games with the exception of tables, ….which they do not claim as theirs.’ So perhaps even before Herodotus’s time the 9*9 Sudoku square matrix was already familiar?

Here I present two examples of matrices that have at least visual similarities to Sudoku: I use the first to teach the calculation of standard deviations and standard error, and the second to illustrate the Kruskal-Wallis non-parametric test for three groups of not necessarily the same size. In the classroom, I present matrices only partially complete, leaving blank a number of cells determined by the level of my students and the time available. Readers can readily carry this blanking process on the electronic version of this paper. I also present equations as far as possible in verbal form so as to avoid formal notations that, however necessary they may be to the practitioner, appear discouragingly convoluted to the learner.

1. Standard deviations and Standard error.

(i)The variance s2

Students first consider a sample comprising the simple set of integers {1, 2, 6}, and develop their differences in Table 1. Here the sample space comprises the six cells excluding the diagonal, since we are sampling without replacement and so do not have the same two values in any one pair. Moreover the variance, from what Joarder (2009) reminds us is its fundamental definition, is:

s2 = (sum of squares of positive differences)/ (size of sample space},

so in the numerator only the three values greater than zero should be evaluated, and in the denominator the diagonal cells should also be ignored, so

s2 = {12 + 42 +52 }/6

= 7

More generally, the number of cells in the sample space is n(n-1) where n is the sample size. Thus the teacher can avoid the difficulty of explaining the term (n-1) in the denominator of the conventional textbook computational formula for s2 which ironically, as Muttart (2009) points out, has been rendered obsolete by the computer itself. Invoking the computational formula requires introducing the term degrees of freedom, a tricky concept which teachers may want to avoid at this stage.

(ii) The Standard error

As a prelude we develop the matrix in Table 2 which shows the sums of our nine pairs of observations: As before, we ignore the values on the diagonal, so the remaing values {3,7, 8} constitute the exhaustive set of equi-probable values of the sums.

Since the mean of a pair is half the sum, and by definition the (standard error)2 is the variance among these means, we now develop the differences in the sums in Table 3.

We next compare the sums in a way analogous to that used for the individual observations in Table 1. Now, however, again by definition we are concerned with the means of an indefinite number of samples. So by implication we are conducting a ‘thought experiment’ in which we repeatedly resample with replacement, so all nine cells of the matrix of Table 3 – including those in the diagonal – must be respected in determining both the numerator and denominator. Thus the

(Standard Error)2 ≡ (SE)2

= {the variance in the mean of the 9 samples}/ {number of cells in sample space},

and since the mean is half the sum for our sample,

= {½differences of sums)2} / 9.

Then, observing the symmetry of the matrix about the diagonal in Table 3, we calculate

(SE)2 = [3(0)2 +2{12 + 42 + 52 }]/36 = 7/3,

We note this conforms to the central limit theorem, which states the standard error to be {standard deviation / √ (sample size)}.

We further note the importance of pondering the differences between sampling with or without replacement. Although we conventionally assume the latter, sampling with replacement is a crucial element in Bootstrapping. So for all that this burgeoning technique has been described as ‘computer-intensive’ (Efron, 2010), it may be not only possible but desirable to introduce it through simple paper-and-pencil enumeration.

2. The Kruskal-Wallis Test

In an earlier paper (Bedwell, 2010 Dresden) I used the marathon race to illustrate the test which is the nearest we have to a non-parametric version of one-way ANOVA. Here I am extending the discussion to include races where there are inequalities among the group or team sizes.

More specifically, we consider cases where there are N runners divided into three teams X, Y and Z, with respectively nx , ny and nz (= N- nx - ny ) runners. We denote by SX, SY, SZ the sums of the individual placings in each team, and minimize the matrix size by choosing nx ≤ ny ≤ nz before plotting, SZ as the output variable in a table of SY v SX. Table 4 shows as example the case where

N=6, nx = 1, ny =2 and nz = 3. We denote with ‘U’ those cells that are unfeasible because the placing of one of the Team Y runners coincides with the placing of the Team X runner.

Once this table has been developed, we can illustrate some basics of hypothesis testing. First, and most fundamentally, our null hypothesis is that the teams are all the same standard, so that the differences among the results – the evidence-- must be ascribed to chance. Any numerical probabilities we go on to calculate – the p-values-- therefore refer to those of the evidence given the null hypothesis, much as we might wish that they were that of the hypothesis given the evidence.

Second, we need to consider by what criteria the various stakeholders in the race might judge the results. The captain of the Z team, for example, will be clearly delighted if their three runners finish ahead of those in the other teams, i.e. if SZ=6. Only three of the 60 feasible cells in Table 4 contain that value, so we might conclude that the p-value is 3/60 = 5%. But do these three cells represent equally convincing triumphs? No, not if the criterion is the maximum difference between the means of the individual placings of the winning team and that of their nearest rival(s). In this instance this maximum occurs for the cell corresponding to SX,= 5, SY = 10, and SZ =6, where the difference is (5/1-6/3) = (10/2-6/3) = 3. The difference is smaller, not just for the other two cells where SZ =6, but for all 59 of the other feasible cells, so Z’s captain could justifiably claim a p-value of 1/60, or less than 2%.

Enthusiasts for other sports might suggest different criteria; team racing in sailing presents a particularly challenging instance. Teachers who want to stimulate ‘statistical thinking’ should hesitate before regurgitating the received textbook wisdom ‘that the right answer is the chi-sq distribution’, and should have an answer prepared for the bright student who asks ‘but are the textbooks right?’ For in the table as a whole there are no fewer than 6 cells where chi-sq has the same maximum value (of 4.29), and these do not include the cell {SX,= 5, SY = 10, and SZ =6} just discussed.

In this year of the Olympics, it should not be difficult to find reasons why the teams might be of different sizes; some teachers may remember the 1988 Winter Olympics when ‘Eddie the Eagle’ was the lone UK entrant in the ski jump. He performed appallingly, but to the chagrin of his expert competitors, stole a lot of the publicity! More important for students is to stimulate discussion as to where the Kruskal-Wallis test -- the formal name for the chi-sq test when applied to rankings rather than to a continuous variable -- can be used in contexts other than sports. Examples are not difficult to find in the worlds of medicine and business, especially in longitudinal studies: a hospital may wish to compare three different interventions for cancer, but find that one or more subjects die from apparently unrelated causes before the survey is complete; a marketing manager comparing the effectiveness of advertising among three different journals may find that one of them inadvertently omits one of her announcements. Further, the test may be used post-hoc when the results seem to correlate with some previously unforeseen variable; when examining the marathon result, for instance, a sports coach might be led to consider the influence of the age or ethnicity of the individual runners, irrespective of the team in which they had been running.

Conclusion and Discussion

Although the mental abilities exercised in Sudoku differ from those constituting statistical thinking, the process of enumeration is common to both. Completing tables where some of the values have been deliberately omitted can be used to stimulate both competition and healthy cooperation even in the large, often unmotivated classes which we teachers are increasingly required to inspire. Further, as Petocz & Sowey (2012), have extensively argued “statistics is not part of mathematics’, and we have seen that enumeration requires no more than simple arithmetic. Neither does it call for IT skills, the incorporation of which into statistics teaching I have long contended distracts rather than helps (Bedwell, 2009, 2010).

|Table 1. Differences between the Set of Observations {1,2,6} |

| |6 |-5 |-4 |0 |

|xj |2 |-1 |0 |4 |

| |1 |0 |1 |5 |

| | |1 |2 |6 |

| | | |xi | |

|Table 2. Sums of Observations in Table 1, = |

|2˟ (means) |

| | | | | |

| |6 |7 |8 |12 |

|xj |2 |3 |4 |8 |

| |1 |2 |3 |7 |

| | |1 |2 |6 |

| | | |xi | |

| |Table 3. |

| |Differences in sums in Table 2 |

|8 |5 |1 |0 |

|7 |4 |0 |1 |

|3 |0 |4 |5 |

| |3 |7 |8 |

Table 4 The Marathon: 1 Runner in Team X, 2 in Team Y, & 3 in Team Z

| |Team Z’s scores SZ | | |

| | | | | | |Team Score SY |Individual placings | | | | | | | | | | | | |9 |8 |7 |6 |U |U |11 |5,6 | | |10 |9 |8 |U |6 |U |10 |4,6 | | |11 |10 |9 |U |U |6 |9 |4,5 | | |11 |10 |U |8 |7 |U |9 |3,6 | | |12 |11 |U |9 |U |7 |8 |3,5 | | |12 |U |10 |9 |8 |U |8 |2,6 | | |13 |12 |U |U |9 |8 |7 |3,4 | | |13 |U |11 |10 |U |8 |7 |2,5 | | |U |12 |11 |10 |9 |U |7 |1,6 | | |14 |U |12 |U |10 |9 |6 |2,4 | | |U |13 |12 |11 |U |9 |6 |1,5 | | |15 |U |U |12 |11 |10 |5 |2,3 | | |U |14 |13 |U |11 |10 |5 |1,4 | | |U |15 |U |13 |12 |11 |4 |1,3 | | |U |U |15 |14 |13 |12 |3 |1,2 | | | | | | | | | | | |Team Score SX |1 |2 |3 |4 |5 |6 | | | | | | | | | | | | | |

References

Bedwell, M. (2010) Statistics for the Mathematically Challenged. Paper C183, ICOTS-8 Conference; Data and Context in Statistics Education. July, in Ljubljana.

Bedwell, M., (2009). Rescuing Statistics from Mathematicians,

Conference on Modeling in Mathematics Education, September, in Dresden.

Efron, B.(2010). Large-Scale Inference, Cambridge University Press, ISBN 978-0-521-19249-1, p138.

Hombas, Vassilios C. (2012) ‘Historical Vignette.’ Teaching Statistics, 34(1):43

Joarder A.H, (2009), Variance of a Few Observations, Teaching Statistics, 31(2):55-58.

Muttart, D.M., ( 2009), The Obsolescence of Computational Formulae, Teaching Statistics, 31(1),12-14

Petocz, P & Sowey, E. Statistical Diversions, Teaching Statistics, 34(1): 44-47

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download