Third Education Group Review / Articles: Volume 1 Number 1

Access this article in pdf format




School-Related Influences on

Grade 8 Mathematics Performance in Massachusetts


Sandra Stotsky

Northeastern University


Rafael Bradley and Eugene Warren

Thomas, Warren + Associates


ABSTRACT

Less than one third of American eighth graders score in the two highest performance levels on the grade 8 mathematics test given by the National Assessment of Educational Progress. Only a little over one third of Massachusetts eighth graders score at the two highest performance levels on the state’s own grade 8 mathematics test. In 2002, the Massachusetts Department of Education funded research to explore why there had been no significant growth in the percent of grade 8 students performing at the two highest levels on the state’s grade 8 mathematics tests. An analysis of quantitative data obtained from administrators and teachers in a representative sample of 60 schools throughout the state in 2003 identified school-based factors that were significantly associated with the 20 of the 60 schools that both increased above the state average increase the percent of grade 8 students performing at the two highest performance levels on the state’s grade 8 mathematics test and simultaneously decreased above the state average decrease the percent of grade 8 students performing at the lowest performance level. A significantly higher percent of teachers in these 20 schools reported spending a great deal of time reviewing and using test results, having a voice in the choice of their instructional materials, using accelerated and leveled algebra I classes to address the needs of above grade students, and less frequent use of calculators in non-algebra classes. At a time when teachers in all states are being held accountable for increasing the achievement of all their students, these findings warrant exploration on a nationwide scale.

 

I.      INTRODUCTION

The Massachusetts Education Reform Act of 1993 (Chapter 71 of the Acts of 1993, Statutes of the Commonwealth of Massachusetts) changed almost every aspect of elementary and secondary education in the state in order to improve student learning in all subjects. With the support of industry leaders, teacher unions, and the public at large, the Massachusetts legislature mandated the development of a comprehensive and far-reaching system of standards and accountability measures that would affect all students, teachers, and school districts. For students, this system took the form of pre-kindergarten to grade 12 standards (called curriculum frameworks) and accountability measures (annual state tests that are part of the Massachusetts Comprehensive Assessment System, or MCAS). For teachers, this system took the form of five-year cycles for license renewal and the requirement of individual professional development plans approved by the teacher’s principal or supervisor. For school districts, this system took the form of school and district standards with accountability measures applied through an established schedule of inspections, and ratings based on the inspections and student test results.

              Over the past ten years, Massachusetts has dedicated significant resources to improving the academic performance of all its students, its lowest achieving students in particular—those whose performance on the MCAS tests is at the Warning level. One major effort to address this goal by the Massachusetts Department of Education (Department) was the Middle School Mathematics Initiative (MSMI), a two-year intervention and research project begun in 2000 to help mathematics teachers in under-performing middle schools, as identified by MCAS scores, to improve student achievement in mathematics. We provide a short description of the 2000-2002 study because this study and its results served as the point of departure for the study reported here.

 

A. Methodology and Results of the MSMI

For the MSMI, the Department employed six highly experienced mathematics teachers as mathematics specialists, or coaches, and a highly recommended pedagogical strategy for strengthening teachers’ effectiveness in their classrooms. The Department was especially interested in assessing the value of coaching in improving student learning in mathematics because it is an expensive strategy for school systems to use, with no body of scientifically based research evidence yet available to attest to its efficacy (Russo, 2004). The basic task of the six specialists was to train over 50 teachers in grades 6, 7, and 8 in eight school districts in lesson planning and implementation over the course of more than one year (24 teachers in the first year of the study continued into the second year of the study). The emphasis was on lesson planning and implementation because the principles guiding them are generic and can be applied to any mathematics curriculum (Panasuk & Sullivan, 1998).

All students in the intervention and comparison classes (volunteered by their principals and teachers, with over 1000 students in each group each year) were given pre-post tests consisting of items similar to released MCAS grade 6 mathematics items addressing basic arithmetic operations. The Department sought to determine learning gains during the academic year and to pinpoint students’ achievement level in arithmetical skills and understanding more precisely than can be learned from MCAS tests, which have been given only at the end of two-year grade spans. Because most low-performing schools today receive targeted assistance of varying kinds (whether for the whole school and the regular classroom teacher or for the low-performing, Limited English Proficient, or English as a Second Language student through a Title I or bilingual education teacher), the intervention and comparison groups as a whole in this initiative could be considered matched mixed models; the only clear difference between them was the Department’s own carefully defined model of coaching.

As part of the first year of the project, 15 teachers voluntarily took a middle school mathematics course at the University of Massachusetts-Lowell. As part of the second year, 36 more teachers took a Department-sponsored middle school mathematics course taught in four locations by three mathematics professors using both a common syllabus and a pre-post test that they had developed. Mathematical knowledge only was taught in these courses to help the Department explore the relationship between teacher knowledge in mathematics and gains in student learning.

This project found that students in the MSMI classrooms had change scores that were significantly higher than similar students in classrooms with no intervention, even though a much higher percentage of students identified as LEP were in the MSMI classrooms. Additionally, teachers’ lesson planning ability was related to change scores, that is, students of teachers with higher score planning made significantly more improvement than students of teachers with lower lesson planning scores. The study also found that students of teachers with more teaching experience achieved higher gains than students of teachers with less teaching experience (University of Massachusetts Donahue Institute, 2002).

Although the differences in outcomes between the two groups were statistically significant, the practical significance of these differences was questionable. The grades 6, 7, and 8 students in the intervention classes could achieve a maximum of 20 points on a test of basic arithmetical operations that included word problems all pitched to a grade 6 level. On average, the students got about 9 points at the beginning of the year and about 12 points at the end of the year. This is a modest gain, even if the differences between the two groups were statistically significant, thus providing only modest support for the efficacy of mathematics coaches, as defined in this project, in improving mathematical learning in low-achieving students. Although the participating teachers in this project all found their work with the mathematics specialists beneficial to their teaching, these benefits did not translate directly into meaningful increases in mathematics achievement for the low-performing students in their classrooms.

Nor did the benefits of the coursework in mathematics translate into increased student achievement. Students whose teachers took the mathematics course in the second year of the study, showed gains on the teacher pre-post test, and found the coursework beneficial showed no greater gains overall than other students.

During the course of the study and in discussions of its results with specialists and teachers at a Title I conference in 2002, several factors affecting the learning of all low-performing students, whether or not in the intervention group, were identified by the field as needing further exploration. One factor was student reading level; the students in both the intervention and comparison classes in the MSMI study were below average in reading as well as in mathematics.

A second factor was the use of grade level textbooks in a standards-based environment. In standards-oriented schools, it is understandable why administrators purchase grade level textbooks for the middle school; the grade 8 MCAS mathematics test is based on grade 8 standards and if they are to prepare students for the grade 8 MCAS they feel obligated to address the standards on which the grade 8 test is based. However, unlike the widespread availability of developmentally appropriate below grade-level reading materials (often called high interest/low vocabulary), there seem to be few if any below grade-level mathematics materials available to teach skills that students have not yet acquired but which are needed for problem solving in the grade-level textbooks.

A third factor was student grouping. In the relatively large body of research on the effects on achievement of grouping students with varying skill levels in different ways, the evidence suggests that students learn more mathematics when they are in more homogeneous groups with a curriculum and materials geared to their needs (Loveless, 1999; Loveless, 2000; Slavin, 1987; Slavin, 1990). In classes with a wide range of student achievement, it is not clear how well classroom teachers address the specific weaknesses of low-performing students, especially if they are using grade-level materials.

The fourth factor mentioned was student absenteeism, a factor that directly affects student learning. Student absentee rates for 2001-2002 were not available at the time the final report for the MSMI was completed by its external evaluators (University of Massachusetts Donahue Institute, 2002), but they were available for the first year of the project. In grade 8 for the first year of the project, in the MSMI schools, 598 out of 2,654 students (23%) were absent 11 to 20 days for the year, while 20% (an additional 525 students) were absent more than 20 days. Absentee rates in the comparison schools were slightly higher. While attendance rates may be lower for the lowest-performing students than for the school as a whole, it was not possible to obtain attendance data for individual students or specific groups of students. We could only assume that the rates were similar across both groups of schools.

 

B. Immediate Background for the Present Study

The Department learned from the MSMI that there was much more to explore than it had initially thought in order to determine how to spend public appropriations wisely for increasing middle school mathematics achievement. In addition, by 2002 the Department had become as concerned about higher achieving students in the state as about lower performing students. As Table 1 shows, Massachusetts students in grade 8 mathematics classes already at the Needs Improvement or Proficient level (the second and third highest performance levels on the state’s tests) were not moving quickly as a group to the Proficient or Advanced level, or even as quickly as grade 8 mathematics students as a group were moving from the lowest level to the Needs Improvement level.

In 1998, 31% of grade 8 students scored at the two highest performance levels, and in 2002, 34% did, an increase of only 3% of the total number of students. On the other hand, the percent scoring at the Warning level decreased from 42% in 1998 to 33% in 2002, a decrease of 9%. The concern here was equity. Why weren’t grade 8 students moving into the two highest performance levels at least at the same rate as students moving out of the lowest level? Were schools in Massachusetts expending less educational effort on the top 60% to 70% of their students in grade 8 than on the bottom 30% to 40% because of the sanctions attending continuous low school performance, thus turning state tests into de facto minimum competency tests? The Department decided to find out what school-based factors might differentiate schools that had increased the percent scoring at the two highest levels as much as they had decreased the percent scoring at the lowest level from schools that had decreased the percent at the lowest level more than they had increased the percent scoring at the highest levels (if in fact they had increased the percent at the two highest levels at all).


Table 1. Grade 8 MCAS Results in Mathematics from 1998 to 2003: Percentage of Students at Each Performance Level

 

 

Warning

Needs Improvement

 

Proficient

 

Advanced

1998

42

26

23

8

1999

40

31

22

6

2000

39

27

24

10

2001

31

34

23

11

2002

33

33

23

11

2003

33

30

25

12

2004

29

32

26

13

Source: Massachusetts Department of Education

 

The research question was: What school-based factors might be related to the lack of significant growth in the percent of students in grade 8 performing at the two highest levels since the inception of state tests in 1998? To explore this question, the Department chose a research design that might be more informative and much less expensive than the one used in the MSMI (see Carnine & Gersten, 2000, for a discussion of the debates about the types of research that might best inform policy and practice). Using funds from its National Science Foundation-supported State Systemic Initiative, the Department retained Thomas, Warren + Associates to gather quantitative data from a stratified random sampling of schools across the state, focusing just on grade 8—a pivotal grade in mathematics education in K-12—and to explore, among other probable influences on student achievement, the second and third factors described above (grade level and choice of textbooks, and grouping arrangements) because specialists and teachers had stressed their relevance in discussions with Department staff. The Department sought a stratified random sampling of schools across the state in order to avoid the complex problems inherent in matching large numbers of schools to produce valid comparison groups, such as the problems encountered by Riordan & Noyce (2001) in a study comparing mathematics achievement in grades 4 and 8 in selected Massachusetts schools. The Department also sought a stratified random sampling of schools across the state in order to allow identification of the schools selected for the study: this would enable other researchers to confirm or further explore its results (see www.csun.edu/~vcmth00m/noyce.htm for an exchange of communications on this topic).

II.    THE PRESENT STUDY Footnote

The present study was designed to be exploratory in nature. Its purpose was to identify school-based factors that were significantly associated with schools that had both increased above the state average increase the percent of grade 8 students performing at the two highest levels on the state’s grade 8 mathematics test and at the same time had decreased above the state average decrease the percent of grade 8 students testing at the lowest level on the state’s mathematics tests between school year 1998-99 and school year 2001-02 (henceforth to be referred to as the study period). The contractors were to examine and compare curricula; instructional and grouping practices; extra support (e.g., tutoring, parental assistance); teacher qualifications; textbook use; and instructional organization (e.g., block scheduling, team-teaching) across the state’s schools. As Stigler and Hiebert suggest in The Teaching Gap (1999), it may not be the teachers’ instructional choices that are retarding student achievement in this country but a “system” that tells them what they should or should not do in their classrooms.

The Thomas, Warren + Associates research design was developed in three parts. First, a methodological approach for analysis was identified. Second, a sampling strategy was prepared (Lohr, 1999). Finally, two survey instruments were written and administered to collect the school-specific information required for the study. These instruments consisted of questions to be asked of a representative sample of grade 8 administrators and mathematics teachers and were based on the suggestions of Department staff (reflecting their communications with the field) and the content of existing questionnaires (Massachusetts Education Reform Review Commission, 2000a; 2000b; 2001). A detailed account of the methodology used and copies of the survey instruments are available in the final report that Thomas, Warren + Associates submitted to the Department in June 2003. Footnote

A.     Research Methodology

The sampling strategy for choosing schools for inclusion in the study required partitioning the universe of Massachusetts schools. First, schools were considered only if they administered the state’s grade 8 mathematics test every year of the study period and administered it to a minimum of 50 students. All public (and public charter) schools are required to administer the state tests, with no exceptions. Altogether 308 schools in Massachusetts met these criteria as of 2001-02 (Massachusetts Department of Education, 1999a; 1999b; 1999c; 2001a; 2001b; 2001c; 2002). Next, in order to capture the effects of a school being part of a large or a small district, districts were classified according to their size. Districts with fewer than four schools giving the mathematics test in 2001-02 were classified as small districts. All other school districts were classified as large.

Inclusion in the sample was based on performance on the state’s mathematics test. Schools were first partitioned into two groups based on whether their observed change was above or below the state average increase in the percent testing at the two highest levels. The state average change was calculated as the mean of the changes in all 308 schools in the sampling frame. A second partition was based on a greater or less than average decrease in the percent of a school’s students at the lowest level on the state’s mathematics test, creating four groups in all. The group of interest in the study represented schools that had both increased the percent of students testing at the two highest levels by more than the state average and simultaneously decreased the percent of students at the lowest level by more that the state average over the study period. These schools will be referred to as Improving Proficient, Advanced, and Warning (IPAW) schools. The study was based on the assumption that they were doing something better. All the schools in the other three groups were analyzed as a single group, hereafter referred to as non-IPAW schools. Table 2 provides a count of the schools in each of these groups and a description of the overall sample development.

Table 2. Development of the Sample Used in the Study

 

IPAW Schools

 

Non-IPAW Schools

 

Schools

In Large Districts

In Small Districts

 

In Large Districts

In Small Districts

Total Schools

 In sampling frame

21

71

 

69

147

308

 In sample

13

13

 

25

24

75

 Eligible and agreed to participate

10

11

 

23

21

65

Administrators and Teachers

 

 

 

 

 

 

 Eligible and agreed to participate

35

36

 

75

67

213

 

Algorithmic random sampling was performed to select schools in small districts. Schools in large districts were selected for participation in the study by Thomas, Warren + Associates in a different way. Schools in large districts were selected based on a committee rating process rather than algorithmic sampling. It was agreed that algorithmic random sampling from a small population of large districts (90 schools) could potentially lead to a very biased sample and that there were no significant implications from using two different methods of sampling. The goal of the committee was to develop a sample that was representative of the population in terms of MCAS results but also exhibited the diversity of socioeconomic status found in the population (Massachusetts Department of Employment and Training, 2002; Boston Plan for Excellence, 2002a; 2002b). The committee was composed of three senior staff from Thomas, Warren + Associates, two education specialists, and one statistician, all of whom were familiar with the Massachusetts school system. Footnote School selection in the large districts was made independently by the members of the committee. The kappa statistic for rater agreement among the members was 0.60 (p=0.00). Footnote Additionally, schools in large districts were over-sampled because of a concern that within district variability in test results and socioeconomic status needed to be adequately represented in the final sample. A preliminary list of 37 schools in small districts and 38 schools in large districts was selected.

Following notification of selection for participation in the study by Thomas, Warren + Associates, the 75 selected school principals were each contacted by telephone in order to obtain their agreement to participate in the study. Part of this agreement was that the principal, the school’s mathematics coordinator or department chair (if there was one), and at least one teacher (or as many as two teachers) who had taught at the school and administered the state’s grade 8 mathematics test in the 2001-02 school year would participate in the study. Teachers were selected for participation by their principals from the (usual) pool of two or three eligible grade 8 math teachers in their school. Sixty-five schools met all criteria (32 in small districts and 33 in large districts) and agreed to participate in the project.

Table 3 identifies similarities and differences between the IPAW and non-IPAW schools for the 60 schools from which complete survey data were collected and which constituted the final sample. Although the two groups of schools were similar in many important areas, two areas of difference warrant comment. The percentage of LEP students in the non-IPAW schools was almost twice that of LEP students in the IPAW schools (23% to 12%). Although in theory this could be an important difference between the two groups of schools, administrators and teachers in both the non-IPAW and IPAW schools rarely commented on second language problems, in focus groups or on an instrument designed to gather qualitative data (not reported here). It is also the case that the non-IPAW schools began with higher scores than the IPAW schools and thus might find it more difficult to raise achievement, especially since they enrolled more LEP students. Even if this did make a difference in their capacity to raise scores, what the IPAW schools did to increase the percentage of students in the two highest performance levels is still of interest, especially since the overall percentage of students in the state in these two levels is puzzlingly low in a state with an overall high level of parent education. It should be noted that the final sample size for Massachusetts is close to the final sample size of 77 schools participating in the study conducted by Hiebert and others (2003) to develop a general picture of mathematics instruction in the United States.

Table 3 Similarities and Differences in Schools

 

Demographics in 2001-2002

IPAW Schools

(N=20)

Non-IPAW Schools

(N=40)

Number of schools from large districts

10

20

Number of schools serving only grades 6-8

11

16

Number of magnet or special focus schools

3

7

Average school enrollment

728

866

Percent of students receiving free or reduced lunches

36%

36%

LEP students as a percent of enrollment

12%

23%

MCAS Performance

 

 

Average percent of students at Proficient or Advanced (1998-99)

18%

34%

Average percent increase in Proficient or Advanced (from 1998-99 to 2001-02)

12.9%

0.5%

Average percent of students at Warning (1998-1999)

49%

38%

Average percent decrease in Warning (from 1998-99 to 2001-02)

-17.2%

-2.1%

Teachers in 2002

 

 

Percent of teachers licensed to teach mathematics

65%

73%

Percent of teachers with over five years of experience

42%

30%

Average number of sections taught by mathematics teachers

3.7

3.6

Average number of students per section

21

21

Classroom Practices in 2002

 

 

Percent of sections where homework was assigned

32%

49%

Percent of grade 8 students enrolled in algebra I

21%

39%

Footnote ote: In general, schools that used homogeneous groupings had various levels of mathematics courses such as algebra, pre-algebra, and general mathematics.

 

Primary and supplemental data collection instruments were developed for administrators (principals and math chairs) and teachers. The primary data collection instrument contained four types of questions: multi