Group items tagged statistics - Education Links

Todd Farley: Lies, Damn Lies, and Statistics, or What's Really Up With Automated Essay ... - 0 views

www.huffingtonpost.com/...lies-and-statis_b_1574711.html

education reform testing scoring commentary

shared by Jeff Bernstein on 09 Jun 12 - No Cached

Jeff Bernstein on 09 Jun 12

As any astute reader but no automated essay scoring program might have gleaned by now, I actually do have my doubts about the automated essay scoring study. I have my doubts because I worked in the test-scoring business for the better part of fifteen years (1994-2008), and most of that job entailed making statistics dance: I saw the industry fix distribution statistics when they might have showed different results than a state wanted; I saw it fudge reliability numbers when those showed human readers weren't scoring in enough of a standardized way; and I saw it fake qualifying scores to ensure enough temporary employees were kept on projects to complete them on time even when those temporary employees were actually not qualified for the job. Given my experience in the duplicitous world of standardized test-scoring, I couldn't help but have my doubts about the statistics provided in support of the automated essay scoring study -- and, unfortunately, that study lost me with its title alone. "Contrasting State-of-the-Art Automated Scoring of Essays: Analysis," it is named, with p. 5 reemphasizing exactly what the study is supposed to be focused on: "Phase I examines the machine scoring capabilities for extended-response essays." A quick perusal of Table 3, however, on page 33, suggests that the "essays" scored in the study are barely essays at all: "Essays" tested in five of the eight sets of student responses averaged only about a hundred and fifty words.

<div class="cArrow"> </div><div class="cContentInner">As any astute reader but no automated essay scoring program might have gleaned by now, I actually do have my doubts about the automated essay scoring study. I have my doubts because I worked in the test-scoring business for the better part of fifteen years (1994-2008), and most of that job entailed making statistics dance: I saw the industry fix distribution statistics when they might have showed different results than a state wanted; I saw it fudge reliability numbers when those showed human readers weren't scoring in enough of a standardized way; and I saw it fake qualifying scores to ensure enough temporary employees were kept on projects to complete them on time even when those temporary employees were actually not qualified for the job. Given my experience in the duplicitous world of standardized test-scoring, I couldn't help but have my doubts about the statistics provided in support of the automated essay scoring study -- and, unfortunately, that study lost me with its title alone. "Contrasting State-of-the-Art Automated Scoring of Essays: Analysis," it is named, with p. 5 reemphasizing exactly what the study is supposed to be focused on: "Phase I examines the machine scoring capabilities for extended-response essays." A quick perusal of Table 3, however, on page 33, suggests that the "essays" scored in the study are barely essays at all: "Essays" tested in five of the eight sets of student responses averaged only about a hundred and fifty words.</div>

...

Cancel

Variability in Pretest-Posttest Correlation Coefficients by Student Achievement Level - 0 views

ies.ed.gov/20114033

education testing research

shared by Jeff Bernstein on 07 Sep 11 - No Cached

Jeff Bernstein on 07 Sep 11

State assessments are increasingly used as outcome measures for education evaluations and pretest scores are generally used as control variables in these evaluations. The correlation between the pretest and outcome (posttest) measures is a factor in determining, among other things, the statistical power of a study. This report examines the variability in pretest-posttest correlation coefficients for state assessment data on samples of low-performing, average-performing, and proficient students to determine how sample characteristics (e.g., achievement level) affect pretest-posttest correlation coefficients. As an application, this report illustrates how statistical power is affected by variations in pretest-posttest correlation coefficients across groups with different sample characteristics. Achievement data from four states and two large districts are examined. The results confirm that pretest-posttest correlation coefficients are smaller for samples of low performers than for samples representing the full range of performers, thus, resulting in lower statistical power for impact studies than would be the case if the study sample included a more representative group of students.

<div class="cArrow"> </div><div class="cContentInner">State assessments are increasingly used as outcome measures for education evaluations and pretest scores are generally used as control variables in these evaluations. The correlation between the pretest and outcome (posttest) measures is a factor in determining, among other things, the statistical power of a study. This report examines the variability in pretest-posttest correlation coefficients for state assessment data on samples of low-performing, average-performing, and proficient students to determine how sample characteristics (e.g., achievement level) affect pretest-posttest correlation coefficients. As an application, this report illustrates how statistical power is affected by variations in pretest-posttest correlation coefficients across groups with different sample characteristics. Achievement data from four states and two large districts are examined. The results confirm that pretest-posttest correlation coefficients are smaller for samples of low performers than for samples representing the full range of performers, thus, resulting in lower statistical power for impact studies than would be the case if the study sample included a more representative group of students. </div>

...

Cancel

What You See May Not Be What You Get: A Brief, Nontechnical Introduction to Overfitting... - 0 views

os1.amc.nl/...Babyak_-_overfitting.pdf

statistics

shared by Jeff Bernstein on 30 Mar 12 - No Cached

Jeff Bernstein on 30 Mar 12

Statistical models, such as linear or logistic regression or survival analysis, are frequently used as a means to answer scientific questions in psychosomatic research. Many who use these techniques, however, apparently fail to appreciate fully the problem of overfitting, ie, capitalizing on the idiosyncrasies of the sample at hand. Overfitted models will fail to replicate in future samples, thus creating considerable uncertainty about the scientific merit of the finding. The present article is a nontechnical discussion of the concept of overfitting and is intended to be accessible to readers with varying levels of statistical expertise. The notion of overfitting is presented in terms of asking too much from the available data. Given a certain number of observations in a data set, there is an upper limit to the complexity of the model that can be derived with any acceptable degree of uncertainty. Complexity arises as a function of the number of degrees of freedom expended (the number of predictors including complex terms such as interactions and nonlinear terms) against the same data set during any stage of the data analysis. Theoretical and empirical evidence-with a special focus on the results of computer simulation studies-is presented to demonstrate the practical consequences of overfitting with respect to scientific inference. Three common practices-automated variable selection, pretesting of candidate predictors, and dichotomization of continuous variables-are shown to pose a considerable risk for spurious findings in models. The dilemma between overfitting and exploring candidate confounders is also discussed. Alternative means of guarding against overfitting are discussed, including variable aggregation and the fixing of coefficients a priori. Techniques that account and correct for complexity, including shrinkage and penalization, also are introduced.

<div class="cArrow"> </div><div class="cContentInner">Statistical models, such as linear or logistic regression or survival analysis, are frequently used as a means to answer scientific questions in psychosomatic research. Many who use these techniques, however, apparently fail to appreciate fully the problem of overfitting, ie, capitalizing on the idiosyncrasies of the sample at hand. Overfitted models will fail to replicate in future samples, thus creating considerable uncertainty about the scientific merit of the finding. The present article is a nontechnical discussion of the concept of overfitting and is intended to be accessible to readers with varying levels of statistical expertise. The notion of overfitting is presented in terms of asking too much from the available data. Given a certain number of observations in a data set, there is an upper limit to the complexity of the model that can be derived with any acceptable degree of uncertainty. Complexity arises as a function of the number of degrees of freedom expended (the number of predictors including complex terms such as interactions and nonlinear terms) against the same data set during any stage of the data analysis. Theoretical and empirical evidence-with a special focus on the results of computer simulation studies-is presented to demonstrate the practical consequences of overfitting with respect to scientific inference. Three common practices-automated variable selection, pretesting of candidate predictors, and dichotomization of continuous variables-are shown to pose a considerable risk for spurious findings in models. The dilemma between overfitting and exploring candidate confounders is also discussed. Alternative means of guarding against overfitting are discussed, including variable aggregation and the fixing of coefficients a priori. Techniques that account and correct for complexity, including shrinkage and penalization, also are introduced.</div>

...

Cancel

Fact or Opinion - Aaron Pallas on Judge's ruling on the release of NYC Teacher Data Rep... - 0 views

nepc.colorado.edu/...fact-or-opinion

education reform teacher evaluations nyc legal commentary

shared by Jeff Bernstein on 09 Nov 11 - No Cached

Jeff Bernstein on 09 Nov 11

What counts as a "fact"? New York State Supreme Court Justice Cynthia Kern's ruling on the release of the New York City Teacher Data Reports reflects a view very much at odds with the social science research community. In ruling that the Department of Education's intent to release these reports, which purport to label elementary and middle school teachers as more or less effective based on their students' performance on state tests of English Language Arts and mathematics, was neither arbitrary nor capricious, Kern held that there is no requirement that data be reliable for them to be disclosed. Rather, the standard she invoked was that the data simply need to be "factual," quoting a Court of Appeals case that "factual data … simply means objective information, in contrast to opinions, ideas or advice." But it is entirely a matter of opinion as to whether the particular statistical analyses involved in the production of the Teacher Data Reports warrant the inference that teachers are more or less effective. All statistical models involve assumptions that lie outside of the data themselves. Whether these assumptions are appropriate is a matter of opinion.

<div class="cArrow"> </div><div class="cContentInner">What counts as a "fact"? New York State Supreme Court Justice Cynthia Kern's ruling on the release of the New York City Teacher Data Reports reflects a view very much at odds with the social science research community. In ruling that the Department of Education's intent to release these reports, which purport to label elementary and middle school teachers as more or less effective based on their students' performance on state tests of English Language Arts and mathematics, was neither arbitrary nor capricious, Kern held that there is no requirement that data be reliable for them to be disclosed. Rather, the standard she invoked was that the data simply need to be "factual," quoting a Court of Appeals case that "factual data … simply means objective information, in contrast to opinions, ideas or advice." But it is entirely a matter of opinion as to whether the particular statistical analyses involved in the production of the Teacher Data Reports warrant the inference that teachers are more or less effective. All statistical models involve assumptions that lie outside of the data themselves. Whether these assumptions are appropriate is a matter of opinion. </div>

...

Cancel

Study's results are flawed and inconsequential - JSOnline - 0 views

www.jsonline.com/...uential-ko4c0t5-141252023.html

education reform choice vouchers graduation research analysis

shared by Jeff Bernstein on 04 Mar 12 - No Cached

Jeff Bernstein on 04 Mar 12

Yet the summary report from the evaluators has no mention of the 75% attrition rate. What readers were told was, "Enrolling in (read as "being exposed to") a private high school through MPCP increases the likelihood of a student graduating from high school, enrolling in a four-year college and persisting in college by 4-7 percentage points." That sounds positive, and voucher advocates have trumpeted this statement. But a more defensible statement is that there are no findings of benefits that are statistically distinguishable from zero. Here's why: After controlling both for students' prior measured achievement and for differences in the level of parents' formal education, to ensure that comparable students were being compared, none of the benefits showcased by the evaluators are statistically significant using conventional significance criteria.

<div class="cArrow"> </div><div class="cContentInner">Yet the summary report from the evaluators has no mention of the 75% attrition rate. What readers were told was, "Enrolling in (read as "being exposed to") a private high school through MPCP increases the likelihood of a student graduating from high school, enrolling in a four-year college and persisting in college by 4-7 percentage points." That sounds positive, and voucher advocates have trumpeted this statement. But a more defensible statement is that there are no findings of benefits that are statistically distinguishable from zero. Here's why: After controlling both for students' prior measured achievement and for differences in the level of parents' formal education, to ensure that comparable students were being compared, none of the benefits showcased by the evaluators are statistically significant using conventional significance criteria.</div>

...

Cancel

Hechinger Report | Should value-added teacher ratings be adjusted for poverty? - 0 views

hechingerreport.org/...s-be-adjusted-for-poverty_6899

education reform value-added teacher evaluations poverty commentary

shared by Jeff Bernstein on 23 Nov 11 - No Cached

Jeff Bernstein on 23 Nov 11

In Washington, D.C., one of the first places in the country to use value-added teacher ratings to fire teachers, teacher-union president Nathan Saunders likes to point to the following statistic as proof that the ratings are flawed: Ward 8, one of the poorest areas of the city, has only 5 percent of the teachers defined as effective under the new evaluation system known as IMPACT, but more than a quarter of the ineffective ones. Ward 3, encompassing some of the city's more affluent neighborhoods, has nearly a quarter of the best teachers, but only 8 percent of the worst. The discrepancy highlights an ongoing debate about the value-added test scores that an increasing number of states-soon to include Florida-are using to evaluate teachers. Are the best, most experienced D.C. teachers concentrated in the wealthiest schools, while the worst are concentrated in the poorest schools? Or does the statistical model ignore the possibility that it's more difficult to teach a room full of impoverished children?

<div class="cArrow"> </div><div class="cContentInner">In Washington, D.C., one of the first places in the country to use value-added teacher ratings to fire teachers, teacher-union president Nathan Saunders likes to point to the following statistic as proof that the ratings are flawed: Ward 8, one of the poorest areas of the city, has only 5 percent of the teachers defined as effective under the new evaluation system known as IMPACT, but more than a quarter of the ineffective ones. Ward 3, encompassing some of the city's more affluent neighborhoods, has nearly a quarter of the best teachers, but only 8 percent of the worst. The discrepancy highlights an ongoing debate about the value-added test scores that an increasing number of states-soon to include Florida-are using to evaluate teachers. Are the best, most experienced D.C. teachers concentrated in the wealthiest schools, while the worst are concentrated in the poorest schools? Or does the statistical model ignore the possibility that it's more difficult to teach a room full of impoverished children?</div>

...

Cancel

Do effective teachers teach three times as much as ineffective teachers? | Gary Rubinst... - 0 views

garyrubinstein.teachforus.org/...s-much-as-ineffective-teachers

education reform teachers quality achievement commentary

shared by Jeff Bernstein on 09 Jun 12 - No Cached

Jeff Bernstein on 09 Jun 12

An often quoted 'statistic' by various 'reformers' is that an effective teacher is three times as good as an ineffective one. Sometimes it is said that the ineffective teacher gets a half year of progress while the effective teacher gets one and a half years of progress. I don't doubt that there are a small percent of teachers who have little classroom control, mostly new teachers, who only manage to get a half a year of progress. I also can imagine a rare 'super-teacher' who somehow gets one and a half years of progress. (I think I'm a pretty good teacher, but I doubt I get a year and a half worth of progress.) I don't think there is a very accurate way to measure this nebulous 'progress' aside from test scores, but I could still imagine that there is a 'true' number, even if we will never be able to accurately calculate it. As this statistic has been quoted by Melinda Gates recently on PBS and by Michelle Rhee in various places, including the StudentsFirst website I thought, in response to a recent post on Diane Ravitch's blog I would investigate the source of this claim.

<div class="cArrow"> </div><div class="cContentInner">An often quoted 'statistic' by various 'reformers' is that an effective teacher is three times as good as an ineffective one. Sometimes it is said that the ineffective teacher gets a half year of progress while the effective teacher gets one and a half years of progress. I don't doubt that there are a small percent of teachers who have little classroom control, mostly new teachers, who only manage to get a half a year of progress. I also can imagine a rare 'super-teacher' who somehow gets one and a half years of progress. (I think I'm a pretty good teacher, but I doubt I get a year and a half worth of progress.) I don't think there is a very accurate way to measure this nebulous 'progress' aside from test scores, but I could still imagine that there is a 'true' number, even if we will never be able to accurately calculate it. As this statistic has been quoted by Melinda Gates recently on PBS and by Michelle Rhee in various places, including the StudentsFirst website I thought, in response to a recent post on Diane Ravitch's blog I would investigate the source of this claim.</div>

...

Cancel

Shanker Blog » Teachers And Education Reform, On A Need To Know Basis - 0 views

shankerblog.org/?p=10024

education reform teachers testing class size salaries data commentary

shared by Jeff Bernstein on 02 Jul 14 - No Cached

Jeff Bernstein on 02 Jul 14

"A couple of weeks ago, the website Vox.com published an article entitled, "11 facts about U.S. teachers and schools that put the education reform debate in context." The article, in the wake of the Vergara decision, is supposed to provide readers with the "basic facts" about the current education reform environment, with a particular emphasis on teachers. Most of the 11 facts are based on descriptive statistics. Vox advertises itself as a source of accessible, essential, summary information - what you "need to know" - for people interested in a topic but not necessarily well-versed in it. Right off the bat, let me say that this is an extraordinarily difficult task, and in constructing lists such as this one, there's no way to please everyone (I've read a couple of Vox's education articles and they were okay). That said, someone sent me this particular list, and it's pretty good overall, especially since it does not reflect overt advocacy for given policy positions, as so many of these types of lists do. But I was compelled to comment on it. I want to say that I did this to make some lofty point about the strengths and weaknesses of data and statistics packaged for consumption by the general public. It would, however, be more accurate to say that I started doing it and just couldn't stop. In any case, here's a little supplemental discussion of each of the 11 items"

<div class="cArrow"> </div><div class="cContentInner">"A couple of weeks ago, the website Vox.com published an article entitled, "11 facts about U.S. teachers and schools that put the education reform debate in context." The article, in the wake of the Vergara decision, is supposed to provide readers with the "basic facts" about the current education reform environment, with a particular emphasis on teachers. Most of the 11 facts are based on descriptive statistics. Vox advertises itself as a source of accessible, essential, summary information - what you "need to know" - for people interested in a topic but not necessarily well-versed in it. Right off the bat, let me say that this is an extraordinarily difficult task, and in constructing lists such as this one, there's no way to please everyone (I've read a couple of Vox's education articles and they were okay). That said, someone sent me this particular list, and it's pretty good overall, especially since it does not reflect overt advocacy for given policy positions, as so many of these types of lists do. But I was compelled to comment on it. I want to say that I did this to make some lofty point about the strengths and weaknesses of data and statistics packaged for consumption by the general public. It would, however, be more accurate to say that I started doing it and just couldn't stop. In any case, here's a little supplemental discussion of each of the 11 items"</div>

...

Cancel

The Condition of Education - 0 views

nces.ed.gov/coe

education policy

shared by Jeff Bernstein on 02 Jun 11 - Cached

Jeff Bernstein on 02 Jun 11

"The Condition of Education (COE) is a congressionally mandated annual report that summarizes important developments and trends in education using the latest available statistics. The report presents statistical indicators containing text, figures, and tables describing important developments in the status and trends of education from early childhood learning through graduate-level education. The contents of The Condition of Education are organized within the 5 sections shown on the left of this page. In addition to the indicators in these sections, there are Topics in Focus that examine specific issues. The Condition of Education 2011 contains 50 indicators, but additional indicators from earlier volumes are also available on this web site. "

<div class="cArrow"> </div><div class="cContentInner">"The Condition of Education (COE) is a congressionally mandated annual report that summarizes important developments and trends in education using the latest available statistics. The report presents statistical indicators containing text, figures, and tables describing important developments in the status and trends of education from early childhood learning through graduate-level education. The contents of The Condition of Education are organized within the 5 sections shown on the left of this page. In addition to the indicators in these sections, there are Topics in Focus that examine specific issues. The Condition of Education 2011 contains 50 indicators, but additional indicators from earlier volumes are also available on this web site. "</div>

...

Cancel

A Legal Argument Against The Use of VAMs in Teacher Evaluation - 0 views

www.tcrecord.org/Content.asp

education reform value-added teacher evaluations commentary

shared by Jeff Bernstein on 07 Jan 15 - No Cached

Jeff Bernstein on 07 Jan 15

"Value Added Models (VAMs) are irresistible. Purportedly they can ascertain a teacher's effectiveness by predicting the impact of a teacher on a student's test scores. Because test scores are the sin qua non of our education system, VAMs are alluring. They link a teacher directly to the most emphasized output in education today. What more can we want from an evaluative tool, especially in our pursuit of improving schools in the name of social justice? Taking this a step further, many see VAMs as the panacea for improving teacher quality. The theory seems straightforward. VAMs provide statistical predictions regarding a teacher's impact that can be compared to actual results. If a teacher cannot improve a student's test score in relatively positive ways, then they are ineffective. If they are ineffective, they can (and should) be dismissed (See, for instance, Hanushek, 2010). Consequently, state legislatures have rushed to codify VAMs into their statutes and regulations governing teacher evaluation. (See, for example, Florida General Laws, 2014). That has been a mistake. This paper argues for a complete reversal in policy course. To wit, state regulations that connect a teacher's continued employment to VAMs should be overhauled to eliminate the connection between evaluation and student test scores. The reasoning is largely legal, rather than educational. In sum, the legal costs of any use of VAMs in a performance-based termination far outweigh any value they may add.1 These risks are directly a function of the well-documented statistical flaws associated with VAMs (See, for example, Rothstein, 2010). The "value added" of VAMs in supporting a termination is limited, if it exists at all."

<div class="cArrow"> </div><div class="cContentInner">"Value Added Models (VAMs) are irresistible. Purportedly they can ascertain a teacher's effectiveness by predicting the impact of a teacher on a student's test scores. Because test scores are the sin qua non of our education system, VAMs are alluring. They link a teacher directly to the most emphasized output in education today. What more can we want from an evaluative tool, especially in our pursuit of improving schools in the name of social justice? Taking this a step further, many see VAMs as the panacea for improving teacher quality. The theory seems straightforward. VAMs provide statistical predictions regarding a teacher's impact that can be compared to actual results. If a teacher cannot improve a student's test score in relatively positive ways, then they are ineffective. If they are ineffective, they can (and should) be dismissed (See, for instance, Hanushek, 2010). Consequently, state legislatures have rushed to codify VAMs into their statutes and regulations governing teacher evaluation. (See, for example, Florida General Laws, 2014). That has been a mistake. This paper argues for a complete reversal in policy course. To wit, state regulations that connect a teacher's continued employment to VAMs should be overhauled to eliminate the connection between evaluation and student test scores. The reasoning is largely legal, rather than educational. In sum, the legal costs of any use of VAMs in a performance-based termination far outweigh any value they may add.1 These risks are directly a function of the well-documented statistical flaws associated with VAMs (See, for example, Rothstein, 2010). The "value added" of VAMs in supporting a termination is limited, if it exists at all."</div>

...

Cancel

What Counts as a Big Effect? (I) | GothamSchools - 0 views

gothamschools.org/...what-counts-as-a-big-effect-i

education reform charters achievement statistics research commentary

shared by Jeff Bernstein on 28 Nov 11 - No Cached

Jeff Bernstein on 28 Nov 11

woke up yesterday morning to read Norm Scott's post on Education Notes Online about a new study of the effects of charter schools on achievement in New York City. The study, by economists Caroline Hoxby and Sonali Murarka, finds a charter school effect of .09 standard deviations per year of treatment in math and .04 standard deviations per year in reading. I haven't read the study closely yet, but I was struck by Norm's headline: "Study Shows NO Improvement in NYC Charters Over Public Schools." The effects that Hoxby and Murarka report are statistically significant, which means that we can reject the claim that they are zero. But are they big? That's a surprisingly complicated question. I'm going to argue that the answer hinges on "compared to what?"

<div class="cArrow"> </div><div class="cContentInner"> woke up yesterday morning to read Norm Scott's post on Education Notes Online about a new study of the effects of charter schools on achievement in New York City. The study, by economists Caroline Hoxby and Sonali Murarka, finds a charter school effect of .09 standard deviations per year of treatment in math and .04 standard deviations per year in reading. I haven't read the study closely yet, but I was struck by Norm's headline: "Study Shows NO Improvement in NYC Charters Over Public Schools." The effects that Hoxby and Murarka report are statistically significant, which means that we can reject the claim that they are zero. But are they big? That's a surprisingly complicated question. I'm going to argue that the answer hinges on "compared to what?"</div>

...

Cancel

New PD Math Study Finds No Statistically Significant Impact on Teacher Knowledge or Stu... - 0 views

www.air.org/...index.cfm

education teachers

shared by Jeff Bernstein on 26 May 11 - No Cached

Jeff Bernstein on 26 May 11

"A federally-funded two-year study of professional development programs for seventh grade mathematics teachers found there was no statistically significant cumulative impact on teacher knowledge or on student achievement. The study, led by the American Institutes for Research (AIR), in partnership with MDRC, was released on May 25, 2011 by the U.S. Department of Education's Institute of Education Sciences (IES)."

<div class="cArrow"> </div><div class="cContentInner">"A federally-funded two-year study of professional development programs for seventh grade mathematics teachers found there was no statistically significant cumulative impact on teacher knowledge or on student achievement. The study, led by the American Institutes for Research (AIR), in partnership with MDRC, was released on May 25, 2011 by the U.S. Department of Education's Institute of Education Sciences (IES)."</div>

...

Cancel

Chetty, et al. on the American Statistical Association's Recent Position Statement on V... - 0 views

www.tcrecord.org/PrintContent.asp

education reform value-added teacher evaluations research commentary

shared by Jeff Bernstein on 06 Sep 14 - No Cached

Jeff Bernstein on 06 Sep 14

"Over the last decade, teacher evaluation based on value-added models (VAMs) has become central to the public debate over education policy. In this commentary, we critique and deconstruct the arguments proposed by the authors of a highly publicized study that linked teacher value-added models to students' long-run outcomes, Chetty et al. (2014, forthcoming), in their response to the American Statistical Association statement on VAMs. We draw on recent academic literature to support our counter-arguments along main points of contention: causality of VAM estimates, transparency of VAMs, effect of non-random sorting of students on VAM estimates and sensitivity of VAMs to model specification. "

<div class="cArrow"> </div><div class="cContentInner">"Over the last decade, teacher evaluation based on value-added models (VAMs) has become central to the public debate over education policy. In this commentary, we critique and deconstruct the arguments proposed by the authors of a highly publicized study that linked teacher value-added models to students' long-run outcomes, Chetty et al. (2014, forthcoming), in their response to the American Statistical Association statement on VAMs. We draw on recent academic literature to support our counter-arguments along main points of contention: causality of VAM estimates, transparency of VAMs, effect of non-random sorting of students on VAM estimates and sensitivity of VAMs to model specification. "</div>

...

Cancel

Linda Darling-Hammond and Edward Haertel: 'Value-added' teacher evaluations not reliabl... - 0 views

www.latimes.com/...ations-20121105,0,650639.story

education reform value-added teacher evaluations research news commentary

shared by Jeff Bernstein on 05 Nov 12 - No Cached

Jeff Bernstein on 05 Nov 12

"It's becoming a familiar story: Great teachers get low scores from "value-added" teacher evaluation models. Newspapers across the country have published accounts of extraordinary teachers whose evaluations, based on their students' state test scores, seem completely out of sync with the reality of their practice. Los Angeles teachers have figured prominently in these reports. Researchers are not surprised by these stories, because dozens of studies have documented the serious flaws in these ratings, which are increasingly used to evaluate teachers' effectiveness. The ratings are based on value-added models such as the L.A. school district's Academic Growth over Time system, which uses complex statistical metrics to try to sort out the effects of student characteristics (such as socioeconomic status) from the effects of teachers on test scores. A study we conducted at Stanford University showed what these teachers are experiencing."

<div class="cArrow"> </div><div class="cContentInner">"It's becoming a familiar story: Great teachers get low scores from "value-added" teacher evaluation models. Newspapers across the country have published accounts of extraordinary teachers whose evaluations, based on their students' state test scores, seem completely out of sync with the reality of their practice. Los Angeles teachers have figured prominently in these reports. Researchers are not surprised by these stories, because dozens of studies have documented the serious flaws in these ratings, which are increasingly used to evaluate teachers' effectiveness. The ratings are based on value-added models such as the L.A. school district's Academic Growth over Time system, which uses complex statistical metrics to try to sort out the effects of student characteristics (such as socioeconomic status) from the effects of teachers on test scores. A study we conducted at Stanford University showed what these teachers are experiencing."</div>

...

Cancel

Evaluating Teachers and Schools Using Student Growth Models - 0 views

www.pareonline.net/v17n17.pdf

education reform value-added teacher evaluations research

shared by Jeff Bernstein on 07 Jan 13 - No Cached

Jeff Bernstein on 07 Jan 13

Interest in Student Growth Modeling (SGM) and Value Added Modeling (VAM) arises from educators concerned with measuring the effectiveness of teaching and other school activities through changes in student performance as a companion and perhaps even an alternative to status. Several formal statistical models have been proposed for year-to-year growth and these fall into at least three clusters: simple change (e.g., differences on a vertical scale), residualized change (e.g., simple linear or quantile regression techniques), and value tables (varying salience of different achievement level outcomes across two years). Several of these methods have been implemented by states and districts. This paper reviews relevant literature and reports results of a data-based comparison of six basic SGM models that may permit aggregating across teachers or schools to provide evaluative information. Our investigation raises some issues that may compromise current efforts to implement VAM in teacher and school evaluations and makes suggestions for both practice and research based on the results.

<div class="cArrow"> </div><div class="cContentInner">Interest in Student Growth Modeling (SGM) and Value Added Modeling (VAM) arises from educators concerned with measuring the effectiveness of teaching and other school activities through changes in student performance as a companion and perhaps even an alternative to status. Several formal statistical models have been proposed for year-to-year growth and these fall into at least three clusters: simple change (e.g., differences on a vertical scale), residualized change (e.g., simple linear or quantile regression techniques), and value tables (varying salience of different achievement level outcomes across two years). Several of these methods have been implemented by states and districts. This paper reviews relevant literature and reports results of a data-based comparison of six basic SGM models that may permit aggregating across teachers or schools to provide evaluative information. Our investigation raises some issues that may compromise current efforts to implement VAM in teacher and school evaluations and makes suggestions for both practice and research based on the results.</div>

...

Cancel

Teacher Job Satisfaction...or Lack There of - Finding Common Ground - Education Week - 0 views

blogs.edweek.org/...isfactionor_lack_there_of.html

education reform teachers satisfaction commentary

shared by Jeff Bernstein on 11 Apr 12 - No Cached

Jeff Bernstein on 11 Apr 12

Job satisfaction is something we all care about. It also happens to be something we care more about when we have less and less of it. It's a hard balance to maintain because we have satisfaction when we are with our students but we lose that same satisfaction when we read negative press or hear politicians use bad education statistics in sound bites. We certainly cannot control what they say about us but we can control how we react.

<div class="cArrow"> </div><div class="cContentInner">Job satisfaction is something we all care about. It also happens to be something we care more about when we have less and less of it. It's a hard balance to maintain because we have satisfaction when we are with our students but we lose that same satisfaction when we read negative press or hear politicians use bad education statistics in sound bites. We certainly cannot control what they say about us but we can control how we react.</div>

...

Cancel

A Rotting Apple: Education Redlining in New York City | The Schott Foundation for Publi... - 0 views

schottfoundation.org/...education-redlining

education reform nyc class race poverty opportunity achievement research news

shared by Jeff Bernstein on 21 Apr 12 - No Cached

Jeff Bernstein on 21 Apr 12

In New York City public schools, a student's educational outcomes and opportunity to learn are statistically more determined by where he or she lives than their abilities, according to A Rotting Apple: Education Redlining in New York City, released by the Schott Foundation for Public Education. Primarily because of New York City policies and practices that result in an inequitable distribution of educational resources and intensify the impact of poverty, children who are poor, Black and Hispanic have far less of an opportunity to learn the skills needed to succeed on state and federal assessments. They are also much less likely to have an opportunity to be identified for Gifted and Talented programs, to attend selective high schools or to obtain diplomas qualifying them for college or a good job. High-performing schools, on the other hand, tend to be located in economically advantaged areas.

<div class="cArrow"> </div><div class="cContentInner">In New York City public schools, a student's educational outcomes and opportunity to learn are statistically more determined by where he or she lives than their abilities, according to A Rotting Apple: Education Redlining in New York City, released by the Schott Foundation for Public Education. Primarily because of New York City policies and practices that result in an inequitable distribution of educational resources and intensify the impact of poverty, children who are poor, Black and Hispanic have far less of an opportunity to learn the skills needed to succeed on state and federal assessments. They are also much less likely to have an opportunity to be identified for Gifted and Talented programs, to attend selective high schools or to obtain diplomas qualifying them for college or a good job. High-performing schools, on the other hand, tend to be located in economically advantaged areas.</div>

...

Cancel

The Toxic Trifecta, Bad Measurement & Evolving Teacher Evaluation Policies « ... - 0 views

schoolfinance101.wordpress.com/...ng-teacher-evaluation-policies

education reform value-added teacher evaluations legal legislation commentary

shared by Jeff Bernstein on 20 Apr 12 - No Cached

Jeff Bernstein on 20 Apr 12

This post contains my preliminary thoughts in development for a forthcoming article dealing with the intersection between statistical and measurement issues in teacher evaluation and teachers' constitutional rights where those measures are used for making high stakes decisions.

<div class="cArrow"> </div><div class="cContentInner">This post contains my preliminary thoughts in development for a forthcoming article dealing with the intersection between statistical and measurement issues in teacher evaluation and teachers' constitutional rights where those measures are used for making high stakes decisions.</div>

...

Cancel

A Sociological Eye on Education | The worst eighth-grade math teacher in New York City - 0 views

eyeoned.org/...h-teacher-in-new-york-city_326