Skip to main content

Home/ Education Links/ Group items tagged statistics

Rss Feed Group items tagged

Jeff Bernstein

Todd Farley: Lies, Damn Lies, and Statistics, or What's Really Up With Automated Essay ... - 0 views

  •  
    As any astute reader but no automated essay scoring program might have gleaned by now, I actually do have my doubts about the automated essay scoring study. I have my doubts because I worked in the test-scoring business for the better part of fifteen years (1994-2008), and most of that job entailed making statistics dance: I saw the industry fix distribution statistics when they might have showed different results than a state wanted; I saw it fudge reliability numbers when those showed human readers weren't scoring in enough of a standardized way; and I saw it fake qualifying scores to ensure enough temporary employees were kept on projects to complete them on time even when those temporary employees were actually not qualified for the job. Given my experience in the duplicitous world of standardized test-scoring, I couldn't help but have my doubts about the statistics provided in support of the automated essay scoring study -- and, unfortunately, that study lost me with its title alone. "Contrasting State-of-the-Art Automated Scoring of Essays: Analysis," it is named, with p. 5 reemphasizing exactly what the study is supposed to be focused on: "Phase I examines the machine scoring capabilities for extended-response essays." A quick perusal of Table 3, however, on page 33, suggests that the "essays" scored in the study are barely essays at all: "Essays" tested in five of the eight sets of student responses averaged only about a hundred and fifty words.
Jeff Bernstein

Variability in Pretest-Posttest Correlation Coefficients by Student Achievement Level - 0 views

  •  
    State assessments are increasingly used as outcome measures for education evaluations and pretest scores are generally used as control variables in these evaluations. The correlation between the pretest and outcome (posttest) measures is a factor in determining, among other things, the statistical power of a study. This report examines the variability in pretest-posttest correlation coefficients for state assessment data on samples of low-performing, average-performing, and proficient students to determine how sample characteristics (e.g., achievement level) affect pretest-posttest correlation coefficients. As an application, this report illustrates how statistical power is affected by variations in pretest-posttest correlation coefficients across groups with different sample characteristics. Achievement data from four states and two large districts are examined. The results confirm that pretest-posttest correlation coefficients are smaller for samples of low performers than for samples representing the full range of performers, thus, resulting in lower statistical power for impact studies than would be the case if the study sample included a more representative group of students.
Jeff Bernstein

What You See May Not Be What You Get: A Brief, Nontechnical Introduction to Overfitting... - 0 views

  •  
    Statistical models, such as linear or logistic regression or survival analysis, are frequently used as a means to answer scientific questions in psychosomatic research. Many who use these techniques, however, apparently fail to appreciate fully the problem of overfitting, ie, capitalizing on the idiosyncrasies of the sample at hand. Overfitted models will fail to replicate in future samples, thus creating considerable uncertainty about the scientific merit of the finding. The present article is a nontechnical discussion of the concept of overfitting and is intended to be accessible to readers with varying levels of statistical expertise. The notion of overfitting is presented in terms of asking too much from the available data. Given a certain number of observations in a data set, there is an upper limit to the complexity of the model that can be derived with any acceptable degree of uncertainty. Complexity arises as a function of the number of degrees of freedom expended (the number of predictors including complex terms such as interactions and nonlinear terms) against the same data set during any stage of the data analysis. Theoretical and empirical evidence-with a special focus on the results of computer simulation studies-is presented to demonstrate the practical consequences of overfitting with respect to scientific inference. Three common practices-automated variable selection, pretesting of candidate predictors, and dichotomization of continuous variables-are shown to pose a considerable risk for spurious findings in models. The dilemma between overfitting and exploring candidate confounders is also discussed. Alternative means of guarding against overfitting are discussed, including variable aggregation and the fixing of coefficients a priori. Techniques that account and correct for complexity, including shrinkage and penalization, also are introduced.
Jeff Bernstein

Fact or Opinion - Aaron Pallas on Judge's ruling on the release of NYC Teacher Data Rep... - 0 views

  •  
    What counts as a "fact"? New York State Supreme Court Justice Cynthia Kern's ruling on the release of the New York City Teacher Data Reports reflects a view very much at odds with the social science research community. In ruling that the Department of Education's intent to release these reports, which purport to label elementary and middle school teachers as more or less effective based on their students' performance on state tests of English Language Arts and mathematics, was neither arbitrary nor capricious, Kern held that there is no requirement that data be reliable for them to be disclosed. Rather, the standard she invoked was that the data simply need to be "factual," quoting a Court of Appeals case that "factual data … simply means objective information, in contrast to opinions, ideas or advice." But it is entirely a matter of opinion as to whether the particular statistical analyses involved in the production of the Teacher Data Reports warrant the inference that teachers are more or less effective. All statistical models involve assumptions that lie outside of the data themselves. Whether these assumptions are appropriate is a matter of opinion.
Jeff Bernstein

Study's results are flawed and inconsequential - JSOnline - 0 views

  •  
    Yet the summary report from the evaluators has no mention of the 75% attrition rate. What readers were told was, "Enrolling in (read as "being exposed to") a private high school through MPCP increases the likelihood of a student graduating from high school, enrolling in a four-year college and persisting in college by 4-7 percentage points." That sounds positive, and voucher advocates have trumpeted this statement. But a more defensible statement is that there are no findings of benefits that are statistically distinguishable from zero. Here's why: After controlling both for students' prior measured achievement and for differences in the level of parents' formal education, to ensure that comparable students were being compared, none of the benefits showcased by the evaluators are statistically significant using conventional significance criteria.
Jeff Bernstein

Hechinger Report | Should value-added teacher ratings be adjusted for poverty? - 0 views

  •  
    In Washington, D.C., one of the first places in the country to use value-added teacher ratings to fire teachers, teacher-union president Nathan Saunders likes to point to the following statistic as proof that the ratings are flawed: Ward 8, one of the poorest areas of the city, has only 5 percent of the teachers defined as effective under the new evaluation system known as IMPACT, but more than a quarter of the ineffective ones. Ward 3, encompassing some of the city's more affluent neighborhoods, has nearly a quarter of the best teachers, but only 8 percent of the worst. The discrepancy highlights an ongoing debate about the value-added test scores that an increasing number of states-soon to include Florida-are using to evaluate teachers. Are the best, most experienced D.C. teachers concentrated in the wealthiest schools, while the worst are concentrated in the poorest schools? Or does the statistical model ignore the possibility that it's more difficult to teach a room full of impoverished children?
Jeff Bernstein

Do effective teachers teach three times as much as ineffective teachers? | Gary Rubinst... - 0 views

  •  
    An often quoted 'statistic' by various 'reformers' is that an effective teacher is three times as good as an ineffective one.  Sometimes it is said that the ineffective teacher gets a half year of progress while the effective teacher gets one and a half years of progress. I don't doubt that there are a small percent of teachers who have little classroom control, mostly new teachers, who only manage to get a half a year of progress.  I also can imagine a rare 'super-teacher' who somehow gets one and a half years of progress.  (I think I'm a pretty good teacher, but I doubt I get a year and a half worth of progress.)  I don't think there is a very accurate way to measure this nebulous 'progress' aside from test scores, but I could still imagine that there is a 'true' number, even if we will never be able to accurately calculate it. As this statistic has been quoted by Melinda Gates recently on PBS and by Michelle Rhee in various places, including the StudentsFirst website I thought, in response to a recent post on Diane Ravitch's blog I would investigate the source of this claim.
Jeff Bernstein

Shanker Blog » Teachers And Education Reform, On A Need To Know Basis - 0 views

  •  
    "A couple of weeks ago, the website Vox.com published an article entitled, "11 facts about U.S. teachers and schools that put the education reform debate in context." The article, in the wake of the Vergara decision, is supposed to provide readers with the "basic facts" about the current education reform environment, with a particular emphasis on teachers. Most of the 11 facts are based on descriptive statistics. Vox advertises itself as a source of accessible, essential, summary information - what you "need to know" - for people interested in a topic but not necessarily well-versed in it. Right off the bat, let me say that this is an extraordinarily difficult task, and in constructing lists such as this one, there's no way to please everyone (I've read a couple of Vox's education articles and they were okay). That said, someone sent me this particular list, and it's pretty good overall, especially since it does not reflect overt advocacy for given policy positions, as so many of these types of lists do. But I was compelled to comment on it. I want to say that I did this to make some lofty point about the strengths and weaknesses of data and statistics packaged for consumption by the general public. It would, however, be more accurate to say that I started doing it and just couldn't stop. In any case, here's a little supplemental discussion of each of the 11 items"
Jeff Bernstein

The Condition of Education - 0 views

  •  
    "The Condition of Education (COE) is a congressionally mandated annual report that summarizes important developments and trends in education using the latest available statistics. The report presents statistical indicators containing text, figures, and tables describing important developments in the status and trends of education from early childhood learning through graduate-level education. The contents of The Condition of Education are organized within the 5 sections shown on the left of this page. In addition to the indicators in these sections, there are Topics in Focus that examine specific issues. The Condition of Education 2011 contains 50 indicators, but additional indicators from earlier volumes are also available on this web site. "
Jeff Bernstein

A Legal Argument Against The Use of VAMs in Teacher Evaluation - 0 views

  •  
    "Value Added Models (VAMs) are irresistible. Purportedly they can ascertain a teacher's effectiveness by predicting the impact of a teacher on a student's test scores. Because test scores are the sin qua non of our education system, VAMs are alluring. They link a teacher directly to the most emphasized output in education today. What more can we want from an evaluative tool, especially in our pursuit of improving schools in the name of social justice? Taking this a step further, many see VAMs as the panacea for improving teacher quality. The theory seems straightforward. VAMs provide statistical predictions regarding a teacher's impact that can be compared to actual results. If a teacher cannot improve a student's test score in relatively positive ways, then they are ineffective. If they are ineffective, they can (and should) be dismissed (See, for instance, Hanushek, 2010). Consequently, state legislatures have rushed to codify VAMs into their statutes and regulations governing teacher evaluation. (See, for example, Florida General Laws, 2014). That has been a mistake. This paper argues for a complete reversal in policy course. To wit, state regulations that connect a teacher's continued employment to VAMs should be overhauled to eliminate the connection between evaluation and student test scores. The reasoning is largely legal, rather than educational. In sum, the legal costs of any use of VAMs in a performance-based termination far outweigh any value they may add.1 These risks are directly a function of the well-documented statistical flaws associated with VAMs (See, for example, Rothstein, 2010). The "value added" of VAMs in supporting a termination is limited, if it exists at all."
Jeff Bernstein

What Counts as a Big Effect? (I) | GothamSchools - 0 views

  •  
    woke up yesterday morning to read Norm Scott's post on Education Notes Online about a new study of the effects of charter schools on achievement in New York City.  The study, by economists Caroline Hoxby and Sonali Murarka, finds a charter school effect of .09 standard deviations per year of treatment in math and .04 standard deviations per year in reading.  I haven't read the study closely yet, but I was struck by Norm's headline:  "Study Shows NO Improvement in NYC Charters Over Public Schools."  The effects that Hoxby and Murarka report are statistically significant, which means that we can reject the claim that they are zero.  But are they big?  That's a surprisingly complicated question. I'm going to argue that the answer hinges on "compared to what?"
Jeff Bernstein

New PD Math Study Finds No Statistically Significant Impact on Teacher Knowledge or Stu... - 0 views

  •  
    "A federally-funded two-year study of professional development programs for seventh grade mathematics teachers found there was no statistically significant cumulative impact on teacher knowledge or on student achievement. The study, led by the American Institutes for Research (AIR), in partnership with MDRC, was released on May 25, 2011 by the U.S. Department of Education's Institute of Education Sciences (IES)."
Jeff Bernstein

Chetty, et al. on the American Statistical Association's Recent Position Statement on V... - 0 views

  •  
    "Over the last decade, teacher evaluation based on value-added models (VAMs) has become central to the public debate over education policy. In this commentary, we critique and deconstruct the arguments proposed by the authors of a highly publicized study that linked teacher value-added models to students' long-run outcomes, Chetty et al. (2014, forthcoming), in their response to the American Statistical Association statement on VAMs. We draw on recent academic literature to support our counter-arguments along main points of contention: causality of VAM estimates, transparency of VAMs, effect of non-random sorting of students on VAM estimates and sensitivity of VAMs to model specification. "
Jeff Bernstein

Linda Darling-Hammond and Edward Haertel: 'Value-added' teacher evaluations not reliabl... - 0 views

  •  
    "It's becoming a familiar story: Great teachers get low scores from "value-added" teacher evaluation models. Newspapers across the country have published accounts of extraordinary teachers whose evaluations, based on their students' state test scores, seem completely out of sync with the reality of their practice. Los Angeles teachers have figured prominently in these reports. Researchers are not surprised by these stories, because dozens of studies have documented the serious flaws in these ratings, which are increasingly used to evaluate teachers' effectiveness. The ratings are based on value-added models such as the L.A. school district's Academic Growth over Time system, which uses complex statistical metrics to try to sort out the effects of student characteristics (such as socioeconomic status) from the effects of teachers on test scores. A study we conducted at Stanford University showed what these teachers are experiencing."
Jeff Bernstein

Evaluating Teachers and Schools Using Student Growth Models - 0 views

  •  
    Interest in Student Growth Modeling (SGM) and Value Added Modeling (VAM) arises from educators concerned with measuring the effectiveness of teaching and other school activities through changes in student performance as a companion and perhaps even an alternative to status. Several formal statistical models have been proposed for year-to-year growth and these fall into at least three clusters: simple change (e.g., differences on a vertical scale), residualized change (e.g., simple linear or quantile regression techniques), and value tables  (varying salience of different achievement level outcomes across two years). Several of these methods have been implemented by states and districts.  This paper reviews relevant literature and reports results of a data-based comparison of six basic SGM models that may permit aggregating across teachers or schools to provide evaluative information.  Our investigation raises some issues that may compromise current efforts to implement VAM in teacher and school evaluations and makes suggestions for both practice and research based on the results.
Jeff Bernstein

Teacher Job Satisfaction...or Lack There of - Finding Common Ground - Education Week - 0 views

  •  
    Job satisfaction is something we all care about. It also happens to be something we care more about when we have less and less of it. It's a hard balance to maintain because we have satisfaction when we are with our students but we lose that same satisfaction when we read negative press or hear politicians use bad education statistics in sound bites. We certainly cannot control what they say about us but we can control how we react.
Jeff Bernstein

A Rotting Apple: Education Redlining in New York City | The Schott Foundation for Publi... - 0 views

  •  
    In New York City public schools, a student's educational outcomes and opportunity to learn are statistically more determined by where he or she lives than their abilities, according to A Rotting Apple: Education Redlining in New York City, released by the Schott Foundation for Public Education. Primarily because of New York City policies and practices that result in an inequitable distribution of educational resources and intensify the impact of poverty, children who are poor, Black and Hispanic have far less of an opportunity to learn the skills needed to succeed on state and federal assessments. They are also much less likely to have an opportunity to be identified for Gifted and Talented programs, to attend selective high schools or to obtain diplomas qualifying them for college or a good job. High-performing schools, on the other hand, tend to be located in economically advantaged areas.
Jeff Bernstein

The Toxic Trifecta, Bad Measurement & Evolving Teacher Evaluation Policies « ... - 0 views

  •  
    This post contains my preliminary thoughts in development for a forthcoming article dealing with the intersection between statistical and measurement issues in teacher evaluation and teachers' constitutional rights where those measures are used for making high stakes decisions.
Jeff Bernstein

A Sociological Eye on Education | The worst eighth-grade math teacher in New York City - 0 views

  •  
    Using a statistical technique called value-added modeling, the Teacher Data Reports compare how students are predicted to perform on the state ELA and math tests, based on their prior year's performance, with their actual performance. Teachers whose students do better than predicted are said to have "added value"; those whose students do worse than predicted are "subtracting value." By definition, about half of all teachers will add value, and the other half will not. Carolyn Abbott was, in one respect, a victim of her own success. After a year in her classroom, her seventh-grade students scored at the 98th percentile of New York City students on the 2009 state test. As eighth-graders, they were predicted to score at the 97th percentile on the 2010 state test. However, their actual performance was at the 89th percentile of students across the city. That shortfall-the difference between the 97th percentile and the 89th percentile-placed Abbott near the very bottom of the 1,300 eighth-grade mathematics teachers in New York City. How could this happen? Anderson is an unusual school, as the students are often several years ahead of their nominal grade level. The material covered on the state eighth-grade math exam is taught in the fifth or sixth grade at Anderson. "I don't teach the curriculum they're being tested on," Abbott explained. "It feels like I'm being graded on somebody else's work."
Jeff Bernstein

Ed Notes Online: Must See Video: Gary Rubinstein at GEM Teacher Evaluation Forum - 0 views

  •  
    In a brilliant presentation Stuyvesant HS teacher Gary Rubinstein uses statistics to punch holes in the high stakes testing standardized testing program. He also finds evidence in the stats that charter schools cream better students. Then he addresses the reason why Bill Gates and Michelle Rhee opposed the release of data scores --- they knew people like Gary would be able to show how irrelevant they really were. "It's like in trying to measure temperature, you count the number of people wearing hats." Then he addresses the issue of why a union agreed to any of this, even 20% given that under the current system almost everyone potentially can be rated ineffective. He offered the union his help to salvage the other 20% but has not heard back yet. There's supposed to be this evil union only about the adults but they really aren't doing a good job at that.
1 - 20 of 106 Next › Last »
Showing 20 items per page