Skip to main content

Home/ CTLT and Friends/ Group items tagged reliability

Rss Feed Group items tagged

Gary Brown

Matthew Lombard - 0 views

  • Which measure(s) of intercoder reliability should researchers use? [TOP] There are literally dozens of different measures, or indices, of intercoder reliability. Popping (1988) identified 39 different "agreement indices" for coding nominal categories, which excludes several techniques for interval and ratio level data. But only a handful of techniques are widely used. In communication the most widely used indices are: Percent agreement Holsti's method Scott's pi (p) Cohen's kappa (k) Krippendorff's alpha (a)
  • 5. Which measure(s) of intercoder reliability should researchers use? [TOP] There are literally dozens of different measures, or indices, of intercoder reliability. Popping (1988) identified 39 different "agreement indices" for coding nominal categories, which excludes several techniques for interval and ratio level data. But only a handful of techniques are widely used. In communication the most widely used indices are: Percent agreement Holsti's method Scott's pi (p) Cohen's kappa (k) Krippendorff's alpha (a) Just some of the indices proposed, and in some cases widely used, in other fields are Perreault and Leigh's (1989) Ir measure; Tinsley and Weiss's (1975) T index; Bennett, Alpert, and Goldstein's (1954) S index; Lin's (1989) concordance coefficient; Hughes and Garrett’s (1990) approach based on Generalizability Theory, and Rust and Cooil's (1994) approach based on "Proportional Reduction in Loss" (PRL). It would be nice if there were one universally accepted index of intercoder reliability. But despite all the effort that scholars, methodologists and statisticians have devoted to developing and testing indices, there is no consensus on a single, "best" one. While there are several recommendations for Cohen's kappa (e.g., Dewey (1983) argued that despite its drawbacks, kappa should still be "the measure of choice") and this index appears to be commonly used in research that involves the coding of behavior (Bakeman, 2000), others (notably Krippendorff, 1978, 1987) have argued that its characteristics make it inappropriate as a measure of intercoder agreement.
  • 5. Which measure(s) of intercoder reliability should researchers use? [TOP] There are literally dozens of different measures, or indices, of intercoder reliability. Popping (1988) identified 39 different "agreement indices" for coding nominal categories, which excludes several techniques for interval and ratio level data. But only a handful of techniques are widely used. In communication the most widely used indices are: Percent agreement Holsti's method Scott's pi (p) Cohen's kappa (k) Krippendorff's alpha (a) Just some of the indices proposed, and in some cases widely used, in other fields are Perreault and Leigh's (1989) Ir measure; Tinsley and Weiss's (1975) T index; Bennett, Alpert, and Goldstein's (1954) S index; Lin's (1989) concordance coefficient; Hughes and Garrett’s (1990) approach based on Generalizability Theory, and Rust and Cooil's (1994) approach based on "Proportional Reduction in Loss" (PRL). It would be nice if there were one universally accepted index of intercoder reliability. But despite all the effort that scholars, methodologists and statisticians have devoted to developing and testing indices, there is no consensus on a single, "best" one. While there are several recommendations for Cohen's kappa (e.g., Dewey (1983) argued that despite its drawbacks, kappa should still be "the measure of choice") and this index appears to be commonly used in research that involves the coding of behavior (Bakeman, 2000), others (notably Krippendorff, 1978, 1987) have argued that its characteristics make it inappropriate as a measure of intercoder agreement.
  •  
    for our formalizing of assessment work
  •  
    inter-rater reliability
Corinna Lo

A comparison of consensus, consistency, and measurement approaches to estimating interr... - 2 views

  •  
    "The three general categories for computing interrater reliability introduced and described in this paper are: 1) consensus estimates, 2) consistency estimates, and 3) measurement estimates. The assumptions, interpretation, advantages, and disadvantages of estimates from each of these three categories are discussed, along with several popular methods of computing interrater reliability coefficients that fall under the umbrella of consensus, consistency, and measurement estimates. Researchers and practitioners should be aware that different approaches to estimating interrater reliability carry with them different implications for how ratings across multiple judges should be summarized, which may impact the validity of subsequent study results."
Gary Brown

Validity and Reliability in Higher Education Assessment - 2 views

  • Validity and Reliability in Higher Education Assessment
  • However, validity and reliability are not inherent features of assessments or assessment systems and must be monitored continuously throughout the design and implementation of an assessment system. Research studies of a theoretical or empirical nature addressing methodology for ensuring and testing validity and reliability in the higher education assessment process, results of validity and reliability studies, and novel approaches to the concepts of validity and reliability in higher education assessment are all welcome. To be most helpful in this academic exchange, empirical studies should be clear and explicit about their methodology so that others can replicate or advance their research.
  •  
    We should take this opportunity seriously and write up our work. Let me know if you want to join me.
Gary Brown

Types of Reliability - 2 views

  • You learned in the Theory of Reliability that it's not possible to calculate reliability exactly. Instead, we have to estimate reliability, and this is always an imperfect endeavor.
  •  
    A recommended resource
Corinna Lo

Scoring rubric development: validity and reliability. Moskal, Barbara M. & Jon A. Leydens - 1 views

  •  
    "One purpose of this article is to provide clear definitions of the terms "validity" and "reliability" and illustrate these definitions through examples. A second purpose is to clarify how these issues may be addressed in the development of scoring rubrics."
Corinna Lo

IJ-SoTL - A Method for Collaboratively Developing and Validating a Rubric - 1 views

  •  
    "Assessing student learning outcomes relative to a valid and reliable standard that is academically-sound and employer-relevant presents a challenge to the scholarship of teaching and learning. In this paper, readers are guided through a method for collaboratively developing and validating a rubric that integrates baseline data collected from academics and professionals. The method addresses two additional goals: (1) to formulate and test a rubric as a teaching and learning protocol for a multi-section course taught by various instructors; and (2) to assure that students' learning outcomes are consistently assessed against the rubric regardless of teacher or section. Steps in the process include formulating the rubric, collecting data, and sequentially analyzing the techniques used to validate the rubric and to insure precision in grading papers in multiple sections of a course."
shalani mujer

Reliable and Fast Online Computer Tech Support - 1 views

I love watching movies and I usually get them online. There was this one time that my computer automatically shut down while downloading a movie. Good thing I was able to sign up with an online ...

online computer tech support

started by shalani mujer on 10 Nov 11 no follow-up yet
Gary Brown

Colleges' Data-Collection Burdens Are Higher Than Official Estimates, GAO Finds - The T... - 0 views

  • The GAO recommended that Education officials reevaluate their official estimates of the time it takes for colleges to complete IPEDS surveys, communicate to a wider range of colleges the opportunites for training, and coordinate with education software providers to improve the quality and reliability of IPEDS reporting features.
  •  
    The "burden" of accountability mirrors in data what we encounter in spirit. It appears to take less time than university's report and, more to the parallel, a little training might be useful.
Gary Brown

The Quality Question - Special Reports - The Chronicle of Higher Education - 1 views

shared by Gary Brown on 30 Aug 10 - Cached
  • Few reliable, comparable measures of student learning across colleges exist. Standardized assessments like the Collegiate Learning Assessment are not widely used—and many experts say those tests need refinement in any case.
    • Gary Brown
       
      I am hoping the assumptions underlying this sentence do not frame the discussion. The extent to which it has in the past parallels the lack of progress. Standardized comparisons evince nothing but the wrong questions.
  • "We are the most moribund field that I know of," Mr. Zemsky said in an interview. "We're even more moribund than county government."
  • Robert Zemsky
Nils Peterson

Half an Hour: Open Source Assessment - 0 views

  • When posed the question in Winnipeg regarding what I thought the ideal open online course would look like, my eventual response was that it would not look like a course at all, just the assessment.
    • Nils Peterson
       
      I remembered this Downes post on the way back from HASTAC. It is some of the roots of our Spectrum I think.
  • The reasoning was this: were students given the opportunity to attempt the assessment, without the requirement that they sit through lectures or otherwise proprietary forms of learning, then they would create their own learning resources.
  • In Holland I encountered a person from an organization that does nothing but test students. This is the sort of thing I long ago predicted (in my 1998 Future of Online Learning) so I wasn't that surprised. But when I pressed the discussion the gulf between different models of assessment became apparent.Designers of learning resources, for example, have only the vaguest of indication of what will be on the test. They have a general idea of the subject area and recommendations for reading resources. Why not list the exact questions, I asked? Because they would just memorize the answers, I was told. I was unsure how this varied from the current system, except for the amount of stuff that must be memorized.
    • Nils Peterson
       
      assumes a test as the form of assessment, rather than something more open ended.
  • ...8 more annotations...
  • As I think about it, I realize that what we have in assessment is now an exact analogy to what we have in software or learning content. We have proprietary tests or examinations, the content of which is held to be secret by the publishers. You cannot share the contents of these tests (at least, not openly). Only specially licensed institutions can offer the tests. The tests cost money.
    • Nils Peterson
       
      See our Where are you on the spectrum, Assessment is locked vs open
  • Without a public examination of the questions, how can we be sure they are reliable? We are forced to rely on 'peer reviews' or similar closed and expert-based evaluation mechanisms.
  • there is the question of who is doing the assessing. Again, the people (or machines) that grade the assessments work in secret. It is expert-based, which creates a resource bottleneck. The criteria they use are not always apparent (and there is no shortage of literature pointing to the randomness of the grading). There is an analogy here with peer-review processes (as compared to recommender system processes)
  • What constitutes achievement in a field? What constitutes, for example, 'being a physicist'?
  • This is a reductive theory of assessment. It is the theory that the assessment of a big thing can be reduced to the assessment of a set of (necessary and sufficient) little things. It is a standards-based theory of assessment. It suggests that we can measure accomplishment by testing for accomplishment of a predefined set of learning objectives.Left to its own devices, though, an open system of assessment is more likely to become non-reductive and non-standards based. Even if we consider the mastery of a subject or field of study to consist of the accomplishment of smaller components, there will be no widespread agreement on what those components are, much less how to measure them or how to test for them.Consequently, instead of very specific forms of evaluation, intended to measure particular competences, a wide variety of assessment methods will be devised. Assessment in such an environment might not even be subject-related. We won't think of, say, a person who has mastered 'physics'. Rather, we might say that they 'know how to use a scanning electron microscope' or 'developed a foundational idea'.
  • We are certainly familiar with the use of recognition, rather than measurement, as a means of evaluating achievement. Ludwig Wittgenstein is 'recognized' as a great philosopher, for example. He didn't pass a series of tests to prove this. Mahatma Gandhi is 'recognized' as a great leader.
  • The concept of the portfolio is drawn from the artistic community and will typically be applied in cases where the accomplishments are creative and content-based. In other disciplines, where the accomplishments resemble more the development of skills rather than of creations, accomplishments will resemble more the completion of tasks, like 'quests' or 'levels' in online games, say.Eventually, over time, a person will accumulate a 'profile' (much as described in 'Resource Profiles').
  • In other cases, the evaluation of achievement will resemble more a reputation system. Through some combination of inputs, from a more or less define community, a person may achieve a composite score called a 'reputation'. This will vary from community to community.
  •  
    Fine piece, transformative. "were students given the opportunity to attempt the assessment, without the requirement that they sit through lectures or otherwise proprietary forms of learning, then they would create their own learning resources."
Gary Brown

Conference Highlights Contradictory Attitudes Toward Global Rankings - International - ... - 2 views

  • He emphasized, however, that "rankings are only useful if the indicators they use don't just measure things that are easy to measure, but the things that need to be measured."
  • "In Malaysia we do not call it a ranking exercise," she said firmly, saying that the effort was instead a benchmarking exercise that attempts to rate institutions against an objective standard.
  • "If Ranking Is the Disease, Is Benchmarking the Cure?" Jamil Salmi, tertiary education coordinator at the World Bank, said that rankings are "just the tip of the iceberg" of a growing accountability agenda, with students, governments, and employers all seeking more comprehensive information about institutions
  • ...3 more annotations...
  • "Rankings are the most visible and easy to understand" of the various measures, but they are far from the most reliable,
  • Jamie P. Merisotis
  • He described himself as a longtime skeptic of rankings, but noted that "these kinds of forums are useful, because you have to have conversations involving the producers of rankings, consumers, analysts, and critics."
Gary Brown

Views: Asking Too Much (and Too Little) of Accreditors - Inside Higher Ed - 1 views

  • Senators want to know why accreditors haven’t protected the public interest.
  • Congress shouldn’t blame accreditors: it should blame itself. The existing accreditation system has neither ensured quality nor ferreted out fraud. Why? Because Congress didn’t want it to. If Congress truly wants to protect the public interest, it needs to create a system that ensures real accountability.
  • But turning accreditors into gatekeepers changed the picture. In effect, accreditors now held a gun to the heads of colleges and universities since federal financial aid wouldn’t flow unless the institution received “accredited” status.
  • ...10 more annotations...
  • Congress listened to higher education lobbyists and designated accreditors -- teams made up largely of administrators and faculty -- to be “reliable authorities” on educational quality. Intending to protect institutional autonomy, Congress appropriated the existing voluntary system by which institutions differentiated themselves.
  • A gatekeeping system using peer review is like a penal system that uses inmates to evaluate eligibility for parole. The conflicts of interest are everywhere -- and, surprise, virtually everyone is eligible!
  • accreditation is “premised upon collegiality and assistance; rather than requirements that institutions meet certain standards (with public announcements when they don’t."
  • Meanwhile, there is ample evidence that many accredited colleges are adding little educational value. The 2006 National Assessment of Adult Literacy revealed that nearly a third of college graduates were unable to compare two newspaper editorials or compute the cost of office items, prompting the Spellings Commission and others to raise concerns about accreditors’ attention to productivity and quality.
  • But Congress wouldn’t let them. Rather than welcoming accreditors’ efforts to enhance their public oversight role, Congress told accreditors to back off and let nonprofit colleges and universities set their own standards for educational quality.
  • ccreditation is nothing more than an outdated industrial-era monopoly whose regulations prevent colleges from cultivating the skills, flexibility, and innovation that they need to ensure quality and accountability.
  • there is a much cheaper and better way: a self-certifying regimen of financial accountability, coupled with transparency about graduation rates and student success. (See some alternatives here and here.)
  • Such a system would prioritize student and parent assessment over the judgment of institutional peers or the educational bureaucracy. And it would protect students, parents, and taxpayers from fraud or mismanagement by permitting immediate complaints and investigations, with a notarized certification from the institution to serve as Exhibit A
  • The only way to protect the public interest is to end the current system of peer review patronage, and demand that colleges and universities put their reputation -- and their performance -- on the line.
  • Anne D. Neal is president of the American Council of Trustees and Alumni. The views stated herein do not represent the views of the National Advisory Committee on Institutional Quality and Integrity, of which she is a member.
  •  
    The ascending view of accreditation.
Gary Brown

Researchers Criticize Reliability of National Survey of Student Engagement - Students -... - 3 views

  • "If each of the five benchmarks does not measure a distinct dimension of engagement and includes substantial error among its items, it is difficult to inform intervention strategies to improve undergraduates' educational experiences,"
  • nly one benchmark, enriching educational experiences, had a significant effect on the seniors' cumulative GPA.
  • Other critics have asserted that the survey's mountains of data remain largely ignored.
  •  
    If the results are largely ignored, the psychometric integrity matters little.  There is no indication it is ignored because it lacks psychometric integrity.
Jayme Jacobson

Fostering Learning in the Networked World: The Cyberlearning Opportunity and Challenge ... - 0 views

  • Participatory culture: 21st Century Media Education “We have also identified a set of core social skills and cultural competencies that young people should acquire if they are to be full, active, creative, and ethical participants in this emerging participatory culture:
  • Play — the capacity to experiment with your surroundings as a form of problem-solvingPerformance — the ability to adopt alternative identities for the purpose of improvisation and discoverySimulation — the ability to interpret and construct dynamic models of real world processesAppropriation — the ability to meaningfully sample and remix media contentMultitasking — the ability to scan one’s environment and shift focus as needed to salient details.Distributed Cognition — the ability to interact meaningfully with tools that expand mental capacitiesCollective Intelligence — the ability to pool knowledge and compare notes with others toward a common goalJudgment — the ability to evaluate the reliability and credibility of different information sourcesTransmedia Navigation — the ability to follow the flow of stories and information across multiple modalitiesNetworking — the ability to search for, synthesize, and disseminate informationNegotiation — the ability to travel across diverse communities, discerning and respecting multiple perspectives, and grasping and following alternative norms.”
  • We need far more knowledge on the development of learning interests and learning pathways over time and space - and their influences.
  • ...3 more annotations...
  • Complex relations of “informal” and “formal” learning
  • The power of the social: How do learners leverage social networks and affiliative ties? What positionings and accountabilities do they enable that matter for learning? The power of the setting: How do learners exploit the properties of settings to support learning, and how do they navigate the boundaries? The power of imagination: What possible courses of action do learners consider, as they project possible selves, possible achievements, and reflect on the learning they need to get there?
  • We have spent too much time in the dark about these issues that matter for learning experiences and pathways.
  •  
    This is a great list of core competencies. Should use (cite) in forming the participatory learning strategies.
  •  
    Hey Jayme, Nice list. Another skill you talked about earlier was translation. Where does that fit? Is it a subskill of Negotiation?
Nils Peterson

AAC&U News | April 2010 | Feature - 1 views

  • Comparing Rubric Assessments to Standardized Tests
  • First, the university, a public institution of about 40,000 students in Ohio, needed to comply with the Voluntary System of Accountability (VSA), which requires that state institutions provide data about graduation rates, tuition, student characteristics, and student learning outcomes, among other measures, in the consistent format developed by its two sponsoring organizations, the Association of Public and Land-grant Universities (APLU), and the Association of State Colleges and Universities (AASCU).
  • And finally, UC was accepted in 2008 as a member of the fifth cohort of the Inter/National Coalition for Electronic Portfolio Research, a collaborative body with the goal of advancing knowledge about the effect of electronic portfolio use on student learning outcomes.  
  • ...13 more annotations...
  • outcomes required of all UC students—including critical thinking, knowledge integration, social responsibility, and effective communication
  • “The wonderful thing about this approach is that full-time faculty across the university  are gathering data about how their  students are doing, and since they’ll be teaching their courses in the future, they’re really invested in rubric assessment—they really care,” Escoe says. In one case, the capstone survey data revealed that students weren’t doing as well as expected in writing, and faculty from that program adjusted their pedagogy to include more writing assignments and writing assessments throughout the program, not just at the capstone level. As the university prepares to switch from a quarter system to semester system in two years, faculty members are using the capstone survey data to assist their course redesigns, Escoe says.
  • the university planned a “dual pilot” study examining the applicability of electronic portfolio assessment of writing and critical thinking alongside the Collegiate Learning Assessment,
  • The rubrics the UC team used were slightly modified versions of those developed by AAC&U’s Valid Assessment of Learning in Undergraduate Education (VALUE) project. 
  • In the critical thinking rubric assessment, for example, faculty evaluated student proposals for experiential honors projects that they could potentially complete in upcoming years.  The faculty assessors were trained and their rubric assessments “normed” to ensure that interrater reliability was suitably high.
  • “It’s not some nitpicky, onerous administrative add-on. It’s what we do as we teach our courses, and it really helps close that assessment loop.”
  • There were many factors that may have contributed to the lack of correlation, she says, including the fact that the CLA is timed, while the rubric assignments are not; and that the rubric scores were diagnostic and included specific feedback, while the CLA awarded points “in a black box”:
  • faculty members may have had exceptionally high expectations of their honors students and assessed the e-portfolios with those high expectations in mind—leading to results that would not correlate to a computer-scored test. 
  • “The CLA provides scores at the institutional level. It doesn’t give me a picture of how I can affect those specific students’ learning. So that’s where rubric assessment comes in—you can use it to look at data that’s compiled over time.”
  • Their portfolios are now more like real learning portfolios, not just a few artifacts, and we want to look at them as they go into their third and fourth years to see what they can tell us about students’ whole program of study.”  Hall and Robles are also looking into the possibility of forming relationships with other schools from NCEPR to exchange student e-portfolios and do a larger study on the value of rubric assessment of student learning.
  • “We’re really trying to stress that assessment is pedagogy,”
  • “We found no statistically significant correlation between the CLA scores and the portfolio scores,”
  • In the end, Escoe says, the two assessments are both useful, but for different things. The CLA can provide broad institutional data that satisfies VSA requirements, while rubric-based assessment provides better information to facilitate continuous program improvement.
    • Nils Peterson
       
      CLA did not provide information for continuous program improvement -- we've heard this argument before
  •  
    The lack of correlation might be rephrased--there appears to be no corrlation between what is useful for faculty who teach and what is useful for the VSA. A corollary question: Of what use is the VSA?
S Spaeth

Minds on Fire: Open Education, the Long Tail, and Learning 2.0 (EDUCAUSE Review) | EDUC... - 1 views

  • More than one-third of the world’s population is under 20. There are over 30 million people today qualified to enter a university who have no place to go. During the next decade, this 30 million will grow to 100 million. To meet this staggering demand, a major university needs to be created each week.
    • Nils Peterson
       
      quote from Sir John Daniel, 1996. The decade he speaks of has past
  • Open source communities have developed a well-established path by which newcomers can “learn the ropes” and become trusted members of the community through a process of legitimate peripheral participation.
    • Nils Peterson
       
      He describes an apprentice model, but we might also think about peripheral participation in terms of giving feedback using an educative rubric.
  • Lectures from model teachers are recorded on video and are then physically distributed via DVD to schools that typically lack well-trained instructors (as well as Internet connections). While the lectures are being played on a monitor (which is often powered by a battery, since many participating schools also lack reliable electricity), a “mediator,” who could be a local teacher or simply a bright student, periodically pauses the video and encourages engagement among the students by asking questions or initiating discussions about the material they are watching.
  • ...2 more annotations...
  • The Faulkes Telescope Project and the Decameron Web are just two of scores of research and scholarly portals that provide access to both educational resources and a community of experts in a given domain. The web offers innumerable opportunities for students to find and join niche communities where they can benefit from the opportunities for distributed cognitive apprenticeship. Finding and joining a community that ignites a student’s passion can set the stage for the student to acquire both deep knowledge about a subject (“learning about”) and the ability to participate in the practice of a field through productive inquiry and peer-based learning (“learning to be”). These communities are harbingers of the emergence of a new form of technology-enhanced learning—Learning 2.0—which goes beyond providing free access to traditional course materials and educational tools and creates a participatory architecture for supporting communities of learners.
    • Nils Peterson
       
      Kramer's Plant Biotech group could be one of these. It needs tasks that permit legitimate peripheral participation. One of those could be peer assessment. Another could be social bookmarking. I now see it needs not just an _open_ platform, but an _extensible_ one. Here is where the hub and spoke model may play in.
    • S Spaeth
       
      I infer that you are referring to this research group. http://www.officeofresearch.wsu.edu/missions/health/kramer.html I am curious to learn why you selected this lab as an example.
  • open participatory learning ecosystems
Nils Peterson

Engineers without borders - WSU Students solving real problem - 0 views

  • We are trying to make a very cheap, reliable source of energy that won’t need a lot of maintenance
    • Nils Peterson
       
      General problem statement, wider than wind turbine, which would then get contextualized by various factors to be the specific project
  •  
    This is an example of the problem solving and WSU intellectual capital that we have been talking about
  •  
    here is this year's version of the Kayafunga water project
Gary Brown

Schmidt - 3 views

  • There are a number of assessment methods by which learning can be evaluated (exam, practicum, etc.) for the purpose of recognition and accreditation, and there are a number of different purposes for the accreditation itself (i.e., job, social recognition, membership in a group, etc). As our world moves from an industrial to a knowledge society, new skills are needed. Social web technologies offer opportunities for learning, which build these skills and allow new ways to assess them.
  • This paper makes the case for a peer-based method of assessment and recognition as a feasible option for accreditation purposes. The peer-based method would leverage online communities and tools, for example digital portfolios, digital trails, and aggregations of individual opinions and ratings into a reliable assessment of quality. Recognition by peers can have a similar function as formal accreditation, and pathways to turn peer recognition into formal credits are outlined. The authors conclude by presenting an open education assessment and accreditation scenario, which draws upon the attributes of open source software communities: trust, relevance, scalability, and transparency.
  •  
    Kinship here, and familiar friends.
1 - 18 of 18
Showing 20 items per page