Skip to main content

Home/ Groups/ Instructional & Media Services at Dickinson College
Ed Webb

I unintentionally created a biased AI algorithm 25 years ago - tech companies are still... - 0 views

  • How and why do well-educated, well-intentioned scientists produce biased AI systems? Sociological theories of privilege provide one useful lens.
  • Scientists also face a nasty subconscious dilemma when incorporating diversity into machine learning models: Diverse, inclusive models perform worse than narrow models.
  • fairness can still be the victim of competitive pressures in academia and industry. The flawed Bard and Bing chatbots from Google and Microsoft are recent evidence of this grim reality. The commercial necessity of building market share led to the premature release of these systems.
  • ...3 more annotations...
  • Their training data is biased. They are designed by an unrepresentative group. They face the mathematical impossibility of treating all categories equally. They must somehow trade accuracy for fairness. And their biases are hiding behind millions of inscrutable numerical parameters.
  • biased AI systems can still be created unintentionally and easily. It’s also clear that the bias in these systems can be harmful, hard to detect and even harder to eliminate.
  • with North American computer science doctoral programs graduating only about 23% female, and 3% Black and Latino students, there will continue to be many rooms and many algorithms in which underrepresented groups are not represented at all.
Ed Webb

Google Researchers' Attack Prompts ChatGPT to Reveal Its Training Data - 0 views

  • researchers showed that there are large amounts of privately identifiable information (PII) in OpenAI’s large language models. They also showed that, on a public version of ChatGPT, the chatbot spit out large passages of text scraped verbatim from other places on the internet
  • ChatGPT’s “alignment techniques do not eliminate memorization,” meaning that it sometimes spits out training data verbatim. This included PII, entire poems, “cryptographically-random identifiers” like Bitcoin addresses, passages from copyrighted scientific research papers, website addresses, and much more.
  • The researchers wrote that they spent $200 to create “over 10,000 unique examples” of training data, which they say is a total of “several megabytes” of training data. The researchers suggest that using this attack, with enough money, they could have extracted gigabytes of training data. The entirety of OpenAI’s training data is unknown, but GPT-3 was trained on anywhere from many hundreds of GB to a few dozen terabytes of text data.
  • ...1 more annotation...
  • the world’s most important and most valuable AI company has been built on the backs of the collective work of humanity, often without permission, and without compensation to those who created it
Ed Webb

William Davies · How many words does it take to make a mistake? Education, Ed... - 0 views

  • The problem waiting round the corner for universities is essays generated by AI, which will leave a textual pattern-spotter like Turnitin in the dust. (Earlier this year, I came across one essay that felt deeply odd in some not quite human way, but I had no tangible evidence that anything untoward had occurred, so that was that.)
  • To accuse someone of plagiarism is to make a moral charge regarding intentions. But establishing intent isn’t straightforward. More often than not, the hearings bleed into discussions of issues that could be gathered under the heading of student ‘wellbeing’, which all universities have been struggling to come to terms with in recent years.
  • I have heard plenty of dubious excuses for acts of plagiarism during these hearings. But there is one recurring explanation which, it seems to me, deserves more thoughtful consideration: ‘I took too many notes.’ It isn’t just students who are familiar with information overload, one of whose effects is to morph authorship into a desperate form of curatorial management, organising chunks of text on a screen. The discerning scholarly self on which the humanities depend was conceived as the product of transitions between spaces – library, lecture hall, seminar room, study – linked together by work with pen and paper. When all this is replaced by the interface with screen and keyboard, and everything dissolves into a unitary flow of ‘content’, the identity of the author – as distinct from the texts they have read – becomes harder to delineate.
  • ...19 more annotations...
  • This generation, the first not to have known life before the internet, has acquired a battery of skills in navigating digital environments, but it isn’t clear how well those skills line up with the ones traditionally accredited by universities.
  • From the perspective of students raised in a digital culture, the anti-plagiarism taboo no doubt seems to be just one more academic hang-up, a weird injunction to take perfectly adequate information, break it into pieces and refashion it. Students who pay for essays know what they are doing; others seem conscientious yet intimidated by secondary texts: presumably they won’t be able to improve on them, so why bother trying? For some years now, it’s been noticeable how many students arrive at university feeling that every interaction is a test they might fail. They are anxious. Writing seems fraught with risk, a highly complicated task that can be executed correctly or not.
  • Many students may like the flexibility recorded lectures give them, but the conversion of lectures into yet more digital ‘content’ further destabilises traditional conceptions of learning and writing
  • the evaluation forms which are now such a standard feature of campus life suggest that many students set a lot of store by the enthusiasm and care that are features of a good live lecture
  • the drift of universities towards a platform model, which makes it possible for students to pick up learning materials as and when it suits them. Until now, academics have resisted the push for ‘lecture capture’. It causes in-person attendance at lectures to fall dramatically, and it makes many lecturers feel like mediocre television presenters. Unions fear that extracting and storing teaching for posterity threatens lecturers’ job security and weakens the power of strikes. Thanks to Covid, this may already have happened.
  • This vision of language as code may already have been a significant feature of the curriculum, but it appears to have been exacerbated by the switch to online teaching. In a journal article from August 2020, ‘Learning under Lockdown: English Teaching in the Time of Covid-19’, John Yandell notes that online classes create wholly closed worlds, where context and intertextuality disappear in favour of constant instruction. In these online environments, readingis informed not by prior reading experiences but by the toolkit that the teacher has provided, and ... is presented as occurring along a tramline of linear development. Different readings are reducible to better or worse readings: the more closely the student’s reading approximates to the already finalised teacher’s reading, the better it is. That, it would appear, is what reading with precision looks like.
  • an injunction against creative interpretation and writing, a deprivation that working-class children will feel at least as deeply as anyone else.
  • There may be very good reasons for delivering online teaching in segments, punctuated by tasks and feedback, but as Yandell observes, other ways of reading and writing are marginalised in the process. Without wishing to romanticise the lonely reader (or, for that matter, the lonely writer), something is lost when alternating periods of passivity and activity are compressed into interactivity, until eventually education becomes a continuous cybernetic loop of information and feedback. How many keystrokes or mouse-clicks before a student is told they’ve gone wrong? How many words does it take to make a mistake?
  • In the utopia sold by the EdTech industry (the companies that provide platforms and software for online learning), pupils are guided and assessed continuously. When one task is completed correctly, the next begins, as in a computer game; meanwhile the platform providers are scraping and analysing data from the actions of millions of children. In this behaviourist set-up, teachers become more like coaches: they assist and motivate individual ‘learners’, but are no longer so important to the provision of education. And since it is no longer the sole responsibility of teachers or schools to deliver the curriculum, it becomes more centralised – the latest front in a forty-year battle to wrest control from the hands of teachers and local authorities.
  • Constant interaction across an interface may be a good basis for forms of learning that involve information-processing and problem-solving, where there is a right and a wrong answer. The cognitive skills that can be trained in this way are the ones computers themselves excel at: pattern recognition and computation. The worry, for anyone who cares about the humanities in particular, is about the oversimplifications required to conduct other forms of education in these ways.
  • Blanket surveillance replaces the need for formal assessment.
  • Confirming Adorno’s worst fears of the ‘primacy of practical reason’, reading is no longer dissociable from the execution of tasks. And, crucially, the ‘goals’ to be achieved through the ability to read, the ‘potential’ and ‘participation’ to be realised, are economic in nature.
  • since 2019, with the Treasury increasingly unhappy about the amount of student debt still sitting on the government’s balance sheet and the government resorting to ‘culture war’ at every opportunity, there has been an effort to single out degree programmes that represent ‘poor value for money’, measured in terms of graduate earnings. (For reasons best known to itself, the usually independent Institute for Fiscal Studies has been leading the way in finding correlations between degree programmes and future earnings.) Many of these programmes are in the arts and humanities, and are now habitually referred to by Tory politicians and their supporters in the media as ‘low-value degrees’.
  • studying the humanities may become a luxury reserved for those who can fall back on the cultural and financial advantages of their class position. (This effect has already been noticed among young people going into acting, where the results are more visible to the public than they are in academia or heritage organisations.)
  • given the changing class composition of the UK over the past thirty years, it’s not clear that contemporary elites have any more sympathy for the humanities than the Conservative Party does. A friend of mine recently attended an open day at a well-known London private school, and noticed that while there was a long queue to speak to the maths and science teachers, nobody was waiting to speak to the English teacher. When she asked what was going on, she was told: ‘I’m afraid parents here are very ambitious.’ Parents at such schools, where fees have tripled in real terms since the early 1980s, tend to work in financial and business services themselves, and spend their own days profitably manipulating and analysing numbers on screens. When it comes to the transmission of elite status from one generation to the next, Shakespeare or Plato no longer has the same cachet as economics or physics.
  • Leaving aside the strategic political use of terms such as ‘woke’ and ‘cancel culture’, it would be hard to deny that we live in an age of heightened anxiety over the words we use, in particular the labels we apply to people. This has benefits: it can help to bring discriminatory practices to light, potentially leading to institutional reform. It can also lead to fruitless, distracting public arguments, such as the one that rumbled on for weeks over Angela Rayner’s description of Conservatives as ‘scum’. More and more, words are dredged up, edited or rearranged for the purpose of harming someone. Isolated words have acquired a weightiness in contemporary politics and public argument, while on digital media snippets of text circulate without context, as if the meaning of a single sentence were perfectly contained within it, walled off from the surrounding text. The exemplary textual form in this regard is the newspaper headline or corporate slogan: a carefully curated series of words, designed to cut through the blizzard of competing information.
  • Visit any actual school or university today (as opposed to the imaginary ones described in the Daily Mail or the speeches of Conservative ministers) and you will find highly disciplined, hierarchical institutions, focused on metrics, performance evaluations, ‘behaviour’ and quantifiable ‘learning outcomes’.
  • If young people today worry about using the ‘wrong’ words, it isn’t because of the persistence of the leftist cultural power of forty years ago, but – on the contrary – because of the barrage of initiatives and technologies dedicated to reversing that power. The ideology of measurable literacy, combined with a digital net that has captured social and educational life, leaves young people ill at ease with the language they use and fearful of what might happen should they trip up.
  • It has become clear, as we witness the advance of Panopto, Class Dojo and the rest of the EdTech industry, that one of the great things about an old-fashioned classroom is the facilitation of unrecorded, unaudited speech, and of uninterrupted reading and writing.
Ed Webb

The Myth Of AI | Edge.org - 0 views

  • The distinction between a corporation and an algorithm is fading. Does that make an algorithm a person? Here we have this interesting confluence between two totally different worlds. We have the world of money and politics and the so-called conservative Supreme Court, with this other world of what we can call artificial intelligence, which is a movement within the technical culture to find an equivalence between computers and people. In both cases, there's an intellectual tradition that goes back many decades. Previously they'd been separated; they'd been worlds apart. Now, suddenly they've been intertwined.
  • Since our economy has shifted to what I call a surveillance economy, but let's say an economy where algorithms guide people a lot, we have this very odd situation where you have these algorithms that rely on big data in order to figure out who you should date, who you should sleep with, what music you should listen to, what books you should read, and on and on and on. And people often accept that because there's no empirical alternative to compare it to, there's no baseline. It's bad personal science. It's bad self-understanding.
  • there's no way to tell where the border is between measurement and manipulation in these systems
  • ...8 more annotations...
  • It's not so much a rise of evil as a rise of nonsense. It's a mass incompetence, as opposed to Skynet from the Terminator movies. That's what this type of AI turns into.
  • What's happened here is that translators haven't been made obsolete. What's happened instead is that the structure through which we receive the efforts of real people in order to make translations happen has been optimized, but those people are still needed.
  • because of the mythology about AI, the services are presented as though they are these mystical, magical personas. IBM makes a dramatic case that they've created this entity that they call different things at different times—Deep Blue and so forth. The consumer tech companies, we tend to put a face in front of them, like a Cortana or a Siri
  • If you talk to translators, they're facing a predicament, which is very similar to some of the other early victim populations, due to the particular way we digitize things. It's similar to what's happened with recording musicians, or investigative journalists—which is the one that bothers me the most—or photographers. What they're seeing is a severe decline in how much they're paid, what opportunities they have, their long-term prospects.
  • In order to create this illusion of a freestanding autonomous artificial intelligent creature, we have to ignore the contributions from all the people whose data we're grabbing in order to make it work. That has a negative economic consequence.
  • If you talk about AI as a set of techniques, as a field of study in mathematics or engineering, it brings benefits. If we talk about AI as a mythology of creating a post-human species, it creates a series of problems that I've just gone over, which include acceptance of bad user interfaces, where you can't tell if you're being manipulated or not, and everything is ambiguous. It creates incompetence, because you don't know whether recommendations are coming from anything real or just self-fulfilling prophecies from a manipulative system that spun off on its own, and economic negativity, because you're gradually pulling formal economic benefits away from the people who supply the data that makes the scheme work.
  • This idea that some lab somewhere is making these autonomous algorithms that can take over the world is a way of avoiding the profoundly uncomfortable political problem, which is that if there's some actuator that can do harm, we have to figure out some way that people don't do harm with it. There are about to be a whole bunch of those. And that'll involve some kind of new societal structure that isn't perfect anarchy. Nobody in the tech world wants to face that, so we lose ourselves in these fantasies of AI. But if you could somehow prevent AI from ever happening, it would have nothing to do with the actual problem that we fear, and that's the sad thing, the difficult thing we have to face.
  • To reject your own ignorance just casts you into a silly state where you're a lesser scientist. I don't see that so much in the neuroscience field, but it comes from the computer world so much, and the computer world is so influential because it has so much money and influence that it does start to bleed over into all kinds of other things.
« First ‹ Previous 181 - 200 Next › Last »
Showing 20 items per page