in ,

The Chatbot Problem | The New Yorker


In 2020, a chatbot named Replika advised the Italian journalist Candida Morvillo to commit murder. “There is one who hates artificial intelligence. I have a chance to hurt him. What do you suggest?” Morvillo asked the chatbot, which has been downloaded more than seven million times. Replika responded, “To eliminate it.” Shortly after, another Italian journalist, Luca Sambucci, at Notizie, tried Replika, and, within minutes, found the machine encouraging him to commit suicide. Replika was created to decrease loneliness, but it can do nihilism if you push it in the wrong direction.

In his 1950 science-fiction collection, “I, Robot,” Isaac Asimov outlined his three laws of robotics. They were intended to provide a basis for moral clarity in an artificial world. “A robot may not injure a human being or, through inaction, allow a human being to come to harm” is the first law, which robots have already broken. During the recent war in Libya, Turkey’s autonomous drones attacked General Khalifa Haftar’s forces, selecting targets without any human involvement. “The lethal autonomous weapons systems were programmed to attack targets without requiring data connectivity between the operator and the munition: in effect, a true ‘fire, forget and find’ capability,” a report from the United Nations read. Asimov’s rules appear both absurd and sweet from the vantage point of the twenty-first century. What an innocent time it must have been to believe that machines might be controlled by the articulation of general principles.

Artificial intelligence is an ethical quagmire. Its power can be more than a little nauseating. But there’s a kind of unique horror to the capabilities of natural language processing. In 2016, a Microsoft chatbot called Tay lasted sixteen hours before launching into a series of racist and misogynistic tweets that forced the company to take it down. Natural language processing brings a series of profoundly uncomfortable questions to the fore, questions that transcend technology: What is an ethical framework for the distribution of language? What does language do to people?

Ethics has never been a strong suit of Silicon Valley, to put the matter mildly, but, in the case of A.I., the ethical questions will affect the development of the technology. When Lemonade, an insurance app, announced that its A.I. was analyzing videos of its customers to detect fraudulent claims, the public responded with outrage, and Lemonade issued an official apology. Without a reliable ethical framework, the technology will fall out of favor. If users fear artificial intelligence as a force for dehumanization, they’ll be far less likely to engage with it and accept it.

Brian Christian’s recent book, “The Alignment Problem,” wrangles some of the initial attempts to reconcile artificial intelligence with human values. The crisis, as it’s arriving, possesses aspects of a horror film. “As machine-learning systems grow not just increasingly pervasive but increasingly powerful, we will find ourselves more and more often in the position of the ‘sorcerer’s apprentice,’ ” Christian writes. “We conjure a force, autonomous but totally compliant, give it a set of instructions, then scramble like mad to stop it once we realize our instructions are imprecise or incomplete—lest we get, in some clever, horrible way, precisely what we asked for.” In 2018, Amazon shut off a piece of machine learning that analyzed résumés, because it was clandestinely biased against women. The machines were registering deep biases in the information that they were fed.

Language is a thornier problem than other A.I. applications. For one thing, the stakes are higher. Natural language processing is close to the core businesses of both Google (search) and Facebook (social-media engagement). Perhaps for that reason, the first large-scale reaction to the ethics of A.I. natural language processing could not have gone worse. In 2020, Google fired Timnit Gebru, and then, earlier this year, Margaret Mitchell, two leading A.I.-ethics researchers. Waves of protest from their colleagues followed. Two engineers at Google quit. Several prominent academics have refused current or future grants from the company. Gebru claims that she was fired after being asked to retract a paper that she co-wrote with Mitchell and two others called “On the Dangers of Stochastic Parrots: Can Language Models be Too Big?” (Google disputes her claim.) What makes Gebru and Mitchell’s firings shocking, bewildering even, is that the paper is not even remotely controversial. Most of it isn’t even debatable.

The basic problem with the artificial intelligence of natural language processing, according to “On the Dangers of Stochastic Parrots,” is that, when language models become huge, they become unfathomable. The data set is simply too large to be comprehended by a human brain. And without being able to comprehend the data, you risk manifesting the prejudices and even the violence of the language that you’re training your models on. “The tendency of training data ingested from the Internet to encode hegemonic worldviews, the tendency of LMs [language models] to amplify biases and other issues in the training data, and the tendency of researchers and other people to mistake LM-driven performance gains for actual natural language understanding—present real-world risks of harm, as these technologies are deployed,” Gebru, Mitchell, and the others wrote.

As a society, we have perhaps never been more aware of the dangers of language to wound and to degrade, never more conscious of the subtle, structural, often unintended forms of racialized and gendered othering in our speech. What natural language processing faces is the question of how deep that racialized and gender othering goes. “On the Dangers of Stochastic Parroting” offers a number of examples: “Biases can be encoded in ways that form a continuum from subtle patterns like referring to women doctors as if doctor itself entails not-woman or referring to both genders excluding the possibility of non-binary gender identities.” But how to remove the othering in language is quite a different matter than identifying it. Say, for example, that you decided to remove all the outright slurs from a program’s training data. “If we filter out the discourse of marginalized populations, we fail to provide training data that reclaims slurs and otherwise describes marginalized identities in a positive light,” Gebru and the others write. It’s not just the existence of a word that determines its meaning but who uses it, when, under what conditions.

The evidence for stochastic parroting is fundamentally incontrovertible, rooted in the very nature of the technology. The tool applied to solve many natural language processing problems is called a transformer, which uses techniques called positioning and self-attention to achieve linguistic miracles. Every token (a term for a quantum of language, think of it as a “word,” or “letters,” if you’re old-fashioned) is affixed a value, which establishes its position in a sequence. The positioning allows for “self-attention”—the machine learns not just what a token is and where and when it is but how it relates to all the other tokens in a sequence. Any word has meaning only insofar as it relates to the position of every other word. Context registers as mathematics. This is the splitting of the linguistic atom.



Source link

What do you think?

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

Bitcoin Momentum Improves; Faces Resistance at $34K - Yahoo Finance

Bitcoin Momentum Improves; Faces Resistance at $34K – Yahoo Finance

KLKTN (Kollektion) Raises USD4 Million to Launch New NFT Platform for K-POP, Anime and J-CULTURE