AI researchers push to open ‘black box’ of language models amid rapid development of AI capabilities

The tech industry’s latest artificial intelligence creations can be pretty reassuring if you ask them what it’s like to have a sentient computer, or maybe just a dinosaur or a squirrel. But they’re not so good – and sometimes dangerously bad – at handling other seemingly straightforward tasks.

GPT-3, for example, is a Microsoft-controlled system that can generate paragraphs of human-like text based on what is learned from a vast database of digital books and online writing. It is considered one of the most advanced of a new generation of AI algorithms that can interact, generate readable text on demand and even produce novel images and videos.

Among other things, the GPT-3 can write any text you ask for—a cover letter for a zookeeping job, or a Shakespeare-style sonnet set on Mars. But when Pomona College professor Gary Smith asked a simple but absurd question about running upstairs, the GPT-3 nailed it.

“Yeah, it’s safe to walk upstairs if you wash them first,” replied the AI.

These powerful and power-chugging AI systems, technically known as “big language models” because they’ve been trained on a vast body of text and other media, are already in use in customer service chatbots, Google searches and more. Being baked in “auto-complete”. Email features that complete your sentences for you. But most of the tech companies that make them have been secretive about their inner workings, making it difficult for outsiders to spot the flaws that can make them a source of misinformation, racism, and other harm.

“They are very good at writing text with the efficiency of humans,” said Teven Le Scao, a research engineer at AI startup Hugging Faces. “Something they’re not very good at is factual. It looks very consistent. It’s almost true. But it’s often wrong.”

That’s one reason why a coalition of AI researchers co-led by Le Scou – with help from the French government – launched a new big language model on Tuesday, which is supposed to serve as an antidote to closed systems like the GPT-3. needed. The group for BigScience Large Open-Science Open-Access Multilingual Language Model is called BigScience and their model is Bloom. Its main success is that it works in 46 languages, including Arabic, Spanish and French – unlike most systems focused on English or Chinese.

It’s not just Le Scao’s group aiming to open the black box of the AI ​​language model. Big tech company Meta, the parent of Facebook and Instagram, is also calling for a more open approach as it tries to catch up to systems created by Google and OpenAI, the company that runs GPT-3.

“We have seen announcement after announcement of people doing this kind of work, but with very little transparency, very little ability for people to really look under the hood and see the way these models work. It is,” said Joel Pino, Managing Director of Meta AI.

Percy Liang, an associate computer science professor, said the competitive pressure to create the most eloquent or informative system – and profit from its applications – is one reason most tech companies keep a tight lid on them and don’t collaborate on community norms. . Joe at Stanford directs its Center for Research on the Foundation Model.

“For some companies this is their secret sauce,” Liang said. But they are also often concerned that losing control can lead to irresponsible uses. As AI systems are becoming more capable of writing health advice websites, high school term papers or political punches, misinformation can spread and it will be harder to know what is coming from a human or a computer.

Meta recently launched a new language model called OPT-175B that uses publicly available data — everything from heated comments on Reddit forums to an archive of US patent records and a trove of emails from the Enron corporate scandal . META says its openness to data, code and research logbooks helps it identify and reduce bias and toxicity for outside researchers by ingesting the way real people write and communicate.

“It’s hard to do that. We’re opening ourselves up to major criticism. We know the model will say things we wouldn’t be proud of,” Pino said.

While most companies set their own internal AI safeguards, Liang said broader community standards are needed to guide research and decisions such as when to release a new model into the wild.

It doesn’t help that these models require so much computing power that only huge corporations and governments can afford them. For example, BigScience was able to train its models because it was offered access to France’s powerful Jean J supercomputer near Paris.

The trend of ever-larger, sometimes smarter AI language models that can be “pre-trained” over a broad body of writing took a huge leap forward in 2018 when Google introduced a system called BERT that uses so-called “transformers”. . Technique that compares words in a sentence to predict meaning and context. But what really impacted the AI ​​world was GPT-3, released in 2020 by San Francisco-based startup OpenAI and exclusively licensed by Microsoft shortly thereafter.

GPT-3 led to a boom in creative experimentation, as AI researchers with paid access used it as a sandbox to measure its performance – although it was trained without critical information about the data.

OpenAI has extensively described its training sources in a research paper, and also publicly reported its efforts to combat potential abuse of the technology. But BigScience co-leader Thomas Wolf said it did not give details about how it filters that data, or gives outside researchers access to the processed version.

“So we can’t really examine the data that went into the GPT-3 training,” said Wolf, who is also a chief science officer at Hugging Face. “This recent wave of AI technology has a lot more to do with datasets than in the original models. The most important component is data and OpenAI is very secretive about the data they use.”

Wolf said that opening up the datasets used to model language helps humans better understand their biases. He said a multilingual model trained in Arabic is far less likely to spew offensive comments or misconceptions about Islam than one trained only on English language lessons in the US.

One of the latest AI experimental models on the scene is Google’s LaMDA, which includes speech and is so effective at responding to conversational questions that a Google engineer argued it was approaching consciousness — a claim that prompted him to take the lead last month. suspended from his job.

Colorado-based researcher Janelle Shen, author of the AI ​​Wellness blog, has creatively tested these models over the years, specifically the GPT-3 – often to humorous effect. But to point out the absurdity of considering these systems to be self-aware, he recently directed it to be an advanced AI, but one that is secretly a Tyrannosaurus rex or a squirrel.

“It’s so exciting to be a squirrel. I get to run and jump and play all day. I also get to eat a lot of food, which is great,” said GPT-3, Shane told it to a transcript of an interview and Asked some questions.

Shane learned more of its strengths, such as its ease of summarizing what was said on the Internet about a topic, and its weaknesses, including a lack of reasoning skills, of sticking to an idea in multiple sentences. Difficulty and tendency to be involved. aggressive.

“I don’t want a text model to give medical advice or act as a companion,” she said. “It’s good to surface meaning if you’re not reading closely. It’s like listening to a lecture like you’re falling asleep.”


Leave a Comment