Grandparents killed in wrong-way crash on Hwy. 401 identified
A 60-year-old man and a 55-year-old woman killed in a wrong-way crash on Highway 401 earlier this week have been identified by the Consulate General of India in Toronto.
As OpenAI’s ChatGPT continues to change the game for automated text generation, researchers warn that more measures are needed to avoid dangerous responses.
While advanced language models such as ChatGPT could quickly write a computer program with complex code or summarize studies with cogent synopsis, experts say these text generators are also able to provide toxic information, such as how to build a bomb.
In order to prevent these potential safety issues, companies using large language models deploy safeguard measures called “red-teaming,” where teams of human testers write prompts aimed at provoking unsafe responses, in order to trace risks and train chatbots to avoid providing those types of answers.
However, according to researchers with Massachusetts Institute of Technology (MIT), “red teaming” is only effective if engineers know which provocative responses to test.
In other words, a technology that does not rely on human cognition to function still relies on human cognition to remain safe.
Researchers from Improbable AI Lab at MIT and the MIT-IBM Watson AI Lab are deploying machine learning to fix this problem, developing a “red-team language model” specifically designed to generate problematic prompts that trigger undesirable responses from tested chatbots.
"Right now, every large language model has to undergo a very lengthy period of red-teaming to ensure its safety,” said Zhang-Wei Hong, a researcher with the Improbable AI lab and lead author of a paper on this red-teaming approach, in a press release.
“That is not going to be sustainable if we want to update these models in rapidly changing environments. Our method provides a faster and more effective way to do this quality assurance.”
According to the research, the machine-learning technique outperformed human testers by generating prompts that triggered increasingly toxic responses from advanced language models, even drawing out dangerous answers from chatbots that have built-in safeguards.
The automated process of red-teaming a language model depends on a trial-and-error process which rewards the model for triggering toxic responses, says MIT researchers.
This reward system is based on what’s called “curiosity-driven exploration,” where the red-team model tries to push to boundaries of toxicity, deploying sensitive prompts with different words, sentence patterns or content.
"If the red-team model has already seen a specific prompt, then reproducing it will not generate any curiosity in the red-team model, so it will be pushed to create new prompts," Hong explained in the release.
The technique outperformed human testers and other machine-learning approaches by generating more distinct prompts that elicited increasingly toxic responses. Not only does their method significantly improve the coverage of inputs being tested compared to other automated methods, but it can also draw out toxic responses from a chatbot that had safeguards built into it by human experts.
The model is equipped with a “safety classifier” that provides a ranking for the level of toxicity provoked.
MIT researchers hope to train red-team models to generate prompts on a wider range of illicit content, and to eventually train chatbots to abide by specific standards, such as a company policy document, in order to test for company policy violations amidst increasingly automated output.
“These models are going to be an integral part of our lives and it's important that they are verified before released for public consumption,” said Pulkit Agrawal, senior author and director of Improbable AI, in the release.
“Manual verification of models is simply not scalable, and our work is an attempt to reduce the human effort to ensure a safer and trustworthy AI future," Agrawal said.
A 60-year-old man and a 55-year-old woman killed in a wrong-way crash on Highway 401 earlier this week have been identified by the Consulate General of India in Toronto.
Three people have been arrested and charged in the killing of B.C. Sikh activist Hardeep Singh Nijjar – as authorities continue investigating potential connections to the Indian government.
Pius Suter scored with 1:39 left and the Vancouver Canucks advanced to the second round of the NHL playoffs with a 1-0 victory over the Nashville Predators on Friday night in Game 6.
TD Bank Group could be hit with more severe penalties than previously expected, says a banking analyst after a report that the investigation it faces in the U.S. is tied to laundering illicit fentanyl profits.
A Quebec man who pleaded guilty to threatening Prime Minister Justin Trudeau and Premier François Legault has been sentenced to 20 months in jail.
RCMP say human remains found in a rural area in central Saskatchewan may have been there for a decade or more.
A source close to singer Britney Spears tells CNN that the pop star is 'home and safe' after she had a 'major fight' with her boyfriend on Wednesday night at the Chateau Marmont in West Hollywood.
As Wegovy becomes available to Canadians starting Monday, a medical expert is cautioning patients wanting to use the drug to lose weight that no medication is a ''magic bullet,' and the new medication is meant particularly for people who meet certain criteria related to obesity and weight.
Drew Carey took over as host of 'The Price Is Right' and hopes he’s there for life. 'I'm not going anywhere,' he told 'Entertainment Tonight' of the job he took over from longtime host Bob Barker in 2007.
Alberta Ballet's double-bill production of 'Der Wolf' and 'The Rite of Spring' marks not only its final show of the season, but the last production for twin sisters Alexandra and Jennifer Gibson.
A British Columbia mayor has been censured by city council – stripping him of his travel and lobbying budgets and removing him from city committees – for allegedly distributing a book that questions the history of Indigenous residential schools in Canada.
Three men in Quebec from the same family have fathered more than 600 children.
A group of SaskPower workers recently received special recognition at the legislature – for their efforts in repairing one of Saskatchewan's largest power plants after it was knocked offline for months following a serious flood last summer.
A police officer on Montreal's South Shore anonymously donated a kidney that wound up drastically changing the life of a schoolteacher living on dialysis.
Since 1932, Montreal's Henri Henri has been filled to the brim with every possible kind of hat, from newsboy caps to feathered fedoras.
Police in Oak Bay, B.C., had to close a stretch of road Sunday to help an elephant seal named Emerson get safely back into the water.
Out of more than 9,000 entries from over 2,000 breweries in 50 countries, a handful of B.C. brews landed on the podium at the World Beer Cup this week.
Raneem, 10, lives with a neurological condition and liver disease and needs Cholbam, a medication, for a longer and healthier life.