![](https://www.ctvnews.ca/polopoly_fs/1.6978649.1722015109!/httpImage/image.jpg_gen/derivatives/landscape_800/image.jpg)
Missing 3-year-old boy found dead in creek in Mississauga, Ont.: police
A three-year-old boy has been found dead a day after he went missing in a park in Mississauga, Ont., Peel police say.
As OpenAI’s ChatGPT continues to change the game for automated text generation, researchers warn that more measures are needed to avoid dangerous responses.
While advanced language models such as ChatGPT could quickly write a computer program with complex code or summarize studies with cogent synopsis, experts say these text generators are also able to provide toxic information, such as how to build a bomb.
In order to prevent these potential safety issues, companies using large language models deploy safeguard measures called “red-teaming,” where teams of human testers write prompts aimed at provoking unsafe responses, in order to trace risks and train chatbots to avoid providing those types of answers.
However, according to researchers with Massachusetts Institute of Technology (MIT), “red teaming” is only effective if engineers know which provocative responses to test.
In other words, a technology that does not rely on human cognition to function still relies on human cognition to remain safe.
Researchers from Improbable AI Lab at MIT and the MIT-IBM Watson AI Lab are deploying machine learning to fix this problem, developing a “red-team language model” specifically designed to generate problematic prompts that trigger undesirable responses from tested chatbots.
"Right now, every large language model has to undergo a very lengthy period of red-teaming to ensure its safety,” said Zhang-Wei Hong, a researcher with the Improbable AI lab and lead author of a paper on this red-teaming approach, in a press release.
“That is not going to be sustainable if we want to update these models in rapidly changing environments. Our method provides a faster and more effective way to do this quality assurance.”
According to the research, the machine-learning technique outperformed human testers by generating prompts that triggered increasingly toxic responses from advanced language models, even drawing out dangerous answers from chatbots that have built-in safeguards.
The automated process of red-teaming a language model depends on a trial-and-error process which rewards the model for triggering toxic responses, says MIT researchers.
This reward system is based on what’s called “curiosity-driven exploration,” where the red-team model tries to push to boundaries of toxicity, deploying sensitive prompts with different words, sentence patterns or content.
"If the red-team model has already seen a specific prompt, then reproducing it will not generate any curiosity in the red-team model, so it will be pushed to create new prompts," Hong explained in the release.
The technique outperformed human testers and other machine-learning approaches by generating more distinct prompts that elicited increasingly toxic responses. Not only does their method significantly improve the coverage of inputs being tested compared to other automated methods, but it can also draw out toxic responses from a chatbot that had safeguards built into it by human experts.
The model is equipped with a “safety classifier” that provides a ranking for the level of toxicity provoked.
MIT researchers hope to train red-team models to generate prompts on a wider range of illicit content, and to eventually train chatbots to abide by specific standards, such as a company policy document, in order to test for company policy violations amidst increasingly automated output.
“These models are going to be an integral part of our lives and it's important that they are verified before released for public consumption,” said Pulkit Agrawal, senior author and director of Improbable AI, in the release.
“Manual verification of models is simply not scalable, and our work is an attempt to reduce the human effort to ensure a safer and trustworthy AI future," Agrawal said.
A three-year-old boy has been found dead a day after he went missing in a park in Mississauga, Ont., Peel police say.
Against the rainy Paris night sky, Celine Dion staged the comeback of her career with a powerful performance from the Eiffel Tower to open the Olympic Games.
Premier Danielle Smith said Friday afternoon in Hinton while weather conditions are cooler, the Jasper fire is still considered out of control and that Jasper residents can expect to be away from their homes 'for several weeks.'
An Irish museum will withdraw a waxwork of singer-songwriter Sinéad O'Connor just one day after installing it, following a backlash from her family and the public, it told CNN in a statement on Friday.
A Winnipeg senior is getting soaked with a six-figure water bill.
Nearly two weeks after Donald Trump’s near assassination, the FBI confirmed Friday that it was indeed a bullet that struck the former president’s ear, moving to clear up conflicting accounts about what caused the former U.S. president’s injuries after a gunman opened fire at a Pennsylvania rally.
Orillia OPP arrested and charged a driver with impaired driving after flashing their high beams.
The lawyer for a former judge whose claims to be Cree were questioned in a CBC investigation says his client is not considering legal action against the broadcaster after the Law Society of British Columbia this week backed her claims of Indigenous heritage.
Scotiabank says it has fixed a technical issue that impacted direct deposits on Friday morning.
As fire threatened people in Jasper National Park, Colleen Knull sprung into action.
Video posted to social media on Thursday morning appears to show the charred remains of a Jasper, Alta., neighbourhood.
A Saskatchewan-born veteran of the Second World War was recently presented with France's highest national order.
A local First Nations elder and veteran is helping to bring the Ojibwe language to a well-known film for the first time.
A cat who fled her Montreal home nearly a decade ago has been reunited with her family after being found in Ottawa.
A woman in Waterloo, Ont. is out thousands of dollars for a car crash she wasn’t involved in.
A swarm of bees living in a lamppost in Winnipeg’s Sage Creek neighbourhood has found a new home for its hive.
Around 100 acres of Manitoba Crown Land near the Saskatchewan border is being returned to the Métis community.
Nova Scotia is suspending the licensed Cape Breton moose hunt for three years due to what the province is calling a “significant drop” in the population.