OpenAI is strengthening its internal safety measures to address the potential dangers of AI. A new safety advisory group will oversee the technical teams and make recommendations to the leadership, with the board now having veto power. This update follows recent leadership changes and growing discussions about AI risks.
OpenAI’s updated “Preparedness Framework” aims to identify and manage the catastrophic risks of AI models, defined as risks causing massive economic damage or severe harm to many people, including existential threats. The framework involves different teams for in-production models (safety systems team) and frontier models in development (preparedness team). There’s also a superalignment team focusing on theoretical guidelines for superintelligent models.
Models are evaluated on four risk categories: cybersecurity, persuasion (like disinformation), model autonomy, and CBRN (chemical, biological, radiological, nuclear threats). Mitigations are in place for high-risk models, like not describing how to make dangerous substances. If a model still poses a high risk after mitigations, it won’t be deployed; if it poses a critical risk, it won’t be developed further.
OpenAI’s approach includes a cross-functional Safety Advisory Group to review and make recommendations, aiming to uncover potential unknown risks. Decisions to deploy models will be made by the leadership, but the board can reverse these decisions.
The new structure aims to prevent high-risk products or processes from being approved without the board’s knowledge, following recent internal upheavals. However, questions remain about the board’s willingness to challenge decisions and OpenAI’s transparency in handling critical risks.
Trusted by leading startups and Fortune 500 companies.
Building an AI product is hard. Engineers who understand AI are expensive and hard to find. And there’s no way of telling who’s legit and who’s not.
That’s why companies around the world trust AE Studio. We help you craft and implement the optimal AI solution for your business with our team of world-class AI experts from Harvard, Princeton, and Stanford.
Our development, design, and data science teams work closely with founders and executives to create custom software and AI solutions that get the job done for a fraction of the cost.
p.s. Wondering how OpenAI DevDay impacts your business? Let’s talk!
Former Pakistani Prime Minister Imran Khan, currently imprisoned and facing charges of leaking classified documents, used artificial intelligence to deliver a speech at a virtual rally organized by his party, Pakistan Tehreek-e-Insaf (PTI). The rally, aimed at overcoming state censorship and reaching supporters, was broadcast on social media platforms including Facebook, X, and YouTube. Khan’s speech was created using an AI voice clone developed by ElevenLabs, based on a script he provided through his lawyers.
This innovative approach allowed Khan to address his supporters despite being in jail. The speech, which blended AI-generated audio with historical footage of Khan, was part of a five-hour livestream that included talks by PTI supporters. The broadcast, viewed by over 4.5 million people, faced internet disruptions attributed to censorship efforts against Khan.
The speech received mixed reactions, with some appreciating the use of technology, while others found it less convincing than a live speech. Khan’s PTI, known for its social media savvy, continues to engage its predominantly young audience through these platforms.
This event raises questions about the potential uses and implications of AI in political communication, especially in contexts of state suppression and censorship. Khan remains a key figure in PTI, despite being replaced as its leader while in prison. Pakistan is set to hold general elections on February 8th, with Khan’s participation and influence still significant.
Data poisoning is a tactic used to disrupt the performance of text-to-image generators like Midjourney or DALL-E. These generators are trained on large datasets containing millions of images. While some, like those from Adobe or Getty, use images they own or license, others scrape online images indiscriminately, leading to copyright infringement issues.
“Nightshade” is a tool created to combat unauthorised image scraping. It subtly alters an image’s pixels in a way that is imperceptible to humans but confuses AI models. When these altered images are used in training data, the AI’s ability to accurately classify and generate images is compromised, leading to unpredictable results, such as generating an egg instead of a balloon, or introducing odd features like six-legged dogs.
The concept of data poisoning is broader than just image generation. It relates to adversarial approaches in AI, like using makeup and costumes to avoid facial recognition systems. These tactics are part of a larger conversation about technological governance and the moral rights of artists and users. While technology vendors might view data poisoning as a problem to be solved, it can also be seen as a creative response to privacy and copyright concerns. Solutions proposed include better scrutiny of data sources, ensemble modeling to detect outliers, and using test datasets to audit model accuracy.
How to use Model Switcher in ChatGPT (and find Hidden Models!)
Hope you enjoyed today’s newsletter
⚡️ Join over 200,000 people using the Superpower ChatGPT extension on Chrome and Firefox.