The Week in AI: Claude 3 vs. GPT-4, Google AutoBNN, OpenAI-Microsoft Supercomputer, More -- Pure AI

The Week in AI: Claude 3 vs. GPT-4, Google AutoBNN, OpenAI-Microsoft Supercomputer, More

By Pure AI Editors
04/01/2024

This edition of our weekly roundup of AI product and services news includes the Claude 3 Opus defeat of GPT-4 in the Chatbot Arena, Google AI's release of AutoBNN, OpenAI and Microsoft's supercomputer project, KnowBe4's AIDA release, Nametag's ID verification platform, Accenture and Adobe's latest collaboration, and more.

For the first time, Anthropic's Claude 3 Opus large language model (LLM) surpassed OpenAI's GPT-4 on Chatbot Arena, a popular crowdsourced leaderboard used by AI researchers to gauge the relative capabilities of AI language models. Sometimes called a "Turing test on steroids," the leaderboard was launched in May of last year by Large Model Systems Organization (LMSYS Org), an open research organization founded by students and faculty from UC Berkeley in collaboration with UCSD and CMU to provide "a benchmark platform for LLMs that features anonymous, randomized battles in a crowdsourced manner." Variations of GPT-4 have held the top spot in the Arena since it was established, so its defeat is a significant milestone in the relatively short history of LLMs.

OpenAI may be collaborating with Microsoft on a $100 billion data center project built around an AI supercomputer dubbed "Stargate." The project, first reported in The Information and then picked up by Reuters, is scheduled to debut in 2028. Microsoft would likely finance the project, The Information reported, which is expected to be 100 times more costly than some of the biggest existing data centers, citing people involved in private conversations about the proposal. The proposed U.S.-based supercomputer would be the biggest in a series the companies are looking to build over the next six years, the report added.

Google AI researchers released AutoBNN, a new open-source package aimed at the challenge of effectively modeling time series data for forecasting purposes. Written in JAX, a Python library for accelerator-oriented array computation and program transformation, AutoBNN automates the discovery of interpretable time series forecasting models, provides high-quality uncertainty estimates, and scales for use on large datasets. AutoBNN combines the interpretability of traditional probabilistic approaches with the scalability and flexibility of neural networks, the researcher said in a blog post. "AutoBNN is based on a line of research that over the past decade has yielded improved predictive accuracy by modeling time series using GPs with learned kernel structures," they wrote.

Nametag Inc., an identity verification platform startup, announced the launch of a new self-service account recovery solution designed to prevent AI-generated deepfake attacks. Dubbed " Nametag Autopilot," the new service aims to prevent social engineering and impersonation attacks. Instead, security questions, one-time passcodes, and authenticator apps, Nametag Autopilot employs self-service verification to shut down threats, such as digital injection attacks, presentation attacks, social engineering, push fatigue, credential stuffing, and the use of AI-generated deepfakes. The solution will save organizations millions by deflecting time-consuming password and multifactor authentication resets to self-service, the company says.

Robotic Assistance Devices (RAD), a subsidiary of Artificial Intelligence Technology Solutions, Inc., announce the availability of "SelectBlur," to qualifying law enforcement agencies at no cost. The Windows desktop app was designed to allow users to selectively blur faces in video footage to ensure the privacy of individuals who are not the focus of the surveillance. The offer comes on the heels of widespread news stories of the Murrieta Police Department utilizing Lego images in an effort to obscure faces in booking photos and videos, the company says.

Wolken Software, a provider of customer service solutions for B2B companies, announced the general availability of Wolken Gen AI, a solution for streamlining and enhancing the process of managing customer queries and improving customer experiences. Wolken Gen AI, which was designed to be integrated into nearly any service management platform, automates many processes that are traditionally conducted manually such as creating customized responses to user queries by identifying relevant data across multiple sources and generating a unique answer for each one. Wolken Gen AI is the first offering within the company’s new Wolken AI suite.

KnowBe4, a security awareness training and simulated phishing platform, unveiled "Artificial Intelligence Defense Agents" (AIDA), designed to enable "long-term culture change and human risk reduction." Demonstrated at KnowBe4's KB4-CON, AIDA enables organizations to automate the dynamic selection of security awareness training and testing to give users a more individualized learning experience based on their specific needs. This streamlines the process of generating individual responses for agents and increases the speed of responses from customers or internal stakeholders, the company said. “At KnowBe4, we have been using AI for nearly six years and we are researching and enabling it in everything we do to ultimately help our customers to better protect their organizations,” said KnowBe4Stu CEO Sjouwerman, in a statement.

Accenture and Adobe are set to co-develop industry-specific solutions using Adobe Firefly, Adobe’s family of generative AI models, to help organizations create personalized content at scale and accelerate the transformation of their content supply chains. Accenture will integrate Adobe Firefly Custom Models into marketing services offered by Accenture Song, the companies said in a statement, to provide clients with "the industry-specific insights required to train bespoke models on their proprietary data and brand guidelines." Firefly is also accessible via APIs through Firefly Services, as well as through Adobe Creative Cloud and Experience Cloud applications. "By generating content that aligns with their brand style and design language, marketers can build templatized campaigns that can be continually refined based on performance data and impact," the companies said. "This iterative approach streamlines the content creation process and reduces the need for manual adjustments."