The Week in AI: Diffusion 3 Medium, Nexus HyperFabric, JACE Autonomous AI Agent, More

Apple stepped into the media spotlight last week with its Apple Intelligence announcements, which we covered here. But life went on outside the Mothership. These are a few of the AI-industry product and services announcements you might have missed while all eyes were on Cupertino.

Stability AI released Stable Diffusion 3 Medium (SD3 Medium). Billed as "the most advanced text-to-image model in the company's Stable Diffusion 3 series," SD3 Medium was release under a free non-commercial license and it's available on Hugging Face and on Stability AI's API and applications, including Stable Assistant and Stable Artisan. SD3 Medium is a 2 billion parameter SD3 model designed to deliver images with exceptional detail, color, and lighting, enabling photorealistic outputs as well as high-quality outputs in flexible styles; the ability to comprehend long and complex prompts involving spatial reasoning, compositional elements, actions, and styles; and text quality with fewer errors in spelling, kerning, letter forming, and spacing. Commercial users are encouraged to contact Stability AI for licensing details.

Hyperscience, a market leader in "hyperautomation," which combines AI/ML to automate business processes, and a provider of enterprise AI infrastructure software, announced a new solution designed to "usher the back office into the GenAI age," by fine-tuning LLMs with ground-truth documents embedded at the core of the enterprise. Called Hypercell for GenAI, the new solution automatically annotates, labels, and structures data from documents for fine-tuning LLMs and GenAI experiences, allowing organizations to rapidly and continuously develop highly accurate and relevant enterprise models. Businesspeople can use Hypercell for GenAI to accelerate mission critical workflows, grounded in secure, proprietary data, and tuned to the business, the company said. 

OneValley launched Haystack AI, a product review and recommendation platform for early-stage startups and small to medium-sized businesses (SMBs), powered by large-language model (LLM) maker Seekr. Currently focused on financial tools and management software, Haystack leverages OneValley's startup expertise and community of founders and entrepreneurs with trustworthy LLM technology and content generation from Seekr. The platform will reduce time spent on securing essential business needs by providing accurate, personalized, and instantaneous product recommendations, the company said. To learn more about Haystack, visit

Zeta Labs debuted an LLM-based autonomous AI agent called JACE. JACE represents the future of AI agents, the company said, because it goes beyond traditional uses of current AI chatbots, such as ChatGPT and their text-generation focus. Instead, JACE focuses on taking action in the digital world. It differs from existing AI-powered chatbots due to its complex cognitive architecture, which was designed to enable it to complete high-difficulty tasks. JACE can control and perform actions in the browser that are similar to a human user, the company said, and it excels at managing complex tasks that involve web automation, interaction, and direct communication. This is possible because of the development and training of Zeta Labs' proprietary web-interaction model, AWA-1 (Autonomous Web Agent-1), which enables JACE to reliably execute tasks over long periods of time, effectively handling the challenges and inconsistencies commonly found in web interfaces.

Cisco introduced its Nexus HyperFabric AI clusters, a new data center infrastructure solution developed in collaboration with NVIDIA. This new offering aims to simplify the deployment of generative AI applications, the company said. By combining the strengths of both companies, it provides comprehensive IT visibility and analytics across the entire AI infrastructure stack. The Nexus HyperFabric AI clusters were designed to enable enterprise customers to build and manage infrastructure for running generative AI models and inference applications, even without extensive IT knowledge or skills. The on-prem solution features a single place to design, deploy, monitor and assure the AI pods and data center workloads. It guides users from design to validated deployment and monitoring and assurance for enterprise-ready AI infrastructure. With its cloud management capabilities, customers can easily deploy and manage large-scale fabrics across data centers, colocation facilities and edge sites, the company said. The solution also offers automated, cloud-managed operations across a unified compute and networking fabric combining Cisco's Ethernet switching expertise founded on Cisco Silicon One, integrated with NVIDIA's accelerated computing and NVIDIA AI Enterprise software, and VAST’s data storage platform.

Persistent Systems announced the launch of GenAI Hub, a new platform designed to accelerate the creation and deployment of generative AI (GenAI) applications within enterprises. This platform integrates with an organization's existing infrastructure, applications, and data, enabling the rapid development of tailored, industry-specific GenAI solutions. GenAI Hub supports the adoption of GenAI across various Large Language Models (LLMs) and clouds, without provider lock-in, the company said. This platform aims to simplify the development and management of multiple GenAI models, expediting market readiness through pre-built software components, all while upholding responsible AI principles. The GenAI Hub streamlines the development of use cases for enterprises by providing step-by-step guidance and seamless integration of data in LLMs, enabling the rapid creation of efficient and secure GenAI solutions at scale, whether for end users, customers, or employees.

Russian tech giant Yandex has released an open-source Sharded Data Parallelism framework designed to help AI companies save money and resources when training new models. Dubbed YaFSDP (Yet Another Fully Sharded Data Parallel), it provides a method for training large language models (LLMs) that optimizes learning speed and performance, enabling AI developers to use less computing power and GPU resources when training their models. YaFSDP is an enhanced version of FSDP (Fully Sharded Data Parallel), a type of data-parallel training algorithm. YaFSDP outperforms that algorithm in the most communication-heavy stages of LLM training, the company said, including things like pre-training, alignment, and fine-tuning. YaFSDP conserves compute power and processor memory, which helps to accelerate the LLM training process. YaFSDP is currently the most effective publicly available tool for enhancing GPU communication and reducing memory usage in LLM training, offering a speedup of up to 26% compared to FSDP, the company said, depending on the architecture and number of parameters. Reducing the training time for LLMs through the use of YaFSDP can result in savings of up to 20% in GPU resources.