The Week in AI: Meta LLM Compiler, Nvidia Nemotron-4 340B, DeepKeep's GenAI Risk Assessment Tool, More

This week's roundup of AI products and services includes, Nvidia's new family of open models for synthetic data generation, Meta's new Large Language Model Compiler, Datastax upgrade of its Langflow and RAGstack dev tools, and more!

Nvidia announced the release of Nemotron-4 340B, a new family of open models designed to generate synthetic data for training large language models (LLMs) across various industries, including healthcare, finance, manufacturing, and retail. This suite of tools aims to provide high-quality training data essential for the performance and accuracy of custom LLMs, which can be costly and difficult to obtain. The Nemotron-4 340B family includes base, instruct, and reward models optimized for use with Nvidia's NeMo open-source framework for end-to-end model training. These models are also designed for efficient inference with the open-source NVIDIA TensorRT-LLM library. The models can be downloaded from the NVIDIA NGC catalog and Hugging Face, and will soon be available on as part of the NVIDIA NIM microservice. NVIDIA's new models help developers create synthetic training data, particularly in scenarios where large, diverse labeled datasets are scarce. And they can fine-tune these models using the NVIDIA NeMo framework and optimize them for inference with TensorRT-LLM, leveraging tensor parallelism for efficient large-scale operations.

Meta AI unveiled the Large Language Model Compiler (LLM Compiler), a suite of pre-trained models aimed at advancing code optimization tasks. This release aims to address the resource-intensive demands of training LLMs by providing robust, openly available models specifically designed for compiler optimization, the company said. Building on the Code Llama foundation, the LLM Compiler was designed to enhance understanding of compiler intermediate representations (IRs), assembly language, and optimization techniques. It was trained on a dataset of 546 billion tokens of LLVM-IR and assembly code, and fine-tuned to interpret and optimize compiler behavior. x86_64 and ARM assembly back into LLVM-IR, the company said. It's available in two sizes, 7 billion and 13 billion parameters, and it was released under a bespoke commercial license to facilitate widespread reuse. Meta's LLM Compiler aims to provide a scalable, cost-effective foundation for further research and development in compiler optimization, benefiting both academic researchers and industry practitioners.

DataStax unveiled upgraded versions of its AI development tools Langflow and RAGStack, designed to streamline the creation of AI apps. The updated Langflow, an open-source tool for developing AI applications, now features AI memory management enhancements, facilitating the processing of text prompts in applications like chatbots. The new version also supports the creation of multimodal AI applications, handling both text and image inputs. To boost productivity, Langflow 1.0 introduces new project examples and a preview feature for evaluating AI output accuracy. A free cloud version is also available, enabling developers to test the tool without local installation. RAGStack, a suite of tools for building AI apps with retrieval-augmented generation (RAG) capabilities, has been updated to include the new Knowledge Graph RAG feature, which allows AI models to store information as graphs, making it easier to detect patterns. The update also introduces specialized AI models ColBERT and Text2SQL, enhancing search tasks and enabling natural language queries on SQL databases.

DeepKeep, provider of an AI-native trust, risk, and security management platform, announced the launch of its GenAI Risk Assessment module. This new product aims to enhance the security of GenAI)language and computer vision models by focusing on penetration testing and identifying potential vulnerabilities and threats to model security, trustworthiness, and privacy. The module assesses and mitigates AI model and application vulnerabilities to ensure compliance, fairness, and ethical standards. By considering risks associated with model deployment and identifying weak spots, DeepKeep offers a comprehensive ecosystem approach. The module works alongside DeepKeep's AI Firewall to provide live protection against attacks on AI applications. Its detection capabilities span a wide range of security and safety categories, utilizing DeepKeep’s proprietary technology and research.

Cognizant launched a suite of healthcare solutions utilizing Google Cloud's generative AI technology, including the Vertex AI platform and Gemini models. Part of an expanded partnership announced last August aimed at enhancing healthcare administrative processes and patient experiences, these AI solutions were designed to streamline administrative workflows, the company said, accelerating operations and improving the quality of care delivered. The new solutions target four high-cost workflows: marketing operations, call center operations, provider management, and contracting. A list of key examples includes an Appeals Resolution Assistant, which automates data retrieval and analysis for healthcare appeals management; a Contract Management Solution, which optimizes the contract lifecycle by automating review and generation processes; a Marketing Content Assistant, which automates content creation for healthcare payer marketing teams; and a Health Plan Shopper, which integrates individual preferences and clinical profiles to help members make informed healthcare plan choices.

New Relic announced the integration of its observability platform with NVIDIA NIM inference microservices, which will simplify the development, deployment, and monitoring of generative AI applications, the company said, and reduce complexity and costs while enhancing data security for AI apps built with NIM. The integrated platform allows New Relic's AI monitoring system to offer comprehensive insights across the AI stack, centralizing data from more than 60 AI integrations. It enables a broad view of the AI infrastructure, including key metrics on throughput, latency, and costs, ensuring efficient and reliable operations, the company said. Key features of the integration include full AI stack visibility, deep trace insights for every response, model inventory management, model performance comparison, and enhanced GPU insights. The platform also ensures enhanced data security by allowing the exclusion of sensitive data from monitoring. The integrated platform supports a wide range of AI models, including those from Databricks, Google, Meta, Microsoft, Mistral, and Snowflake. This support helps organizations deploy AI applications confidently, improve time-to-market, and achieve faster ROI, the company said.