News

Anthropic and NVIDIA Bring AI Agents Deeper into Scientific Workflows

Anthropic is moving further into scientific research with Claude Science, a new AI workbench for scientists that integrates research tools, produces auditable artifacts, and connects to specialized life sciences models and workflows from NVIDIA.

The beta release, announced June 30, gives Claude Pro, Max, Team, and Enterprise users access to an application designed to help researchers work across literature review, data analysis, figure generation, manuscript refinement, and computational workflows. Anthropic said Claude Science is available on macOS and Linux, and can run locally, on remote machines over SSH, or through a high-performance computing login node.

The launch is part of a broader push by AI companies to turn general-purpose assistants into domain-specific workbenches for professional users. In life sciences, that means moving beyond chat-based summarization toward agents that can query databases, write and run code, inspect outputs, preserve research history, and connect to scientific tools already used by labs.

Anthropic said Claude Science brings fragmented scientific tools into a single research environment. The app can work with tools such as PubMed, Jupyter, R, cluster terminals, and domain-specific scientific databases, while preserving an auditable history of how outputs were produced.

Users interact with a generalist coordinating agent that has access to more than 60 curated skills and connectors configured for genomics, single-cell analysis, proteomics, structural biology, cheminformatics, and other research areas. The system can also use specialist agents created by users, and includes a reviewer agent that checks citations and calculations, flagging and correcting errors, according to Anthropic.

A central claim in the launch is reproducibility. When Claude Science generates a figure, it includes the code and environment used to create it, a plain-language description of the process, and the message history leading up to the output. The company said that history is intended to make results easier to validate and reproduce later.

That focus is likely to matter for researchers and enterprise AI buyers. Scientific AI tools may be judged less by whether they can produce plausible answers and more by whether their work can be traced, challenged, repeated, and incorporated into established review processes.

NVIDIA’s role comes through its BioNeMo Agent Toolkit, which the chipmaker announced June 23. Anthropic said Claude Science uses BioNeMo Agent Toolkit skills to connect to life sciences models and libraries in BioNeMo, including Evo 2, Boltz-2, and OpenFold3.

NVIDIA describes BioNeMo Agent Toolkit as a set of domain-specific tools and skills for agentic life sciences workflows. The toolkit includes NVIDIA life sciences libraries, tools, and open models, and is designed to help agents gather evidence, reason across findings, run computational experiments, and recommend next steps.

“Frontier models are the brains. BioNeMo is the scientific toolbox,” Jensen Huang, NVIDIA’s founder and CEO, said in the company’s announcement. “Together, they give AI agents the skills of a PhD research assistant and the speed of a supercomputer.”

The toolkit is available through NVIDIA developer resources and GitHub. The GitHub repository describes BioNeMo Agent Toolkit as packaging tools for protein folding, molecular docking, generative chemistry, genomics analysis, protein design, and biomarker discovery into ready-to-call agent skills.

NVIDIA said a broad set of companies and research groups are using or integrating the toolkit. The company named Anthropic, OpenAI, Edison Scientific, Lila Sciences, and Owkin among frontier labs and scientific agent builders integrating with BioNeMo. It also cited scientific data and workflow platforms including Benchling, Certara, Databricks, Snowflake, and Seqera.

The list shows how quickly the market for scientific AI agents is becoming an ecosystem story. AI model developers, infrastructure providers, data platforms, lab automation vendors, drug discovery software companies, and research institutions are all trying to define the role of agents in scientific work.

The practical question is not whether AI will replace scientists. The more immediate issue is whether agentic systems can become reliable parts of research operations. That requires more than model capability. It requires provenance, permissions, reproducible code, access controls, review agents, human oversight, and integration with trusted tools.

Anthropic emphasized that Claude Science can run on a lab’s existing infrastructure, including laptops, Linux machines, or HPC login nodes. And large or sensitive datasets can remain on the systems where they already reside, with only the context needed for each analysis step sent to Claude.

The company also said the system asks before reaching new resources, and lets users review or revoke decisions before it writes and submits jobs to computing resources. Those controls may help address concerns about scientific agents acting too autonomously in environments involving proprietary data, regulated workflows, or expensive compute resources.

The release also illustrates how AI agents are becoming more specialized. A general chatbot can summarize papers or help write code. A scientific agent workbench is expected to understand research tools, call domain models, preserve experimental context, generate figures, inspect citations, and operate inside a lab’s computational environment.

NVIDIA framed the same shift from the infrastructure side. It said BioNeMo Agent Toolkit gives agents the context and know-how to execute scientific computing, including preparing inputs, launching reproducible workflows, analyzing outputs, and returning insights inside platforms scientists and data teams already use.

The opportunity is clear: agentic systems could reduce the time scientists spend moving between databases, scripts, compute clusters, and analysis tools. The risk is also clear: if the systems make mistakes, misread data, produce untraceable outputs, or overstate conclusions, they could add another layer of complexity to already difficult research workflows.

That makes auditability a key part of the story. Anthropic’s emphasis on traceable figures, code, environments, and reviewer agents suggests that scientific AI products are being shaped by the credibility requirements of research, not only by the productivity claims common in enterprise AI.

Claude Science is still in beta, and Anthropic said it will refine the platform as it collects user feedback. The company is also supporting up to 50 Claude Science AI for Science projects with up to $30,000 in credits, with Modal providing up to $2,000 in compute for select projects.

The broader trend is that AI agents are moving into domains where correctness, traceability, and workflow integration matter as much as speed. In life sciences, that raises the stakes. A useful scientific agent must not only generate an answer, but also show how it got there, what data it used, what code it ran, and where a human expert should remain in control.

For now, Claude Science and BioNeMo represent an early test of whether agentic AI can move from general productivity into specialized scientific work. The promise is faster analysis. The standard of proof will be reproducible results.

About the Author

John K. Waters is the editor in chief of a number of Converge360.com sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS.  He can be reached at [email protected].

Featured