NetApp and NVIDIA: Revolutionary AI Products at a Revolutionary Time

Featured

Table of Contents

๐Ÿš€ Introduction: A New Chapter in AI and Data

I spoke with George Kurian, CEO of NetApp, and together we announced a milestone that represents a fundamental shift in how enterprises will store, process, and reason over their data. In the video I released on the NVIDIA channel, I described how our decades of work in reinventing computing have arrived at a moment when data โ€” not just models or chips โ€” becomes the central element for the next wave of AI. I am writing this as a report of that announcement and as an explanation of why the new NetApp AFX family and the NetApp AI Data Engine, integrated with the NVIDIA AI Data Platform, are far more than incremental improvements: they are a reinvention of data processing for agentic, multimodal AI.

This announcement builds on a long history between our two companies. NetApp and NVIDIA have collaborated since 2019 on solutions that pair NVIDIA DGX systems and software stacks with NetApp storage architectures. We have already delivered value to hundreds of customers worldwide, and today we elevate that partnership to a strategic platform that addresses the complex demands of modern AI โ€” where most of the world's data is unstructured, multimodal, and enormous in scale.

๐Ÿ” Why This Matters: Data Is the Food of AI

One simple truth drives this work: data is the food of AI. Agents learn from data, and they operate on data. The richer and more accessible that data becomes, the smarter and more capable agents will be. We have entered an era where agentic AI โ€” systems that take actions autonomously and reason about data โ€” relies on a fundamentally different approach to data storage, indexing, discovery, and governance.

I often say that we've reinvented computing: from chips to systems to software and models. Today, we must similarly reinvent data processing. The classic, centralized, protocol-driven storage and retrieval models are not equipped to deliver the latency, scale, and semantic understanding required by modern AI. This is the motivation behind the work we announced with NetApp: to integrate highly scaled, composable, disaggregated storage architecture with near-data compute, and to combine it with AI-first data software that embeds, catalogs, enriches, and serves data to models efficiently and securely.

๐Ÿ’ก Announcement Summary: NetApp AFX + NetApp AI Data Engine with NVIDIA AI Data Platform

At the core of the announcement are two closely integrated innovations:

  • NetApp AFX family โ€” a high-performance, composable, disaggregated storage architecture designed for AI-scale workloads. It brings near-data compute to storage so that data transformation, embedding, and active metadata generation can happen without unnecessary data movement.
  • NetApp AI Data Engine โ€” a suite of software capabilities that leverages the NVIDIA AI Data Platform reference design to provide a full AI data lifecycle: discovery, cataloging, semantic representation, guardrails, lineage, transformation, and serving to LLMs and other models.

Together, these technologies let organizations scale datasets into exabyte pools with hundreds of petabyte scale namespaces while preserving high performance for data scientists and predictable, manageable operations for IT teams.

๐Ÿงญ Background: The Journey from DGX-1 to AI Data Platforms

NetApp and NVIDIA's relationship goes back to the days of the DGX-1, which was one of the first purpose-built systems for AI. That history matters because every breakthrough we've achieved in accelerated computing has depended on co-design across layers: chips, systems, software, and applications. The same principle applies to data. The AI revolution demands that storage and compute be reimagined to serve semantic workloads at scale.

Historically, enterprises treated knowledge as structured data: rows and columns residing in relational databases where SQL and classic indices delivered answers. That model is insufficient for today's AI-centric workloads. Enterprise knowledge now predominantly lives in unstructured forms: documents, PDFs, images, video, audio, health records, chemical structures, and more. To unlock that knowledge, we need semantic representations โ€” embeddings โ€” and a different paradigm of indexing and querying. I said it plainly in the announcement: indexing is no longer about hash tables, B-trees, or inverted indices; indexing now means vectorized representations and nearest neighbor search delivered by neural indexing layers.

โš™๏ธ Technical Deep Dive: What "Semantic, AI Semantic Data Processing" Really Means

Let me unpack what I mean by semantic data processing. There are several interrelated components:

  • Embeddings and semantic representation: We convert raw data across modalities (text, audio, image, video, structured records, molecular graphs, etc.) into vectors that capture meaning. These embeddings turn heterogeneous data into a unified representation space that models can search and reason over.
  • Vectorized indexing: Traditional indices are built around exact keys and sorted orders. For semantic search, we need indices optimized for nearest neighbor queries. These indices are often implemented with techniques such as Approximate Nearest Neighbor (ANN) search, product quantization, HNSW graphs, or learned indexes. The key distinction is that these indices operate on high-dimensional vectors instead of scalar keys.
  • AI query: Querying semantic indices is inherently different. You don't query by exact match; you query by similarity, intent, and context. That necessitates query planners and retrieval systems that understand embeddings, multimodal fusion, and prompt-aware retrieval strategies.
  • Multimodal fusion: The engine must index and search across text, audio, images, video, and domain-specific modalities (chemical fingerprints, genomics, etc.) in a single semantic space or via coordinated multi-space approaches.
  • Active metadata and transform-in-place: Instead of copying massive datasets for each pipeline, we bring compute to the data to produce embeddings, enrich metadata, and maintain versioned lineage โ€” all while minimizing duplication and preserving governance.

The NetApp AFX hardware provides the disaggregated and composable platform to host the storage namespaces and the near-data compute necessary to run these operations at scale. The NetApp AI Data Engine overlays the software to make that raw capability consumable: findability, discoverability, lineage, and guardrails for enterprise adoption.

๐Ÿ”— How It Works: Composable Disaggregated Architecture and Near-Data Compute

When I describe the AFX as "composable" and "disaggregated," I mean a practical, operational architecture where storage and compute are decoupled and can be assembled into logical fabrics on demand. This design yields several crucial benefits:

  • Elasticity โ€” You can scale storage capacity and compute independently. Need more GPU for training? Add compute. Need more capacity for archives or exabytes of video? Expand capacity without touching GPU clusters.
  • Performance at scale โ€” Composable fabrics avoid the bottleneck that arises when compute must pull copies of data across network boundaries. With near-data compute, you perform expensive operations โ€” embedding generation, feature extraction, video frame processing โ€” right where the data sits.
  • Zero-copy transformation โ€” Zero-copy is the promise of avoiding unnecessary data movement. When transformations can happen in place, it reduces cost and latency dramatically. For AI, where datasets are large and multimodal, zero-copy transforms are the difference between feasible and infeasible pipelines.
  • Active metadata โ€” Metadata becomes a live, queryable artifact that evolves as models process data. Active metadata stores embeddings, model versions used for embedding, quality metrics, lineage, and governance flags so that downstream systems can make informed, auditable decisions about data use.

For the first time, NVIDIA GPUs will be used extensively inside file systems and storage nodes, enabling GPU-accelerated processing to run as close to the data as possible. This is not just about raw throughput; it's about changing how organizations think about pipelines and lifecycle management for AI datasets.

๐Ÿง  Vector Databases, Indexing, and AI Queries

I want to emphasize the shift from relational thinking to vector thinking. A vector database is not merely a container; it is an execution environment for similarity search. Here are the principles that matter:

  • Embedding consistency โ€” Embeddings are model-dependent. If you embed your data with one model and then later change the embedding model, similarity relationships can change. You must track which embeddings were produced by which models and maintain consistency across training and retrieval pipelines.
  • Index evolution โ€” As datasets grow, you need indices that are maintainable and updateable without rebuilding entire structures. This requires architectures that support incremental updates, partitioning, and hierarchical indexing strategies.
  • Query enrichment โ€” AI queries can be augmented by metadata filters, temporal constraints, provenance checks, and multimodal fusion. A query might ask for semantically similar content filtered by region of origin, creation date, or by the model version used for embedding.
  • Hybrid retrieval โ€” Often the best results come from hybrid strategies that combine sparse retrieval (keyword-based) with dense retrieval (vector-based). The platform should seamlessly integrate both approaches.

NetApp AI Data Engine is designed to convert enterprise data into a semantic, unified representation while maintaining the operational controls IT teams require. The engine catalogs data, creates embeddings, and manages the indices โ€” and because it's built to run wherever NVIDIA GPUs are available, it supports edge, on-premises, and cloud deployments.

๐Ÿ” Governance, Security, and Lineage: Enterprise-Grade Controls

Enterprises will not hand over their data to ungoverned systems. They require strict compliance, auditability, and lineage tracking. That is why the integration we announced addresses these needs explicitly:

  • Guardrails and policies โ€” The AI Data Engine includes mechanisms to define and enforce policies about what data can be used for model training, what data can be served to external agents, and how access is logged and monitored.
  • Lineage and version control โ€” Every embedding, transformation, and index change is recorded. You can trace which raw object contributed to a training dataset and which embedding model was used. That becomes critical for reproducibility, debugging, and regulatory inspections.
  • Provenance for models โ€” Knowing the pedigree of training data helps ensure model quality. You can flag low-quality sources, exclude certain records, or preferentially weight data from vetted repositories.
  • Secure data processing โ€” By enabling near-data compute and by minimizing data movement, we reduce exposure and increase the feasibility of processing sensitive data in tightly controlled environments.

In the announcement I reiterated the importance of these capabilities: our combined technologies extract value from data while keeping it compliant with enterprise risk management and security policies. Customers will see tools to track which objects have been embedded, which embeddings are out of date, and which require reprocessing due to model upgrades.

๐Ÿฅ Real-World Use Cases Already Delivering Value

We are not speaking in abstractions; our customers are already seeing meaningful results. Two examples I highlighted are particularly illustrative:

Johnson & Johnson โ€” Accelerating Drug Discovery

Johnson & Johnson aggregates vast amounts of research, experimental data, and clinical data across hybrid environments. They need to combine on-prem resources with cloud compute to accelerate drug discovery workflows. By deploying a fabric that spans on-premises NetApp AFX pools and cloud-based NVIDIA compute, they can index large multimodal datasets โ€” molecular structures, assay results, imaging data โ€” and enable teams to retrieve semantically similar experiments and candidate molecules quickly. That accelerates hypothesis generation and reduces the time from idea to experiment.

Yale School of Medicine โ€” Computational Health

At Yale School of Medicine, researchers require the ability to combine disparate modalities: clinical records, imaging, genomic data, and research datasets. Privacy and governance are paramount. Our integrated solution allows Yale to unify these datasets semantically while maintaining strict lineage, cohort controls, and audit logs, enabling computational health research that adheres to institutional policies and accelerates translational research.

These examples are early signals because the full potential of agentic AI operating on enterprise knowledge bases will be unlocked across every industry: manufacturing, finance, healthcare, defense, and media. Wherever high-value decisions depend on connecting multimodal evidence, the combination of NetApp AFX and the NVIDIA AI Data Platform will become a foundational capability.

๐Ÿงฉ Deployment Flexibility: Run Anywhere, Scale Everywhere

One of the promises we made is that the AI Data Engine is software-defined and extremely portable. It can run wherever NVIDIA GPUs are available โ€” on-premises, in private or hybrid clouds, and across public cloud providers. The NetApp AFX family provides appliances for on-prem deployments and can be composed into OEM configurations. From an enterprise perspective, this flexibility reduces vendor lock-in and ensures that you can place workloads where they make the most sense operationally and financially.

Hardware support ranges from compact RTX Pro servers to our largest Blackwell-based systems. The same software-defined stack runs on these platforms, providing consistent semantics for discovery, embeddings, and indexing regardless of the deployment footprint.

๐Ÿ“ˆ Benefits for Data Scientists and IT Teams

This integrated platform addresses distinct but complementary needs for two groups in every organization:

Data Scientists

  • Faster iteration โ€” Near-data compute and zero-copy transforms reduce the time required to generate training datasets and to iterate on retrieval strategies.
  • Better data quality โ€” Active metadata and lineage make it easier to select high-quality training data and to detect drift in embeddings or data sources.
  • Unified semantic search โ€” Data scientists can query across modalities and across hybrid topologies with AI-aware retrieval primitives, which improves recall and relevance for research and experimentation.

IT and Infrastructure Teams

  • Operational familiarity โ€” NetApp's platform provides familiar storage management abstractions even as it introduces AI-native capabilities, enabling IT to deploy and manage at enterprise scale without reinventing processes.
  • Scalability and efficiency โ€” Composable infrastructures allow resource allocation to match workloads and avoid costly overprovisioning.
  • Security and compliance โ€” Integrated governance and auditability reduce enterprise risk and support compliance with regulatory mandates.

When these two groups work together on a platform that supports both AI-native workflows and enterprise controls, organizations move faster and with greater confidence.

๐Ÿ“š The New Unified Data Model: From Files to Knowledge

The unified data model I described during the announcement is not a single data format. Instead, it is a conceptual shift: a way of treating all data as potential inputs to semantic reasoning. Whether the data is a PDF, an image, an audio transcript, a chemical descriptor, or a database record, the first step is to create a semantic representation that allows it to be discovered, compared, and retrieved.

This unified representation doesn't erase the differences between modalities. Rather, it provides a common substrate for retrieval. The AI Data Engine organizes data into namespaces and channels that reflect both the physical storage topology and semantic domains. For example, a legal namespace might contain contract PDFs, annotated transcripts, and negotiation logs; a research namespace might contain lab notebooks, imaging, and assay outputs. Each namespace supports its own policies, lifecycle rules, and embedding strategies โ€” all while maintaining a global discovery layer that enables cross-domain queries when allowed.

๐Ÿ”„ Embedding Lifecycle and Versioning

Embedding lifecycle management is a critical operational concern. Embeddings evolve as models improve. Re-embedding an entire exabyte-scale dataset is impractical unless you have intelligently designed pipelines. Our approach includes:

  • Selective re-embedding โ€” Use metadata and quality metrics to decide which items must be re-embedded after a model upgrade and which can be deferred.
  • Incremental indexing โ€” Update indices incrementally to avoid full rebuilds.
  • Model tagging โ€” Track which model produced which embedding so that retrieval can be consistent and reproducible.
  • Hybrid strategies โ€” Use a combination of dense and sparse features to provide robust retrieval even during transitions.

These practices reduce operational cost and ensure that model retraining and inference pipelines are robust and auditable.

๐Ÿ› ๏ธ Integration with the NVIDIA AI Data Platform

The NVIDIA AI Data Platform is a collection of reference architectures, optimized libraries, orchestration patterns, and model-serving tools designed to accelerate AI development and deployment. When integrated with NetApp's storage innovations, the platform provides a comprehensive, production-ready stack for enterprise AI:

  • Data ingestion and preprocessing โ€” GPU-accelerated pipelines extract frames, transcode video, parse documents, and create embeddings.
  • Cataloging and discovery โ€” Active metadata stored alongside files surfaces context and relevance for automated retrieval.
  • Model training and fine-tuning โ€” Scalable NVIDIA compute clusters train models on large datasets with optimized network and storage topologies.
  • Serving and inference โ€” Low-latency retrieval and model-serving stacks deliver responses to conversational agents or downstream applications.

The combined platform is designed to be modular and interoperable. Customers can adopt pieces of the stack incrementally or deploy the full integrated solution depending on their maturity and priorities.

๐Ÿงญ From Search to Conversational Agents: New Ways of Asking Questions

Search has always been about finding files. Now, agents let you ask questions and receive synthesized answers drawn from a corpus of enterprise knowledge. I explained in the announcement that "in the future, you just ask the AI." The NetApp AI Data Engine enables that by providing:

  • Contextual retrieval โ€” Return not just documents but passages, records, or multimodal evidence ranked by relevance and provenance.
  • Conversational interfaces โ€” Enable natural language queries that trigger complex retrieval plans and multi-step reasoning over multiple data sources.
  • Explainability โ€” Provide the sources and lineage for an agent's answer so that users can evaluate confidence and compliance implications.

As agents become more capable, they will perform tasks that once required skilled human operators: synthesizing reports, doing literature reviews, proposing experimental designs, and surfacing compliance risks. The quality of those outcomes depends on the data fabric beneath the agent.

๐Ÿ“ฆ The Commercial and Strategic Implications

This partnership is a strategic inflection point for both companies. For NVIDIA, embedding GPUs into the storage tier opens a new class of opportunities. For NetApp, offering AI-native storage with near-data compute positions the company to be central to enterprise AI deployments.

From a market perspective, we expect several effects:

  • Acceleration of AI adoption โ€” Lowering the operational burden of dataset preparation and governance reduces time-to-value for organizations experimenting with AI.
  • Consolidation of data fabrics โ€” Organizations will favor architectures that allow seamless movement between cloud and on-prem while preserving semantic indexes and policies.
  • New workflows โ€” The ability to run GPU-accelerated transforms in storage will enable previously infeasible pipelines, particularly for video and other large multimodal datasets.
  • Vendor ecosystems โ€” This co-innovation sets a template for other suppliers: the future is software-defined, composable, and GPU-aware at the storage layer.

๐Ÿ”ง How Customers Can Get Started

If your organization is contemplating the next step in AI readiness, here is a practical roadmap based on what we are recommending to customers:

  1. Inventory and classify data โ€” Understand where your high-value data lives and what modalities matter most. PDFs, video, and domain-specific records often deliver disproportionate value when made semantically searchable.
  2. Define governance and policies โ€” Establish clear rules about data usage, access, and lineage. These rules should be embedded into the discovery and indexing processes.
  3. Pilot with bounded workloads โ€” Start with a narrow use case where the ROI is clear: legal discovery, R&D retrieval, clinical research. Deploy NetApp AFX and the AI Data Engine in a contained environment to validate throughput and latency.
  4. Iterate embedding strategies โ€” Evaluate embedding models for each modality and measure retrieval quality. Use model tagging and selective re-embedding strategies to control cost.
  5. Scale incrementally โ€” Expand namespaces and indices gradually, ensure operational playbooks are in place, and add near-data compute where it provides the greatest benefit.
  6. Measure business outcomes โ€” Track time saved, experiments accelerated, or revenue enabled by better retrieval and agentic capabilities.

This pragmatic approach helps organizations avoid the pitfalls of grand redesigns and instead align technical investments with measurable outcomes.

๐Ÿ“ฃ Closing Remarks: A Major Reinvention of Storage

I have long believed that AI represents the biggest industrial revolution in computing since the invention of the microprocessor. Today, we stand at another turning point: the reinvention of storage. With NetApp AFX and the NetApp AI Data Engine, we are changing how data is indexed, processed, and served โ€” and making it possible for agents to "talk" to enterprise data with semantic understanding across hybrid clouds.

My thanks to George and the entire NetApp team for a years-long partnership that has culminated in this announcement. This is a huge day for both companies, for our customers, and for the industry at large. We have the technology to transform search into reasoning at scale, and to do so while preserving enterprise controls that matter to CIOs, compliance officers, and researchers.

We have only begun to scratch the surface of what agentic AI will enable when it has access to well-governed, semantically indexed enterprise knowledge. I am optimistic and excited about the kinds of innovations that will spring from this capability.

โ“ Frequently Asked Questions

What are the NetApp AFX family and the NetApp AI Data Engine?

The NetApp AFX family is a composable, disaggregated storage architecture designed to deliver extreme performance and near-data compute. The NetApp AI Data Engine is a suite of software capabilities that integrates with the NVIDIA AI Data Platform to provide discovery, cataloging, semantic representation (embeddings), governance, lineage, and serving for AI workloads. Together, they enable semantic indexing and efficient retrieval across hybrid cloud environments.

Why do embeddings matter and how are they managed?

Embeddings convert heterogeneous data into vector representations that capture semantic meaning. They are crucial for similarity search and AI-assisted retrieval. Because embeddings are model-dependent, the platform tracks which model generated each embedding, records embedding versions, and supports selective re-embedding and incremental index updates to manage lifecycle and cost.

What does near-data compute mean and why is it important?

Near-data compute means running processing tasks (such as embedding generation, metadata enrichment, and transformation) close to where the data is stored, often within the storage fabric itself. This reduces data movement, lowers latency, and cuts storage duplication. For large multimodal datasets like video, near-data compute is essential for cost-effective and performant pipelines.

How does this partnership change security and compliance for AI workloads?

The integrated platform includes guardrails, policy enforcement, lineage tracking, and provenance metadata to ensure that data used in AI training and inference is auditable and compliant. By minimizing data movement and enabling controlled in-place processing, it also reduces the attack surface and helps organizations maintain strict controls over sensitive information.

Can the AI Data Engine run in public cloud environments?

Yes. The AI Data Engine is software-defined and designed to run wherever NVIDIA GPUs are available โ€” on-premises, in private clouds, and across public cloud providers. The same stack supports RTX Pro servers, Blackwell-class GPUs, and cloud-hosted GPU instances, enabling consistent semantics across hybrid deployments.

What types of data and modalities are supported?

The platform supports multimodal data including text, documents (like PDFs), images, audio, video, structured records, and domain-specific representations such as chemical structures or genomic sequences. The AI Data Engine creates embeddings suitable for each modality and supports unified or coordinated search strategies across them.

How does this approach differ from traditional storage and search?

Traditional storage and search are protocol-driven and optimized for exact-match queries and file retrieval. The new approach emphasizes semantic representation, vector indexing, and AI query semantics. Instead of relying solely on file paths and keyword search, agents can perform similarity searches over embeddings and retrieve multimodal evidence ranked by relevance and provenance.

What are some early customer success stories?

Early examples include Johnson & Johnson using the platform to accelerate drug discovery across hybrid data sets and Yale School of Medicine combining research and clinical modalities for computational health research. Both customers benefit from semantic discovery, near-data compute, and enterprise-grade governance to enable faster, auditable research workflows.

How should organizations begin deploying these technologies?

Start by inventorying and classifying high-value data, then define governance and policies. Pilot with a bounded, high-impact workload to validate throughput and retrieval quality. Iterate on embedding models and indexing strategies, then scale incrementally while measuring business outcomes such as time saved or discovery acceleration.

Will NVIDIA GPUs be embedded into storage systems?

Yes. For the first time, NVIDIA GPUs will be used within file systems and storage nodes to enable GPU-accelerated processing close to the data. This provides the performance required to process large multimodal datasets in place and enables zero-copy transformations and active metadata generation.

What is the expected impact on total cost of ownership (TCO)?

By reducing data movement, minimizing copies, and enabling selective reprocessing, the architecture lowers storage and operational costs. Near-data compute reduces network and compute waste, while active metadata and selective re-embedding reduce unnecessary processing. These efficiencies contribute to a lower TCO for AI-ready data fabrics.

Can legacy applications use the AI Data Engine?

Yes. The AI Data Engine is designed to coexist with legacy applications. NetApp's platform provides familiar file and object interfaces, while the AI Data Engine adds a semantic layer on top. This allows organizations to modernize AI workflows without having to rewrite all existing applications.

How do you ensure embedding and index quality over time?

The platform supports monitoring and quality metrics for embeddings and indices, including drift detection. You can set triggers for re-embedding, run validation checks, and maintain audit trails that specify why and when data was reprocessed. These measures help maintain retrieval quality as models and data evolve.

Is this solution vendor-locked or open?

The solution is built to be flexible. The AI Data Engine runs on any NVIDIA GPU and on a variety of OEM platforms, and NetApp provides both appliances and software-defined options. Our design goal is to avoid unnecessary lock-in by supporting hybrid and multi-cloud deployments while offering deep integrations for optimized performance.

๐Ÿ“… Final Thoughts and Next Steps

I am energized by what we announced because it represents a practical, enterprise-ready answer to a pressing problem: how do we make massive, multimodal data useful for agentic AI while preserving governance and operational simplicity? NetApp's AFX family and the NetApp AI Data Engine, together with the NVIDIA AI Data Platform, are engineered to do exactly that.

For organizations exploring this capability, begin small and think big: pilot discrete, high-value use cases first, then scale the platform and policies as you prove value. The architecture we've unveiled is designed for that path. It lets you deploy a consistent semantic layer across your hybrid environment and to iterate on models and retrieval strategies without losing control over cost, compliance, or operational risk.

Finally, co-innovation is central to our approach. This announcement is not a finish line; it is the start of a new era of collaboration between compute, storage, and software to enable the next generation of intelligent applications. I am excited about the journey ahead and look forward to seeing how our customers leverage these capabilities to transform industries and accelerate discovery.


AIWorldVision

AI and Technology News