What is a Tabular Foundation Model?

A Tabular Foundation Model is a pre-trained AI system designed specifically for structured, row-and-column data — the kind found in ERP systems, CRMs, and spreadsheets. Unlike LLMs, which learn from unstructured text, Tabular Foundation Models learn statistical patterns from millions of synthetic tabular datasets and then generalize to new business problems without retraining.

Why can't standard LLMs handle structured business data accurately?

LLMs are pre-trained on text and struggle to perform precise numerical predictions on tables with mixed data types, missing values, and business-specific semantics. They hallucinate numbers, fail on regression tasks, and lack the feature-importance reasoning that tabular models provide natively. Tabular Foundation Models sidestep these limitations because they are architecturally purpose-built for rows and columns.

What does the SAP Prior Labs deal mean for companies running SAP ERP?

In practical terms, SAP ERP users will gain AI-powered predictions baked directly into familiar business workflows — detecting payment delays, forecasting inventory risk, flagging supplier issues — without needing a separate data science team or model-training pipeline. SAP plans to integrate TabPFN into its Business AI layer over the next two to four years.

Is TabPFN open source?

Yes. TabPFN remains open source under Prior Labs' existing strategy. The acquisition by SAP does not change the open-source availability of the core model, though enterprise tiers — including fine-tuning, real-time inference, and causal reasoning — will continue to be offered as commercial products.

SAP Acquires Prior Labs and Dremio: The €1B Bet on Structured Data AI

Two acquisitions. One week. One very clear message from enterprise software’s largest player: the missing layer in enterprise AI is not a smarter model — it is a data infrastructure that actually understands your business numbers.

On May 4–5, 2026, SAP announced back-to-back deals to acquire Prior Labs (pioneers of Tabular Foundation Models) and Dremio (an open lakehouse platform for unifying SAP and non-SAP data). Together, the moves represent a €1 billion-plus commitment to close the gap between frontier AI capabilities and the structured, row-and-column data that runs the global economy. For any organization that manages forecasts, inventory, receivables, or risk in a spreadsheet or ERP system, this week’s news changes what AI can do for them.

What Prior Labs Actually Built

Prior Labs is a research-driven company founded by Frank Hutter, Noah Hollmann, and Sauraj Gambhir — academics whose work on Tabular Prior-data Fitted Networks (TabPFN) was published in Nature in 2024 and has since crossed three million downloads on PyPI.

The core insight behind TabPFN is elegant: instead of training a model on your specific business dataset (which requires labeled data, feature engineering, and often weeks of work), train a single foundation model on hundreds of millions of synthetic tabular datasets that cover the full space of possible statistical patterns. When you then hand that model a new table — say, your accounts-receivable ledger — it can make accurate predictions in seconds, with no fine-tuning, because it has already learned the underlying grammar of tabular data.

TabPFN 2.5, the current public release, scales to 50,000 data points and 2,000 features per table. TabPFN Enterprise, the commercial tier that ships with the SAP deal, adds fine-tuning, context-length reasoning, real-time inference, and causal reasoning — the capabilities you need to move from a data-science experiment to a production business workflow.

Why LLMs Struggle With Your Spreadsheets

It is tempting to assume that a model capable of passing the bar exam or writing production code can also crunch your revenue forecast. In practice, the gap is wide.

Large language models are pre-trained on text. When they encounter a table, they serialize it into tokens — collapsing the relational geometry that gives a spreadsheet its meaning. Mixed data types (dates, categoricals, currency), missing values, and business-specific semantics (the difference between “net 30” and “net 60” in your AR column) turn into noise. The result: LLMs hallucinate financial figures, lose track of feature importance, and cannot natively produce the confidence intervals a CFO needs to act.

SAP’s own CTO, Philipp Herzig, put it bluntly at the announcement: “Enterprise AI doesn’t stall because the models aren’t good enough. It stalls because the data isn’t ready for AI agents.” Tabular Foundation Models are purpose-built for exactly this readiness gap — they operate on the data as it already exists in your ERP, without requiring transformation into free text or extensive labeling pipelines.

Structured Data: LLM vs Tabular Foundation Model

Large Language Model

Tabular accuracyModerate

Setup timeDays–weeks

Missing valuesStruggles

Confidence intervalsUnreliable

Fine-tuning neededOften

Tabular Foundation Model

Tabular accuracyState-of-the-art

Setup timeSeconds

Missing valuesNative handling

Confidence intervalsBuilt-in

Fine-tuning neededZero-shot capable

Dremio: The Data Pipeline That Makes the Prediction Possible

Buying a powerful tabular AI model solves only half the problem. The other half is getting clean, unified data to it in real time. That is where Dremio enters the picture.

Dremio is an Apache Iceberg-native open lakehouse designed to federate data from SAP and non-SAP systems — Salesforce, Snowflake, external ERPs, partner files — into a single queryable layer without physically moving or converting the data. SAP’s CTO framed the acquisition plainly: with Dremio, SAP Business Data Cloud becomes an enterprise lakehouse where agents can query across all your data in real time, not just what lives inside SAP’s own tables.

The practical implication is significant. Enterprise AI agents today often fail not because the underlying model is weak but because they cannot reliably answer questions like “what is our current on-hand inventory across all regional warehouses, net of open purchase orders, updated to the last hour?” That query touches at least three systems. Without a lakehouse that federates those sources under a single schema, the agent has to improvise — and improvisation on financial data produces hallucinations, not decisions.

Together, TabPFN and Dremio form a two-layer stack: Dremio ensures the data is unified, clean, and queryable; TabPFN turns that data into precise, actionable predictions without a model-training detour.

What This Means for Businesses Running SAP

The immediate implications land on the approximately 27,000 companies that run SAP as their system of record. In the medium term — SAP projects integration over the next two to four years — these organizations will see TabPFN-powered predictions surfaced natively inside familiar Business AI workflows:

Accounts receivable: probability scores on invoice payment timing, flagging likely late payers before the due date
Supply chain: supplier risk scoring across a live, federated view of procurement data
Sales forecasting: regression predictions on pipeline conversion with built-in confidence bands — no data science team required
Customer churn: classification tasks on customer health scores pulling from CRM, support, and usage tables simultaneously

The zero-shot capability of TabPFN matters here: most mid-market SAP customers do not have dedicated ML engineers. A model that works on new tables immediately, without retraining, brings AI-grade prediction into reach for companies that have historically depended on Excel formulas and quarterly reviews.

The Bigger Signal: Enterprise AI’s Missing Infrastructure Layer

SAP’s twin acquisitions signal something larger than one company’s product roadmap. They articulate a thesis gaining momentum across the industry: the frontier model is not the bottleneck; the infrastructure to surface business data to that model is.

This framing reshapes how companies should think about AI investments. Buying API access to GPT-5.5 or Claude Opus 4.7 is a start, but if the underlying data is siloed, inconsistently labeled, or trapped in legacy schemas, the model’s output will be unreliable at best. The competitive advantage shifts toward whoever masters the data-readiness layer.

For small and mid-sized businesses that have not yet built a formal AI strategy, this is a critical moment to audit not which AI tools you are subscribed to, but what state your data is in. The companies extracting the most from AI agents in 2027 and 2028 are the ones investing in clean, federated, queryable data today.

At AgentsGT, the agent-building platform we follow closely, the teams seeing the fastest results are almost universally those that started with data cleanup — normalizing CRM records, unifying inventory tables, setting up real-time feeds from operational systems — before wiring in an AI layer. The SAP/Prior Labs story is, at its core, a €1 billion validation of that approach.

Ready to assess your organization’s data readiness for AI? Our team at DDR Innova helps businesses build the infrastructure and agent workflows that turn structured data into competitive advantage. Reach out today or write to info@ddrinnova.com.

Sources

SAP to Acquire Prior Labs to Establish a Globally Leading Frontier AI Lab in Europe — SAP Newsroom, May 4, 2026
SAP to Acquire Prior Labs to Establish a Globally Leading Frontier AI Lab in Europe — HPCWire / AIwire, May 4, 2026
SAP Acquires Dremio and Prior Labs, But Can It Solve Enterprise AI’s Core Problem? — BigDATAwire, May 5, 2026

Cover image: AI-generated representation of structured business data meeting neural network architecture.