How does dental AI use patient images for training?

Third-party dental AI companies collect X-rays submitted for analysis and use them to improve their machine learning models. The images are typically de-identified but remain in the vendor's training dataset permanently. This means your patients' clinical data is contributing to a commercial product you do not own or benefit from.

Can dental AI run without sending data to third-party servers?

Yes. In-house AI models can run within your own cloud environment using services like AWS SageMaker. The X-ray never leaves your infrastructure, no third party touches the data, and the analysis results stay in your patient record. NexV uses this architecture to deliver AI diagnostics at $0.001 per scan.

← Back to Blog

Clinical·10 min read·Mar 25, 2026

Where Do Your X-Rays Go? The Hidden Data Pipeline in Dental AI

AI-powered X-ray analysis is one of the most compelling features in modern dental software. A periapical radiograph is captured, sent through a machine learning model, and returned with annotations highlighting potential caries, periapical lesions, calculus, and bone loss. The clinical value is real.

What most practices do not know is where that X-ray goes after they click “analyze.” The answer, for the majority of dental AI products on the market, is a third-party server that your practice does not control, cannot audit, and may never have been disclosed to your patients.

Is My Dental Software Sharing Patient X-Rays with Third Parties?

Most dental AI tools send patient X-rays to third-party cloud servers for processing. These images may be stored indefinitely and used to train commercial AI models sold to other practices. Check your vendor's data processing agreement for language about model training, data retention, and secondary use of clinical images.

The data flow is straightforward. Your practice captures a radiograph. Your PMS or imaging software sends that image to an external API endpoint operated by a third-party AI company. That company runs the image through their detection model, returns the annotated result, and retains a copy of the original image.

The retention is where the economics get interesting. Every X-ray your practice sends through the system becomes training data. That training data improves the AI model, which is the core product the third-party company sells to every other dental practice, including your direct competitors.

How Does Dental AI Process X-Ray Images?

Dental AI processes X-ray images by sending the radiograph to a machine learning model that detects pathologies such as caries, bone loss, and periapical lesions. The model returns annotated results with bounding boxes and confidence scores. In third-party systems, the image leaves your network. In-house systems like NexV process the image within your own cloud environment.

To understand how AI processes this data for clinical decisions, read our guide on AI treatment planning in dentistry.

How Third-Party AI Diagnostics Actually Work

To understand the privacy implications, you need to understand the technical pipeline. Most third-party dental AI services operate on a simple architecture:

Image ingestion.Your PMS sends the radiograph to the AI vendor's API via HTTPS. The image is typically sent as a DICOM or JPEG file with associated metadata including patient identifiers, tooth numbers, and acquisition parameters.
Inference.The vendor's machine learning model processes the image and generates detection results, such as bounding boxes around suspected pathologies, confidence scores, and classification labels.
Response. The annotated results are returned to your PMS for display. The round trip typically takes 2 to 5 seconds.
Storage and retention. The vendor retains the original image and the inference results on their servers. Most data processing agreements grant the vendor a perpetual, irrevocable license to use de-identified images for model training.

The critical step is the fourth one. Your patient's X-ray, even after de-identification, now lives permanently in a commercial dataset. The AI company uses that dataset to train better models, which they license to other practices at $0.50 to $2.00 per scan.

What “Training Data” Means in Practice

Machine learning models are only as good as the data they are trained on. Dental AI companies need vast quantities of annotated X-rays to build accurate detection models. Acquiring and labeling that data is the single most expensive part of building a dental AI product.

When your practice uses a third-party AI service, you are subsidizing the vendor's data acquisition cost. Every X-ray you send is a free, pre-captured, clinically relevant training sample that the vendor would otherwise need to source from academic institutions, purchase from data brokers, or generate synthetically.

The scale is significant. A busy general practice captures 30 to 50 radiographs per day. Over a year, that is 8,000 to 13,000 images from a single location. A 20-location DSO contributes 160,000 to 260,000 images annually to the vendor's training pipeline.

The vendor benefits twice. First, they charge the practice a per-scan fee for the analysis. Second, they use the scan to improve a product they sell to everyone else. The practice pays for the privilege of contributing to a commercial asset it does not own.

Is Dental AI HIPAA Compliant?

Dental AI can be HIPAA compliant, but compliance depends entirely on the architecture. Third-party AI services require a signed Business Associate Agreement, encryption at rest and in transit, and guarantees that patient data is not used for model training. In-house AI that runs within the practice's own cloud environment, like NexV's SageMaker-based engine, eliminates third-party data exposure entirely.

For a comprehensive guide to HIPAA requirements for dental AI, including BAA checklists, encryption standards, and audit trail requirements, see our post on HIPAA compliance in the age of AI.

The Patient Consent Gap

HIPAA requires patient authorization for certain uses and disclosures of protected health information. Treatment, payment, and healthcare operations are generally exempt from the authorization requirement. The question is whether sending a patient's X-ray to a third party for AI analysis, where it will be retained and used for model training, falls within the treatment exemption.

The legal analysis is nuanced, but the practical reality is straightforward. Most patients do not know their X-rays are being sent to external servers. They are not informed that their clinical images may be used to train commercial products. And most dental practices have not updated their Notice of Privacy Practices to disclose this specific data flow.

Even when the data is de-identified under HIPAA Safe Harbor, the ethical question remains. Patients consent to X-rays for their dental care. They do not consent to their clinical images becoming part of a commercial training dataset. The distinction matters, and it is one that regulatory bodies are increasingly scrutinizing.

Practices that use third-party AI diagnostics should, at minimum, review their Business Associate Agreements to understand exactly what rights they are granting the AI vendor over patient data. Many BAAs contain broad language that permits “de-identified use for research and product improvement” without specifying what that means in practice.

What Is In-House AI for Dental Practices?

In-house dental AI means the machine learning model runs within the practice's own cloud infrastructure rather than on a third-party vendor's servers. The patient's X-ray never leaves the practice's AWS environment. No external company receives, stores, or trains on the image. The cost per scan drops to $0.001 because there are no third-party licensing fees.

Does Dental Software Share Patient Data?

Many dental AI tools share patient X-rays with third-party servers for analysis and retain copies for model training. The practice pays a per-scan fee while the vendor uses the images to build a commercial product sold to competitors. Practices should review their vendor's data processing agreement for language about training data usage, retention, and secondary use.

There is a fundamentally different architecture for dental AI. Instead of sending X-rays to a third party, the AI model runs inside your own cloud infrastructure. The image never crosses a network boundary you do not control. No third party touches the data. No external company retains your patients' clinical images.

This is how NexV's AI diagnostic engine works. The detection model runs on AWS SageMaker within the practice's own AWS environment. When a clinician captures a radiograph, the image is sent to a SageMaker endpoint that the practice's organization controls. The model processes the image, returns detection results, and the original image stays in the practice's own S3 bucket.

No third party receives the image. No external dataset grows from your patients' clinical data. The AI vendor, in this case NexV, provides the trained model but never sees the production data it processes.

How NexV's AI Model Was Trained

If the model does not train on your patients' data, where does the training data come from? NexV's detection model was trained on publicly available dental imaging datasets, academic research repositories, and synthetically augmented data.

The model detects 31 pathology classes including caries, periapical lesions, calculus, bone loss, impacted teeth, root fragments, and crown and bridge anomalies. Each class was trained on thousands of annotated examples from published datasets that were collected with informed consent for research purposes.

The cost of running each inference is $0.001. That is one-tenth of a cent per scan, compared to $0.50 to $2.00 per scan charged by third-party AI services. The cost difference exists because NexV is not subsidizing a data acquisition pipeline with every scan. The model is already trained. Running it costs fractions of a cent in compute time.

For practices that want to understand the technical architecture in more detail, our security documentation covers data residency, encryption at rest and in transit, access controls, and audit logging.

What This Means for DSOs and Enterprise Groups

The data privacy implications compound at scale. A single-location practice sending 10,000 X-rays per year to a third-party AI service is contributing a meaningful amount of training data. A 50-location DSO sending 500,000 X-rays per year is contributing a dataset that has significant commercial value.

DSO executives should ask their AI vendors three direct questions. First, does the vendor retain copies of submitted X-rays after returning analysis results? Second, does the vendor use submitted images to train or improve their AI models? Third, does the vendor's model serve other customers, including direct competitors of the DSO?

If the answer to all three is yes, the DSO is effectively funding the development of a product that benefits competitors equally. The platform architecture should include data sovereignty as a core requirement, not an afterthought.

How to Evaluate Dental AI Privacy Claims

Vendors will increasingly market their AI products as “privacy-first” or “HIPAA-compliant.” Those labels are necessary but not sufficient. HIPAA compliance is a floor, not a ceiling, and it says nothing about whether your data is being used to train commercial models.

When evaluating dental AI products, request specific documentation on these points:

Data residency.Where does the image physically travel during analysis? Is it processed within your own cloud environment or on the vendor's infrastructure?
Retention policy.How long does the vendor retain submitted images? “As long as necessary” is not an acceptable answer. Look for a specific retention period and an automatic deletion policy.
Training data usage. Does the vendor use submitted images to train, fine-tune, or validate their AI models? If yes, is there an opt-out mechanism, and what is the default setting?
Model architecture. Is the AI model a shared multi-tenant model, or does each practice get a dedicated inference endpoint? Shared models inherently aggregate data from multiple sources.
Audit access.Can you audit the vendor's data handling practices? Can you verify that images are actually deleted when the retention period expires?

Any vendor that cannot provide clear, specific answers to these questions is either not thinking about data privacy seriously or is hoping you will not ask.

The Economics of Privacy

There is a reason most dental AI companies use the third-party processing model: it is cheaper to build. A single centralized inference endpoint serving thousands of practices is less complex to operate than deploying dedicated model endpoints per customer.

The trade-off is that the centralized model requires all customer data to flow through the vendor's infrastructure. The per-customer model costs slightly more in compute but eliminates the data sovereignty problem entirely.

NexV chose the per-customer architecture because we believe patient data should never leave the practice's control. The incremental cost of running a dedicated SageMaker endpoint is negligible when the endpoint uses serverless inference that scales to zero when idle. You pay for compute only when a scan is actually being processed. The same privacy-first architecture extends to NexV's ambient AI scribe, which processes clinical conversations without retaining audio or sending data to external services.

For a deeper look at how this architecture keeps costs low while maintaining data sovereignty, see our post on the technology dividend that makes NexV's pricing possible. And for practices concerned about the broader privacy landscape, our imaging platform overview details how X-rays are stored, accessed, and protected throughout their lifecycle.

Ready to see NexV in action?

Book a Demo

AI Radiograph Analysis: What It Actually Detects (and What It Misses)→HIPAA Compliance in the Age of AI: What Dentists Need to Know→The Technology Dividend: How In-House AI Changes the Economics of Dental Software→Why Ambient AI Scribe is the Future of Clinical Documentation→