Research on Autopilot: Automating SDR lead research for a German DiGA company

Built by Junaid Cheema

Designing and building the fully automated pipeline that takes a raw purchased database from URL to a prioritized, briefed call task, removing research, lead prioritization and task creation from the SDR’s day entirely.

→ Introduction

The client is a German digital therapeutics (DiGA) company. Growth depends on one thing: getting doctors to prescribe the app. The SDR team works through German orthopedic and musculoskeletal practices and converts them into prescribers.

The buyer here is the practice and its doctors. But the SDR’s daily reality isn’t the practice, it’s the CRM. And the bottleneck was never finding practices. There are tens of thousands of them. The bottleneck was that every lead arrived blank, and the only way to know which practice was worth a call, and what to say on it, was to research it by hand.

✕ Challenge

The CRM had been loaded years earlier from a purchased database. The records were unenriched and unreliable, so before an SDR could pick up the phone they had to become a research analyst.

For every lead, a rep opened the HubSpot company, clicked out to the practice website, and manually worked out:

Is this even an orthopedic practice, or out of scope entirely?
How modern and credible does the practice look?
How many doctors are there, and what does each one focus on (spine, conservative care, knees)?
Do they already use online booking?
What does the practice actually offer, what are the opening hours, what is a real email or fax?

All of that lived in the rep’s head. None of it went back into the CRM. It took five to twenty minutes per lead, the quality varied from rep to rep, and it set a hard ceiling on how many practices the team could ever reach.

The real problem wasn’t the calling. It was that the team spent the front half of every day doing research instead of selling.

✓ Solution

We built a fully automated pipeline that takes a raw HubSpot record from URL to a queued call task without an SDR touching it. The moment a record is flagged, it reads the practice website the way a rep would, extracts everything needed to qualify the lead, writes it back onto the HubSpot company and its contacts, then prioritizes the lead and creates the call task.

The enrichment engine runs on three components. Prioritization and task creation sit on top of it.

1. Site ingestion and classification. Pulls the homepage from the company URL, strips it to clean text, scores how modern and credible the site is on a 1 to 10 scale, and decides whether the practice is actually orthopedic. This is the first filter: is the lead even in scope.

2. AI extraction across the practice. Gemini 2.5 Pro reads the homepage, the services pages, the team and individual doctor pages, and the Impressum, each against a locked rubric. It pulls a practice summary, opening hours, the online-booking provider, email and fax, every doctor with their clinical focus, the lead MFA, the number of locations, and the languages spoken.

3. CRM write-back and contact graph. Every field writes back to the HubSpot company. Each doctor is matched against existing contacts with an AI dedup step, then created or updated and associated to the company, so the rep opens a record already populated with the right people and context instead of a blank shell.

★ The extraction model: what an SDR decides before they dial

The thinking behind the model matters more than the field list, so here is the logic first.

The design question was never “what can we scrape off the page.” It was “what does an SDR work out in their head before they call, and can we put each of those judgments onto the record automatically.” Every judgment falls into one of three buckets, and every extracted field maps to one of them.

Is this lead in scope at all? Ortho yes or no, and site credibility. A musculoskeletal DiGA only sells to musculoskeletal practices.
Is it worth prioritizing? Number of doctors, spine and conservative-care focus (direct clinical fit), multiple locations (a bigger account), a modern site and existing online booking (a digitally mature practice that is likelier to adopt a digital prescription tool).
What do I say when they answer? Named doctors and their focus, the practice summary, the lead MFA to ask for, the languages, and the opening hours so the call actually lands.

The full field set, and what each one is really for: