AiDDA·Benchmarks

From target to optimised portfolio — weeks to hours

Transparent validation data across 14 DUD-E targets, 7 protein families, and 42 real campaigns. Every number is reproducible from the same natural-language input.

Median AUC-ROC

DUD-E targets validated

Protein families tested

Real campaigns measured

Hours average runtime

The real timeline problem

A campaign isn't one step.
It's a cycle.

Here's what changes when the entire design-make-test-analyse loop runs autonomously.

Traditional6–8 weeks

AiDDA1–3 hours

~40× time compression

Traditional workflow

Wk 1

Target research, PDB selection, protein prep, compound library assembly

Wk 2

Docking, scoring, manual hit triage, ADMET profiling

Wk 3

Medicinal chemistry reviews SAR, proposes modifications

Wk 4

Second-round library, re-dock, re-score, re-triage

Wk 5–6

MM-GBSA or FEP on selected compounds. Waiting for compute.

Wk 6–8

Final candidate selection, report, handoff to synthesis

6–8 weeks · no guarantee of convergence

AiDDA

Hr 1

Target research → protein prep → library → docking → ADMET → validation → SAR → tiered portfolio

Hr 2–3

Autonomous evaluation. Generates new compounds if needed. Repeats until convergence.

Hours · same rigour · automated iterations

Iterative optimisation

The hard part, automated

The first screen is easy. The real work starts when you ask: “How do I make these hits better?”

Traditional MPO

—Medicinal chemist proposes changes

—Computational chemist re-docks

—ADMET scientist flags liabilities

—Team debates in a meeting

—Repeat for each design idea

—One iteration per week, if fast

ScreenEvaluateDiagnoseGenerateRe-screen

Each iteration: minutes, not weeks

Evaluate portfolio

Qualified hits, scaffold diversity, ADMET pass rate

Diagnose bottleneck

Binding? ADMET? Diversity? Auto-detected.

Generate compounds

Scaffold hops, bioisosteric swaps, R-group enumeration

Full pipeline screen

Dock → score → qualify → ADMET → validate

Merge & compare

SAR accumulates. Tracks what improved.

Stop or continue

Converged → stop. Improving → another round.

Balanced scoring

What gets optimised

The platform balances 8 objectives simultaneously — the same trade-offs a medicinal chemist navigates manually, resolved in every iteration.

Improving binding often breaks ADMET. Fixing ADMET drops binding. This whack-a-mole takes weeks manually. The platform's multi-parameter scoring resolves these trade-offs in every iteration, tracking what improved and what regressed.

Figure 1

DUD-E benchmark results

Complete automated workflow per target. AUC-ROC and Precision@20 on initial screening pass.

Nuclear ReceptorKinasePolymeraseProteaseGPCRChaperoneHydrolase

TargetFamilyAUC-ROCActives in Top 20Precision

ERαNuclear Receptor

0.95

20/20

100%

SRCKinase

0.95

20/20

100%

VEGFR2Kinase

0.92

20/20

100%

HIV-RTPolymerase

0.88

15/20

75%

PPARγNuclear Receptor

0.88

10/20

50%

BACE1Protease

0.87

14/20

70%

CDK2Kinase

0.87

16/20

80%

A2AGPCR

0.84

9/20

45%

HSP90Chaperone

0.82

9/20

45%

EGFRKinase

0.79

10/20

50%

ACEProtease

0.72

11/20

55%

ABL1Kinase

0.66

2/20

10%

AmpCHydrolase

0.61

4/20

20%

DRD3GPCR

0.56

3/20

15%

Median0.8510.5/2053%

Initial screening pass. In real campaigns, iterative optimisation generates additional compounds targeting weaknesses — improving coverage in subsequent rounds.

Deliverables

What a campaign produces

After the final iteration, a complete actionable package.

Tiered compound portfolio

Typical distribution

Tier 1: Top candidates

Strong binding, ADMET-clean, physics-validated, diverse scaffolds

25%

Tier 2: Promising with flags

Good binding, specific liabilities identified

40%

Tier 3: Backup / SAR context

Moderate binding, valuable for structure-activity understanding

35%

Iteration history

What was tried and results
Per-round metrics & trends
Generation strategy rationale

SAR analysis

Matched molecular pairs
Scaffold-activity relationships
What helps, what hurts

Per-compound profiles

Binding mode & residue contacts
ADMET predictions
Synthetic accessibility

CSVScores, SMILES, scaffolds, ADMETSDF3D docked posesPDFStructured report

Transparency

Where we're honest

The pipeline underperforms on 3 of 14 targets. We show these results because transparency is how trust gets built.

Virtual screening is not solved. No tool gets every target right. What matters is automating the 90% that doesn't require expert judgment — and being transparent about the 10% that still does.

ABL1

0.66

Non-canonical DFG-out binding conformation. CNN-based pose scoring was trained on canonical modes and misjudges binding quality here.

AmpC

0.61

Fragment-like actives (avg MW ~295) fall below the molecular weight range where CNN scoring is most reliable.

DRD3

0.56

Deep transmembrane GPCR pocket. Structure-based virtual screening remains challenging for this target class generally.

The platform detects scoring unreliability from score distributions and flags results rather than presenting false-confident rankings.

Methodology

DUD-E (Database of Useful Decoys — Enhanced), 14 targets, ~500 compounds per target. Full automated pipeline: literature review, PDB retrieval, protein preparation, compound library construction, 3D conformer generation, CNN-based docking, multi-signal scoring, adaptive qualification, scaffold-diverse selection, ADMET screening, physics-based binding validation. AUC-ROC and Precision@20 on initial screening pass. Campaign times across 42 real campaigns. All benchmarks reproducible from same natural-language input. March 2026.

Case Study

KRAS G12C: From Prompt to Lead Compounds in 6 Hours

How a single natural-language prompt drove an autonomous campaign that identified three lead compounds overcoming sotorasib resistance — compressing 4–6 weeks into one afternoon.

Read the full case study

See these results on your target

AiDDA runs the full virtual screening pipeline — from target to ranked portfolio — in hours, not months. Run a campaign on your protein target and compare against your current workflow.

Try AiDDA Talk to Sales

No credit card required

From target to optimised portfolio — weeks to hours

A campaign isn't one step.It's a cycle.

The hard part, automated

Traditional MPO

Evaluate portfolio

Diagnose bottleneck

Generate compounds

Full pipeline screen

Merge & compare

Stop or continue

What gets optimised

DUD-E benchmark results

What a campaign produces

Tiered compound portfolio

Iteration history

SAR analysis

Per-compound profiles

Where we're honest

KRAS G12C: From Prompt to Lead Compounds in 6 Hours

See these results on your target

A campaign isn't one step.
It's a cycle.