Methodology | DoAIKnowYou

This research employs two complementary methodological approaches:

Matched-pair testing — extending Bertrand & Mullainathan's seminal methodology to AI systems
Structural analysis — proprietary methods detecting bias in narrative structure (methodology protected)

The first approach is fully documented here. The second produces the findings shown in our evidence section, with methodology details forthcoming in academic publication.

I. Matched-Pair Testing Design

Identical inputs, different names. Any difference in output is attributable to the name.

Experimental Structure

Name Pairs	54 pairs across 6 demographic contrast categories
Symptom Profiles	20 standardized clinical presentations
Total Comparisons	1,080 matched-pair tests
Control Variables	Prompt template, model version, temperature, system prompt
Independent Variable	Name only

Name Pair Categories

Name-demographic associations validated through Bertrand & Mullainathan (2004) and Fryer & Levitt (2004).

Anglo vs. African American name signals
Anglo vs. Hispanic/Latino name signals
Anglo vs. Asian name signals
Male vs. Female name signals
Professional title (Dr.) vs. No title
Socioeconomic indicator variations

Symptom Profile Categories

20 standardized presentations across clinical domains:

Cardiovascular (chest pain, palpitations, shortness of breath)
Pain management (chronic pain, acute pain, medication questions)
Psychiatric (depression, anxiety, psychosis presentations)
Emergency (acute abdomen, mental health crisis)
General (fatigue, neurological symptoms)

II. Vocabulary-Level Analysis

Content analysis of AI-generated responses for explicit recommendation differences.

Analysis Methods

Keyword extraction — automated identification of treatment recommendations, urgency markers, referral language
Treatment coding — classification of recommended treatments, medications, and care pathways
Urgency scoring — quantification of urgency language (immediate, soon, routine, etc.)
Sentiment analysis — tone and framing of responses

Coding Reliability

Inter-rater reliability established through:

Independent dual coding of 20% sample
Cohen's kappa > 0.85 for all coding categories
Discrepancies resolved through consensus discussion

III. Structural Analysis

Beyond what AI says — how it says it.

Methodology Protected

Structural analysis methodology is proprietary. Academic publication forthcoming. Research partnerships available for qualified investigators.

What We Can Share

We developed methods to detect bias in narrative structure—the mathematical patterns of how AI-generated text unfolds differently based on name signals.

Analysis operates on 12 structural dimensions
Detects patterns invisible to vocabulary analysis
Produces quantitative scores comparable across conditions
Generates behavioral signatures characterizing response patterns

The 12 Dimensions (Names Only)

Structural analysis examines:

Semantic Distance — trajectory through semantic space
Power Dynamics — agency asymmetry between entities
Entropy — information-theoretic disorder patterns
Tension/Release — buildup and release cycles
Boundary Crossing — formal/personal transitions
Proximity Dynamics — linguistic intimacy patterns
Narrative Velocity — pacing and acceleration
Symmetry Breaking — balance shifts over time
Resonance — rhythmic synchronization patterns
Phase Transitions — state changes in linguistic patterns
Information Density — compression patterns
Temporal Displacement — tense and time orientation shifts

What Is Not Available

Mathematical formulas, detector code, weighting schemes, pattern classification.

Available through research partnership.

IV. Statistical Framework

Standard methods for robust inference.

Effect Size Estimation

All findings reported as Cohen's d with interpretation:

Cohen's d	Interpretation
0.2	Small effect
0.5	Medium effect
0.8	Large effect
>1.0	Very large effect

Multiple Comparison Correction

With 12 structural dimensions and multiple demographic comparisons, we apply:

Bonferroni correction — adjusting significance threshold by number of comparisons
False Discovery Rate control — Benjamini-Hochberg procedure
All reported findings remain significant after correction

Confidence Intervals

95% confidence intervals estimated through:

Bootstrap resampling — 10,000 iterations
Bias-corrected and accelerated (BCa) intervals
All reported CIs exclude zero (confirming significance)

V. Pre-Registration

This study's methodology and analysis plan were specified before data collection began.

Pre-Registration Details

Protocol Version	1.0
Date	January 2026
Contents	Hypotheses, methodology, analysis plan, stopping rules
Deviations	None

Pre-registration prevents post-hoc adjustment of methods to achieve desired results. All analyses reported were planned before data collection.

VI. Reproducibility

We are committed to enabling independent verification of our findings to the extent possible while protecting proprietary methodology.

What Is Available

Name pairs dataset (54 pairs, 6 categories) — Download JSON
Symptom profiles (20 standardized presentations) — Download JSON
Healthcare findings summary — Download JSON
Structural findings summary (results only) — Download JSON

What Requires Partnership

Structural analysis source code
Raw model outputs
Full statistical analysis scripts

Available to qualified researchers through formal partnership.

See the Evidence