Methodology

Every signal NextComp shows is built from named inputs, a specific formula, and an explicit confidence rule. This page is the canonical reference — built into the static site and updated on the same schedule as the data.

Last updated 2026-06-22 14:16 UTC

Data sources

CarGurus retail listings

Coverage: 21,448 active listings across 120 tracked trims

Cadence: Snapshot every 2 hours; nightly diff to detect price changes, relists, and delistings.

Schema: listing_latest, listing_snapshots, listing_enrichment, listing_dealer_spec

Bring a Trailer auctions

Coverage: 1,122 completed auctions ingested (sold, RNM, withdrawn statuses)

Cadence: Daily ingest at 03:30 PT. Bid logs scraped opportunistically (Playwright, capped at 30/run).

Schema: auction_results (source='bat'), auction_bids (source='bat')

Cars & Bids auctions

Coverage: Closed auctions matching our 11 model rules (Porsche GT3, Ferrari SF90/296, Lamborghini Huracán/Aventador/Revuelto, McLaren 750S/765LT, Nissan GT-R)

Cadence: Daily ingest at 03:45 PT via past-auctions XHR interception.

Schema: auction_results (source='cnb')

Cross-source linkage graph

Coverage: 29 auction → retail relists tracked

Cadence: Rebuilt nightly after both auction sources finish ingest.

Schema: relist_linkages (PK: auction_source, auction_id, retail_listing_id)

Signal formulas

Authenticity score

Inputs: status, repeat_flag, no_reserve, comments, vin, seller_username, winner_username + bid log + relist_linkages

Formula: Start at 100. Subtract for each signal that fires: RNM (-30), withdrawn (-25), retail relist within 90d at +5% (-25 to -35), bat_repeat_flag (-8), multi-appearance VIN (-10), self-dealt (-40), dealer winner (-20), bidder-seller cluster (-15), top-bidder concentration ≥50% (-15), unique-bidders <3 (-20), thin engagement (-10). Floor at 0, cap at 100.

Confidence: High when bid log has ≥3 bids; medium with status known but no bids; low when missing critical fields. The confidence label and individual factor contributions are stored as JSON and shown as a tooltip on every /bat row.

Fair value anchor

Inputs: BaT 6-month median sold price (when ≥3 sold) → confidence-adjusted ask (median ask × (1 − expected_haircut)) → hedonic median.

Formula: Per-listing hedonic residual is computed from a per-model OLS on log-price ~ mileage + age + key spec flags. The residual z-score drives the buy_signal pill (Strong Buy → Over Market).

Confidence: Confidence depends on `distinct_days` of snapshot history and `n_active`. Surfaced in NC-0205's verdict block as low/medium/high.

Relist linkage

Inputs: auction_results.vin = listing_latest.vin

Formula: For every VIN that appears on both sides, compute gap_days = retail_first_seen − auction_sold_date. Keep rows where 0 ≤ gap_days ≤ 365. Markup = (retail_price − sold_price) / sold_price.

Confidence: High (VIN exact match). Spec+seller+time fallback intentionally deferred to keep the false-positive rate near zero.

Dealer auction-flip score

Inputs: relist_linkages + listing_latest + listing_enrichment

Formula: Per dealer: n_active_sourced (linkages still active), median_premium_pct (median markup over auction close), median_gap_days, stale_share (active_sourced with DOM ≥60d). No black-box index; each metric stands alone.

Confidence: Median is descriptive when n_total_sourced ≥ 3; lower N is shown as the raw count without a percentile.

Should-I-act-now verdict

Inputs: market_state.state, cuts_share, sell_through_pct, expected_haircut_pct, inv_change_pct, BaT 6mo median, distinct_days

Formula: Hold off when n_active<2. Otherwise: Wait if Softening or (cuts ≥40% AND sell-through <10%); Negotiate if Stable with cuts ≥20% or Heating with asks >+5% over BaT median; Buy now if Heating and asks within ±5% of BaT median; Patient buyer otherwise.

Confidence: Verdict and confidence are computed independently. Thin-data models still get a recommendation but the confidence chip is visible (low/medium/high) so the reader knows to weight it less.

Confidence levels
LevelCriteriaSemantics
high distinct_days ≥ 21 AND n_active ≥ 8 (verdict); auction_bids has ≥3 bids (authenticity) Treat as load-bearing. The threshold and historic depth are sufficient to act on.
medium distinct_days 7–20 OR n_active 3–7 Directionally correct. Use as a tiebreaker, not the only input.
low distinct_days < 7 OR n_active < 3 Listen for direction, not magnitude. Re-check when more data lands.
Honest limits

Related: How it works (7-step workflow) · Glossary (term definitions) · API (JSON bundles per model).