Methodology
Every signal NextComp shows is built from named inputs, a specific formula, and an explicit confidence rule. This page is the canonical reference — built into the static site and updated on the same schedule as the data.
Last updated 2026-06-22 14:16 UTC
Data sources
CarGurus retail listings
Coverage: 21,448 active listings across 120 tracked trims
Cadence: Snapshot every 2 hours; nightly diff to detect price changes, relists, and delistings.
Schema: listing_latest, listing_snapshots, listing_enrichment, listing_dealer_spec
Bring a Trailer auctions
Coverage: 1,122 completed auctions ingested (sold, RNM, withdrawn statuses)
Cadence: Daily ingest at 03:30 PT. Bid logs scraped opportunistically (Playwright, capped at 30/run).
Schema: auction_results (source='bat'), auction_bids (source='bat')
Cars & Bids auctions
Coverage: Closed auctions matching our 11 model rules (Porsche GT3, Ferrari SF90/296, Lamborghini Huracán/Aventador/Revuelto, McLaren 750S/765LT, Nissan GT-R)
Cadence: Daily ingest at 03:45 PT via past-auctions XHR interception.
Schema: auction_results (source='cnb')
Cross-source linkage graph
Coverage: 29 auction → retail relists tracked
Cadence: Rebuilt nightly after both auction sources finish ingest.
Schema: relist_linkages (PK: auction_source, auction_id, retail_listing_id)
Signal formulas
Authenticity score
Inputs: status, repeat_flag, no_reserve, comments, vin, seller_username, winner_username + bid log + relist_linkages
Formula: Start at 100. Subtract for each signal that fires: RNM (-30), withdrawn (-25), retail relist within 90d at +5% (-25 to -35), bat_repeat_flag (-8), multi-appearance VIN (-10), self-dealt (-40), dealer winner (-20), bidder-seller cluster (-15), top-bidder concentration ≥50% (-15), unique-bidders <3 (-20), thin engagement (-10). Floor at 0, cap at 100.
Confidence: High when bid log has ≥3 bids; medium with status known but no bids; low when missing critical fields. The confidence label and individual factor contributions are stored as JSON and shown as a tooltip on every /bat row.
Fair value anchor
Inputs: BaT 6-month median sold price (when ≥3 sold) → confidence-adjusted ask (median ask × (1 − expected_haircut)) → hedonic median.
Formula: Per-listing hedonic residual is computed from a per-model OLS on log-price ~ mileage + age + key spec flags. The residual z-score drives the buy_signal pill (Strong Buy → Over Market).
Confidence: Confidence depends on `distinct_days` of snapshot history and `n_active`. Surfaced in NC-0205's verdict block as low/medium/high.
Relist linkage
Inputs: auction_results.vin = listing_latest.vin
Formula: For every VIN that appears on both sides, compute gap_days = retail_first_seen − auction_sold_date. Keep rows where 0 ≤ gap_days ≤ 365. Markup = (retail_price − sold_price) / sold_price.
Confidence: High (VIN exact match). Spec+seller+time fallback intentionally deferred to keep the false-positive rate near zero.
Dealer auction-flip score
Inputs: relist_linkages + listing_latest + listing_enrichment
Formula: Per dealer: n_active_sourced (linkages still active), median_premium_pct (median markup over auction close), median_gap_days, stale_share (active_sourced with DOM ≥60d). No black-box index; each metric stands alone.
Confidence: Median is descriptive when n_total_sourced ≥ 3; lower N is shown as the raw count without a percentile.
Should-I-act-now verdict
Inputs: market_state.state, cuts_share, sell_through_pct, expected_haircut_pct, inv_change_pct, BaT 6mo median, distinct_days
Formula: Hold off when n_active<2. Otherwise: Wait if Softening or (cuts ≥40% AND sell-through <10%); Negotiate if Stable with cuts ≥20% or Heating with asks >+5% over BaT median; Buy now if Heating and asks within ±5% of BaT median; Patient buyer otherwise.
Confidence: Verdict and confidence are computed independently. Thin-data models still get a recommendation but the confidence chip is visible (low/medium/high) so the reader knows to weight it less.