Data

Learn about the data used in NeoVIN, including its sources and how it is processed.

Data Sources Used by NeoVIN

NeoVIN is powered by a comprehensive and authoritative dataset that enables accurate decoding of Vehicle Identification Numbers (VINs). The data is sourced from multiple streams, each contributing unique and critical information about vehicles. This multi-source approach ensures that NeoVIN can deliver precise and reliable VIN decoding for a wide range of vehicles, including those that are no longer in active production or inventory.

The primary data sources for NeoVIN include:

  • Build Specifications (Window Sticker Data)
  • OEM Inventories
  • MarketCheck Crawled Data
  • Raw Specs Provider

1. Build Specifications (Window Sticker Data)

Build Specifications, also known as Window Sticker Data, form the backbone of NeoVIN's decoding capabilities. This dataset captures the original configuration of vehicles as specified by manufacturers at the time of vehicle release. It includes detailed information such as:

  • Factory-installed options
  • Equipment packages
  • Manufacturer's Suggested Retail Price (MSRP)
  • Engine, transmission, and drivetrain specifications

This data is collected from OEM build sheets and window stickers, providing a definitive snapshot of the vehicle as built. It is structured and complete, offering the highest level of accuracy within the NeoVIN decoding pipeline.

This dataset is essential for accurately identifying vehicle features, options, and configurations, making it a critical component of the NeoVIN service.

WS Data Availiblity Stats (as of Sept 17, 2025):

  • Total VINs: 100M+ (via Marketcheck)
  • Coverage: 2012–present (varies based of make and availiblity)

US Market:

MakeCountCountry
Ford12,414,411US
Toyota12,293,220US
Chevrolet10,648,229US
Nissan10,625,410US
Jeep8,536,238US
Kia7,971,297US
Hyundai6,420,690US
Ram5,430,024US
GMC3,883,056US
Subaru3,771,572US
Dodge3,678,081US
Lexus2,335,356US
Chrysler1,774,283US
Mitsubishi1,289,168US
Buick1,107,635US
Cadillac1,002,201US
Lincoln605,183US
Genesis366,107US
Fiat168,125US
Alfa Romeo127,526US
BMW23,353US
Scion1,203US

Canadian Market:

MakeCountCountry
Ram780,559CA
Chevrolet611,014CA
Jeep576,128CA
Toyota497,508CA
Ford480,605CA
GMC406,781CA
Nissan364,303CA
Dodge337,351CA
Mitsubishi147,227CA
Buick103,449CA
Chrysler99,614CA
Lexus92,135CA
Cadillac69,268CA
Fiat13,628CA
Lincoln13,318CA
Alfa Romeo7,467CA
Kia4,197CA
Hyundai3,635CA
Genesis1,378CA
BMW1,105CA
Scion369CA
Subaru43CA

2. OEM Inventories

NeoVIN integrates active and historical inventory data sourced directly from OEMs (Original Equipment Manufacturers). This dataset includes:

  • Active and historical vehicle inventories
  • Vehicle attributes and configurations like Engine, transmission, drivetrain etc..
  • Factory-installed options
  • Equipment packages
  • Manufacturer's Suggested Retail Price (MSRP)

Data Availiblity Stats (as of Sept 17, 2025):

  • Total VINs: 28M+
  • Coverage: 2018–present (varies based of make and availiblity)

US Market:

MakeCountCountry
Toyota8,108,325US
Jeep2,661,119US
Chevrolet2,424,040US
Mazda1,841,059US
Ram1,675,923US
Volkswagen1,590,464US
Hyundai1,326,143US
BMW1,023,195US
Audi919,720US
GMC872,266US
Kia696,426US
Ford542,193US
Dodge307,160US
Land Rover296,816US
Buick222,857US
Chrysler188,820US
Cadillac185,595US
Nissan117,401US
Jaguar47,683US
Maserati16,648US
Acura14,672US

Canadian Market:

MakeCountCountry
Ram272,750CA
Audi143,994CA
Jeep115,124CA
Dodge43,180CA
Chevrolet42,849CA
GMC18,673CA
Chrysler15,864CA
Buick8,826CA
Cadillac5,680CA
Fiat3,198CA

3. MarketCheck Crawled Data

MarketCheck provides a vast dataset of vehicle information collected from various online sources, including dealership websites, classifieds, and online marketplaces. This dataset is particularly valuable for:

  • Historical vehicle data
  • Off-market vehicles
  • Optional equipment and trim variations
  • Price intelligence and market analysis

This data is collected through web crawling and aggregation, ensuring that NeoVIN has access to a wide range of vehicle information beyond what is available from OEMs. It includes both new and used vehicles, providing insights into market trends and vehicle availability.

Stats (as of Sept 17, 2025):

  • Total VINs: 243,878,947
  • Historical Coverage: 2015–present
  • Top Sources:
    • carmax.com (88,039 used)
    • autonation.com (61,448 new, 23,633 used)
    • lithia.com (59,513 new, 35,783 used)
    • ...and many others

This vast dataset empowers NeoVIN to decode VINs for vehicles no longer in active inventory, including discontinued and off-lease models.

4. Raw Specs Provider

NeoVIN also utilizes a raw specs provider that offers foundational vehicle specification data. This provider supplies detailed information about vehicles, including:

  • Makes, models, and versions
  • Options and equipment
  • Squish VIN level records

This data source is essential for providing a baseline of vehicle specifications, which NeoVIN then enriches with proprietary modules to produce detailed decoding outputs. It ensures that even vehicles with limited or no OEM data can be accurately decoded.

This raw specs provider is particularly useful for vehicles that may not have comprehensive build specifications available, allowing NeoVIN to fill in gaps and provide a more complete picture of the vehicle's attributes.

There could be overlapped between window stickers, OEM inventories and dealer listings data. In case of overlap first preference is always given to build specs data.

Summary

NeoVIN’s data architecture is built to leverage multiple authoritative sources, ensuring accurate and comprehensive VIN decoding across a wide range of vehicles. By integrating OEM build specifications, inventory data, Marketcheck crawled datasets, and raw specs provider information, NeoVIN delivers detailed insights into vehicle configurations, options, features and equipments.