Scrapinghub Reference Documentation

Vehicle Extraction (beta)

If you requested vehicle extraction, and the extraction succeeds, then the vehicle field will be available in the query result:

from autoextract.sync import request_raw

query = [{
    'url': 'https://example.com/vehicle',
    'pageType': 'vehicle'
}]
results = request_raw(query, api_key='[api key]')
print(results[0]['vehicle'])

Vehicle is a sub-class of product, so it has all the fields of product and some fields which are specific to vehicle only.

The list of fields which are specific to vehicle below

Name

Type

Description

vehicleIdentificationNumber

String

It is a unique fingerprint for vehicle, which is different for every vehicle.

mileageFromOdometer

Dictionary with value and unitCode fields

The mileage of the vehicle. value is an integer indicating the distance travelled by the vehicle and unitCode is a string and can either be SMI for miles or KMT for kilometers.

vehicleTransmission

String

It is vehicle transmission. It is the type of component used for transmitting the power from a rotating power source to the wheels or other relevant component.

fuelType

String

The type of fuel suitable for the engine of the vehicle.

vehicleEngine

Dictionary with raw string field

Information about the engine or engines of the vehicle. Field raw indicate the raw text present on the site without any parsing.

color

String

The color of car (exterior).

vehicleInteriorColor

String

The color of car interior.

availableAtOrFrom

Dictionary with raw string field

The place where the car is located. Field raw indicate the raw text present on the site without any parsing.

numberOfDoors

Integer

The number of doors in the car.

vehicleSeatingCapacity

Integer

Seating capacity of the car.

fuelEfficiency

List of dictinaries with raw field.

The measure of fuel efficiency of vehicle. It can be represented as distance per unit fuel (eg. 20 miles per gallon) or fuel per unit distance (8 liters per 100 km). Field raw indicate the raw text present on the site without any parsing.

The fields from product which are also extracted from vehicles.

Name

Type

Description

name

String

The name of the vehicle.

offers

List of dictionaries with price currency, availability and regularPrice string fields

Offers of the vehicle. All fields are optional but currency is present only if price is also present. price field is a string with a valid number (dot is a decimal separator). It is the price customer has to pay after discount or special offers. currency is currency as given on the web site, without extra normalization (for example both “$” and “USD” are possible currencies). It is present only if price is also present. regularPrice is the price before the discount or any special offer. It is present only when the price is different from regularPrice. availability is vehicle availability, currently it can either be "InStock" or "OutOfStock". "InStock" includes the following cases: in-stock, limited availability, pre-sale (indicates that the item is available for ordering and delivery before general availability), pre-order (indicates that the item is available for pre-order, but will be delivered when generally available), in-store-only (indicates that the item is available only at physical locations). "OutOfStock" includes following cases: out-of-stock, dis-continued and sold-out.

sku

String

Stock Keeping Unit identifier for the vehicle assigned by the seller.

mpn

String

Model of the vehicle. It is issued by the manufacturer and is same across different websites for a vehicle.

gtin

List of dict with type and value string fields

Standardized GTIN product identifier which is unique for a product across different sellers. It includes the following type: isbn10, isbn13, issn, ean13, upc, ismn, gtin8, gtin14. gtin14 corresponds to former names EAN/UCC-14, SCC-14, DUN-14, UPC Case Code, UPC Shipping Container Code. ean13 also includes the jan (japanese article number). E.g. [{'type': 'isbn13', 'value': '9781933624341'}]

brand

String

Brand or manufacturer of the vehicle.

breadcrumbs

List of dictionaries with names and link optional string fields

A list of breadcrumbs (a specific navigation element) with optional name and URL.

mainImage

String

A URL or data URL value of the main image of the vehicle.

images

List of strings

A list of URL or data URL values of all images of the vehicle (may include the main image).

description

String

Description of the vehicle.

aggregateRating

Dictionary with ratingValue, bestRating float field and reviewCount int field.

ratingValue is the average rating value. bestRating is the best possible rating value. reviewCount is the number of reviews or ratings for the vehicle. All fields are optional but one of reviewCount or ratingValue is present.

additionalProperty

List of dictionaries with name and value fields

A list of vehicle properties or characteristics, name field contains the property name, and value field contains the property value.

probability

Float

Probability that the requested page is a single vehicle page.

url

String

URL of a page where this vehicle was extracted.

All fields are optional, except for url and probability. Fields without a valid value (null or empty array) are excluded from extraction results.

Below is an example response with all vehicle fields present:

[
  {
    "vehicle": {
      "name": "Vehicle name",
      "offers": [
        {
          "price": "42000",
          "currency": "USD",
          "availability": "InStock",
          "regularPrice": "48000"
        }
      ],
      "sku": "Vehicle sku",
      "mpn": "Vehicle model",
      "vehicleIdentificationNumber": "4T1BE32K25U056382",
      "mileageFromOdometer": {
        "value": 25000,
        "unitCode": "KMT"
      },
      "vehicleTransmission": "manual",
      "fuelType": "Petrol",
      "vehicleEngine": {
        "raw": "4.4L "
      },
      "availableAtOrFrom": {
        "raw": "New york"
      },
      "color": "black",
      "vehicleInteriorColor": "Silver",
      "numberOfDoors": 5,
      "vehicleSeatingCapacity": 6,
      "fuelEfficiency": [
        {
          "raw": "45 mpg (city)"
        }
      ],
      "gtin": [
        {
          "type": "ean13",
          "value": "978-3-16-148410-0"
        }
      ],
      "brand": "vehicle brand",
      "breadcrumbs": [
        {
          "name": "Level 1",
          "link": "http://example.com"
        }
      ],
      "mainImage": "http://example.com/image.png",
      "images": [
        "http://example.com/image.png"
      ],
      "description": "vehicle description",
      "aggregateRating": {
        "ratingValue": 4.5,
        "bestRating": 5.0,
        "reviewCount": 31
      },
      "additionalProperty": [
        {
          "name": "property 1",
          "value": "value of property 1"
        }
      ],
      "probability": 0.95,
      "url": "https://example.com/vehicle"
    },
    "webPage": {
      "inLanguages": [
        {"code": "en"},
        {"code": "es"}
      ]
    },
    "query": {
      "id": "1564747029122-9e02a1868d70b7a2",
      "domain": "example.com",
      "userQuery": {
        "pageType": "vehicle",
        "url": "https://example.com/vehicle"
      }
    },
    "algorithmVersion": "20.8.1"
  }
]