General

To create the first configuration, click on the Create button

In the Next step, you will see the General tab that includes the main settings, including the configuration name and URL of the site that you want to scrape.

You can use one of 3 options:

  • Start from a single page

  • Use a file with a list of URLs

  • Download the URLs from a sitemap

Start from a single page - specify the URL that you want to scrape a data from

Use the file with a lost of the URLs - download the file that contains all the urls that you want to scrape

Download URLs from a sitemap - you can use the sitemap to scrap all the products from the specified site

This section allows setting the following parameters:

  • Scan depth - is set to detect the depth of scanning the page
    *0 - the scrapper will check only the page that is set in the Entry URL field.
    *1 - the scrapper will scan all the pages on the 1st level (blogs, accounts, about, delivery, related products, etc.)
    *100 - unlimited

  • Number of threads - is used to speed up the scrapping process. You can set a few threads to be running at the same time. We'd recommend you use the default parameter - 3, to prevent the ban

  • Wait up to - you can set the time that the application will wait for the response from the server

IMPORTANT NOTE: Free trial version allows set only 1 thread and Wait Up to - 1 second, and you can scrape up to 100 products

  • Result Data format - required parameter to detect the format of the data that you will get after the scrapping. Since different shopping carts have different formats of the files for import, the software will prepare the ready-to-use file. At the moment the supported formats are WooCommerce, Magento, Shopify, PrestaShop or you can set the Custom format

The next section is used for the variations scrapping:

  • Fill variations empty fields*

  • Export column with row type (product/variation)*

  • Prevent duplicate variations - it is recommended to check this option if you scrape products with variations to avoid the duplicates

*The first 2 options will be activated if you select a Custom data format

Here you can specify the configuration name - it is filled automatically when you enter the URL, but you can change it if needed.

Note, that this tool allows you to set up separate configurations for different stores, so using the store name, or any specific parameter will help you to identify what configuration was used for which store

You can also add comments, under the Comment section

Advanced settings are used for the specific

  • User agent - you can specify the agent to identify yourself (ex. Mozilla) or skip this option

  • Wait up to - set this setting if the page is loaded for a long time. We'd recommend setting 1-5 seconds

  • Scroll page to the bottom before data parsing - check this option to make sure that the scraper scanned the page to the bottom

  • Always load images - check this option if you want to upload the images
    In most cases, this option is used for Magento stores. Magento uses the fotorama for the Media Gallery, so in order to load all the images, check this option