Creating a dataset from a remote source (URL, API, FTP)

Patrick Smith Updated by Patrick Smith

As a source of data for a dataset, you can add a file from your computer, a URL (a remote server or API), or an FTP server (using the FTP, FTPS, or SFTP protocol).

The size limit for a file is 240 Mo. If your file is too big, you can compress it before uploading it.

For more information on compressed or uncompressed file formats, see Supported file formats.

Uploading a file from a URL (remote server or API)

You can connect the platform to:

  • A remote server via its URL, to import files stored there
  • A remote source that exposes data records over an API

Opendatasoft supports HTTP and HTTPS URLs to files, for example "http://example.org/mydata.csv".

Opendatasoft supports the following authenticated connection methods: login/password, API key, OAuth2 protocol (client_credentials grant type) and supplying info within an HTTP header.

  1. In Catalog > Datasets, click on the New dataset button
  2. In the wizard that opens, under the Retrieve a file section, select From a URL
  3. Configure your HTTP connection
  4. From the preview of the first 20 records that appears, configure the source
  5. Configure the dataset information or use the prefilled values:
    • In the Dataset name field, enter the title for this dataset
    • In the Dataset technical identifier field, enter a meaningful identifier for this dataset
    • If you want anyone with access to your workspace to be able to explore the dataset, toggle off Access restricted to allowed users and groups
If you are using an API that was provided to you, note that it must meet the following requirements:
- Be of REST type and preferably in JSON, with a "GET" method.
- It must be available via the internet.
- It must not be paginated or incremental, i.e. results must be obtained in a single call.
- It must be accessible by the Opendatasoft platform via simple authentication (permanent API key, fixed HTTP header, etc.).

Uploading a file from an FTP server

Opendatasoft supports FTP, FTPS, or SFTP URLs to:

  • Files, for example: "ftp://example.org/my_dir/mydata.csv"
  • Folders, for example: "ftp://example.org/my_dir/"

In Catalog > Datasets, click on the New dataset button

  1. In the wizard that opens, under the Retrieve a file section, select From an FTP server
  2. Configure your connection
SSH certificate authentication is available for SFTP servers.
Regarding SSH certificate configuration:
- paste the entire private key file in Private Key field
- the passphrase is relative to the private key and is optional
  1. From the preview of the first 20 records that appears, configure the source
  2. Configure the dataset information or use the prefilled values:
    • In the Dataset name field, enter the title for this dataset
    • In the Dataset technical identifier field, enter a meaningful identifier for this dataset
    • If you want anyone with access to your workspace to be able to explore the dataset, toggle off Access restricted to allowed users and groups
Opendatasoft does not support implicit FTP as it is considered deprecated.

After creating a dataset, an interface for that dataset opens. Only users with the right permissions, either "Create dataset" or "Edit dataset", can use this interface. There you can process the data, configure the datasets and their visualizations, and publish the datasets.

If you update your source file and republish, the entire file will be loaded. However, if the source is a folder and not a file, Opendatasoft will take into account only the newly uploaded files within that folder.

If you delete a file from the FTP folder, you need to perform a small piece of upkeep. To the right of the resource, click on Clean cache before republishing, or else the change will not be reflected in the related dataset. For more information about FTP folders, please read this section about sourcing multiple files stored on an FTP server.

Creating a dataset using dedicated connectors

You can create datasets based on data from remote services for which we have created dedicated connectors.

For the list of available connectors, see here. Some connectors are only available on-demand and may be subject to pricing.

How did we do?

Creating a dataset with multiple files

Creating a dataset using dedicated connectors

Table of Contents

Contact

Powered by HelpDocs (opens in a new tab)