Creating a dataset with media files

It is possible to create a dataset with images or other media files.

For example, this ability allows a regional tourism board to provide the public with sets of thematic or promotional pictures, each with their correponding information and metadata.

To create the dataset, you first create a source with the media files, and then either add that source as an archive file, or else add it via a URL.

File formats

The platform is able to work with the following file formats:

Images (.gif, .png, .jpeg, .jpg, .tiff, .bmp, .svg)
PDF files (.pdf)
GTFS files (.zip)

All formats considered images by the platform will be imported as such. A thumbnail will be generated for these formats, and the "Images" visualization activated. For more information, see Configuring the Images visualization.For other formats, such as PDF and GTFS files, no thumbnail will be generated, and the "Images" visualization will not be available. Users will only be able to download these files.

Sourcing multimedia files with an archive file

This method consists of building an archive file with the multimedia files, and then importing it into the platform.

For more information about which compressed file formats are supported, see Supported compressed file formats.

Step 1: Building the archive file

The archive file must contain:

The media files
A CSV file listing the media files and their metadata

No matter the format of the media files, they must be zipped together with the CSV file. For instance, if the media files are already zipped files, they must be zipped again together with the CSV file.

We recommend keeping all images at the same level in the archive file. However, if images are in subdirectories, keep in mind to write the whole path in the CSV file.

Step 2: Creating the CSV file of the archive file

The CSV file must at least contain a column with the names of the media files. It can contain other columns that will be considered as additional fields.

Example of a CSV file to create a dataset with media files:

Caption;Title;File
Caption of PNG file;Media 1;file_name.png
Caption of PDF file;Media 2;file_name.pdf
Caption of ZIP file;Media 3;file_name.zip
Caption of SVG file;Media 4;file_name.svg

Caption	Title	File
Caption of PNG file	Media 1	1-10.png
Caption of PDF file	Media 2	1-20.pdf
Caption of ZIP file	Media 3	2-10.zip
Caption of SVG file	Media 4	2-20.svg

In this example:

The column "File" indicates the names of the media files
The columns "Title" and "Caption" are additional fields

Step 3: Sourcing the archive file

You can use the archive file you created as a source and upload it to the platform.

In Catalog > Datasets, click on the New dataset button.
In the wizard that opens, select From my computer under the "Retrieve a file" section.
Either drop the archive file or click Browse to locate the archive file in your local filesystem.
From the preview of the first 20 records that opens, configure the source.
Configure the dataset information or use the prefilled values.

Sourcing media files via URLs

This method consists of sourcing any Supported formats file containing URLs of media files stored in a remote server and using a processor to define the media files and extract their metadata.

For this method, Opendatasoft supports the HTTP protocol and its secured version HTTPS. Both should link to a single file.

Step 1: Create a dataset

In Catalog > Datasets, click on the New dataset button.
In the wizard that opens, select the desired method under the Retrieve a file section.
From the preview of the first 20 records that appears, configure the source.
Configure the dataset information or use the prefilled values.

Step 2: Process the URLs

Once the dataset is created, click on the Processing tab.
Click on the Add a processor button.
Choose the File processor in the Generic operations section.
In the File processor area, indicate which field contains the URLs of the media files.
(optional) In the File processor area, select the Extract metadata check box to import the related metadata of the images.

Example of a CSV file used to create a dataset with media files:

Caption;Title;File
Caption of PNG file;Media 1;http://website.com/file_name.png
Caption of PDF file;Media 2;http://website.fr/file_name.pdf
Caption of ZIP file;Media 3;http://another-website.com/file_name.zip
Caption of SVG file;Media 4;http://website.com/file_name.svg

Caption	Title	File
Caption of PNG file	Media 1	http://website.com/file_name.png
Caption of PDF file	Media 2	http://website.fr/file_name.pdf
Caption of ZIP file	Media 3	http://another-website.com/file_name.zip
Caption of SVG file	Media 4	http://website.com/file_name.svg

In this example:

The column "File" indicates the URL of the media files (which is also the column that will be used with the File processor)
The columns "Title" and "Caption" are additional fields

Displaying images

Once the images are imported into the platform, they can be displayed in two different ways:

Through the default Images visualization tab: An image gallery displays all the images and their metadata.
Through a slideshow, which is an Opendatasoft widget you can add in any code area of the platform. For example, you can add it in the Custom view of the dataset or in a content page. In that case, images are displayed one-by-one.