Reference your datasets on data.gouv.fr
This article was written for producers of French open data who publish it on the Opendatasoft platform, but who wish to also have their data referenced on https://www.data.gouv.fr to increase its discoverability.
The method defined jointly by the Etalab and Opendatasoft teams is based on the Opendatasoft DCAT export and the data.gouv.fr DCAT harvester.
The steps involved are as follows:
Fill in the metadata for each dataset
Copy the DCAT export link, having selected datasets for publication
Set up the harvester in your organization's profile on data.gouv.fr
Prerequisites:- An Opendatasoft portal configured to allow anonymous users- An "Organisation" profile on data.gouv.fr
Step 1: Fill in the metadata
Metadata to fill in
In Opendatasoft, the metadata you need to fill in for your datasets—and which are used to calculate the quality score on data.gouv.fr—are listed below:
ODS metadata template | ODS metadata field | Matching data.gouv.fr metadata field | How to fill it out |
Standard | Description | Déscription | |
Standard | License | Licence | For a public administration, these licenses are recommended: Licence Ouverte version 2.0 or ODC Open Database License (ODbL) version 1.0. For more information, see here (in French). |
DCAT | Accrual periodicity | Fréquence | One of the following values:
|
DCAT | Temporal coverage start date + Temporal coverage end date | Couverture temporelle | Fill out the start and end dates for the data. |
DCAT | Spatial | Couverture territoriale |
|
Step 2: Create the DCAT export link
Option 1: Reference all of the datasets on your portal
Go to your portal's catalog (<the URL of your portal>/explore).
Find the "RDF/XML (DCAT)" export link at the bottom left of your screen.
Right-click on the link to copy it. This is what you will use when setting up the harvester on data.gouv.fr.
Option 2: Reference a subset of your data
This option is preferable if you don't want to reference all of your portal's datasets. For example, you may wish to exclude the SIRENE database or other data published by third-party organizations. This step can be quite technical, so don't hesitate to contact Opendatasoft support if you need assistance.
In this case, you'll need to create the export URL with:
The root <the URL of your portal>/api/v2/catalog/exports/dcat
Plus the parameters to filter the export, based on ODSQL syntax (Opendatasoft API query language)
The resulting link is what you will use when setting up the harvester on data.gouv.fr.
Examples:
Here's an example of a query to retrieve the DCAT export from the https://public.opendatasoft.com portal, excluding the "Opendatasoft" producer:
https://public.opendatasoft.com/api/v2/catalog/exports/dcat?where=publisher!="Opendatasoft"Here's an example query to retrieve the DCAT export from the https://public.opendatasoft.com portal, excluding the "Opendatasoft" producer and filtering on the "Population" keyword:
https://public.opendatasoft.com/api/explore/v2.1/catalog/exports/dcat?where=publisher!="Opendatasoft" AND keyword="Population"&lang=en
Note that you can run the test with a CSV export to see if you are indeed retrieving the desired datasets: https://public.opendatasoft.com/api/v2/catalog/exports/csv?where=publisher!="Opendatasoft"
Create the harvester on data.gouv.fr
For this final step, follow the documentation at data.gouv.fr: https://guides.etalab.gouv.fr/data.gouv.fr/publier-jeu-de-donnees/#publier-un-catalogue-de-donnees-existant-par-moissonnage (in French).
Remember to select the DCAT harvesting option.
Once done, what will be available on data.gouv.fr?
The metadata for the datasets you've referenced.
Export files for the datasets you've referenced. For all datasets, CSV, XLSX, and JSON exports will be available as resources on data.gouv.fr. If the dataset contains geographic forms, the GeoJSON export will also be available.
Other resources. If you wish to add other files to be available as resources on data.gouv.fr, you can add them to your dataset's alternative exports (as a file, or via URL).