Federating an Opendatasoft dataset
Being able to federate a dataset published using Opendatasoft is a core feature of the data community we're helping to build. It's an efficient way to enrich your own data, and for others it provides a way for them to discover and reuse your open data.
What is a federated dataset?
You can think of a federated dataset like a hologram of or a window onto another, distant dataset. So even though the dataset is stored on and maintained by a different workspace, you can see and consult the data locally. Especially, you can use a federated dataset to enrich one of your own datasets!
Federated data has two key qualities:
Federated data updates automatically in your portal. Because federated data is "merely" a view onto the provider's dataset, if the provider updates their dataset, those changes are immediately reflected in your federated data.
Federated data does not count against your data volume quota. Again, because federated data is not literally copied into your portal, it cannot count against your quota.
Just remember that, for the same reasons, you can't manipulate the data in the same way you would local data. However, you can still filter the data to use only the parts you're interested in, as well as specify your own local metadata and visualizations for that data.
How to federate a dataset
In Catalog > Datasets, click on the New dataset button.
In the interface that opens, select the desired option under "Retrieve a dataset"
Use Opendatasoft network to get a list of all public datasets published on all Opendatasoft domains.
Use My catalog to get a list list of all datasets published on any of your current workspaces.
Select the dataset you need, using filters and search terms to help you find it.
Once you have selected the dataset, from the preview of the first 20 records that opens, you can filter data from the selected source dataset. Then click Continue.
Configure the dataset information or use the prefilled values:
In the Dataset name field, enter the title for this dataset.
In the Dataset technical identifier field, enter a meaningful identifier for this dataset.
If you want anyone with access to your domain to be able to explore the dataset, toggle off Access restricted to allowed users and groups.
Remember, because the data is not actually duplicated, it cannot be transformed in the way you would a local dataset. As such, federated datasets do not have the Processing tab.
Frequently asked questions about federations
Can I override the existing metadata and visualizations?
Yes. When retrieving an Opendatasoft dataset, you can override the metadata and visualization configuration. These new values are specific to your new, federated dateset, and do not modify the original.
If you want to override the title, click Override next to the title and enter the desired title.
If you want to override other metadata, go to the Information tab of the edition interface, click Override and enter the desired value.
If you want to override visualizations, go to the Visualizations tab of the edition interface, click Override and edit the desired visualization configuration.
After editing, you can always restore the original value from the source dataset by clicking Return to original value.
If you have not overridden these values, the metadata of federated datasets will be updated once a day. Other modifications on the original dataset, such as visualization configurations or the dataset schema, will not trigger an automatic update. If you wish to include them in your federated dataset (and haven't overridden them), you must unpublish and republish the federated dataset for the latest modifications to be visible.
Can I add sources or processors to a federated dataset?
No. As explained above, though you can filter the original data or override some metadata, other actions such as adding sources or processors are not available for federated datasets. These actions are either disabled in the back office, as in the grayed-out Add a new source button shown below, or simply do not appear, as with the Processing tab.
Once a federation has been created, can a different user modify it?
When a federated dataset is created, the permissions used to access the remote dataset are tied to the user who initially configured the federation. But later, a different user may wish to modify the filters, change the remote dataset, or simply recover the federation from a user no longer in your organization.
To do this, the subsequent will modify the federation's permissions settings, and assign their permissions to the federation instead of those of the federation's creator.
First, go to that dataset's Source tab. Click on the source, or on View source in the three-dot menu that appears when you hover over the source. The following view is then displayed, where you can click on the Permissions button in the top-right corner.
This displays the following window, where you click Edit permissions to modify whose permissions are used for the configuration. In our example, if the permissions were previously those of Robert Martin, you can choose to either use your own permissions or to not use any permissions.
Note, however, that depending on the remote dataset's security settings, it may not be accessible using your permissions or no permissions. In that case, you will see the error message below.
To make the modification, you need to request further permissions from the admin in charge of the catalog that contains the remote dataset. Naturally, this is your own admin if the dataset is in your own catalog. For datasets found elsewhere in the Opendatasoft network, you must identify the source catalog and contact its admin.