An introduction to the Explore API

Edited

If you're already familiar with APIs or the Explore API, you can go directly to the Explore API v2 documentation. Otherwise, read on for more details.

An API, or Application Programming Interface, is a tool designed to allow different software systems to communicate with each other. If you want to use data stored somewhere online, the point of an API is to allow you to interact with that data in a way the source will understand. And if you want to share your data with others, an API allows you to define what kind of access they have.

Opendatasoft provides several different APIs to interact with the platform, but the main API used to access a given workspace's public data is our Explore API.

What the Explore API allows you to do

Opendatasoft's Explore API gives you access to public data on the Opendatasoft platform. As such, the Explore API allows you to perform three kinds of actions:

  • Explore: Ask for records and fields that you would like to see. The data is given to you in a JSON object.

  • Export: Export the entire dataset according to specified conditions. To specify the conditions you use a query language we call "ODSQL." ODSQL is our own query language, very similar to SQL.

  • Analyze: You can combine data within a dataset, and/or can perform simple analysis on a dataset.
    For example, you might query a dataset that contains schools, the number of students at those schools, and the region where each school is located, and combine that information by asking for the total number of students per region.

Interacting with the API

So, how do you actually go about using the API? The answer is at once simple and complicated.

As we saw above, using the API consists in requesting things of it, and receiving a response. In API-speak, you make a "call" or "request" and receive, in the case of ODS, a JSON object in return. So when you use the API, your API call is sent to the Opendatasoft server, and is answered with a JSON object.

It's easy to see this in action in your browser. The API can be asked to give you a list of all of the public datasets in a workspace. https://data.opendatasoft.com/pages/home/ is the URL for Opendatasoft's Data Hub, a source of over 30,000 public datasets published by public and private sector organizations. Its domain is therefore "data.opendatasoft.com."Open a browser window, and paste this URL into it: http://data.opendatasoft.com/api/v2/catalog/datasets. What you see is a JSON object containing a kind of list of the datasets, with links to further JSON objects. And remember that you can replace "data.opendatasoft.com" with any Opendatasoft domain, and you'll see the data for that domain. That's the API in action!Of course, on its own, this isn't very interesting. But when other tools are used, the API can be a powerful way of interacting with the data. Read on for more details.

Practically speaking, you'll want to use certain tools to make interacting with the API more practical and useful. For example, API calls can be made using platforms such as PostMan that are made to interact with APIs. If you're a developer, you can use Curl or Python's Requests Library.

Exploring the data

As we saw above, exploring the data is one of three kinds of actions you can take with the Explore API. By exploring the data, we mean that you can request records from a public dataset in order to process and use it on your end.

There are methods you can use—and limits to be aware of—when specifying what data you want to be given:

  • Select: You can select a specific range of columns

  • Where: You can filter the data according to a condition

  • Order by: You can use this to sort according to a designated column

  • Limit/offset: You can limit what records are returned, or else jump directly to a specific record

  • Group by: You can group data according to certain field values or functions applied to these fields.

Note you can only explore some of the data at a time, because API calls are limited to 100 items at a time. Using limit/offset allows you to manage which parts of the data you're examining.

Exporting the data

API calls are limited to 100 records per call. But if you need to handle all of the data at once, you are able to export the entirety of the data.

The same methods listed under "Exploring the data" can be used to tailor your requests.

Analyzing the data

After exploring or exporting the data, you may wish to perform some basic analysis on it. This is called aggregating, and you can use different functions to combine, or aggregate, the data in productive ways:

  • avg (average)

  • count

  • count distinct

  • envelope

  • bbox

  • max (maximum)

  • median

  • min (minimum)

  • percentile

  • sum

These functions are applied to "groups" that can be defined by the method "Group by," described above.

This can come in handy if, for example, you want to obtain the total expenses each month, when your dataset is a list of expenses and their dates. In this case, you might group the expenses by month, and perform a sum of the expenses column. In this way, even with a small amount of analysis you can begin to understand and make real use of the raw data. And, depending on how you use the API, you are able to do it in a standardized, automated, or scalable way.

Why upgrade to version 2?

Differences and advantages of v2:

V1

V2

Use paradigm

3 main endpoints for catalog and dataset : "/search" to retrieve data "/analyse" to use aggregation function "/download" to export data

2 main endpoints for catalog and dataset :

- "/records" to retrieve or perform data analyzes on a sample of the dataset (10 000 records max)

- "/exports" to export the full dataset in various available formats

Both endpoints use our ODSQL language, which provides among other things, aggregation functions.

Exports

• Not all exporter formats are available • Group_by not supported

• All exporter formats are available • Group_by supported

Internal and external uses

Only used by the old ODS tools

Used by the Studio and other external services (WFS, CSW, AUTOMATION, ...)

URL encodage

Need to escape some special characters, for instance '#'

No need to escape special characters

Parameter mapping:

V1

V2

q, sort

Use selectwhereorder_by and group_by from our ODSQL language instead

dataset

dataset is in the endpoint, it's not a parameter anymore

rows & start

become limit & offset (keep same sens)

refine.<facet_name>=<face_value> exclude.<facet_name>=<face_value>

refine=<facet_name>:<face_value> exclude=<facet_name>:<face_value>

lang, timezone

no changes

Query translation:

Exemple with this portal: https://documentation-resources.opendatasoft.com

V1 

/api/records/1.0

V2

/api/explore/v2.1

/search?dataset=les-arbres-remarquables-de-paris&q=#search(libellefrancais,'Platane') &sort=hauteur_en_m

/catalog/datasets/les-arbres-remarquables-de-paris/records?where=libellefrancais='Platane'&order_by=hauteur_en_m desc

/search?dataset=les-arbres-remarquables-de-paris&rows=100 &refine.dateplantation=1700 &q=circonference_en_cm>300

/catalog/datasets/les-arbres-remarquables-de-paris/records?limit=100 &refine=dateplantation:1700 &where=circonference_en_cm>300

/search/?dataset=arbresremarquablesparis2011 &geofilter.bbox=48.811385499847546,2.1673965454101562, 48.901740646573025,2.5306320190429688 &fields=geom_x_y

/arbresremarquablesparis2011/records?where=in_bbox(geom_x_y,48.811385499847546,2.1673965454101562, 48.901740646573025,2.5306320190429688)

Where to now?

Hopefully by now you better understand why an API can be useful, and have a sense of what Opendatasoft allows you to do with the Explore API.

No doubt using an API isn't for everyone. But with a little work, the API can allow you to better understand—and efficiently use—your own data or other public data.

To answer your questions and help you on your way, we invite you to dive into our documentation for the Explore API.