4 Data import

Arabesque also allows you to import your own data sets : (1) flowdata in the form of an origin-destination matrix (adjacency long format in .csv), to agregate them if necessary and (2) nodes dataset before building a readable flowmap.

For this tutorial, we will use for example the historical trade flows listed in the RICardo database.

For more informations about the dataset and its use with Arabesque, see in References.

The application Arabesque accepts as input files in the following format: .CSV, .JSON and .geoJSON.

4.1 Links/flow dataset importation

Arabesque requires the loading of at least one origin-destination links/edges/flow data set. It is a matrix in .CSV (separator: comma) and long format.

4.1.1 Origin, Destination and unique flow matrice

You must also declare the 3 minimum fields required for flow mapping: those corresponding to the origin locations, to the destination locations and the flow values.

If the OD matrix is temporal or available for different categories, you must also choose an aggregation method.

On the homepage of Arabesque load at least one set of flow data.

  • Click on the browse button

Application

Statistical dataset

Loading data SAGEO_RICardo_edges_small.csv.

The data must be in long format, with at least 3 columns : origin, destination, flow

NOTE : Remember that data must be in long format, with at least 3 columns to identify the origin, destination and volume of flows.

4.1.2 Origin, Destination and multiscalar matrice

If the flow data are multiscalar (e.g. flows that concern several social groups, several goods transported or that occur on several dates), agregations procedures are suggested when importing the dataset in Arabesque.

By default, the sum function is applied in the lack of any specifications. However, the user can choose to apply an average, minimum, maximum or median function calculated on all the matrices or graphs provided.

It is also possible to choose a single date or to aggregate the data, according to a given function, over a period or for categories.

This aggregation unction is important because it defines the default flowmap which will be proposed at the entry of Arabesque: the percent of links, nodes and interaction depicted, the intensity of the colors and the opacity of the corresponding signs (see Data processing chapter.

Note: This aggregation does not interfere with the geo-visualization possibilities that will remain available for all existing types.

4.2 Nodes/vertex dataset importation

If you have locational data associated with your ODs, you can load the corresponding node files with “Import Location”, otherwise you can use predefined locations with “Preset Location”.

If you select “Import Location”, you must load a .GEOJSON or .CSV file, then choose the ID of the nodes and their lat/long geographic coordinates.

EXAMPLE: Application on RIcardo dataset.

Loading SAGEO_RICardo_nodes.csv data

The data must be in long format, with at least 3 columns to identify the place and the latitude (Y) and longitude (X) coordinates.

If you do not have a file for the geometry, you can use the codes identifying the reference data (e.g. INSEE codes of the French communes, ISO codes of the countries), to automatically geolocate your nodes. See Preset.

4.3 Preset nodes dataset

Example of pre-selection of French municipalities.

After loading the link and node files, Arabesque automatically performs a join of the common attributes between the two files.

4.5 Import a flowmap project

Import a previously made flowmap by loading a project file in .zip format.