Creating a dataset and uploading the data

Let's have a look at the available data before we define the corresponding dataset object.

Customers table

The data we will visualize in this tutorial represent our customers. The anonymized table contains each customer's internal ID, city, address, sex, age group and most importantly - latitude and longitude of the address. The table also contains code of the neighborhood to which the address belongs, which we'll use later.

The CSV file can be downloaded here: customers.csv

NameTitleData type
customer_idCustomer IDinteger
neighborhood_codeNeighborhood codestring
cityCustomer's citystring
addressCustomer's addressstring
sexCustomer's sexstring
age_groupCustomer's age groupstring
latAddress latitudelatitude
lngAddress longitudelongitude

Download the CSV file and put it in the /data folder of your dump.

Creating a dataset

Now, we will create the corresponding dataset. Dataset object has some specifics which differ it from other metadata objects:

  • it contains properties with featureTitle and featureSubtitle settings
  • has a ref object instead of content

The properties.featureTitle and properties.featureSubtitle properties specify the content of the tooltip shown when hovering the dataset's features in the map. In this case, it will be customer_id and an address.

Now, to the ref object. The type of the dataset is dwh, and the subtype is geometryPoint, because the table represents customers' addreses (points) that have a latitude and longitude. The table's primaryKey is the customer_id property. In the visualizations object, we say that we want to visualize it as a dotmap and a heatmap (the only two available for geometryPoint). The dataset is not categorizable by default, and none of its properties are filterable, as they will not appear in filters (more about filters later). It's data are also not allowed to be searched by full text search - fullTextIndex property.

The zoom object at the end can be used to modify the zoom levels for the dotmap visualization. This can be handy when there's a lot of dots, which could be a performance problem.

Using your text editor, save this dataset as customers.json to the /metadata/datasets subdirectory in your dump directory.

Using the status command, the dataset and the corresponding CSV file will be listed as new.

Use addMetadata to add it the dataset to the project, and pushProject to upload the CSV file.