Analyzing street networks through graph machine learning

a study on how to encode them.

Abstract

Urban road networks are reflections of several environmental, social, and economic factors evolving with different speeds at different times. Sometimes these factors are linearly related and traceable, in other cases street patterns remain formal manifestations. This study represents an attempt of analyzing the geometric properties of street networks, through graph machine learning. Through unsupervised and supervised models, two ways of encoding them are tested: representing streets as graphs with the primal and dual approach. It concludes with some advantages and disadvantages of each encoding method and further opens a discussion on the prospects of using graph machine learning methods when analyzing or generating street patterns.

street networks, graph machine learning, encoding, primal/dual graphs

1. Introduction

Street networks have commonly been treated as sets of linear elements, connecting locations and intersecting at junctions. They can be modeled as graphs in two major categories: the primal and the dual approach. In the first one, quite intuitively, vertices represent junctions and edges the street, while in the second one the opposite happens. While it is easy to represent the topology of street patterns through a graph, it still isn’t straightforward how to best describe its geometric features.

2. 2 ways of encoding graphs

Encoding strategy

1. encoded as primal graphs.
(4 features per node/1 feature per edge)
Node (junction) features:
1.Average intersection angle
2.Maximum intersection angle
3.Minimum intersection angle
(Max/min angle proportion)
4.Length proportion –
(Max / min length of edges starting at the node)

Edge (street) feature:
Length

2. encoded as dual graphs.
(2 features per node)
Node (street) features:
1.Length
2.Bearing

3. Dataset preparation

Samples from the created dataset

The aim for creating the dataset was to have samples which can be visually distinguished into typologies and also labeled for being able to use supervised models. The proposed strategy here was to follow the classification done by Southworth, Ben Joseph in 2003, which classifies them in 5 typologies. The used machine learning models were tested on a smaller dataset with three scales, retrieving tiles of 600 m , 1200, 1800 m width and height dimension. The smaller one gave better results and thus tiles of 600 m were used for testing the final dataset.

The publicly available road network data from OpenStreetMap https://www.openstreetmap.org via the OSMnx Python package https://github.com/gboeing/osmnx was used for retrieving the street network samples.

4. Unsupervised model

Unsupervised models will be the first approach for testing our encoding strategy. The chosen unsupervised model was based on the Unsupervised Inductive Graph-level Representation Learning via Graph-Graph Proximity by (Bai, et al. 2019). This model is able to generate graph level embeddings by training on a set of graphs, best preserving their graph-graph proximity. The researchers also state that the generated embeddings would be able to provide meaningful visualization in a two-dimensional space, besides using them for traditional tasks of graph classification and graph similarity/ distance computation.

tsne visual of primal encoded graphs

In the tsne visualization of the primal encoding scenario we can observe that the generated embeddings from this model are able to produce a meanigful representation. Besides the differences in their topology we can see that through the assigned features it is also able to capture subtle geometric nuances. The model is not succesful in differentiating between fragmented parallel and warped parallel, so we can conclude that the selected features are not enough when dealing with curved patterns.

tsne visual dual representation

In the tsne visualization of the dual encoding scenario we can observe that the groups of the 5 topologies are not clearly distinguishable. However we observe that street networks that are visually very similar are grouped together, concluding that this might be a succesful way of analayzing patterns of a similar topology.

5. Supervised model – an attempt to built a typology identifier

For this classifying task two models were tested: GCN (graph convolutional layers) and DGCNN (deep graph convolutional neural network). GCN failed to train satisfactory models while the results from DGCNN were acceptable. We can observe that the chosen features increase accuracy by one third. Also by removal of the features that encode length the accuracy is not hindered. In this case the features that encode angle intersection information seem to be more important than the ones encoding length, raising the hypothesis that angles are more important for classifying typologies than length.

DGCNN accuracy for primal encoding

DGCNN accuracy for primal encoding, no node features

DGCNN accuracy for dual encoding

Testing the trained model on unseen and unlabeled graphs also provides interesting insights on street networks where clear typologies can not be clearly defined. We can observe that grididon typologies usually get succesfully identified. Warped parallel on the other side is the least succesful one, also owing to the fact that it had the least of samples in the dataset.

Testing the typology classifier

6. Generating

Through testing these two machine learning models, we come to the conclusion that the selected features are able to encode the geometric nature of street networks to a certain extent, thus might be used also for generating them. In particular our proposed dual representation provides an easy way of encoding and decoding back street networks, and can be used as an input for graph generative models.

There are different models used for generating graphs: the most popular ones being: variational autoencoders (VAEs), generative adversarial networks (GANs), and autoregressive models. Auto regressive models are more sophisticated models and can also be incorporated with VAEs and GAN for better performance. We could note GraphRNN and GRAN.

Generated graph topologies from the GRAN model

Conclusions

This study has been a exploration of graph machine learning methods applicable to the geometric analysis of street networks. It proves that these models can be accessible even to non AI professionals due to the existence of open source libraries and demos.

It is up to architects and urban planners to think about how these methods can be used in their practise. In my opinion the analysis of street networks should continue in the chicken-egg dynamic, thinking about how street networks affect and can be affected by different factors, be them environmental, economic and even pragmatic ones such as building area or function.

In this sense both encoding strategies might be useful in further studies, be that analysis or generation of street networks.

The accompanying code is available under: https://github.com/Eridaa/Analyzing-street-networks-through-graph-machine-learning .

Analyzing street networks through graph machine learning is a project of IAAC, Institute for Advanced Architecture of Catalonia developed in the Master in Advanced Computation for Architecture & Design 2021/22 by student: Erida Bendo and faculty: David Andres Leon.

Program