Dataset generation for informed space allocation through machine learning

01 // CONTEXT – Space Allocation Problem

In architecture, the space allocation problem is a well-known issue, which describes the process of locating and determining the size of different spaces according to some requirements or constraints (e.g., the relationship between them), which usually are topological and geometrical.

It has been present in space planning for a long time. Le Corbusier, in 1929, was already thinking about the location of spaces and its connections using a bubble diagram. This kind of diagrams are commonly used by architects to sketch and design buildings, as it is the simplest method of solving this problem. The bubbles represent the spaces, and they can be rearranged to explore different options of distribution and relations between them, as shown in Le Corbusier’s diagram.

Since the introduction of Computer Aided Design (CAD) there have been plenty of studies which try to automate and make easier this problem. In 1963, Armour and Buffa published the first research that approaches the space allocation problem through CAD. (Armour, G.C., & Buffa, E.S. (1963). A heuristic algorithm and simulation approach to relative location of facilities. Management Science, 9, 294-309.).

During this time, there have been a lot of different approaches to the problem. Later in 1971, the Architecture Machine Group developed YONA, a tool based in graph theory that allows the user to design. Also based in graph theory is Magnetizing Floor Plans, a tool for Grasshopper by Decoding Spaces, which solves the problem using evolutionary algorithms. Evolutionary algorithms are a common way of solving space allocation. It is what Joel Simon used in the project Evolving Floor Plans, which optimizes the use of hallways. However, graph theory could be also combined with machine learning, as the example HouseGAN++ shows. Introducing some concepts of graph theory, they generate floor plans using GANs. Image generation is a powerful tool, which was also used by Stanislas Chaillou in AI + Architecture. Finally, a combination of these two methods, evolutionary algorithms, and artificial intelligence, is used in Finch3D. A tool that makes adaptive design for the user much easier.

However, all these studies have in common that they face the space allocation problem in 2 dimensions. This means there is some information about the space missing, and it could be said that the problem solved is floor plan allocation instead of space allocation.

Taking this into account and as the Seattle library’s diagrammatic section from OMA shows, there is a need of seeing space and its distribution not only in the horizontal plane, but also in the vertical plane, to fully understand a building. It is important to understand space in 3 dimensions to design.



The main goal of this thesis is to research about how to translate the space allocation problem from 2D to 3D.

The idea of this research comes from a search of integrating a more informed spatial design, since the early stage of the design process. Nowadays, there are a lot of simulation and analysis tools, that enable to predict and simulate building performance. However, it doesn’t seem that they are being used to develop an informed design that actually incorporates that analysis into the design, but just some small brush strokes. By developing a tool as such, this analysis could be directly incorporated into the design process as a constraint or a parameter to take into account.

Nevertheless, the development of this tool, doesn’t pretend to substitute the figure of an architect or magically solve the whole design process. Ideally, by developing a tool like the one that is being researched, the design process is improved, and the architect can evaluate different options, take well-founded decisions, and develop an informed design.


The chosen method for configuring and designing spaces in 3D follows graph theory using machine learning. The initial step would be to generate a dataset, then to build and train the needed neural network and finally use the trained model to automatically generate space distributions. This research is focused in the first step and in researching about how the generated dataset would have to be encoded and how the needed neural network would work.

The expected result then, is to develop an algorithm that enables to automatically generate a dataset of spaces and connections between them in 3D, and being able to encode the generated geometrical data. This dataset would be used to train a Graph Neural Network for getting the spaces and its relationships and a 3D Convolutional Neural Network for getting the boundary and allocating the spaces considering it.


Automated Floor Plan Design: Generation, Simulation and Optimization – Eugenio Miguel de Sousa Rodrigues

Automated Floor Plan Design has a similar goal as this research. The aim is to develop a tool for a more informed design which is useful since the space planning phase. It uses evolutionary algorithms to create design solutions, but in a second phase it goes beyond to assess and optimize them according to thermal performance. The approach and understanding of a multi-level space allocation is what has been most considered in this research. However, although thought for having several levels, spaces are still allocated and thought in two dimensions.

LOC[ai]: Partially Assisted Artificial Intelligence Tool for Architecture – Oana Taut

LOC[ai] has three clearly define steps: data collection, which obtains the design parameters, space optimization, through multi objective genetic optimization and space experience design, by using artificial intelligence.

In this research it can already be seen the space allocation problem solved in three dimensions. Based in graph theory it allocates a 3D bubble diagram through genetic optimization, which is after translated into a 3D space scheme. The workflow followed in LOC[ai] is the same that has been followed in this research for the generation of the dataset, although as it has already been explained through the use of physics simulation instead of genetic optimization.

Evolutionary approach for spatial architecture layout design enhanced by an agent-based topology finding system – Zifeng Guo, Biao Li

This approach combines two steps for allocating spaces. In a first stage it uses an agent-based topology system to allocate spheres and capsules in 3D, by using attraction and repulsion forces. After they translate those spheres into a box grid system, which is optimized through evolutionary algorithms to get cleaner rooms.

This research is really advanced in terms of allocating spaces in 3D, as the results it gets are extremely close to the final desired spaces in architecture.

Graph2Plan: Learning Floorplan Generation from Layout Graphs – Ruizhen Hu, Zeyu Huang, Yuhan Tange, Oliver Van Kaick, Hao Zhang, Hui Huang

Graph2Plan’s workflow serves as the neural network reference. This research develops an interface where the user can generate its desired house in 2D, given an input boundary and the user preferences. This is possible thanks to the use of a Graph Neural Network which reads the input graph obtained from existing floorplans and a Convolutional Neural Network which reads the input raster image with the boundary. Combining both the spaces are allocated considering the input graph and the boundary.



The previous selected projects are all based in graph theory for allocating space, but they can be divided in two groups: the ones that solve space allocation in 2D and those that do it in 3D. As it has already been explained, the goal of this research is to translate 2D space allocation into 3D. It follows LOC[ai] and the agent-based topology finding system ideas to generate the dataset, and then intends to use Graph2Plan’s neural network in the 3D dataset.

This research aims to automate the generation of a dataset of 3D spaces, which would then be used to train a neural network. This dataset was generated using a physics engine called Kangaroo Physics, developed as a Grasshopper component which enables to apply forces and constraints for allocating the spaces represented in a 3D bubble diagram.

Generation workflow

The process consists of the following steps: collect parameters and constraints, generate a bubble diagram, allocate spaces, and transform bubbles into boxes.

The inputs for the final model would be mostly user preferences. Therefore, to generate the dataset, it would be necessary to set some parameters and constraints that would be in the end set by the user. These parameters include the desired rooms, its size, the connections between them, the level in which they would be located… These parameters are collected in an Excel sheet and then read in Grasshopper. However, there are also some constraints that aren’t defined just by the user, such as the building volumetric boundary, and the entrance of the building, which are geometrical constraints. Accordingly, these constraints are set in Grasshopper as the actual volume and as an attractor point in the desired place. Following this same method of attractor points, more constraints could be applied in a further development such as the orientation of the rooms.

Once the Excel sheet is read in Grasshopper, the bubble diagram can be generated with those defined spaces and constraints. However, these bubbles are initially created to satisfy the desired area of each room. So, in order to not have cubic spaces, with the same height as length, it is necessary to subdivide those spheres considering the desired height for each space. Once all the spaces are ready and connected according to the input, they get allocated into the input building volume using Kangaroo.

Kangaroo Physics enables the possibility of reproducing some attraction or collision forces, for assembling the rooms. To allocate the bubbles inside the input building volume, some of these forces were used as constraints. Therefore, there were used attraction forces to keep the rooms all together and also the subdivided spheres of the same room, but at the same time repulsion for preventing them to collide, a boundary collision to keep the spheres inside the volume, attractor points to specify the entrance of the building and planes to keep the bubbles onto, to keep them in the desired level.

Thanks to the combination of these forces, the bubbles can be allocated inside the input building volume considering the connections and relationship between them. However, the spheres are just a diagram, and we are more used to orthogonal spaces, so those spheres have to be translated into boxes. The process for doing so is by dividing the whole building into a grid of 1x1x1 and generating the real spaces considering the allocated spheres. At the end of this process, the connections between the real spaces and its nodes are ready to be encoded and used for the GNN, and also, there can be obtained floorplans, by translating back again the 3D spaces into 2D.

These steps are the basis to automatically generate the used dataset, which uses real plots from Madrid city center, in order to not randomize them and get real building boundaries. This also makes possible to include new samples in case there is some bias in the boundary, or to replicate the process in case another city has a very specific urban configuration which needs to be taken into account.


In short, graphs are structures made of vertices or nodes, and edges connecting them.

A graph neural network (GNN) is a type of neural network that understands graphs and processes data described by graphs. Graph convolution, which is the used GNN, predicts features of a node considering neighbor’s features. In this particular example, the input would be a graph, and the GNN would generate boxes taking into account the given size as a feature.

Also, for introducing the building volume as a boundary it would be necessary a 3D Convolutional Neural Network, which could read the building boundary by boxelizing it.


As it has been explained, the goal of this research isn’t to develop the model and train the neural network, but to research on how it works and how the data of the generated dataset would have to be encoded.

The steps are to introduce the data generated in Grasshopper to generate the spaces given the input graph, with the GNN. At the same time, given the building boundary volume, use the 3D Convolutional to interpret the boundaries and boxelize the volume. This would make possible to allocate the generated spaces with the GNN inside the boundary.

The needed inputs for the GNN, would just be the graph, defined by the room count and its connections, which would generate the room boxes. For the 3D Convolutional the inputs would be the building volume and the room boundaries as an image and the constraints such as the entrance location, to finally obtain the distributed spaces inside the boundary.

The parameters included in the dataset are the ones that were introduced in the Excel sheet at the beginning, the initial geometrical constraints, and the generated room boundaries. However, they had to be encoded as numbers for using the in the neural network.

In a further development, after training the model, anyone could use it as a tool by inputting a building volume as the boundary and specifying the user preferences. This would automatically output the room distribution in 3D inside the given volume.


As it has been shown with the given examples, the space allocation problem is a widely known issue that has been tried to be solved in very different ways. However, it is not until nowadays, that it is being solved in 3D with great results. These approaches have used evolutionary algorithms, which makes the process slow and heavy in terms of computation. That is why it would be good to translate some of the knowledge of the solutions that use machine learning into 3D, which is what this research aims to do.

Nevertheless, with the use of machine learning and raster images, the precision and sizes of the rooms are lost along the way, as during the process it is an approximation of what the neural network has learned, and not a geometrical volume.

In this research, there is a long way ahead for developing the neural networks that could generate the spaces. However, theoretically, the machine learning method could work better than the evolutionary algorithm approach, as the dataset generation considers some of the good aspects that it has, and it is then used to instantly generate spaces.

Not only there is development left in the neural network, but there is some improvement in the dataset generation. The workflow for the generation makes replication easy, so it would be advisable to replicate the generation with different building types and different uses of the building, to reduce bias in the future model.

Another aspect to improve in the dataset generation is to introduce longitudinal corridors. Because of the configuration of the considered spaces, the corridors in this dataset are created as halls. Despite this, the division of the spaces into smaller spheres leads to believe that there shouldn’t be much problem in aligning them in a more linear way, as the typical corridor typology, by applying linear forces.

The main goal of this whole process is to enable informed spatial design, so it would also be good to add more constraints in the dataset generation, such as views, orientation or any other parameter that could be analyzed and used for space planning.


1. Rodrigues, Eugénio. (2014). Automated Floor Plan Design: Generation, Simulation, and Optimization.
2. Ruizhen Hu, Zeyu Huang, Yuhan Tang, Oliver Van Kaick, Hao Zhang, and Hui Huang. (2020). Graph2Plan: Learning Floorplan Generation from Layout Graphs.
3. Nelson Nauata, Sepidehsadat Hosseini, Kai-Hung Chang, Hang Chu, Chin-Yi Cheng, and Yasutaka Furukawa. (2021). House-GAN++: Generative Adversarial Layout Refinement Networks.
4. Simon, Joel. Evolving Floorplans.
5. Taut, Oana. (2020) LOC[ai]: Partially Assisted Artificial Intellignece Tool for Architecture
6. Guo, Zifeng and Li, Biao. (2017). Evolutionary approach for spatial architecture layout design enhanced by an agent-based topology finding system.
7. Gavrilov, Egor and Schneider, Sven and Dennemark, Martin and Koenig, Reinhard. (2020). Computer-aided approach to public buildings floor plan generation. Magnetizing Floor Plan Generator.
8. Nourian, Pirouz. (2016). Configraphics. Graph Theoretical Methods for Design and Analysis of Spatial Configurations
9. Saha, Nirvik. (2020). The Space Allocation Problem.
10. Chaillou, Stanislas. (2019). AI + Architecture. Towards a New Approach
11. Kipf, Thomas. (2016). Graph Convolutional Networks.
12. Jie Zhou, Ganqu Cui, Shengding Hu, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, and Maosong Sun. (2021). Graph Neural Networks: A Review of Methods and Applications.
13. Benjamin Sánchez-Lengeling, Emily Reif, Adam Pearce, Alexander B. Wiltschko. (2021). A Gentle Introduction to Graph Neural Networks.     
14. Menzli, Amal. (2021). Graph Neural Network and Some of GNN Applications – Everything You Need to Know.


SPATIAL ADAPTIVE DESIGN is a project of IAAC, Institute for Advanced Architecture of Catalonia developed in the Master in Advanced Computation for Architecture & Design in 2020/21 by: 

Students: Jaime Cordero Cerrillo

Thesis Advisor: Oana Taut