Latent Space Walk through Generated Churches

Neural Network or other M.L applications are so integrated into our daily life, that we hardly notice it. These technologies focus on automating the work we do manually, so that we can shift our focus towards more sensible and stimulating subjects. This is the same in Architecture where there is a lot of work done in using machine learning techniques to deal with more basic tasks while the designer get more time for creative expression. And thus enabling us to use our brain’s capabilities to intuitively express designs and 3d forms as well as expand on it . And this lead us the question of
How can we recreate this capability of creative expansion using Neural Networks?
The aim of this paper is to apply machine learning techniques to develop design tools that could deal with geometric data, so as to traverse the creative expanse of the design field.

  • To explore the potential of 3DGAN NNs in architectural design as an efficient method in generating new design solutions.
  • To unveil the possible use of latent space operations as a design methodology.
  • To identify the directions in which this methodology can be taken forward as new creative design tool.

Workflow

The project followed a workflow involving curation and preparing 3d datasets, data conversions, and many iterations of training. Multiple models where tested out with different runs for each with varying hyper parameters until the best combination is achieved.

Workflow diagram

Model I 3DGAN: Keras

Model architecture diagram showing hidden layers

Keras implementation of the architecture suggested in paper.-Wu, Zhang, Xue, Freeman, Tenenbaum (2016)”Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling” [NIPS]

IKEA Table Dataset
Column Capital Dataset

The first run involved a keras implementation of the architecture suggested in the MIT paper[5]. This model was ran on two different dataset ; IKEA furniture dataset and a column capital dataset. With the use of matplotlib libraries in python we were able to visualize the output based on the values predicted, thus giving a deeper understanding of the performance of the model and the output generations. The outputs were not refined but the model was still able to generate capitals and tables that were distinguishable.

Generated Tables 0 -2000 epochs
Generated Column Capitals 0 -2000 epochs

Model II 3DGAN: Tensorflow

Model architecture diagram showing hidden layers

Tensorflow 1 implementation of the architecture suggested in NIPS papee based on the project – rp2707, Greg K (wasd12345)(2017)„coms4995-project“

The second model used was a Tensorflow 1 version of the same architecture. Although this model was more refined the outputs were still not showing any signs of progress. Furthermore the two networks ran into  mode collapse and the model was unable to generate variations. This was vital for the project as latent space operations required the model to generate different outputs.

Model II outputs showing mode collapse

Mode Collapse?!

“Mode collapse occurs when the generator produces an especially plausible output, the generator may learn to produce only that output. In fact, the generator is always trying to find the one output that seems most plausible to the discriminator.” A solution this was the use of the Wasserstein Method. “Wasserstein GAN, or WGAN, is a type of generative adversarial network that minimizes an approximation of the Earth-Mover’s distance (EM). It leads to more stable training than original GANs with less evidence of mode collapse.” With the use of the Wasserstein method the GAN model can be optimized so that the output generations have variations. The model trains to optimize the weights so that it deviates from generating single solutions. This provided a perfect solution to the issue of mode collapse.

Model III 3DIWGAN – 32 resolution

Model architecture diagram showing hidden layers

Further developed Tensorflow 1 implementation of architecture used in the NIPS paper with gradient penalty and Wasserstein method applied during loss calculations. Architecture based on Smith, Meger (2017)”Improved Adversarial Systems for 3D Object Generation and Reconstruction”&Wiegand (2018)”Eine Einführung in Generative Adverserial Network(GAN)”

Model 3 was a 3D IWGAN written using tensorflow 1 library. The model was set up for 32 voxel resolution inputs. With the new model setup a new dataset was introduced. This included 3d geometry of churches that was sourced from thingiverse – an online repository for 3d geometries for printing. The churches provided a dataset that is large enough but also has recognizable features even in smaller resolutions.
The first set of iterations was based on the parameters from the original 3DGAN paper[5].

Original mesh and Voxelized geometry of resolution 323

The outputs were rather poor, with the generations retaining the massing features but was missing the finer architectural details.

Generated Churches 0 -1500 epochs

Model IV 3DIWGAN -64 resolution : Run I

Model architecture diagram showing hidden layers

Further developed Tensorflow 1 implementation of architecture used in the NIPS paper with gradient penalty and Wasserstein method applied during loss calculations.
Voxel dimensions increased to 643 with the input dataset up sampled to accommodate for more finer architectural details.

Upscaled dataset from 323 to 643resolution

An update to the 3D IWGAN was done to accommodate and improved dataset of resolution 64 x 64 x 64. This meant adding additional hidden layers at the output of the generator and the input of the discriminator to reshape the generated sample. For this a new dataset of churches in the higher resolution of 64 x 64 x 64 was made .
The first run with this model was done with the same parameters as of the 3DGAN[5] paper and as expected the model had the tendency to leave out finer details. This might be due to the fact that the parameters used in the paper were optimal for generating furniture and other objects with larger and more distinguishable details. This meant the model required more fine tuning through trial and error until an ideal region is identified with the parameters. By increasing the learning rates, we observed that the model started learning more and outputs more detailed but at higher epochs the loss values started to climb up and the GAN model became unstable again.

Generated Churches with increased learning rates

Model IV 3DIWGAN -64 resolution : Run II

Generated Churches with new features

After a number of iterations an optimal range was reached within which the model kept improving with more training. The loss values kept reducing and the outputs showed more positive results. With this the process was continued on an extended dataset and 11 rotations per each geometry in the input dataset is introduced. The new generations showed more variations with more mix of features.
With the optimal parameters at high epochs we were able to generate new church geometries that had unique features which were initially not seen in the input dataset. This showed great potential of generating new geometries based on existing 3d geometry and this could be extended and built upon to define many new design methodologies.

Generated geometries with distinct features

Model IV 3DIWGAN -64 resolution : Run III

For the latent space operations to be more appropriate it was one of the priorities to be able to map the inputs in the latent space. This could enable us to develop a method to input 3d geometry and do latent vector operations on it.

Many papers such as the embedGAN[8] addresses this issue and the solution is to further optimize the random latent input vector to reduce the distance between the inputs and the generations. With this approach we ran further iterations and were able to generate geometries with features almost identical to that of the input dataset. Here on the figure to the left we can see the generations similar to the dataset and as you move further to the right more features and unique solutions can be observed.

Generated solutions with features identical to dataset to more distinct solution to the right

Latent Space Walk

This enabled for a latent space interpolation between inputs as well as new generations. A latent space can be imagined as a 3D space which represents a distribution on to which the generations are mapped. Two points in this distribution represents two generations and a path between the two will mark the transition between two geometries.
By moving through this path we can generate all geometries between these two points. A spherical linear interpolation is showcased on the right between the selected generations. This is exciting as this method , with sufficient data offers a possibility to find the intermediated between any two geometry which have completely different shape, size and topology.

Spherical Linear interpolation between selected solutions

Vector Arithmetic

Vector arithmetic is another possibility of the exploration of the latent space, where we can identify features within the distribution by  calculating mean vectors of the geometries with these similar features.

Feature selection

The resultant of these vector arithmetic can then be used to create new generations with such features. This is an interesting idea, as it allows for the intuitive and targeted generation.

Vector Arthemetic between selected features

Use Case I

As the last step of the explorations the approach followed was to identify a small test dataset of architectural data. We chose buildings designed by ZHA in this case and prepared a dataset of recreated models from sketchup warehouse and used it to embed
into latent space to perform interpolations.

ZHA recreated model as test Dataset

Though the outputs of the interpolations were very crude this still showed signs of promise. we could observe some intermediates between different buildings in the latent spacewalk. Considering the fact that we used 8 models that are very distinct, we are sure that with higher resolution with a large enough dataset like what is available for any architectural practice this method has great value and potential.

Interpolations between selected buildings

Further Steps

An extension of this can be a 2D- 3D reconstruction as in the mentioned paper. This method uses paired dataset like a 2D DCGAN and therefore can be made into a 3DCGAN, which has high application in the field of design. An extension of this can be a 2D- 3D reconstruction as in the mentioned paper. This method uses paired dataset like a 2D DCGAN and therefore can be made into a 3DCGAN, which has high application in the field of design.

Potential workflow for architectural design

Conclusion

The combined workflow will enable new opportunities in the field of design. The reconstruction phase can be altered to input any data that can be embedded in a 2d image format and this will enable us to create generations in a more controlled manner. Thus creating a tool with which the architect or the designer has the role of initiating the creation but with the neural network will be able to generate solution much beyond what they can create themselves.

Further improvements in the direction of 3d generative neural networks can thus open up new expanse in the field of design.

Bibliography


1. MACHINE HALLUCINATIONS-an examination of architecture in a posthuman design ecology –
Matias del Campo – University of Michigan, Sandra Manninger – University of Michigan, Alexa Carlson – University of Michigan, Marianne Sanche – University of Pennsylvania, Leetee Jane Wang – University of Pennsylvania
Paper
2. Neural 3D Mesh Renderer – Hiroharu Kato1, Yoshitaka Ushiku1, and Tatsuya Harada1,
The University of Tokyo, RIKEN
Paper
3. Upload Any Object and Evolve it: Injecting Complex Geometric Patterns into CPPNs for Further Evolution
Jeff Clune, Anthony Chen, Hod Lipson
Paper
4. Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling
Jiajun Wu, Chengkai Zhang, Tianfan Xue, William T. Freeman, Joshua B. Tenenbaum
Paper
5. rp2707, Greg K (wasd12345) (2017) „coms4995-project
Github
6. Smith, Meger (2017) „Improved Adversarial Systems for 3D Object Generation and Reconstruction
Github
7.Wiegand (2018)„Eine Einführung in Generative Adverserial Network(GAN)“
Github
8.Improved Adversarial Systems for 3D Object Generation and Reconstruction
Paper
8.DECOR-GAN: 3D Shape Detailization by Conditional Refinement
Paper | Github

CREDITS

Architectural Intermediates : An exploration of Neural Networks for 3D Geometry Generation   is a project of IAAC, Institute for Advanced Architecture of Catalonia developed at Master in Advanced Computation for Architecture & Design in 2020/21 by

Students: Krishnanunni Vijayakumar and Aleksander Mastalski as the final semester thesis project.

Thesis Guide: Oana Taut