DEPTH CITY

Estimate Depth using DLT model 

Design Intention

To create 3D space from 2D image

Method:

Depth estimation is a crucial step toward inferring scene geometry from 2D images. The goal of monocular depth estimation is to predict the depth value of each pixel or infer depth information, given only a single RGB image as input.

WORKFLOW

Mode Dataset

Various datasets containing depth information are not compatible in terms of scale and bias. This is due to the diversity of measuring tools, including stereo cameras, laser scanners, and light sensors. Midas introduces a new loss function that absorbs these diversities, thereby eliminating compatibility issues and allowing multiple data sets to be used for training simultaneously.

WORKFLOW I

We start from a 2d image a run estimating Depth using the DLT model, we convert it into a 3D points cloud, import it into rhino and Grasshopper, then voxelized it over there.

The Model

The following model is based on MiDAS-DPT Hybrid whose backbone is Vision Transforms.
To understand this process, Various datasets containing depth information are not compatible in terms of scale and bias because of measuring tools, like stereo cameras and light sensors.
Midas introduces a new loss function that absorbs the diversities allowing train multiple datasets simultaneously, So very brief Midas is a machine learning model that estimates depth from an arbitrary input image.

2D RGB IMAGES

Dataset

The first exploration is based on 250 3d RGB images in other words monocular images, downloaded from google.

2D RGB IMAGES

Side-by-side comparison of image and depth map

Then we process the images to get the depth estimation and prepare the data
to recreate a point cloud.

Here is an approach to Depth estimation from our dataset before converting
into points cloud.

Depth to Point Cloud

Side-by-side comparison of image and depth map

Then here we can visualize the following step which recreates a points cloud based on the depth estimation.

Rhino & GH

The process to recreate a pc is not straightforward. We scaled the point cloud and use a 64×64 grid which to find the closest point that refines the resolution of the pc.

A visualization from the entire data set one step early from the resolution post-processing.

The final step of this exploration is to recreate a voxel from the depth estimation map.

WORKFLOW II

DEPTH ESTIMATION FOR COMPLEX SPACE TRANSLATION

In this workflow, we started with 2d images of various styles of famous architects worldwide. we also run estimating Depth using the DLT model, then convert it into a 3D points cloud.

COMPLEX SPACE TRANSLATION

Here is a selection of famous architectural styles

BARCELONA PAVILION, MIES VAN DER ROHE

FALLINGWATER HOUSE, FRANK LLOYD WRIGHT

VILLE SAVOYE, LE CORBUSIER

ROLEX CENTER, SANAA

GUGGENHEIM BILBAO, FRANK GEHRY

GALAXY SOHO, ZAHA HADID ARCHITECTS

COMPLEX SPACE TRANSLATION

WORKFLOW III

DEPTH ESTIMATION FROM 2D STYLE GAN  

2D STYLE GAN Credits: CHARBEL BALISS AND SOPHIE MOORE

In this workflow we follow the same process as the first workflow except instead of any 2D images, we would use a 2D style GAN image generated from another model.

2D GAN

2D STYLE GAN Credits:  CHARBEL BALISS AND SOPHIE MOORE

2D GAN

DEPTH ESTIMATION FROM 2D STYLE GAN  

 

MESHING ATTEMPTS

Further Steps

  • Create our own dataset to train a 3d GAN, to recreate and visualize our results in the Latent Space
  • Refine the depth estimation values for the point cloud
  • Testing with a model that can create a mesh for the point cloud
  • Mixing two 2D GANs and Depth Estimation workflow can help in the initial early design stages for more creative and unexpected results.

 

Depth City is a project of IAAC, Institute for Advanced Architecture of Catalonia developed at MaCAD (Masters in Advanced Computation for Architecture & Design) in 2022 by Salvador Calgua, Pablo Antuña Molina, Jumana Hamdani, and faculty: Oana Taut, Aleksander Mastalski.