As a group of students with no theoretical or practical understanding of AI, there was an immediate lack of clarity or direction upon being briefed with an AI in Architecture studio project. Not only were the tools and concepts of AI still foreign to us, but the studio asked us to diverge from our past experiences; we were being asked to develop a tool or process to be disseminated to the Architecture community rather than coming up with a defined project in and of itself. Our initial ideas applied to an urban scale, going along with the idea that AI in architecture suggests smart cities. At the early onset, we wanted to create a tool that would showcase the identities of the city’s inhabitants, where we would make cultural backgrounds visible and therefore celebrated. However, it quickly occurred to us that this requires an involuntary interaction of citizens who might not want their identity publicly showcased. This triggered the realization of how it is  necessary to consider ethics when dealing with AI, the gathering of data, and the disruption that our proposal would cause to a city in which anonymity is a trademark.  As a group we engaged in early debates about what does culture mean, and could we metricize such an abstract concept through hard coded data. The naivete of our fist approach showed a deep unawareness of how AI functions and what are its limitations; we were allured by styleGANs ethereal aesthetic and thought of implementing it in our project. This led us to question the excess use of novel technologies, their overall necessity, and instead forwarded us to think more critically about the value of our proposed tools. From the early onset, our aim was to embrace citizens as creators and editors of the urban morphology. We were inspired by the idea of a tool that generates heatmaps of pedestrian paths within an urban context, and how these paths and heatmaps could inform design decisions in architectural and urban projects. Here, the anonymity of each individual is not jeopardized, and they become direct agents of design in the future of their city. The underlying aim is to encourage urban interventions that increase pedestrian traffic, and our asserted bias is that we believe that a pedestrian centric development creates better cities. While maintaining the concept of sticking to an urban setting, we have abandoned abstract classification of people and their cultures, to focus more on geo-data and global pedestrian-behaviour patterns, an attempt at a binary project that approaches human behavior with a factual mindset. 


The sharp shift in focus has provided us with a quest for a tool that is indeed useful for city planning, yet has a handful of shortcomings. Having settled on our concept, our first focus was to search for recorded pedestrian data in cities around the world. Immediately we learnt the difficulty to access public-available pedestrian data and that we were limited to a small list of cities. We chose New York City as our research sample and  looked into Open Street Maps (OSM) and Google Places for available information on pedestrian statistics, which included datasets of people coming into a presented amenity and the “street score”. These metrics were accessed through a third party, the grasshopper plug-in Urbano, which scaled the “street score” and “amenity scores” into a metric called “hits”. It is still opaque for us where this data exactly comes from, since not only is it being collected by Google Places in a non-published fashion, but it is also then distilled by Urbano and scaled according to specific windows of time. We soon realized that this was the extent of information that we would manage to scrap from the web, which brought us to a problematic and still lingering thought that we are forced to rely on third parties data, the validity of which we can’t be certain of, yet have to accept to move on with our research.  

Unlike many fellow metropolises, New York offers a publicly accessible NYCDOT pedestrian dataset as an alternative to OSM/Google Places data. This particular dataset is publicly available, and the methodology of its collection is completely transparent. For over 10 years, the city of New York has included pedestrian traffic as part of the mission of the Department of Transit, and as such it has collected data for it in as thorough a method as it has for other methods of transport. New York City relies on the MTA metro system to collect data on daily riders, however it deploys city workers twice a year to count the number of people crossing a particular street during the morning and afternoon rush hours in 100 locations throughout all five boroughs. This method of data collection means that local pedestrian traffic, not only commuters, are taken into account. However, we were still limited with pedestrian information on only  100 micro locations. Naturally we understood the dangers of getting a generalized result if using a dataset that does not represent the pedestrian behaviour of an entire city. We saw this as an opportunity to include our recent skills with machine learning algorithms and performed a simple ANN regression training to develop more pedestrian datasets for other locations that were not marked by NYCDOT. 

The way our model worked was by coupling the data from OSM, Google Places, and NYCDOT together. We allocated a constant radius to each of the data gathering points provided by NYCDOT, and from this radius we collected number of amenities, be it restaurants, stores, bars, pubs, etc., as well as the number of public transport stations. We found that there was a direct correlation between the number of amenities and the number of pedestrians, and that this was true for the majority of the city with few exceptions in areas where public transport became scarce. Through this process we could then select any point in the city and ask it to predict the number of pedestrians that might be present at a given location. The results for the more centralized areas of the city, namely Manhattan, Brooklyn and the more densely populated areas of Queens were highly accurate. Areas with more sparse population, less public transportation, and further in the periphery of the city began to show signals of error in our algorithm. Although we treated the collected data as factual, which is an erroneous assertion in itself, our role as its interpreters became most involved when we organized it for our ANN Regression Training. Datum became interpretable, contextualized, and therefore mutable. Our assumption was that an amenity, a recordable point of interest according to Google Places or OSM, ought to exist for there to be pedestrian traffic. And although this mostly true for a large part of the city, it was not true for some areas.

At this point we had become highly aware of the faultiness of relying solely on second hand-sourced web data, and understood the importance of generating our own dataset through alternative methods, which prompted us to explore data-generation through computing simulations that mimic pedestrian behaviour. In our initial trials, we used agent-based simulators like Pedsim plugin in Grasshopper 3D, to see how we can capture the logic of a pedestrian.  These simulators would require an entry and exit gate, and would generate pedestrians that walked in a straight line, one after the other, providing eerie images of orderly walking. The pedestrians walked as close as possible to the buildings, did not get distracted by other elements of the city, and were mostly interested in going from point A to point B. Pedestrian behaviour is chaotic and very hard to predict and we quickly found limitations in trying to predict it using simple simulators. Having evaluated our limitations, we figured that our opportunity lied in trying to bridge the gap between web-scraped information, results of computed agent simulators, and pre-developed physics based simulators that use flock functions for multiple agent interaction. This meant that we would collect our second-hand augmented dataset and place it in yet another black box, the pre-computed flock. We found Unity, a game-development software, to be most appropriate for our objective. Unity offered the possibility of integrating all of our data, visualizing our pedestrians, and furthermore publishing our tool as a gamified platform that becomes accessible to non expert users.