In the last room we explored and collected over 500 architectural layouts, from 3 specific cohorts within the building code evolution. In this room we will experiment with artificial architectural learning on our collected material.
CHAPTER 4 OVERVIEW
Chapter 4: Artificial Architectural Intelligence
“One doesn’t discover new lands without consenting to lose sight, for a very long time, of the shore.”
Having extracted what we must assume to be too few plans to sufficiently achieve precision in an evolutionary cause-effect comparison analysis between our three databases, let us instead focus on one cohort. In the previous room I claimed that these historic plans were valuable as pieces in the puzzle to understand the relationship between updates in regulation and code, and their resulting consequences. I have also put forth the need to extract these patterns, as an important step toward better securace of architectural learning in the future. However, from our previous research there seems to be certain critical roadblocks hindering immediate extraction of these patterns at the needed scale. Assuming such roadblocks are removed in the future, through continued digital innovation and digitisation, let us move further and test a machine's ability to learn architectural representation.
4.1 First Trial: An Interactive Demonstration
Below is an interactive demonstration of the pix2pix algorithm1. The preparation and training were completed in 4 steps, named level 1-4. This demonstration uses an early version of the trained "level 4" cGAN algorithm. We will address this in more detail later in part 4.3. You can also press "random", to explore preconfigured solutions from the level 3 database. You can test the demo by drawing your own footprint, windows, and an entrance. Chose the colours from the tool section to draw in the input section. Press “process” to have the algorithm generate from your input.
1. The demonstration builds on work originated with Christopher Hesse (2017). Mr. Hesse has also created quite a few interesting other demonstrations using the pix2pix algorithm. This includes an architectural façade generator demo, based on 400 building facades. For more information visit: https://affinelayer.com/pixsrv/
4.2 Conditional Generative Adversarial Networks: A Brief Explanation
The above demonstration is part of an earlier test phase from this chapter’s experiment. As we can see, the result is still not satisfactory. This connects to our lack of architectural plans, as addressed in the beginning and end of chapter 3, seen in our previous room. The cGAN model is here attempting to generate novel plans, based on prepared material from our 2008-2019 database. Before going over the process, let us first briefly look at how the algorithm is structured to perform the image generation.
A GAN model2 uses two neural networks. The two networks are known as the generative- and the discriminatory neural network. These are positioned to compete against each other. The discriminatory network is trained with the dataset of our prepared architectural layouts to gain an understanding of the material. Thereafter the model encourages the generative network to continually create its own random new pictures. The generator will continually show its output to the discriminator, in hopes of tricking the discriminator to believe it is producing something akin to the real plans that the discriminator has been practising to understand. Therefore, the discriminator’s job is now to determine whether the input received is from the training set or if it is a fake.
Through a feedback mechanism called backpropagation, the generative network will become better and better at tricking the discriminator, and the discriminator will become better at understanding fakes. If all goes well, the back and forth trains the generator network to become increasingly capable. As such, making it skilled in the production of new plan drawings. Such plan drawings are then novel, but they fully embody the patterns of the database material.
The illustration above shows a simplified version of the configuration of the GAN model. However, we are using a modified version of the GAN-model called CGAN.3 “In CGAN labels act as an extension to the latent space z to generate and discriminate images better.”4 (The latent space z is shown as the “random noise” in the illustration above.) In the next chapter we will see how such labels are prepared and coupled to each plan layout.
Importantly, the experiments in this chapter utilises and trains the machine learning algorithm first created by Phillip Isola et al.5, through research at the Berkeley AI Research Laboratory. Also, we are using an adaptation of this research, written for the source-platforming environment TensorFlow7, the former which is created by Christopher Hesse7. To access, adjust and utilise these tools, we will be using the python programming language.8
2. Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozairy, Aaron Courville, Yoshua Bengioz (2014), “Generative Adversarial Nets”, University of Montreal
3. Mehdi Mirza (2014), “Conditional Generative Adversarial Nets”, University of Montreal
4. Jonathan Hui (2018), GAN — CGAN & InfoGAN
5. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, Alexei A. Efros (2018), “Image-to-Image Translation with Conditional Adversarial Networks”, Berkeley AI Research Laboratory
6. TensorFlow, developed by Google Brain Team, Visit https://www.tensorflow.org
7. Christopher Hesse, Tensorflow adaptation of Phillip Isola et al., Image-to-Image cGAN, Accessed: April 2020, Available at: https://github.com/affinelayer/pix2pix-tensorflow
8. Python, high-level general-purpose programming language, first developed by Guido van Rossum, More information available at: https://www.python.org
4.3 Data Preparation and Training
The preparation of our database is directly connected to the quality of our result. The 2008-2019 cohort had the greatest number of high-quality plans available and was therefore chosen for testing. This database, and every plan drawing within it, eventually went through 4 stages of preparation and processing. This process was a result of experimentation, specifically with the amount of standardisation needed, for a small database set, before the machine could start to grasp the underlying patterns.
For every plan, a labelled twin must also be created. With the twin copy and the label, we can choose how the algorithm should "categorise" the different areas of the photo. Therefore, the decision of using a large label over the entire 3-room apartment was critical. Other work with plan layouts and pix2pix has used more labelling very successfully for layout synthesis.9 This is effective, and could afford us great automaticity and new working approaches and methods, in the very near future.10 For our research, extensive labelling would not test the ability to grasp the overall architectural composition. Therefore, efforts have continually been made to keep the labelling simple.
Level 1 | Minimal Preparation
Level 2 | Layout Trace
Level 3 | New Label Strategy
Level 4 | New Layout Strategy
In the end, adding all four preparatory stages, 1500 apartment layouts, including the twin copies, were produced. With concerns to time available, each plan's scale was kept relative to itself. The work was marked by sincere precision, both in keeping with the plan's accurate and original representation. This was highly prioritised, particularly through the second level layout trace, but also throughout each successive alteration.
All four preparatory level sets were run through training epochs ranging from 50-1200, attempting to balance the training and improve the generator output. ”An epoch is defined as one cycle through a training dataset, where the samples in a training dataset are used to update the model weights in mini-batches.”11
The quality of the output was evaluated based on the coherence achieved with the traits and general characteristics found in the original training plan layouts. Such a comparison is made possible by feeding the algorithm a precise copy of the twin label back into the cGAN. Thereafter, we would inspect if something akin to the original plan, connected to the original label, returns from the newly generated plan. Given the low number of plans, none of the levels produced sufficient detail to arguably warrant further exploration into the interesting topic of validation measurements and quality. This is certainly a topic in need of much exploration in future research. Indeed, our precise formulations and explanations to the computers are central challenges that we will address later. As for now, when evaluating the artificial learning and training procedure, the following, within the algorithm's output, was given focus:
Program – Connectivity – Circulation – Proportion – Orientation
Program: Ability to produce coherent 3-room apartment programs. Example: A coherent 3-room apartment would, relative to our cohort, contain 2 bedrooms. A non-coherent 3-room apartment, relative to the cohort, could for example have 1 or 3 bedrooms, or the bedrooms could lack windows. Alternatively, it could be missing a bathroom, etc.
Connectivity: Ability to compose relative coherence with regards to adjacency between different types of spaces. Example: A meaningful connection, relative to the cohort, could be a hallway placed adjacent to a main entrance and a bathroom. A non-meaningful connection could be a hallway placed in a disconnected fashion, adjacent to nothing but the kitchen, leaving its function severely limited and seemingly misunderstood.
Circulation: Ability to produce circulation within the apartment program. Circulation shapes the movement-patterns of the users. A measurement for evaluation would therefore be to what extent the algorithm can grasp the pattern of circulation, relative to the cohort, and produce openings and overall circulation within the apartment.
Proportions: Ability to utilise footprint to organise room proportions meaningfully. Example: A meaningfully proportioned bedroom, relative to the cohort, will usually have a length and width that allows the placement of some sort of bed, with a clearance for entering the room. A non-meaningful utilisation of the footprint, relative to the cohort, could be an excessively enlarged bathroom, taking up substantial amounts (as an example >30%) of the total footprint.
Orientation: Ability to orient walls and space with relative coherence to the cohort. Example: No plans within the cohort had curved walls. Therefore, the ability to produce spaces and walls, relative to the characteristic of the cohort, could serve as an evaluation criterion.
9. Example: Weixin Huang, Hao Zheng (2018), “Architectural Drawings Recognition and Generation through Machine Learning”, Tsinghua University, University of Pennsylvania
10. Stanislas Chaillou (2019), 3-stack plan creation pipeline, “AI+Architecture: Towards a New Approach”, Harvard Graduate School of Design
11. Jason Brownlee (2019), “How to Code the GAN Training Algorithm and Loss Functions”, Accessed: March 2020, Available at: https://machinelearningmastery.com/how-to-code-the-generative-adversarial-network-training-algorithm-and-loss-functions/
The first two levels (Level 1 and Level 2) did not produce coherent plan-like outputs. We will therefore focus on Level 3 and Level 4. Below is an animation of a training session from Level 3. As we can see, the labelled parts and structural perimeter is easily captured through training with our small dataset. However, the interior architectural composition of programmatically rational walls, kitchen, and bathroom layouts does not appear in the result.
From the illustration below we see three different attempts at adjusting and improving the generator output. Even though we are approaching representations of plans, we are still not meaningfully close to the ability of reproducing characteristics and qualities of program, connectivity, circulation, proportions, or orientation.
After the unsuccessful attempt at grasping architectural representation at Level 3, the decision was made to run one more database experimentation. The animation below shows a training run of the preparatory work within Level 4. This dataset contained only 125 prepared architectural plan layouts. It held a total of 250 drawings if we include their twin labelled counterpart. However, from the animation below, we slowly start seeing a much greater convergence towards the logical architectural plan-layout.
In the final training sessions of Level 4, the development and evolution, seen through illustrations 1-9, seem indicative of increased ability to pinpoint the logic of architectural composition. From the 9th illustration below, we see convergence towards a coherent 3-bedroom program. In finer detail, it includes continual and small strides towards meaningful circulation, connectivity, orientation, and possibly proportion. Clearly, it is still far from the precision and quality we would have hoped to attain, but the tests, after the training of only a very small 125 plan layout dataset, quite possibly shows, a clear path of progress.
4.5 Speculative Hypothesis
The final result, along with other similar international projects,10,12 seem to suggest both the existence of, and opportunity to find, unique patterns within sets of architectural representation. So far, we have not been able to test more than one cohort. However, the assumption that each cohort, in itself is unique, seems highly reasonable. We know that the building code continually receives updating, and that it leads to some uncertain change in the way we construct buildings from that point on. We also know that the inherent results can be represented through geometric patterns, of which have been explored in this thesis, through pixels on our two-dimensional screen. Therefore, I would like to suggest the following speculative13 hypothesis:
• We have a continually growing set of distinct periods in time, that can be defined by their specific set of building codes. We can call these periods in the building code (p).
• We also have the output from construction projects – the entirety of buildings produced – within period (p). We can call the complete output and the set of all project information for (a).
• Thereafter, we have available projects ready for learning. These are available projects and their corresponding plan layouts, sections, and so on. This is the available data from the output (a), and we can call this (n).
• From the research done so far, we should therefore expect, when applying learning analysis on (n); for (n) to converge towards a specific unifying pattern or code, as it approaches (a).
• We may then call the continually converging, and increasingly precise result, an approximation of the building DNA, with respect to (p).
12. See chapter three: notes 1,4,5, and 6.
13. As for the arguments to the benefit, the need for, and the innovations generated by, speculation, I will refer to the work of Karl Popper, "The logic of scientific discovery", Routledge Classics print 2002, ISBN: 978041527843.
4.6 Beyond Experimental Results
Our experiment leaves something to be desired. However, the result, perhaps not preferable, is still a valuable one. Especially with respect to the learning outcome of the experiment at large. On the particulars of the specific machine learning network that was applied, clearly, a larger number of plans are needed for the machine to grasp the inherent pattern satisfactorily. As previously mentioned, another similar study from Harvard in 2019, had time to generate a dataset with over 700 plan layouts. Consequently, it generated far better results. Showing more precise geometric understanding of the concept of an architectural layout.
However, what seems likely, and possibly a core finding that we can extract from the previous experiment; is the possibility of successfully utilising a computer’s ability to synthesise learning over a large amount of architectural data and information. It is another step in the direction of proving the feasibility of having AI grasp the underlying pattern of our work – a proof of concept for the ideas to follow.
It seems then also, that we have identified a somewhat familiar bottleneck for successful learning from the previous experiment: the number of plans, and the time needed to prepare them for training. The preparation for training is an important subject in itself. It becomes very clear that in order to properly activate our past, particularly in the scale that is likely needed, we have to initiate and put forth an official AEC AI-Standard.
4.7 An Official AEC AI-Standard
We have a finding that keeps on repeating: a lack of an official ai-standard. A seemingly necessary stepping stone for activating the memories of past in our architectural learning.
How could we then build a path into the future, leading to the creation of such a standard? What characteristics and functions would our standard need? What would a successful project, both producing, distributing, and maintaining such a standard look like? Many more questions are needed, but for now let us address some of them.
Proximate projects and efforts to our standard has long existed. Namely, the idea of applying building information into a digital format. The concept of BIM has been around since the 1970s14, and today we are accustomed to directly manipulating drawings and building information digitally. Platforms such as Revit and ArchiCAD are but a few front-end options utilising a large range of the BIM-concept. Substantial work has also gone into the project of creating the interoperable standard of IFC.
Projects such as the International Alliance for Interoperability and organisations such as Standard Norway or BuildingSMART, have also, for a long period, worked towards less friction and more efficiency in our digital transitions. BuildingSMART is a large actor within the space and work towards better standards in the building industry. However, work specifically targeted to achieve a large-scale official working standard to utilise AI, currently seems limited.15 Actors or products, such as dRofus, are varyingly active in the AEC space of AI standardisation. However, our unique large-scale challenge likely requires a national or large authoritative initiation to successfully drive the project through to success.
Let us also briefly look to plausible success factors for an official ai-standard. This thesis has experimented with a single specific machine learning model from 2018, resulting from a different 2014-research paper. Looking beyond this narrow scope, an astounding amount of new papers are arriving every day, and with them new opportunities. One could therefore reasonably argue that a defining design characteristic, and a key success factor for our new working standard, is its ability to adapt to the changing landscape of innovation within the artificial intelligence field. The project is therefore multi-disciplinary, rooted with many different stakeholders; and after its inception, likely converted to a continual operation providing further development and maintenance. This should allow for flexible changes to the entire digital library, whenever needed. Designing with this adaptability, will likely be the only sure way of having a digitally active data and memory bank that stays both relevant and easily available for use. Certainly, further research is needed. Presumably, we may also find helpful inspiration and already established solutions within a variety of other industries and international projects.
As we move on from chapter 4, we will for now assume the claim that this specialised and official ai standard is needed. In the next chapter, we will ask "what if", and explore what we could do given such a standard. We will also explore how it could allow us to answer the initial question of re-structuring the building code to safeguard architectural quality.
14. Charles Eastman et al. (1974), An Outline of the Building Description System, Research Report No. 50, Carnegie-Mellon University, Institute of Physical Planning
15. Note: Oracle Construction and Engineering joining buildingSMART International, 2019, Article, Accessed: April 2020, Available at: https://www.buildingsmart.org/oracle-joins-as-a-strategic-member-of-buildingsmart-international/
• The 2008-2019 TEK10+TEK17 database was prepared for training. Over 1500 separate layout drawings were created, through 4 successive preparatory levels.
• Through each level, the machine learning algorithm was trained with the new database. At level 3 interesting geometric representations started taking place. At level 4, we began seeing logical compositions of apartment layouts.
• We have established a speculative hypothesis. It suggests the existence of a building DNA (bDNA) for each meaningfully different building code period.
• A crucial stepping stone to activating a profound and large-scale architectural learning is seemingly the initiation of an official AEC AI-Standard.
“Like the standing wave in front of a rock in a fast-moving stream, a city is a pattern in time.”