Thorpe Lane Lofts (BIM dataset)
 Overall floor plans showing apartment unit schemes
 Unit types and floor plans, information which yielded the expansive data set used in this project. 
Dynamo Data Exports
  A more advanced approach to exporting apartment unit data is to extract the information from a series of PDF documents containing the unit floor plans (printed from either CAD or BIM). The PDFs are filtered via binary thresholding. This removes most of the symbols and some of the text. Additional filtering is done to extract and segment doors. The remainder of the text is removed via text recognition filter. The remaining lines are converted into Hough lines, the model is tweaked to improve accuracy and the lines are stored. Another connected component analysis is employed to identify room and unit shapes. The forms are classified with the room and unit names segmented previously. Additional processing yields a similar dataset to what was extracted from the BIM model.     However, there are more direct ways to convert PDF floor plan data into a BIM model, which is explored more in the  Kasita Site Plan Generation Tool post .
  There was additional processing needed before the data could be analyzed. Several of the unit’s data integers were reoriented so that all unit types shared the same direction. Each of the units were then recentered around the main project origin, recalculating the centroids. Certain fields separated the x & y coordinate values for the start and end points of each line recorded. So I had to form tuples and segments and add the new fields to the tables.     I also calculated the percentage of space each room occupies within a unit, and the number of wall segments each jurisdiction had and added the fields to the tables.
  From there, I plotted the points to ensure the new tuple data was preprocessed correctly.
  The next step was to create shape descriptors from the plotted boundaries for both the unit types and each room. These descriptors are created from taking the distances from the centroid of each plotted shape, to a series of points where radiated lines intersect the boundaries. These distances yield the descriptor. This process was also visualized to ensure the information was programmed correctly.
unit plots.png
  To eventually aid in the classification of the new unit layouts, I graph matched the existing shape descriptors (visually, on a line histogram). I am currently working to add room adjacency as another field for reference.
 The shape descriptors for each room was also measured and compared. 
  The remaining data was run through a principal component analysis to identify prominent features. When matching graphs, shape descriptor was placed on a similarity index against the other descriptors in a category, which made classifying new lines possible. The combined PCA and similarity indexes were weighed according to model validity.
  I tested 3 new unit layouts by drawing the outer boundaries and extracted this information in Dynamo. The new weighed machine learning model predicted which unit types the new schemes would be most similar to. It also predicted the shapes & placement of the new rooms, which wouldn’t necessarily match the assigned unit type if the new boundary edges deviate significantly in some places.
 Results of shape descriptor analysis. 
 Newly generated rooms.      Next steps: The new room boundaries are being preprocessed so that when Dynamo imports the numbers, the 3D BIM walls will be centered on the boundary lines. There is also a relatively simple process to ensure the rooms are labeled correctly in BIM.    Eventually, I am aiming to explore the furniture, cabinets, fixtures and door layouts for each unit based on similar methods outlined above.     Additionally, I am currently expanding upon the PDF extraction method to arrive to similar conclusions as the BIM extraction method(s). Then packaging a pipeline which takes PDFs and generates new schemes in Revit.
prev / next