Image-based Modeling of Haussmannian Facades

This paper describes techniques and algorithms for Haussmannian façade modeling. Although buildings are complex artificial objects which are difficult to interpret, Haussmannian buildings carry a more consistent typology and composition rhythm. By incorporating the architectural knowledge of the Haussmannian facade into an image analysis process, façade structure information can be automatically inferred. Moreover, in order to further refine the façade analysis, an image synthesis process is integrated and a feedback loop is created for producing more stable results. With this methodology, a solid technique and process for image-based modeling of Haussmannian or similar building facades have been established.

IGN is the responsible of the 3D reconstruction of the building roofs from ground and satellite images as well as the planimetry which is supporting the geo-refencing of these buildings.On top of that, IGN also furnishes BATI3D which is a 3D polyhedric representation of the building bounding boxes limited by the roofs and the planimetry.In this way, each building is delimited by a set of vertical faces which, most of the time, correspond to the facades of the buildings (or at least are parallel to the real facades).These transparent faces will have to be replaced by real facades obtained from images taken from the ground by cars or other sensors.Therefore, 3D reconstruction of facades from images is crucial.
We decided to start with analysis of Haussmannian buildings as they are the most spread in Paris.In comparison to buildings of other types, such as Art Deco or Art Nouveau, Haussmannian buildings present rather regular and consistent elements that are more suitable for analysis automation.For example, these types of buildings consist of multiple highly similar floors, significant repetition of architectural elements and well defined dimensions constrained to the street width, which exhibit high degree of consistency.These meet the requirements from both the construction laws and aesthetic perceptions.Even though these buildings are less complex, previous methods failed.
Our analysis process starts with the use of single faç ade images.Although, we could tackle the case of the use of more images from the same faç ade, the aim is to explore the maximum we can achieve from a single one and therefore limit the analysis to a 2D approach that could be extended to 3D.
Meanwhile, our main goal in this modeling effort is to derive a 3D model from a single faç ade image, which owns multiple colors, materials, shapes and textures, saturated with significant reflections and partially occluded by public facilities and trees.Hence in this consideration, single features will not be sufficiently, the clues for image recognition and further segmentation.Consequently, we use a hybrid, multiple-clued approach, which means, the composition recipes of colors, shapes, textures will form a context for recognition of building elements, quite similar to the interpretation of human languages.Our image-based modeling of Haussmannian facades brought three major contributions: (1) An image analysis-synthesis loop for better reconstruction.(2) A joint color and edge profile-based method for building typology determination.(3) A hybrid method for architectural elements recognition.
The remaining of this paper consists of 6 additional sections.Section II reviews related topics and works in this field.Next, Section III describes the establishment of the techniques and process for faç ade structure determination and architectural

Chun Liu and André Gagalowicz
Project MIRAGES, INRIA Paris-Rocquencout elements recognition.Section IV presents our future work on full 3D faç ade reconstruction.Results are given in Section V and the conclusion is drawn in Section VI.

II. RELATED WORK
Because buildings bear enormous typologies, accordingly it has been shown that analysis methods are versatile.Generally, there are two approaches carrying different philosophies and procedures but sharing the same assumption that windows are key elements for faç ade interpretation.

Implicit Window Models
Methods in one category are designed to recognize window first by incorporating implicit window models [4,5,7].Jan Cech and Radim Sara [7] developed a window detection algorithm by utilizing machine learning with a window image database.This direction seems promising except that establishment and customization of window databases are problematic.Also in Robotic domain, there are similar works on indoor environment rectangular object detection by assuming that the objects are rectangular and consistent in color, and also maintain a certain ratio between the width and the height.

Explicit Window Models
In another category, window models are given more explicitly.In one of them, windows are considered as dark regions on the faç ade planes compared to faç ade walls.In this regard, region-based segmentation is useful for window detection.Alternatively, the horizontal and vertical profiling [3] of the gray image could show the repetition pattern of windows so that the faç ade typology could be inferred in terms of floors and tiles as windows should impose very different pixel values.
Another assumption is, in simple facades, window frames are rectangles, imposing significant edges against the background wall.As a consequent, edge profiling could also be a good solution for window recognition based segmentation.Particularly for modern box like office buildings, the repetition rhythm of windows could be obtained by analyzing the similarities in Pascal Muller's paper [2] or by analyzing the periodicity with FFT.
However, all those assumptions of windows may be not always true under real and complex situations as we traverse everyday in cities.Because in manmade complex objects like buildings, there are too many edges, luminance, color and texture variations, such as in Haussmannian buildings or any other type human residence.

Color Based Window Recognition
With the rapid progress of digital cameras, images could be obtained easily in large resolution and colors.And those color images offer more information and so more clues for recognition, classification and segmentation under sufficient lightness.With interaction from human operation, a semi-automatic window and faç ade recognition is quite successful.

Overview
In our approach, two steps are taken in series.Firstly the facade typology is inferred in terms of floors and tiles segmentation vertically and horizontally.Second, different architectural elements are deduced by using different image descriptors.Please note that a pre-rectified single facade image is used in our context.

Determination of Facade Typology
Our assumption here is that windows always reflect or display different colors against the dark background facade wall, either the blue from the sky or various colors from surrounding objects.And this could be perceived in the hue image, which shows he global color contrast.By checking the occurrences of windows but not recognizing them, the facade typology could be determined as floor and tile segmentation.Also here we take the architectural knowledge that windows in Haussmannian buildings always last until the bottom of the floor.As the hue information differs windows from walls, it highlights the typology of the facades and suppresses other details which may do harm to global analysis.This hue information is projected to X and Y axes to form horizontal and vertical profiles respectively which facilitate floor and tile segmentation.In this way, we reduce 2D segmentation problem to 1D problem, which is much easier and more efficient in terms of processing time.A good analogy to illustrate this method is to think of a binary image in a checkerboard pattern.Projecting this binary image along X and Y axes will generate two binary signals with 1 corresponding to white blocks and 0 to black ones.From them, we could easily tell the sizes and locations of different blocks.Both of these information together indicate the typology of this patterned image.
Furthermore, in Haussmannian buildings, windows are always repeated and aligned well horizontally and vertically.So generally, this rigid structure will greatly help the segmentation.
The following subsections describe the method and the procedures used for floor segmentation vertically and tile segmentation horizontally.

Floor Segmentation
As the first step, when the hue image is projected onto the Y axes, the vertical color change will be highlighted in the obtained profile signal which should be ideally a square wave.However, this would be only possible in the absence of many decorations.Since these elements do exist in the facade images, the profile signal can be significantly corrupted in various ways.And those corruptions would mostly occur near the top of a window due to decorations.As a consequence, from top to bottom, when it comes to square wave like profile signals, the rising edges could not be trusted.Then we take the falling edges to segment floors as they are more reliable in most circumstances.
Before taking falling edges as borders of floors, the profile signals should be processed to minimize the influence of two kinds of distortions.The first distortion is caused by the luminance variations.Due to the scale of the facade, the luminance along the facade plane is varying significantly.On the top, the roof is very close to the sky and reflects enormous amount of light.While on the bottom, it is little bit darker as many occlusions occur.So the luminance is declining from the top to the bottom.As a result, color contrast is also shrinking from the roof to the ground floor.Hence, significant contour lines could be perceived in the profile signal.The second distortion in the profile signal is pulse noise caused by the presence of many small objects other than windows on the facade.
In order to cope with those distortions, we used several signal processing techniques in 1D rather than 2D for the sake of simplicity.For the first distortion, we normalize the profile by dividing it with an estimated luminance line from a smoothed version of the profile signal itself.And the second distortion is minimized by filtering the profile signal with customized 1D morphological operation.Both positive and negative pulse noises could be minimized.The reason why morphological filtering being used, is that we do not know the noise distribution but the extend of the pulse noise according to the size of small objects in the facade.Therefore, by taking these two types of treatments, the profile signal is normalized against the luminance degrading effect and cleaned from pulse noise.
Until this stage, we are able to segment floors according to the falling edges of hue profiles along Y axes as windows last to the floor bottom in Haussmannian buildings where the abrupt color change appears.Hence, the facade image could be segmented into floors by taking those falling edges as separation lines.Special cases should be taken on roofs and ground floors as color fluctuate unexpectedly but could be corrected by voting scheme using floor height.The detailed procedures to segment floors is presented below.
(a) Luminance Drifting Line Removal: from Y axis profile of the hue image, consruct two step signals form local maxima and local minima respectively and get the average from the two step curves.Normalize the profile by dividing it with this estimation.
(b) Pulse Noise Filtering: the filtering scheme is to get the average from opening-closing and closing-opening cascading filers to suppress both positive and negative noises.
(c) Rising and Falling Edges Extraction: to avoid tedious thresholding or zerocrossing searching, a scheme borrowed from receiver theory in telecommunication systems is used.Previously normalized and filtered profiles are integrated along the X axis (producing approximate triangle shapes).Within a fixed-width sliding window, we align signals, local maxima with strong confidence are picked up.These maxima correspond to falling edges.While the rising edges are extracted by searching the local minima between two maxima.And then we could reconstruct a binary signal from positions of local maxima and minima.In this way, we have a rough estimation of the color switching rhythm along Y axis and the falling edges correspond to floor separations.
(d) Floor Separation Validation: recovered binary profiles contain some false floor separations from smaller architectural elements like balconies and unexpected vertical color changes.Those false separations could be eliminated by two-stage validation process.Firstly, if two digital pulses are too close, they would be merged as one.Secondly the distances between horizontal edges are checked collectively.If the distance is extremely smaller than the typical one, it will be ignored or if it is too large, it will be broken into two.The second stage validation is only necessary when there are too many color fluctuations.
(e) Floor Separation Refinement: along the hue image profile, each separation line is moved at a certain distance to reach a local maximum.So the separation should be locked to the position where it should be.

Tile Segmentation
Similarly, the tile segmentation could be achieved by utilizing the hue profile along X.As in Haussmannian facades, windows are rigidly repeated and well aligned, the window pattern is more prominent.As a consequence, local maxima show the existence of windows and mark them in the position.So the tile separation lines are available by taking the middle lines between window existence maxima.And the procedure is described below.
(a) Window Width Estimation: as window rectangles are repeated regularly, the FFT of the first order derivative of profiles should indicate the window width, which corresponds to the max peak in the FFT spectrum.
(b) Window Existence Estimation: moving average filtering is applied to the profile signal with the size of sliding window set to the window width.Then the filtered output is once again integrated along X axis and local maxima are picked up.They correspond to the existence of windows.
(c) Window Existence Validation: since some small objects (rain pipes for example) impose strong color changes on the facade plane, we have some false lines also.They could be removed since they are very close to the border of the facade.
(d) Tile Separation: when all windows indicating lines are present, the tile separation lines are available immediately by taking the middles as tile separation lines.

Detection of Architectural Elements
After the determination of the facade typology in terms of floor and tile segmentation, more details will be detected to obtain a complete description of the facade.These details include windows, balconies, and various decorations.The central part of the detection of various architectural elements is window detection because windows mark the active region of residents' lives in the tile so that all other elements are positioned around those regions to functionally and decoratively highlight them.
Various elements could be detected separately according to their own image characteristics.As each element appears at multiple positions, the detection could be co-verified among multiple instances.That means if one instance shows a strange result, it could be corrected by the presences of other instances.For example, for window detection, if a few windows were badly estimated, they would be corrected by aligning them with others horizontally and vertically.Therefore, a relatively accurate detection could be achieved.
Furthermore, there is another scheme to ensure the accuracy of the detection.First, a rough detection of various architectural elements is established with various image descriptors globally.Then the detection is refined by local information.For instance, the window is detected by color difference but the final estimation of size and location are based on edges on facade image.

Window Aperture Detection
The detection of windows is very difficult since the real scenarios are very complex.Windows have paneled glasses which intrinsically absorb the red rays from the light and reflect surrounding contents including the sky above or the facing objects.In addition, windows may have shutters, which have their own colors and shapes, internally or externally, open, partially close or fully close.Hence, windows on facades do not own unique shape, color or material.
One way to resolve the problem is to use machine learning with boosting scheme [7] because multiple cues (shape and colors) could help classify windows from others.However, the diversity of windows will make the window database unrealistically large and occlusion may do significant harm.
Our assumption here is that window shutters are mostly open and facade walls have very different material against windows.Therefore, windows should have very different color against walls.In the context of hue image, window aperture regions are very different from walls.By estimation of these regions individually, then verified collectively in horizontal and in vertical, and then further refined by local edges, window apertures are detected.(b) Window Aperture Centroid and Size Refinement: the actual border of window aperture on facades imposes significant edges.In consequence, those edges help refining the initial estimation.The actual process is to move the window aperture border lines form a certain amount to reach a local maximum.Once again, those aperture centroid and size information are synchronized horizontally and vertically.

Balcony Detection
Balconies are also difficult to detect because they do not have prominent features such as shape, color or material.In Haussmannian buildings, wrought iron balconies are mostly seen.And they add another difficulty as they are neither plain nor solid but polyporous.Moreover, the bases and consoles of balconies share the same material as the facade wall.Therefore other types of features should be searched to detect them and architectural knowledge should be involved to facilitate the detection.
(a) Wrought Iron Guard Detection: for most Haussmannian buildings, wrought iron guard is located on the low part of tiles around the window to protect residents from falling down.They are cast in various forms and they are polyporous.As a result, there are no concrete areas to present a unique color or shape.However, as the image pixels around wrought iron are switching very rapidly from wrought iron to other contents, we may consider the activity of pixels as a feature.Here we use FFT in a sliding window to evaluate the pixel if it is belonged to the wrought iron part.Afterwards, a 2D morphological operation is applied to remove holes inside the region.Also potential regions are examined by area size and position.For instance, wrought iron region should be at the lower part of a tile and should not be very small.Hence, a rough estimation of wrought iron regions is obtained and can be further refined by utilizing local edge information.In addition, individual and collective balconies could be differentiated because wrought iron region present this information.(c) Balcony Consoles Detection: After getting the wrought iron and base, we could know where the consoles should appear.They are just below the base.We could highlight them against the wall by histogram equalization.Then they could be marked by profiling techniques used before.

Roof Treatment
The top section of the facade is normally a mansard roof, which is painted in deep blue or gray, and it could be easily differentiated from normal floors.Also dormer windows are widely seen in mansard roofs.To mark the dormer windows, color histogram based segmentation could be used.

Door Entrance Detection
Every building has a prominent door entry for access and security control.Its detection is a problem due to the fact that on the ground floor, various shops exist, which have various colors and shapes.However, the door type and door color are of a limited number, thanks to the building construction regulations.Hence, we could set up a small door database and use color histogram intersection to match and mark them.Until this stage, we have a 2D description of facades.In order to obtain a full 3D facade reconstruction, we need two kinds of information.One is the depth information.This type of information can be obtained by analyzing 3D scan data of the facade and the 2D classification of the tile parts obtained from image analysis in (walls, balconies, window parts, ...,).The other type is the texture information which could be easily extracted from images.We will investigate an analysis/synthesis approach in order to obtain precise reconstructions tuned to real data.

Fig. 1 .
Fig. 1.Input facade image and its hue image

Fig. 2 .
Fig. 2. Profiling for facade image segmentation.Left: profiling on a checkboard pattern image.Right: proifling on an hue image of a facade image.

Fig. 3 .
Fig. 3. Signal Processing on Y profile of hue image.From top to bottom: Original profile signal, drifting line removed, filtered with morphological filters, integration against X axis and the final falling edge detection.

Fig. 4 .
Fig. 4. Signal Processing on X profile of hue image.From top to bottom: Original profile signal, smoothed with moving average filtering and the final separation line detection.

Fig. 5
Fig. 5 Window Aperture Detection.Left: initial estimation of window aperture within each window tile; Middle: After centroid and size synchronization; Right: final result after local edge refinement (a) Window Aperture Centroid and Size Initial Estimation: within each tile, binarize the hue image to mark the window aperture region.Then use morphological operation to remove noise.Only the objects larger than a certain area size are considered as window apertures.And the size (height and width) and position (X and Y coordinates) could be estimated.As mentioned before, those size and position information could be synchronized along floors horizontally and vertically.In this way, an initial estimation of window aperture size and centroid is achieved.(b) Window Aperture Centroid and Size Refinement: the actual border of window aperture on facades imposes significant edges.In consequence, those edges help refining the initial estimation.The actual process is to move the window aperture border lines form a certain amount to reach a local maximum.Once again, those aperture centroid and size information are synchronized horizontally and vertically.

Fig. 6 .
Fig. 6.Balcony Guard Detection.Left, after activity evaluation; Middle, augment of the activity; after morphological operation.(b) Balcony Base Detection: balcony bases are just below the wrought iron guard and they are projected from the facade plane outwards.As a result, there are strong edges on both the top and the bottom of the base on the facade plane.By extracting these edges, the base should be outlined.(c)Balcony Consoles Detection: After getting the wrought iron and base, we could know where the consoles should appear.They are just below the base.We could highlight them against the wall by histogram equalization.Then they could be marked by profiling techniques used before.

Fig. 7 .
Fig. 7. Balcony Base and Consoles Detection.Left: after histogram equalization; Middle: binarized version to extract the bottom border of base; Right: consoles are highlighted with crosses.

Fig. 10
Fig. 10 shows the facade analysis results obtained from the top left input.The top right image shows the precision of the result of the automatic tiling of this facade.The reader may check the bottom of each tile corresponds exactly to each floor bottom.The center left image displays the automatic detection of the window frames of the four major floors.The center right part adds the detection of the dormer window of the mansard floor.The main building entrance has been successfully detected on

Fig. 10 .
Fig. 10.Image Analysis Process.From left to right and top to bottom, original image, global structure determination, window aperture estimation, dormer window detection, door entrance detection and balcony detection