Your Source for Data Science

Generating Website Layouts with Deep Learning


When it comes to software development, there are two kinds of development. There’s front end and back end development. Back end development refers to what goes on, “under the hood” so to speak. It’s everything that the user of the application or website doesn’t see. By contrast, front end development consists of everything that the user sees and interacts with. Front end development can include webpage design, and for years now web pages have been created by a web designer who uses a variety of tools kits, some simpler than others. Recent advancements in deep learning are making it possible to auto-generate web pages and websites, making the process of creating a website much easier and quicker.

The ability for a neural network to auto generate a webpage comes from the synthesis of various other AI abilities. Neural networks have been able to automatically recognize images and generate language to limited degrees for a while now, and the combination of these traits can enable the generation of web pages. Instead of a single image being passed into a neural network, a proposed layout, grid, or wireframe of the webpage is passed into the network via a screenshot with HTML tags. The network is capable of recognizing discrete elements of the web page, such as toolbar, image areas, and text fields. The network is able to interpret this layout and fill it in with the requisite HTML and CSS code that makes the website useable to the average person.

HTML code is what is used to create the structure or framework of a website. It is what indicates which elements and aspects of a website go where. When you see a website with links, headers, text fields, images, comment sections and search functions, the layout and function is all handled by HTML code. CSS code is what defines how the HTML code looks, or is presented to the user. The CSS code defines things like font size and shape, style elements.

A blog post written by Emil Wallner at FloydHub Blog broke down the process of generating code from design mockups into several different steps. The neural network interprets the general structure of the website from the inputted layout, and then fills in the structure of the page with the HTML code. One thing to take note of is the fact that the neural network doesn’t just construct the HTML code from scratch, it initially based its code on the patterns and conventions used by a simple website designed to display Hello World. The model predicted the the matching HTML tags word by word, one by one. After approximately 300 epochs of training, the network reproduced a simple Hello World webpage.

Deep Learning is used by Google in its voice and image recognition algorithms, by Netflix and Amazon to decide what you want to watch or buy next, and by researchers at MIT to predict the future.  

Moving on to a more complex task, Wallner scaled up the general algorithms employed in the Hello World model to successfully create a dummy website filled in with dummy images and text. This was done by delineating features of the markup by using word embedding for the input and one-hot encoding for the network’s output. The word embeddings were then put through a Long Short Term Memory layer. The images were prepared for use at the same time by transforming the individual image features into one continuous list. The images and text for the website were then concatenated together. A decoder was used to predict the next tag in the sequence once the image-markup features were combined.

In the final portion of Wallner’s experiment, websites were generate with the Bootstrap web development framework were used for the dataset. Bootstrap is a front end framework that lets web developers create things out of pre-established modules, sort of like using building blocks to create the website. The neural network examines the features of the layout and then maps the features to the modules which are part of Bootstrap. This meant that CSS and HTML could be now be combined and used together, and that the size of the vocabulary could be decreased.

Once more, a LSTM was used to fill the role of a recurrent neural network and enable the system to predict things beyond a couple timesteps, which was necessary for the network to maintain information about aspects of the front end design like position and color of text and images. The result was an auto generated web page that almost perfectly mimicked the layout specified by Bootstrap.

Sophisticated applications of deep learning systems allows the front end of websites to be created with less effort, potentially freeing up resources and allowing developers to focus on other aspects of web development.