Sketchup a AI-UI project generating prototype application markup from hand drawn UI sketches.
Sketchup takes a sketch of an application UI that looks something like this:
Then uses TensorFlow object detection to identify UI elements within the sketch and their locations. The identified elements are transformed into prototype markup for a target technology stack. Currently, the only implemented target is the web.
This project was inspired by work done by the Microsoft AI Lab in a project called Sketch 2 Code (code repository) and an original paper and project by Tony Beltramelli called pix2code (code repository).
The original dataset consists of 149 images labeled with 10 classes and bounding boxes indicating the location of the instance of that class within the image.
The images are of hand drawn sections of, or within, prototype user interfaces from hypothetical software applications. They are lower fidelity and less consistent than traditional wireframes and represent the kind of sketches often created on the fly during ideation phases of a project.
The original images are located in the repo here and the original labels and bounding boxes are here. The set of labels is summarised here.
The dataset is augmented, as explained below.
After augmentation the data is separated into train, test and validation tranches in the ratio 70:25:5. Train and test are used for training of the object detection machine learning model and validation is held out for final analysis.
The dataset is made available thanks to the Microsoft AI Lab under the MIT License. The AI Lab repository can be found here and the data here.
Sketchup was built using macOS and Python 3.6.7 on top of the TensorFlow Object Detection APIs.
I have used pipenv for Python dependency management, but it isn't strictly required.
To use Sketchup requires the TensorFlow models research module.
Install instructions are available here, but here are the steps I took.
git clone [email protected]:tensorflow/models.git
Note: at this time this prototype was built, the repo is at
d7ce21fa4d3b8b204530873ade75637e1313b760. Optionally:
cd models
git reset --hard d7ce21fa4d3b8b204530873ade75637e1313b760
At this time, a bug exists that requires a workaround, if using Python 3. In
models/research/object_detection/model_lib.py on line 418 the following change
must be made:
- eval_config, category_index.values(), eval_dict)
+ eval_config, list(category_index.values()), eval_dict)
Refer to this issue if more details are required.
COCO API is required for our evaluation metrics.
cd models/research
git clone [email protected]:cocodataset/cocoapi.git
cd cocoapi/PythonAPI
pipenv install Cython numpy pycocotools
pipenv run make
cp -r pycocotools ../..
Now, pycocotools should be present: models/research/pycocotools.
The Tensorflow Object Detection API uses Protobufs to configure model and training parameters. To compile the protobuf libs:
brew install protobuf
cd models/research
protoc object_detection/protos/*.proto --python_out=.
Sketchup lives in the TensorFlow models/research directory so that it
can access object detection utilities.
cd models/resarch
git clone [email protected]:PhilipCastiglione/sketchup.git
cd sketchup
pipenv install
So that training doesn't start form scratch, which takes a very long time for object detection models, we use a pretrained model from the Tensorflow detection model zoo as a starting point for our training.
curl http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_coco_2018_01_28.tar.gz > models/model/faster_rcnn_inception_resnet_v2_atrous_coco_2018_01_28.tar.gz
tar -xzvf models/model/faster_rcnn_inception_resnet_v2_atrous_coco_2018_01_28.tar.gz -C models/model/
This example also demonstrates the use of a pretrained model for transfer learning (though using GCP).
If you are using macOS and pipenv (or virtualenv) then depending on how you have
installed Python, a terrible workaround may be required for matplotlib.
You will get this error when matplotlib is invoked if so:
RuntimeError: Python is not installed as a framework. The Mac OS X backend will not be able to function correctly if Python is not installed as a framework. See the Python documentation for more information on installing Python as a framework on Mac OS X. Please either reinstall Python as a framework, or try one of the other backends. If you are Working with Matplotlib in a virtual enviroment see 'Working with Matplotlib in Virtual environments' in the Matplotlib FAQ
To deal with this you can:
pipenv run python
import matplotlib
matplotlib.matplotlib_fname()
#=> <path to matplotlibrc>
exit()
vi <path to matplotlibrc>
# comment out backend macosx
# add backend: TkAgg
Refer to this issue and this SO answer if more details are required.
First, augment the original image dataset.
pipenv run python scripts/augment_dataset.py
Note that the current default is to only include the "Button" label, the flag
--all-labels can be passed to include all 10 classes. This flag will also
need to be passed to the training script.
This will take the original 149 images in ./data/original_images/ and apply
every combination of six stochastic augmentations. Bounding boxes are also
transformed equivalently.
Augmentations are applied using this library as seen here.
A total of 9,536 images will now be in ./data/images/ and an augmented
manifest will be at ./data/dataset.json.
It is recommended that you visualize labelled training data.
pipenv run python scripts/visualize_input_dataset.py 20
This will display 20 of the training images randomly sampled from all of them, with bounding boxes drawn around labeled elements in the image. The bounding box colors are generated each run.
TensorFlow Object Detection requires a particular format for data, which we produce from our augmented images and json manifest.
pipenv run python scripts/convert_dataset_for_tf.py
This will produce ./data/train.record, ./data/test.record which are used
by TensorFlow and ./data/validation.txt which contains a list of images held
out for evaluation of the trained model.
Details on TFRecord files can be found here.
Training the model can take a long time. The object detection APIs utilize checkpoints so that it can be trained incrementally. To train the model, we specify the (maximum) step number to train up to (100 in this example).
pipenv run python scripts/train_model.py 100
If the data was augmented with the --all-labels flag, it will need to be passed here as well.
TensorFlow will train from the last checkpoint (if available) up to the specified
step. Numerous files will be created in ./models/model/, the key ones are:
- model.ckpt-100.data-00000-of-00001
- model.ckpt-100.index
- model.ckpt-100.meta
You may want to observe training using TensorBoard:
pipenv run tensorboard --logdir=models/model
Once the model has been trained, we can export a frozen inference graph that we can use for prediction.
pipenv run python scripts/export_trained_model.py
The latest checkpoint will be exported, with the key output being
./models/model/frozen_inference_graph.pb.
Details on exporting models can be found here.
Generate a prediction for an image by passing a path to that image. Here we use an image from our validation set.
pipenv run python scripts/predict.py data/images/272.png
A folder will be created in ./predictions with a timestamp containing
image.png and detections.json.
To visualize a prediction, pass the timestamp to the script.
pipenv run python scripts/visualize_prediction.py 1542554276
This will display the provided image and detected elements with bounding boxes, class names and confidence scores indicated.
For a prediction, a simple prototype web UI can be generated. Pass a timestamp to the script.
pipenv run python scripts/generate_prediction_web_spike.py 1542554276
open predictions/1542554276/prediction.html
This will display the detected elements mapped to html in your default web browser. Hover over an element to view its location based on detected bounding box.
This library is available as open source software under the terms of the MIT License.