TSEC NEWS: 06.05.21 Cron-Job Fehlerhaft nach PHP Update + PWA mobile + Desktop / 04.05.21 - Android App von TSECURITY 28.04.21 - NEUER SERVER // 26.04.21 ++ Download the Electron-App für tsecurity.de // Über 550 Feed-Quellen

❈ PhotoBooth Lite on Raspberry Pi with TensorFlow Lite

AI Videos blog.tensorflow.org

Posted by Lucia Li, TensorFlow Lite Intern

Illustration of the Smart Photo Booth application running in real time.
We’re excited to showcase the experience of building a Smart Photo Booth application on Raspberry Pi with TensorFlow (we're not open-sourcing the code yet). It can capture smiling faces and record them automatically. Additionally, you can use speech commands to interact with it. Thanks to the Tensorflow Lite framework, we built the application to easily handle smiling face detection and speech commands recognition in real-time.

Why should we build an application on Raspberry Pi?

Raspberry Pi is not only a widely-used embedded platform, but also tiny in size and cheap in price. We decided to use TensorFlow Lite as it is specifically designed for mobile and IoT devices which is perfect for Raspberry Pi.

What do we need to build the Photo Booth App Demo?

We implemented our Photo Booth App on Raspberry Pi 3B+, with 1GB RAM equipped and the 32-bit ARMv7 operating system installed. Our application has image input and audio input, so we will also need a camera and a microphone. In addition, we will need a monitor for display. The total cost is under $100 USD. The details are listed below:
  • A Raspberry Pi ($35)
    ‣ Parameters:
    » Quad core 64-bit processor clocked at 1.4GHz.
    » 1GB LPDDR2 SRAM.
  • A camera to capture image (~$15+)
  • A microphone to sample audio data (~$5+)
  • A 7-inch monitor (~$20+)
In our Photo Booth application, there are two key technologies involved. From the camera image input, we need to be able to detect whether there is a smiling face. From the microphone audio input, we need to be able to recognize whether there is a speech command which is “yes” or “no”.

How do we detect smiling faces?

Using a single model to detect faces and predict the resulting smiling score, with both high accuracy and low latency, is difficult. Thus, we detect a smiling face by three steps:
Smiling Face Detection Workflow
  1. Apply a face detection model to detect whether there is a face in the given image.
  2. If there is a face, crop it from the original image.
  3. With the cropped face image, apply a facial attribute classification model to measure if it is a smiling face.
We tried various options to reduce the latency for detecting a smiling face:
  1. In order to reduce memory and speed up execution, we leveraged the TensorFlow model optimization toolkit's post-training quantization. In this tutorial, you can see how easy it is to use in your own TensorFlow Lite model.
  2. We resized the original image captured from the camera with its length-width ratio fixed. The compression ratio can be 4 or 2 depending on its original size. We try to make the image size less than 160x160 (the original designed size is 320x320). Smaller inputs significantly reduce the inference time, as shown in the table below. In our application, the original image size captured from the camera is 640x480, so we resized it to 160x120.
  3. Instead of using the original image for facial attribute classification, we cropped the standard faces and abandoned the background. It reduced the input size while keeping the useful information.
  4. We used multi-threads for inference.

The table below shows the effects of the strategies we apply. We used Tensorflow Lite benchmark_model to evaluate the performance of the face detection model on Raspberry Pi
Face Detection Latency Comparison
The whole pipeline of detecting smiling faces, including the three steps we mentioned before, cost 48.1ms with one single thread on average, which means we realized real-time smiling face detection.

Face detection

Our face detection model consists of an 8-bit modified MobileNet v1 body and SSD-Lite head with a 0.25 depth multiplier. Its size is only a little larger than 200kB. Why is this model so small? First, the TensorFlow Lite model is based on Flatbuffer, which is smaller in size than the TensorFlow model based on protobuf. Second, we applied an 8-bit quantized model. Third, our modified MobileNet v1 has fewer channels than the original. Similar to most face detection models, our model outputs the position of a bounding box and 6 landmarks including the left eye, right eye, nose tip, mouth center, left ear tragion, and right ear tragion. We also apply non-maximum suppression to filter repeated faces. The inference time of our face detection TensorFlow Lite model is about 30ms. It means our model can detect a face on Raspberry Pi in real time.
Example of the bounding box and 6 landmarks.

Face cropper

The detected face may have various directions and various sizes. To unify them for better classification, we rotated, cropped, and resized the original image. The input of this function is the positions of the 6 landmarks we get from the face detection model. With 6 landmarks, we can compute the rotation Euler angles and resize ratios. Through this, we can get a 128x128 standard face. The figure below shows an example of our face cropper function. The blue bounding box is the output of the face detection model, while the red bounding box is our calculated cropping bounding box. We duplicated the borderline for the pixels outside the image.
Face Cropper Illustration

Face attribute classification

Our face attribute classification model is also an 8-bit quantized MobileNet model. With a 128x128 standard face as the input, the model outputs a float variable from 0 to 1 to predict the smiling probability. The model also outputs a 90-d vector to predict age from 0 to 90. Its inference times on Raspberry Pi can reach around 30ms.

How to recognize speech commands?

Real-time speech commands recognition can also be divided into three steps:
  1. Pre-processing: we use a sliding window to store the latest 1s audio data, with 512 frames different from the last recording.
  2. Inference: given a 1s audio input, we can apply a speech command recognition model to get probabilities for four categories (“yes”/“no”/“silence”/“unknown”).
  3. Post-processing: we average current inference result with previous ones. When the average probability of one word exceeds one threshold, we decide that a speech command is detected.
The three steps are explained in detail below.


We use PortAudio, an open-source library to get audio data from a microphone. The following figure shows how we store the audio data.
Audio Stream Processing
Since our model uses 1s audio data with a sampling rate of 16kHz for training, the size of our data buffer is 16,000 bytes. The data buffer also serves as a cycle buffer. We update 512 frames every time. Additionally, we record an offset which indicates the end of the last update. When the tail of the buffer is full, we will continue from the head of the buffer. When we want to get the audio data for inference, we will start reading from the offset and end it until the frame before the offset.

Speech command recognition

The speech command recognition model we used can be found publicly in many TensorFlow examples. It is composed of audio_spectrogram, MFCC, 2 convolutional layers, and 1 fully-connected layer. The input of this model is 1s of audio data with a sampling rate of 16kHz. The dataset is public, or you can train it yourself. This dataset contains 30 categories of speech command data. Since we only need “yes” and “no”, we disregard all other categories labeled as “unknown”. Additionally, we used other methods to improve the latency performance:
  1. We cut half the channels. The TensorFlow Lite model size is about 1.9 MB after compression.
  2. We used 4 output channels of the last fully-connected layer than the usual 12 as we only need 4 categories.
  3. We use multi-threads for inference.
In training, we set the background volume to 0.3 to improve the noise tolerance of our model. We also set the silence percentage as 25% and the unknown percentage as 25% to balance the training set.


Audio Stream Post-processing
Since the audio data we get may only cover half of the words, the single prediction result is not as precise. We stored previous results with recording time that is no more than 1.5s earlier to get average prediction results. It significantly improves the real-time performance of keyword detection. The amount of previous results we keep depends highly on our inference time. For example, the inference time of our model on Raspberry Pi is about 160ms, which means we can keep 9 previous results at most.

Whats next?

We hope to open source code for this example soon in the TensorFlow Lite Github repository. For more information about how to start with TensorFlow Lite, please see here and other reference examples here. Please let us know what you think or share your TensorFlow Lite use case with us.


Lucia Li, Renjie Liu, Tiezhen Wang, Shuangfeng Li, Lawrence Chan, Daniel Situnayake, Pete Warden....

Kompletten Artikel lesen (externe Quelle: https://blog.tensorflow.org/2020/01/photobooth-lite-on-raspberry-pi-with-tensorflow-lite.html)

Zur Startseite

➤ Weitere Beiträge von Team Security | IT Sicherheit (tsecurity.de)

A Tour of SavedModel Signatures

vom 1897.38 Punkte
Posted by Daniel Ellis, TensorFlow Engineer Note: This blog post is aimed at TensorFlow developers who want to learn the details of how graphs and models are stored. If you are new to TensorFlow, you should check out the TensorFlow Basics guides be

Recap of the 2020 TensorFlow Dev Summit

vom 1121.03 Punkte
Posted by Laurence Moroney, Developer AdvocateThanks to everyone who joined our virtual TensorFlow Dev Summit 2020 livestream! While we couldn’t meet in person, we hope we were able to make the event more accessible than ever. We’re recapping all t

What’s new in TensorFlow Lite from DevSummit 2020

vom 997.18 Punkte
Posted by Khanh LeViet, Developer Advocate on behalf of the TensorFlow Lite teamEdge devices, such as smartphones, have become more powerful each year and enable an increasing number of on-device machine learning use cases. TensorFlow Lite is the official framework for running TensorFlow model inference

What’s new in TensorFlow Lite for NLP

vom 915.14 Punkte
Posted by Tian Lin, Yicheng Fan, Jaesung Chung and Chen CenTensorFlow Lite has been widely adopted in many applications to provide machine learning features on edge devices such as mobile phones, microcontroller units, and Edge TPUs. Among all popular ap

Towards ML Engineering: A Brief History Of TensorFlow Extended (TFX)

vom 858.21 Punkte
Posted by Konstantinos (Gus) Katsiapis on behalf of the TFX TeamTable of ContentsAbstractWhere We Are Coming FromLessons From Our 10+ Year Journey Of ML Platform EvolutionWhere We Are GoingA Joint JourneyAbstractSoftware Engineering, as a discipline, has matured over the past 5+ decades. The mod

How TensorFlow Lite helps you from prototype to product

vom 774.18 Punkte
Posted by Khanh LeViet, Developer AdvocateTensorFlow Lite is the official framework to run inference with TensorFlow models on edge devices. TensorFlow Lite is deployed on more than 4 billions edge devices worldwide, supporting Android, iOS, Linux-based IoT devices and microcontrollers.Since first lau

TensorFlow operation fusion in the TensorFlow Lite converter

vom 684.13 Punkte
Posted by Ashwin Murthy, Software Engineer, TensorFlow team @ GoogleOverviewEfficiency and performance are critical for edge deployments. TensorFlow Lite achieves this by means of fusing and optimizing a series of more granular TensorFlow operations (which t

Recap of TensorFlow at Google I/O 2021

vom 649.57 Punkte
Posted by the TensorFlow team Thanks to everyone who joined our virtual I/O 2021 livestream! While we couldn’t meet in person, we hope we were able to make the event more accessible than ever. In this article, we’re recapping a few of the updates we

Easier object detection on mobile with TensorFlow Lite

vom 642.97 Punkte
Posted by Khanh LeViet, Developer Advocate on behalf of the TensorFlow Lite team At Google I/O this year, we are excited to announce several product updates that simplify training and deployment of object detection models on mobile devices: On-device

How to Create a Cartoonizer with TensorFlow Lite

vom 641.5 Punkte
A guest post by ML GDEs Margaret Maynard-Reid (Tiny Peppers) and Sayak Paul (PyImageSearch)This is an end-to-end tutorial on how to convert a TensorFlow model to TensorFlow Lite (TFLite) and deploy it to an Android app for cartoonizing an image captured by

Responsible AI with TensorFlow

vom 638.76 Punkte
Posted by Tulsee Doshi, Andrew ZaldivarAs billions of people around the world continue to use products or services with AI at their core, it becomes more important than ever that AI is deployed responsibly: preserving trust and putting each individua

Accelerating TensorFlow Lite with XNNPACK Integration

vom 629.89 Punkte
Posted by Marat Dukhan, Google ResearchLeveraging the CPU for ML inference yields the widest reach across the space of edge devices. Consequently, improving neural network inference performance on CPUs has been among the top requests to the TensorFlow

Team Security Diskussion über PhotoBooth Lite on Raspberry Pi with TensorFlow Lite