For most people, communicating by speaking is a matter of course. But there are still a small group in the world. They can’t do what we take for granted because of congenital or acquired reasons. For some people with hearing and speech problems, sign language is between them. The way of communication.

The problem is that although sign language solves the way they communicate with each other, there is no essential difference between sign language and alien language for most people who are used to speaking. How to easily convert sign language into spoken language has become a new research topic in recent years, and the new algorithm brought by Google AI Lab may become a new solution to this problem.

This new technology uses some clever and efficient methods, and with the increasing efficiency of machine learning, high-precision hand and finger tracking can be achieved with just a mobile phone, which brings many new possibilities. .

▲ image description. Image from: xxx

“The most advanced methods currently rely on a powerful desktop environment, and our approach can be implemented on mobile phones.Real-time tracking can even extend to multiple hands,” Google researchers wrote in an official blog. The powerful hand tracking algorithm is a challenging computer vision task because the hand is often occluded by motion and lacks a high contrast mode.

Not only that, the movements of the hands are usually fast and subtle. This is not the kind of real-time tracking that computers are good at. Even with SignAll with multi-camera and depth sensing, tracking every action is still difficult.

In this case, the researchers can only minimize the amount of data the computer needs to filter in order to improve its response speed and tracking accuracy.

First, they gave up tracking the size and position of the entire hand. Instead, they only let the system find the palm of the hand. This is not only the most unique and reliable part of the hand, but also close to the rectangle, which means the system does not have to deal with a lot. Complex patterns.

When the palm portion is first recognized, the extended finger portion is individually identified and analyzed. A separate algorithm assigns 21 coordinates, roughly depicting the knuckles and fingertips, including the finger portion. How far, and the system can also guess based on the size and angle of the palm.

In order to complete the finger recognition, the researcher must manually add these 21 coordinate points to approximately 30,000 palm images in various poses and lighting conditions. As always, every powerful machine learning system requires researchers to feed data at the beginning.

It’s easy to determine the posture of the hand, and to associate these gestures with the currently known sign language meanings, from simple letters and numbers to gestures with specific noun meanings. Finally, a quick and accurate gesture recognition algorithm was born and can be shipped on a smartphone.Line instead of desktop.

The emergence of this algorithm can also improve those existing recognition systems, but there is still a long way to go before AI really understands sign language, because it is a gesture, facial expression and other details. A distinctive form of communication that is different. But now we are moving in a better and better direction.

Finally, Google researchers wrote: “We want to provide this gesture-awareness to a broader research and development community and expect the emergence of creative cases to stimulate new applications and new research approaches.”

Source: Verywell Health