How It Works

#HowTall is brought to you by Oben, Inc

Your speech contains abundant information about you, perhaps even more than you even realize. When we hear someone’s voice we can usually tell their gender and often their age range. Gender and age estimations from speech have been successfully automated using the current technology of machine learning and speech processing. One fascinating task that a machine can do — that the human brain is not designed for – is height estimation. That’s right. While, at this point in time, you are better than machine at identifying a face from an image you see or recognizing a word that you hear, you might not necessarily be able to predict height from simply hearing the voice of a person.

How it works: Height Estimation

How we predict your height from your voice

Your vocal system — which consists of your lungs, vocal chord, and your mouth – functions like a musical instrument. The shape of your vocal tract creates unique voice characteristics. That’s why you can recognize a friend’s voice on the phone. Inspired by this relationship, we can cleverly use your speech to trace back the shape of your vocal tract. The length of a person’s vocal tract is generally proportional to his or her height. This is how we can “reverse engineer” your height from your voiceprint.

Training the system

We train our height estimator using a set of short recordings from a few hundred speakers. For each recording, a speech representation — or a set of features — is extracted so that the computer can understand it. These features capture the shape of the vocal tract and the stiffness of the vocal cords of the speaker from the speech. We group the speakers based on their heights. For each height range, the system learns the “model” or general description of the speech characteristics of all speakers.

Predicting height

Once the models for each height range are learned, the system is ready to predict the speaker’s height from his or her voice. The features of the voice are extracted and the machine evaluates how likely the voice features belong to each model. The height group that gives the highest likelihood value is the final prediction.

How It Works: Age estimator and gender predictor

An age estimator and a gender predictor can be trained in the same fashion. Age ranges represent the groups for age estimator while ‘male’ and ‘female’ are the two groups for the gender predictor.

How we can improve the system

Our current system is developed from only a few minutes of recordings and they don’t cover some height or age ranges. It is difficult to learn the model when there is no data available. The more data to train, the more accurate. If you’d like to help us improve the system, please send us an email to: contact @ Oben . me and we shall send you a set of sentences to read as well. Currently, our algorithm works best for 20+ years old folks.

You can also sign up for updates here: http://eepurl.com/bwO3mL

Bitnami