Talking to Your Dog with Ember

A presentation at EmberConf 2020 in March 2020 in by Robert Wagner

Slide 1

Slide 1

Talking to Your Dog With Ember

Slide 2

Slide 2

Who Am I? CEO of Ship Shape • • • Ember.js Learning Core Team ember-math-helpers, ember-shepherd, Shepherd.js Love all dogs and have a Frenchie named Odie (Instagram: @odielafrenchie) robbie@shipshape.io | https://shipshape.io TWITTER: @RWWAGNER90 GITHUB: @RWWAGNER90 INSTAGRAM: @ODIELAFRENCHIE

Slide 3

Slide 3

Motivation Odie is constantly barking at something. He will bark playfully at the fireplace tools when he wants to be chased around, ferociously at anyone entering the house, and incessantly when squeaking his ball, as if he is trying to talk to it. Understanding why he barks the way he does was the major motivation behind exploring this talk.

Slide 4

Slide 4

Demo

Slide 5

Slide 5

The Idea Chuck Carpenter, Ship Shape’s COO, and I were working one day when the EmberConf CFP came out. We were brainstorming ideas and Odie kept barking and running around. We thought “it would be nice to know what Odie is thinking and why he is barking”, and the idea to create an app to decode dog barks was born.

Slide 6

Slide 6

The Process Web Audio API The Web Audio API provides methods for working with audio and video in JavaScript. Determining Dog Bark Types Studies have shown that both dogs and humans can tell the difference between lonely, happy, and aggressive barks, and dogs have been shown to react differently to dogs they are familiar with. Deciphering Audio Data Since there is a scientifically proven difference between types of dogs barks, there must be some way we can use the Web Audio API to try to decode them.

Slide 7

Slide 7

What is Web Audio API? ¯_(ツ)_/¯

Slide 8

Slide 8

When In Doubt, Google It • Several real-time analysis examples • Docs from Mozilla • Stack Overflow posts

Slide 9

Slide 9

Web Audio API fftSize and frequencyBinCount fftSize represents the window size in samples that is used when performing a Fast Fourier Transform, and frequencyBinCount is always 1/2 of fftSize. getByteFrequencyData() Copies the current frequency data into a Uint8Array. The frequency data is composed of integers on a scale from 0 to 255. getByteTimeDomainData() The Web Audio API provides an AnalyserNode, which has several Copies the current waveform or time-domain helpful methods for analyzing audio and getting frequency and data into a Uint8Array. The data is composed of waveform data from it. integers 0-255 which map from -1 to +1, so 128 is 0.

Slide 10

Slide 10

First Attempt 01. createMediaElementSource 03. Small Data Snippets 01. The analyser methods only provide frequency and Initially, I was trying to use a media element as a waveform data for the small section of audio they are source, which worked, and was intuitive, but it only called on, but we need data for the whole file. provided the playing audio, not the full buffer. 02. Frequency 04. How To Use The Data? I was incorrectly getting only the frequency data, Once we do have the data, what does it mean and which does not have a time component, so to get the data over time we had to switch to how do we use it? 03. 04. getByteTimeDomainData. 02.

Slide 11

Slide 11

Second Attempt 04. 01. 01. OfflineAudioContext 03. Upload Files / Use Microphone OfflineAudioContext allows us to use an entire audio file Leveraging ember-file-upload, we can support file loaded into a buffer, rather than only analyzing the part uploads with ease, and we can use getUserMedia to of audio that is currently playing. access microphone data. 02. getByteTimeDomainData 04. Visualizations We’re using a script processor to run Heavily borrowing from Visualizing Audio #3 Time getByteTimeDomainData onaudioprocess. This allows us Domain Summary, I was able to add a visualization of the to get time domain data for the whole file. audio data, so we can see where it spikes. 02. 03.

Slide 12

Slide 12

Dog Bark Science Disclaimer: I am not an expert in dog barks. This project is based off of several different scientific studies, and is based on my personal interpretation of the results. It can certainly be improved, and may not be 100% accurate.

Slide 13

Slide 13

Dog Bark Science Studies have shown that dog barks fall somewhere in the ~250-4000 Hz range at most shelters, all breeds seem to have some component of their bark registering in the ~1000-2000 Hz range, and barks register between 80-90 decibels from a distance of 5 meters.

Slide 14

Slide 14

Bark Types There are seemingly infinite combinations of potential types of dog barks. You could have any pitch, duration, and number of total barks, as well as subtly different inflections on each, so how do we determine meaning from such a huge set of possibilities? Alert Greeting/Playful Distress Rapid barking at a mid-range pitch can signal to Stutter barks or rising pitch barks can signal a Barking with long periods of time between each the pack that there is a problem or something to dog is happy and wanting to play. A single bark utterance can mean “Is anyone there? I’m lonely investigate. can be a greeting. and in need of companionship”.

Slide 15

Slide 15

Limiting Frequency Since dog barks only make sounds in the ~250-4000 Hz range, we can throw out data > 4000 Hz. Finding dB Spikes Mapping Data To Bark Types We can find spikes in the waveform data above 0.55 to try to assume the barks occurred there. Modes We can take the mode of the frequency range per bark occurrence to determine the pitch of that bark. Let’s apply what we learned about the different types of dog barks to what we learned about the science behind each bark type, in order to identify the type from the audio data.

Slide 16

Slide 16

Integrating With Ember Ember helps us to take our vanilla JS implementation to the next level. We gain the flexibility to structure our code into components, the power of glimmer, and a huge set of available addons, which allow us to bolt on PWA functionality with ease. ember-service-worker ember-web-app Opinionated Service workers allow our app to work offline, so Ember-web-app creates a manifest that allows Ember’s strong conventions help us to structure it can continue to be used when connection is us to specify the name, description, icon, etc for our app cleanly, quickly install packages as spotty and ember-service-worker makes the app and makes it installable on various addons, and leverage the power of glimmer installing them a breeze. devices. components.

Slide 17

Slide 17

Future Work Add More Bark Types We only have four rough categories currently, but there are 10 basic types at a minimum, as well as other nuanced bark types, which could be supported for more exact output. The one I particularly want to support is the lonely bark, to identify episodes of separation anxiety. Refine Frequency Ranges Our current low, mid, and high frequency calculation is a linear distribution of the three, but we should refine it to have more buckets, like mid-high, mid-low, etc. for better results. Add Talk Back Feature After fully mapping out the bark types and frequency ranges, it should also be possible to add a “talk back” feature, which could take barking audio samples and tweak them to respond to your dog.

Slide 18

Slide 18

Try Wüf Now Visit https://wuf.plus to try it now!

Slide 19

Slide 19

Sources • https://developer.mozilla.org/en-US/docs/Web/API/Web_Audio_API • https://stackoverflow.com/questions/24083349/understandinggetbytetimedomaindata-and-getbytefrequencydata-in-web-audio • https://www.petsafe.net/learn/10-translated-barks-know-what-your-dog-issaying • https://www.sciencemag.org/news/2014/08/dogs-glean-information-eachothers-barks • http://apprentice.craic.com/tutorials/32

Slide 20

Slide 20

Thank you for your attention. https://shipshape.io/ / E-mail: robbie@shipshape.io