Hey everyone! I'm working on a project where I need to create a sound event detector that can run efficiently on smartphones. The goal is to identify a specific 1-second jingle that acts as a wake signal, similar to how "OK Google" or "Hey Siri" works, but this one is a musical cue rather than a spoken command. I need the app to activate the microphone in "active mode" when it detects this jingle. It's crucial that the system only recognizes this exact jingle and isn't fooled by similar sounds or imitations. I've looked into sound event detection but would love your insights on the best models or techniques to implement this effectively on mobile platforms. Any suggestions? Thanks in advance!
1 Answer
For an app to activate upon detecting your jingle, it will likely need to run in the foreground with the necessary microphone permissions. It’s doable with some robust audio processing, but keep in mind that this involves active processing rather than just simple listening. Most traditional wake word systems use low power hardware to process sound, while your app will need to work more like Shazam, processing audio continuously in the foreground. One of the challenges is achieving reliable detection without other environmental sounds interfering, which can be tricky.
Yeah, I can see how it must be difficult with background noise. Have you explored using audio markers or specific sound profiles to help with detection?