Super vague AI question for MIR 

Super vague AI question:

I have a pile of wav files and I would like some AI to come in and do an unsupervised analysis of them where it comes up with some ways in which they're distinct from each other and give them a score of 1-5 on how close they are to each pole of whatever it decides makes them distinct. So it decides on some pronounced x-factor and then scores them accordingly.

My questions are:
1. What's the terminology to describe this process?

2. Is there code lurking around (preferably in python or SuperCollider) that can do the machine learning/analysis/(see question 1)?

3. Are the results of it likely to be something that humans can perceive? (Or is there a way to encourage that?)

Follow

Super vague AI question for MIR 

@celesteh
Background: The better an Algo is or solving certain problems the worse it is at the rest of the possible problems (No Free Lunch Theorems).

And for an Algo to treat it as sound or music you need to tell it it's music and it needs to understand what to do with that. For example, you could submit the wave file to an image Algo and get interesting results, but to your question 3, not necessarily something a human would perceive as similarity.

Super vague AI question for MIR 

@celesteh
#1: you're looking for a distance or similarity metric that works for human sound or music perception. If you get a model that's already trained, it may have been supervised but you don't want to train one so from your perspective, unsupervised. You may organize and redistance the files with a higher dimensional model or kmeans clustering as suggested (both forms of unsupervised training the model to your data).

Super vague AI question for MIR 

@celesteh
#2 IDK what's out there sorry!
#3 expanding on #1: why you need something that models human perception of sound / music. Consider a fugue. A human can recognize that as many layers and repetitions of the same theme at different tempos, pitches, etc. However, pitch and tempo are both obscured in a wave file. So, it depends what the Algo looks for. Hence a music Algo beats an image Algo here. And I'd bet a temporal analysis would fall somewhere in between.

Sign in to participate in the conversation
Mastodon

Server run by the main developers of the project 馃悩 It is not focused on any particular niche interest - everyone is welcome as long as you follow our code of conduct!