Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Generate unique drum samples using artificial intelligence (audialab.com)
81 points by belter on Feb 11, 2023 | hide | past | favorite | 26 comments


Different idea, but I really want someone to make a good drummer AI. Specifically, it listens to a key component of your song (probably bass or guitar) and generates a wide array of beats (midi format) to match the music. You’d provide input regarding style and intensity, and it’d be available as a plug-in for logic, etc.

I am fully aware of apple’s drummers in GarageBand and Logic, but they’re weak in my opinion. I’m looking for a quantum leap forward in the same way GPT chat makes Siri look like a toy.


Check out "band in a box", some of the real band drummer samples are fantastic and adapt in a pretty harmonically interesting way according to your chord progression that you've set up.


Awesome, will do, thanks!


I’m one of the co-founders of Audialab, the company behind this tech. We packaged it up into a plug-in called “Emergent Drums”.

If you’re curious, our original generative models used GANs, and we’re incorporating diffusion approaches now.

Drums are the start - we’re currently training models for instruments, synths, vox, foley, etc.


As a Musician, I'll be more interested in style transfer: You give it a snare (or a multitrack drum loop !), and you tell it to generate a related hit-hat (or a complete kit !)

There is no shortage of samples and sample packs (millions), and most pros are picky : context in king, and style transfer is more contextual.

For instruments/synths/vox, "playability" is important, so the best approach IMO is cloning a sample to playable Midi instrument like midi-ddsp, midi2params or Mawf


Have you tried out Logic’s drummers? They aren’t style transfer per se, but they are AI that drums in a particular style, and you can control how it works.


Only seen it on YouTube. Yes This is still generation, but I like the level of control. If Emergent Drums have this degree of parametric adjustment (I did not test it) it could be useful (to me).


It's unusual to use the term "royalty-free" for a music-creation tool. That term is usually used for purchases of sample libraries, where the seller retains the copyright. It's a given that you own the things you create with a tool, so saying they're royalty-free is superfluous.

I couldn't find a license or terms of service on the website, so I must ask: who owns the copyright on the generated samples?


This is all so new the law hasn't caught up yet. No one knows who (if anyone) will end up with copyright on AI generated art.

Perhaps "royalty-free" are the terms they're offering if the law does eventually allow an AI (or its creators) to own a copyright.


I see where you're going, but I don't see the legal issue as relevant to my question. The answer I was hoping for was that the company makes no claim toward any rights that may exist in the work the tool generates. That way, regardless of how the law treats generated art, the company won't end up with rights in stuff that their customers made. Compare "royalty-free," which implies the company retains everything except a claim for royalties.

It's too bad the cofounder hasn't answered.


Nice work. This is a far more tractable problem than other generative sound projects of late. Creating one individual sample still keeps the human in the driver’s seat.

In a similar vein, it would be very cool to use AI with synthesizers. Text to synth, but use an actual synthesizer as an intermediary step. Don’t just create a sound from nothing, tune the knobs in the right ways. Start out with subtractive synthesis and additive synthesis.


I feel like this poses an interesting problem: if this AI, without any input samples, generates an output that is indistinguishable from a known sample, can you still say it's royalty free? I'm not really sure what the legal status quo is here in other related fields with similar issues.


If you ask stable diffusion or MidJourney to generate a recent Disney character image, it’s not royalty-free.

For a drum sample, I guess some are so generic and simple that no one can claim ownership. Still, if you manage to reproduce a particular sample, perhaps because of an overfitted model, then you may have some issues. However, the music industry seems to be fine with sampling.


> However, the music industry seems to be fine with sampling.

Yes, as long as the sampled are paid and credited.


It's not an interesting problem, we in the US already have a legal definition of similarity for the purpose of copywright law, I assume other countries do as well.


I would be more interested if AI could interpolate drum breaks and create new ones for me.

I produce hiphop and those 909 sounding drums seemed a bit bland for me.


This would be golden. Imagine being able to text prompt for breaks, for example: A "Think" break, but with bongos and using the "Apache" pattern and no reverb.


I tend to find repetitive sound effects in games when repeating an action immersion-breaking (a known concept, I think) -- I've wondered why procedural generation (even as rudimentary as a few dozen random envelopes/pitch shifts/crossfades applied ahead-of-time to recorded samples) isn't used more often to mitigate that.


Yes, game devs in the know will have a bank of random samples for e.g. a foot step and then will also tweak volume and pitch to increase variability.


Though I'm sure some games actually do this, to lesser or greater extent [0], I've wondered why it's not more common ..

I think the answer comes down to the fact that it would take, at minimum, one highly skilled audio engineer with specific audio programming skills to develop reliable variations that sound good. The sounds need to be realistic and appropriate literally every time, despite also being "random". This process would be deceptively hard, and take a significant amount of time.

How great or necessary is the payoff in most cases? For example, Dota 2 is widely regarded as having superb audio engineering - experienced players can tell what's going on even in frantic ten player battles involving 100+ different heroes and hundreds of abilities by ear alone. Dota doesn't use any procedural generation at all, and I've never heard anyone complain despite the thousands of hours they put in.

[0] - https://splice.com/blog/procedural-audio-video-games/


Some games certainly do that and most audio engines will allow the sound designers to add random variations, along with randomly selecting between different samples.

There is only so much you can do with pitch changes and creative use of filters, and having multiple samples for each sound can burn memory (and sound designer hours) fairly quickly. As you point out there is a wide variance in the technical skills of sound designers (some love playing with the game engine side, some are better at making sounds and throwing them over the wall).


Theres almost no info on the website though. An interactive demo would be great, or failing that just some audio examples of its output or a video or something using it would help.



Reminds me of GUK.ai , which uses AI to generate generic samples.

Interesting that Audialab is focused on drums only.


Reminds me of Mozart using dice rolls to compose his waltzes :)


https://youtu.be/n4WpyNMfQ2Q

No need for rolling a dice when you can generate all. All one needs is a taste which will be compatible with niche audiences.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: