December 08, 2006

wextract

It's sometimes useful to be able to extract a bunch of little audio files from one big audio file in .wav format. You know the time boundaries of the pieces that you want, based on a time-aligned transcription or other annotation. You don't want to do all the extractions by hand using an interactive waveform editor, and you'd like a light-weight, scriptable program that can do it for you. After looking around for a while, I couldn't find any suitable free software solution to this problem. (I was especially disappointed that sox doesn't have the ability to extract a piece of a file.). I used to deal with this problem by using sox to make a copy of the file with no header, and then using a generic little seek-read-write program to extract the desired chunk of bytes (whose size and location depends on the sampling frequency, sample type and number of channels), and then using sox to put a .wav header back on. But it's annoying to have to produce the headerless copy -- especially if you're dealing with a 600-MB file. And if you use the wrong sampling frequency or channel count, everything else comes out wrong too. So I decided to make a generic little seek-read-write program that understands simple .wav headers. If you need such a thing too, here's wextract.c. If you don't want or need it, ignore this post.

[Update -- boy, is my face red! Keenan Pepper points out that among its umpteen options, sox actually does have a function "trim" which does exactly what I wanted. In my defense, I can only say that in addition to mis-reading the documentation, I asked a couple of other sox users, who also failed to find this function... Anyhow, I learned how .wav headers work, said he lamely.]

[Meanwhile, Bill Poser was seized by a fit of hackery, and modified wextract to use libsndfiles, which I had considered and rejected as too complicated. He even packed it up using Gnu autoconf and all. If you want to see how to use libsndfiles, or to use it for some other purpose, the tarball is here. Warning: you'll have to install libsndfiles first -- and you'll also have to execute ldconfig as root, since the libsndfiles "make install" command unaccountably fails to do that. If none of that makes sense to you, consider yourself lucky and move on.]

Posted by Mark Liberman at December 8, 2006 09:15 AM