X

Using the Speex codec in Java

JSpeex is a Java implementation of the Speex codec. By default, Java ships with either no encoding or the venerable G.711 encoders. However, if you want to store a bit of a larger data set, well, you’re out of luck. This is a problem for me, as I wanted to store a lot of spoken data in a little space. Luckily, the Speex codec is right for that, but it wasn’t as easy to get it into my project as I wanted.

Let’s get this clear first: jspeex is not an actively maintained project. The last code update I can find is from two years ago. So it is not too surprising I couldn’t get it off the shelf, and the documentation was a bit off. Nevertheless, it works just fine for my purposes, so I write this here in case anyone else needs it.

Adding the Dependency

jspeex is not hosted on Maven Central. So adding it as a maven repository, you have to add the jboss releases repository. For copy-paste purposes, that boils down to the following code:

<repository>
  <id>sonatype-snapshots</id>
  <name>sonatype-snapshots</name>
  <url>https://oss.sonatype.org/content/repositories/snapshots</url>
</repository>
<dependency>
  <groupId>org.mobicents.external.jspeex</groupId>
  <artifactId>jspeex</artifactId>
  <version>0.9.7</version>
</dependency>

Recording and Playback

jspeex integrates into the Java AudioCodec API as a provider; therefore no configuration is needed to get it running. However, due to how the audio format conversion works, it is not cleanly possible to record and playback. The AudioSystem DataLines appear to need PCM, and cannot accept Speex directly. Therefore, we need to jump through a small hoop to get recording and playback working.

// Create target Speex AudioFormat. In this case, mono sound.
AudioFormat speexFormat = new AudioFormat(SpeexEncoding.SPEEX, sampleRate, -1, 1, -1, -1, false);
// Convert the stream
speexStream = AudioSystem.getAudioInputStream(speexFormat, micStream);
// Can now  write regularly
AudioSystem.write(speexStream, SpeexFileFormatType.SPEEX, bos);

For recording, it’s a simple conversion needed. Create an AudioFormat with the same sample rate and record it. The inverse is needed for playback: converting the stream. This primitive player will handle regular audio files and SpeexEncoding format as well:

public static void play(InputStream is) throws Exception
    {
        
        // AudioSystem requires a buffered stream.
        BufferedInputStream bis = new BufferedInputStream(is);
        AudioInputStream audioStream = AudioSystem.getAudioInputStream(bis);
        
        // Speex requires a reformat into PCM.
        if (audioStream.getFormat().getEncoding() instanceof SpeexEncoding) {
            AudioFormat format = audioStream.getFormat();
            AudioFormat pcmFormat = new AudioFormat(format.getSampleRate(), 16, format.getChannels(), true, false);
            audioStream = AudioSystem.getAudioInputStream(pcmFormat, audioStream);
        }
        
        // It works better if the playback is in 44.1khz for some reason. Let's do that.
        AudioFormat outputFormat = new AudioFormat(44100, 16, audioStream.getFormat().getChannels(), true, false);
        audioStream = AudioSystem.getAudioInputStream(outputFormat, audioStream);
        
        // Pure playback and cleanup past this point
        Info info = new DataLine.Info(SourceDataLine.class, outputFormat);
        SourceDataLine sourceLine = (SourceDataLine) AudioSystem.getLine(info);
        sourceLine.open(outputFormat);
        sourceLine.start();
        
        int nBytesRead = 0;
        byte[] abData = new byte[4096];
        while (nBytesRead != -1)
        {
            try
            {
                nBytesRead = audioStream.read(abData, 0, abData.length);
            }
            catch (IOException e)
            {
                e.printStackTrace();
            }
            if (nBytesRead >= 0)
            {
                @SuppressWarnings("unused")
                int nBytesWritten = sourceLine.write(abData, 0, nBytesRead);
            }
            if (nBytesRead == 0)
                break;
        }

        sourceLine.drain();
        sourceLine.stop();
        sourceLine.close();
        audioStream.close();
    }

Recording while Streaming

This is an edge case that only truly applies to me perhaps. I needed to record the audio while streaming it to a remote location. Luckily for us, AudioInputStreams are very much regular input streams, so I simply needed to use the venerable PipedInput/OutputStreams to copy and save simultaneously.

int sampleRate = 16000;
AudioFormat format = new AudioFormat(sampleRate, 16, 1, true, false);
DataLine.Info info = new DataLine.Info(TargetDataLine.class, format);

TargetDataLine line = (TargetDataLine) AudioSystem.getLine(info);
line.open(format);
line.start();

// Now, we want to duplicate this line data into two sources.
PipedOutputStream pos = new PipedOutputStream();
PipedInputStream pis = new PipedInputStream(pos);

AudioInputStream audioForSpeechRecogniser = new AudioInputStream(pis, line.getFormat(), AudioSystem.NOT_SPECIFIED);

byte[] buf = new byte[4096];
int read = -1;

while ((read = line.read(buf, 0, buf.length)) > -1)
{
    if (abort.getAndSet(false))
    {
        line.stop();
        line.close();
        pis.close();
    }
    
    try
    {
        
        if (read > 0)
        {
            pos.write(buf, 0, read);
            bos.write(buf, 0, read);
        }
        
    }
    catch (IOException e)
    {
        break;
    }
}

pos.close();
audioForSpeechRecogniser.close();

As we can see, all I really needed to do to pipe my stream to two locations was to do a read loop that writes it to two locations. Simple.

Vlad:
Related Post