Facebook wav2letter++


May be a little OT, but of general interest.

The Facebook AI Research (FAIR) Speech team is sharing the first fully convolutional speech recognition system. It uses convolutional neural networks (CNNs) for acoustic modeling and language modeling, and is reproducible. The team says that wav2letter++ is composed only of convolutional layers, which yields performance that’s competitive with recurrent architectures.


Downloaded the repo, got stuck on arrayfire, will have to find some time to try to get through that and running it.

That it’s compiled in totality probably helps the speed (vs, say, deepspeech which does some python bits).


Arrayfire can be binary installed, moved forward from there and flashlight kept being a PITA, had to end up building it with make -j1 or else it broke?
The google test/log/flag bits are super quick to install.
Not sure if you can run it on a ryzen since they require MKL?
CUDA compilation uses gcc5, was building everything else with gcc7, had to export CC, CXX, CFLAGS, CXXFLAGS, NVCC, NVCCFLAGS and then it started to compile. Then it couldn’t decide if it wanted to use std=c++11 or not?
Anyways…seems to be compiling now. Will see later if it finished successfully.

Bonus: apparently it overwhelmed the machine I was compiling on, and it ran out of memory. Trying again, with -j1 on this make as well.

(Several tries later) Managed to build the binaries, haven’t had time to run them yet, though.