@wsnyder would you mind if I recode some of this in C++ (and which edition?), and maybe use mmap for reading the files (requires POSIX host, and 64-bit platform for large files)? I'm trying to diff 20GB VCD files and and just the signal matching takes forever due to the O(N^2) algorithm.