Hi Johan,
Just wanted to report back that it works really well.
Glad to hear it works.
It is about 50% slower than running the md5 command on the command line of my mac.
My final solution is close to the one you proposed [1]: I decided to use a little buffer as well, because it was faster than calling md.update() for each single byte.
Using nio channels gives us better performance:
String path = ... RandomAccessFile raf = new RandomAccessFile(path, "r"); FileChannel ch = raf.getChannel(); ByteBuffer buf = ByteBuffer.allocate(IO.BLOCKSIZE); final MessageDigest md = MessageDigest.getInstance("md5"); do { final int n = ch.read(buf); if(n == -1) break; md.update(buf.array(), 0, n); buf.flip(); } while(true); System.out.println(Token.string(Token.hex(md.digest(), true)));
But I am not sure how smoothly this would integrate in our remaining streaming architecture, as we are also streaming main-memory objects. I'll keep it in mind, though.
Cheers, Christian
[1] https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/ba...