We run the script that uses this functionality embedded in a java application. I noticed now that the first time the code runs after a cold start. This log message appears.
Hi Johan,
> Just wanted to report back that it works really well.
Glad to hear it works.
> It is about 50% slower
> than running the md5 command on the command line of my mac.
My final solution is close to the one you proposed [1]: I decided to
use a little buffer as well, because it was faster than calling
md.update() for each single byte.
Using nio channels gives us better performance:
String path = ...
RandomAccessFile raf = new RandomAccessFile(path, "r");
FileChannel ch = raf.getChannel();
ByteBuffer buf = ByteBuffer.allocate(IO.BLOCKSIZE);
final MessageDigest md = MessageDigest.getInstance("md5");
do {
final int n = ch.read(buf);
if(n == -1) break;
md.update(buf.array(), 0, n);
buf.flip();
} while(true);
System.out.println(Token.string(Token.hex(md.digest(), true)));
But I am not sure how smoothly this would integrate in our remaining
streaming architecture, as we are also streaming main-memory objects.
I'll keep it in mind, though.
Cheers,
Christian
[1] https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/query/func/hash/HashFn.java