…looks good :)
On Thu, Jan 8, 2015 at 10:15 PM, jean-marc Mercier jeanmarc.mercier@gmail.com wrote:
Christian,
Excellent ! thanks a lot ! prof:time(local:savebin((1 to 10000000),$binfile)) ,prof:time(local:loadbin($binfile))
output is now 930.71 ms 907.04 ms
2015-01-08 22:11 GMT+01:00 Christian Grün christian.gruen@gmail.com:
Hi Jean-Marc,
- However unserialization seems to perform awfully, or I do not know
how to do it properly.
I haven't tried the query, but my guess is that the binary data is streamed again and again. The stream:materialize function should help you [1]:
let $data := stream:materialize(file:read-binary($file)) ...
Christian
[1] http://docs.basex.org/wiki/Streaming_Module
declare function local:savebin($seq,$file as xs:string) { file:write-binary($file,bin:join( (bin:pack-integer(count($seq),4) ,$seq ! bin:pack-integer(.,4)))) }; declare function local:loadbin($file as xs:string) { let $data := file:read-binary($file) let $size:= bin:unpack-integer($data,0,4) let $seq := for $i in (1 to ($size)) return bin:unpack-integer($data,$i*4,4) return count($seq) };
prof:time(local:savebin((1 to 100000),"Bin.dat")) ,prof:time(local:loadbin("Bin.dat"))
output :
46.38 ms 10775.12 ms 100000
To compare, unserializing a sequence (1 to 10 000 000) stored in a file as a big string using fn:tokenize takes about 10 sec (100 x faster). Did I mistake something ?
2015-01-08 16:44 GMT+01:00 Christian Grün christian.gruen@gmail.com:
This way of doing stores integers as string, then call a cast string / integer to unserialize it. For large integer list (I am dealing with lists of size 134 Mo), it is quite time and mem consuming.
I was wondering if there exists a more efficient way to store and retrieve atomic list into BaseX ?
One alternative is to store the integers in a binary file:
let $size := 4 let $data := bin:join( for $n in 1 to 100 return bin:pack-integer($n, $size) ) return db:store('db', 'integers.bin', $data)
This way, every integer will occupy the supplied number of bytes (here: 4, allowing you to address 2^32 integers).