Hi Andy,
Nice use of syntax (though you have to loose the semi-colon of course).
Visually i like the arrow operator a lot. Looks like a visual pipeline
"https://wiki.mozilla.org/images/f/ff/Example.json.gz" => fetch:binary() => archive:extract-text()
I also think that this could be a bug or at least a good improvement to make as the docs say gzip archives can be created. Christian, you think we should file an issue for this?
--Marc
On Tue, Jan 26, 2016 at 9:51 PM, Andy Bunce bunce.andy@gmail.com wrote:
Hi Marco,
I get the same. This works:
"https://wiki.mozilla.org/images/f/ff/Example.json.gz" !fetch:binary(.) !archive:extract-text(.)
But this returns empty:
"https://wiki.mozilla.org/images/f/ff/Example.json.gz" !fetch:binary(.) !archive:entries(.) <archive:entry xmlns:archive="http://basex.org/modules/archive"/>
Expecting to see "example.json"
Could this be a bug?
/Andy
On 26 January 2016 at 18:51, Maximilian Gärber mgaerber@arcor.de wrote:
Hi,
I think this should work, I use it for OData requests from IIS.
Need to dig through the source...but I used one oft the extract-binary functions
Regards, Max
Am 26.01.2016 16:04 schrieb "Marc van Grootel" marc.van.grootel@gmail.com:
Well, shelling out wasn't so hard even on Windows with cygwin tools it's simply
proc:execute('gunzip', $path-to-gzipped-file)
Worked quite transparently as it extracts the files and removes the .gz file. Would be nice if there's a pure XQuery solution but for now I'm okay.
Cheers,
On Tue, Jan 26, 2016 at 3:13 PM, Marc van Grootel marc.van.grootel@gmail.com wrote:
Hi,
I hoped that I could use archive module to also extract gzipped files. I need to fetch/sync large XML from a web service that has the option of getting files with gzip encoding (to be nice to the web server).
First attempt was to explicitly get the gz file via the URL and then treat it like an archive binary (extracting it with the recipe from the archive module page). The entries XML I get is empty so I suppose that I cannot read .gz
Second attempt was to specify Accept-Encoding = gzip which indeed delivers the XML as a binary. But I probably run into the same issue when trying to extract.
Is there a way to do the extraction of .gz encoded files without having to shell out to some kind of unzipper?
Cheers, --Marc
-- --Marc