[basex-talk] Whatever happened to DeepFS

Andy Bunce bunce.andy at gmail.com
Wed Nov 16 00:34:34 CET 2011


I am mainly interested in image, (usually jpg ), and audio (usually mp3)
I dont know much about Exiftool but it seems to be a Perl library. Nothing
wrong with that :-), but sounds an heavy choice to wrap in a java package?

xmlcalabash has cx:metadata-extractor extension step; for images a thin
shell around Drew Noakes' library of the same
name<http://www.drewnoakes.com/code/exif/>
. <http://xmlcalabash.com/download/>Mentioned
athttp://xmlcalabash.com/download/

Mp3 is more tricky, but https://github.com/mpatric/mp3agic looks like a
possible candidate to me.

/Andy

On Tue, Nov 15, 2011 at 2:48 PM, Alexander Holupirek <
alexander.holupirek at uni-konstanz.de> wrote:

>
> On 14.11.2011, at 21:48, Andy Bunce wrote:
>
> > It is the metadata extraction part that is non trivial.
> > So packaging the libraries and calls for that sounds like a great way to
> go.
> >
> > /Andy
> >
> > On Mon, Nov 14, 2011 at 7:22 PM, John D. Mitchell <jdmitchell at gmail.com>
> wrote:
> > On Nov 14, 2011, at 11:17 , Alexander Holupirek wrote:
> > [...]
> > > If you also want to have the extractor functionality ... we thought
> about packaging [2] it for BaseX and make it available as XQuery functions.
>  Just give us a hint and we will get going.
> >
> > ++
> >
> > Cheers,
> > John
>
> Thanks for your feedback.  We decided to go for the packaging approach and
> to provide an EXPath package [0] in order to produce a FSML database of a
> given file hierarchy.
>
> It would be interesting to hear what kind of file types are relevant for
> you.
> The idea is to have transducer code [1] that, for example, extracts ID3
> information for audio files:
>
>   <file name="LockerBleiben.mp3" suffix="mp3" st_mode="0100644"
> st_size="4585915" st_mtime="1320945388000" st_uid="1000" st_gid="1000"
> st_nlink="1" bsid="70622d84-f4f7-4b90-95e2-9e1821e8d283">
>      <folder name="ID3v2">
>        <fact name="Title">Locker Bleiben</fact>
>        <fact name="Artist">Die Fantastischen Vier</fact>
>        <fact name="Composer">Andreas Rieke/Michael DJ Beck/Thomas
> Dürr/Michael B. Schmidt</fact>
>        <fact name="Album">Lauschgift</fact>
>        <fact name="Track">15/20</fact>
>        <fact name="PartOfSet">1/1</fact>
>        <fact name="Year">1995</fact>
>        <fact name="Genre">Hip Hop/Rap</fact>
>        <fact name="Compilation">1</fact>
>        <fact name="Comment">(iTunPGAP) 0</fact>
>        <fact name="EncodedBy">iTunes 8.0.2</fact>
>      </folder>
>      <folder name="Cover">
>        ...
>      </folder>
>    </file>
>
> Currently I think about using exiftool[1] by Phil Harvey to include
> metadata about numerous multi-media files.
> Extract full text and publisher metadata from PDF files, etc.
>
> If you have something special or want to comment on this, I'm all ears.
>
> Thanks,
>        Alex
>
>
> [0] EXPath Packaging: http://docs.basex.org/wiki/Packaging
> [1] Transducer coined by Gifford et.al. Semantic File System:
> http://dl.acm.org/citation.cfm?id=121138
> [1] http://www.sno.phy.queensu.ca/~phil/exiftool/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.uni-konstanz.de/pipermail/basex-talk/attachments/20111115/d1ba6935/attachment.htm>


More information about the BaseX-Talk mailing list