Hi Jon Morehouse,
You can use BaseX to create a database (a collection) of all your XML files. You can find some information here [1]. Additionally you can create a full-text index to support fast access of keywords within text nodes, see also [2].
Afterwards you can access/query your database e.g., using the REST API [3] or a client API [4,5].
Which size does one XML file have?
regards Lukas
[1] http://docs.basex.org/wiki/Commands#CREATE_DATABASE [2] http://docs.basex.org/wiki/Full-Text [3] http://docs.basex.org/wiki/REST [4] http://docs.basex.org/wiki/Clients [5] https://github.com/BaseXdb/basex-api/tree/master/src/main/php
On Jan 1, 2012, at 9:12 AM, Jon Morehouse wrote:
I am new to BaseX and am excited to be moving forward with this product. Right now I am setting up a website where I want to be able to query across millions of xml files. For instance, if each file contains different keywords, I would like to query across each file to match them with a list of say, my top 50 keywords, to find which files have the most keywords present, the most amount of times. Would something like this be possible with basex? It seems like it would be a simple xquery piece using php (the list of keywords is coming from mysql) but with each xml being its own xml file, would it be possible to search across each and every database/file.
Jon Morehouse Moeller High School Class of 2009 Pepperdine University 2009-2010 University of Southern California Class of 2013
<Signature Final (2).jpg>
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk