We have set up a system in which we have about 17 million BaseX databases, but my operating system does not allow 17 million subdirectories in one directory ( the one set in the .basex file). In order to resolve this, we had to set up four (4) BaseX servers on four different machines so we could resolve this. Depending on the name of the BaseX database, we know on which BaseX server we have to look. As this is a work around, it would be nice to have either the possibility to have a hierarchy of databases, so we can store them into more manageable directories, or to allow to run more than one BaseX server onto a single machine (for instance, listening on different ports). For more info on the reasons why we did this: Vandeghinste and Augustinus (2014). Making Large Treebanks Searchable. The SONAR case. In Marc Kupietz, Hanno Biber, Harald Lüngen, Piotr Bański, Evelyn Breiteneder, Karlheinz Mörth, Andreas Witt, Jani Takhsha (eds.), Proceedings of the 2nd workshop on Challenges in the management of large corpora (CMLC-2) at the Ninth International Conference on Language Resources and Evaluation (LREC). Reykjavik, Iceland. pp. 15-20.
Is there another way to solve this?
thanks, v.
Hello Vincent,
the maximum amount of directories is defined by the file system, not the OS. It would be interesting to know which filesystem you do use. As far as I know the current filesystems for the major OS (ext4, NTFS) all support an unlimited amount of directories.
It is already possible to run multiple BaseX servers on a single machine. Make sure to also set a different event port (using command line -e or in the config file).
Cheers, Dirk
On Tue, Sep 30, 2014 at 1:33 PM, Vincent Vandeghinste < vincent@ccl.kuleuven.be> wrote:
We have set up a system in which we have about 17 million BaseX databases, but my operating system does not allow 17 million subdirectories in one directory ( the one set in the .basex file). In order to resolve this, we had to set up four (4) BaseX servers on four different machines so we could resolve this. Depending on the name of the BaseX database, we know on which BaseX server we have to look. As this is a work around, it would be nice to have either the possibility to have a hierarchy of databases, so we can store them into more manageable directories, or to allow to run more than one BaseX server onto a single machine (for instance, listening on different ports). For more info on the reasons why we did this: Vandeghinste and Augustinus (2014). Making Large Treebanks Searchable. The SONAR case. In Marc Kupietz, Hanno Biber, Harald Lüngen, Piotr Bański, Evelyn Breiteneder, Karlheinz Mörth, Andreas Witt, Jani Takhsha (eds.), Proceedings of the 2nd workshop on Challenges in the management of large corpora (CMLC-2) at the Ninth International Conference on Language Resources and Evaluation (LREC). Reykjavik, Iceland. pp. 15-20.
Is there another way to solve this?
thanks, v.
basex-talk@mailman.uni-konstanz.de