Hi Christian,
Thank you for your answer. The problem is tricky in the sense that I trained a pos tagger which outputs binary data (the model), which will then be used to output new non-binary data. I do not know whether this might be relevant when I bring everything on a different machine, even though the locale is utf.8 (but on my mac I just have utf-8, on the Centos6 server en_us.utf8). The interesting thing is that at the command line, on both machines, I get the right output. The problem arises only when I use proc:execute. I have to add that I tested this with exist-db and I get the same error.
Thank you for the support.
Joseph
Il giorno 17/nov/2015, alle ore 19:34, Christian Grün christian.gruen@gmail.com ha scritto:
Hi Joseph,
It shouldn’t have anything to do with RESTXQ (RESTXQ is nothing else than a plain API for server middleware). I assume that your local and the server machine use different default encodings, which is why you need to specify different encodings with proc:execute.
Hope this helps, Christian
I have been able to solve my problem within my local machine by specifying ISO-8859-1 in proc:execute. In this case the bash command called in a RESTXQ function is able to return the right characters. However it not clear to me why I need to specify that encoding (instead of utf8) in order to get the Greek characters dislpayed.
Unfortunately, however, this does not work on my server. I checked that the bash has utf8 and indeed if I run the same scripts there results are properly displayed. However, within RESTXQ the same command (called via proc:execute) only returns question marks as a result.
Do you think this can be due to a problem inside RESTXQ or something else?
Thanks, Joseph
Il giorno 09/nov/2015, alle ore 15:18, Jens Erat jens.erat@uni-konstanz.de ha scritto:
Please report the encoding of your actual files. What is the output of `env | grep LANG` (when stored in your bash script)? If you pipe the output in another file, what will `file -i [output-file]` report (`file` applies some magic to discover file formats and encoding)?
Am 09.11.2015 um 14:06 schrieb meumapple: Hi All,
I have been able to run the command just putting it in a bash file and then running this file with proc:execute. It works. The problem now is with the encoding of the text returned by proc:execute. In my bash I can display Greek characters, but they are messed up if returned by the proc:execute function. I also added utf-8 as a parameter for the function but nothing changes. How can I solve this? Thanks.
Joseph
Il giorno 09/nov/2015, alle ore 10:41, Jens Erat jens.erat@uni-konstanz.de ha scritto:
Hi Joseph,
As far as I understand the documentation, BaseX `proc:system` does not use a shell interpreter by default. `<` or also the pipe `|` are features offered by your shell, not the operating system.
Instead of passing a single parameter, pass the command as parameter to `/usr/bin/env sh -c`, which will make BaseX run a shell that interpretes I/O indirection:
proc:system('/usr/bin/env', ('sh', '-c', 'opennlp postagger model.bin < file.txt'))
This might be a reasonable thing to provide as a built-in function, I'm also not sure whether one can connect some kind of stream/strings to STDIN, STDOUT and STDERR.
Regards from Lake Constance, Germany, Jens
Am 09.11.2015 um 10:19 schrieb Christian Grün:
Any suggestions?
Any example? ;) Christian
Thanks.
Joseph
Inizio messaggio inoltrato:
Da: meumapple meumapple@gmail.com Data: 06 novembre 2015 20:52:08 CET A: BaseX basex-talk@mailman.uni-konstanz.de Oggetto: Bash command
Hi all,
I am using proc:system for a bash command but I cannot run it. It looks like opennlp postagger model.bin < file.txt. I suspect that the problem is the less-than sign, which is converted into < Is there any way to keep the less-than sign and pass it to the bash properly? Thanks.
Joseph
-- Jens Erat Universität Konstanz Kommunikations-, Infomations-, Medienzentrum (KIM) Abteilung Basisdienste D-78457 Konstanz Mail: jens.erat@uni-konstanz.de