Hi Erol,
I probably don't have time to wait for the script to finish, and I think it does much more than we need to get down to the core of the problem, so.. If you manage to isolate the problem you want us to solve, feel free to give it another try.
Christian
On Sat, Jul 5, 2014 at 1:46 AM, Erol Akarsu eakarsu@gmail.com wrote:
Christian,
Slight modification I made to script: only 10 sites are pulled and produced output I attached new script and result you will get when you run it
On Fri, Jul 4, 2014 at 7:13 PM, Christian Grün christian.gruen@gmail.com wrote:
Hi Erol,
thanks for the script. Do you think you could provide us with a minimized example?
Christian _____________________________________
On Fri, Jul 4, 2014 at 11:43 PM, Erol Akarsu eakarsu@gmail.com wrote:
Christian,
My xquery script and result are attached. You will see incorrect encoded elements as?
On Fri, Jul 4, 2014 at 3:44 PM, Christian Grün christian.gruen@gmail.com wrote:
Hi Erol,
I am trying to fetch an url page http://www.ruya-tabirleri.com/ruyada-adam-gormek.html that has Turkish characters in it. When I pull it in basex, the pages returns in charset=windows-1254 but as you see it doesn't render Turkish characters like here:
Please give us more details on how you are trying to retrieve the web page.
One solution could look as follows:
html:parse(fetch:text('http://www.ruya-tabirleri.com/ruyada-adam-gormek.html', 'windows-1254'), map { 'encoding': 'UTF-8' })
Cheers, Christian
content="r?a tabirleri, r?a yorumlar? r?a tabiri, r?a, r?atabirleri, haberci r?a, r?a t?leri, r?a nedir, aksenov, forsa, atat?k, evliya ?lebi, bur?ar, astroloji"/>
<html lang="tr-TR"> <head profile="http://gmpg.org/xfn/11"> <title>R?a Tabirleri, R?a Tabiri, R?a Yorumlar? R?a, r?atabirleri, ruya tabirleri</title> <meta name="keywords" content="r?a tabirleri, r?a yorumlar? r?a tabiri, r?a, r?atabirleri, haberci r?a, r?a t?leri, r?a nedir, aksenov, forsa, atat?k, evliya ?lebi, bur?ar, astroloji"/> <meta name="description" content="r?a tabirleri, r?a tabiri, r?a yorumlar? r?a, ruya tabirleri, r?atabirleri, r?alar?? bu sitede anlam kazanacak."/> <meta http-equiv="Content-Language" content="tr"/> <meta http-equiv="Content-Type" content="text/html; charset=windows-1254"/> <meta name="Author" content="Ruya-tabirleri.com"/> <meta name="robots" content="index, follow"/> <meta name="ROBOTS" content="ALL"/> <meta name="googlebot" content="index, follow"/> <meta name="Revisit-After" content="1 Days"/> <meta name="RATING" content="General"/> <meta name="copyright" content="r?a tabirleri"/> <link rel="stylesheet" href="http://www.ruya-tabirleri.com/img/ruya1.css" type="text/css"/> <script type="text/javascript" src="http://apis.google.com/js/plusone.js"/> </head>