Christian,
Slight modification I made to script: only 10 sites are pulled and produced output I attached new script and result you will get when you run it
On Fri, Jul 4, 2014 at 7:13 PM, Christian Grün christian.gruen@gmail.com wrote:
Hi Erol,
thanks for the script. Do you think you could provide us with a minimized example?
Christian _____________________________________
On Fri, Jul 4, 2014 at 11:43 PM, Erol Akarsu eakarsu@gmail.com wrote:
Christian,
My xquery script and result are attached. You will see incorrect encoded elements as?
On Fri, Jul 4, 2014 at 3:44 PM, Christian Grün <
christian.gruen@gmail.com>
wrote:
Hi Erol,
I am trying to fetch an url page http://www.ruya-tabirleri.com/ruyada-adam-gormek.html that has
Turkish
characters in it. When I pull it in basex, the pages returns in charset=windows-1254 but as you see it doesn't render Turkish
characters
like here:
Please give us more details on how you are trying to retrieve the web page.
One solution could look as follows:
html:parse(fetch:text('
http://www.ruya-tabirleri.com/ruyada-adam-gormek.html',
'windows-1254'), map { 'encoding': 'UTF-8' })
Cheers, Christian
content="r?a tabirleri, r?a yorumlar? r?a tabiri, r?a, r?atabirleri, haberci r?a, r?a t?leri, r?a nedir, aksenov, forsa, atat?k, evliya ?lebi, bur?ar, astroloji"/>
<html lang="tr-TR"> <head profile="http://gmpg.org/xfn/11"> <title>R?a Tabirleri, R?a Tabiri, R?a Yorumlar? R?a, r?atabirleri, ruya tabirleri</title> <meta name="keywords" content="r?a tabirleri, r?a yorumlar? r?a tabiri, r?a, r?atabirleri, haberci r?a, r?a t?leri, r?a nedir, aksenov, forsa, atat?k, evliya ?lebi, bur?ar, astroloji"/> <meta name="description" content="r?a tabirleri, r?a tabiri, r?a yorumlar? r?a, ruya tabirleri, r?atabirleri, r?alar?? bu sitede anlam kazanacak."/> <meta http-equiv="Content-Language" content="tr"/> <meta http-equiv="Content-Type" content="text/html; charset=windows-1254"/> <meta name="Author" content="Ruya-tabirleri.com"/> <meta name="robots" content="index, follow"/> <meta name="ROBOTS" content="ALL"/> <meta name="googlebot" content="index, follow"/> <meta name="Revisit-After" content="1 Days"/> <meta name="RATING" content="General"/> <meta name="copyright" content="r?a tabirleri"/> <link rel="stylesheet" href="http://www.ruya-tabirleri.com/img/ruya1.css" type="text/css"/> <script type="text/javascript" src="http://apis.google.com/js/plusone.js"/> </head>