I am trying to fetch an url page http://www.ruya-tabirleri.com/ruyada-adam-gormek.html that has Turkish characters in it. When I pull it in basex, the pages returns in charset=windows-1254 but as you see it doesn't render Turkish characters like here:
content="r?a tabirleri, r?a yorumlar? r?a tabiri, r?a, r?atabirleri, haberci r?a, r?a t?leri, r?a nedir, aksenov, forsa, atat?k, evliya ?lebi, bur?ar, astroloji"/>
<html lang="tr-TR"> <head profile="http://gmpg.org/xfn/11"> <title>R?a Tabirleri, R?a Tabiri, R?a Yorumlar? R?a, r?atabirleri, ruya tabirleri</title> <meta name="keywords" content="r?a tabirleri, r?a yorumlar? r?a tabiri, r?a, r?atabirleri, haberci r?a, r?a t?leri, r?a nedir, aksenov, forsa, atat?k, evliya ?lebi, bur?ar, astroloji"/> <meta name="description" content="r?a tabirleri, r?a tabiri, r?a yorumlar? r?a, ruya tabirleri, r?atabirleri, r?alar?? bu sitede anlam kazanacak."/> <meta http-equiv="Content-Language" content="tr"/> <meta http-equiv="Content-Type" content="text/html; charset=windows-1254"/> <meta name="Author" content="Ruya-tabirleri.com"/> <meta name="robots" content="index, follow"/> <meta name="ROBOTS" content="ALL"/> <meta name="googlebot" content="index, follow"/> <meta name="Revisit-After" content="1 Days"/> <meta name="RATING" content="General"/> <meta name="copyright" content="r?a tabirleri"/> <link rel="stylesheet" href="http://www.ruya-tabirleri.com/img/ruya1.css" type="text/css"/> <script type="text/javascript" src="http://apis.google.com/js/plusone.js "/> </head>
Hi Erol,
I am trying to fetch an url page http://www.ruya-tabirleri.com/ruyada-adam-gormek.html that has Turkish characters in it. When I pull it in basex, the pages returns in charset=windows-1254 but as you see it doesn't render Turkish characters like here:
Please give us more details on how you are trying to retrieve the web page.
One solution could look as follows:
html:parse(fetch:text('http://www.ruya-tabirleri.com/ruyada-adam-gormek.html', 'windows-1254'), map { 'encoding': 'UTF-8' })
Cheers, Christian
content="r?a tabirleri, r?a yorumlar? r?a tabiri, r?a, r?atabirleri, haberci r?a, r?a t?leri, r?a nedir, aksenov, forsa, atat?k, evliya ?lebi, bur?ar, astroloji"/>
<html lang="tr-TR"> <head profile="http://gmpg.org/xfn/11"> <title>R?a Tabirleri, R?a Tabiri, R?a Yorumlar? R?a, r?atabirleri, ruya tabirleri</title> <meta name="keywords" content="r?a tabirleri, r?a yorumlar? r?a tabiri, r?a, r?atabirleri, haberci r?a, r?a t?leri, r?a nedir, aksenov, forsa, atat?k, evliya ?lebi, bur?ar, astroloji"/> <meta name="description" content="r?a tabirleri, r?a tabiri, r?a yorumlar? r?a, ruya tabirleri, r?atabirleri, r?alar?? bu sitede anlam kazanacak."/> <meta http-equiv="Content-Language" content="tr"/> <meta http-equiv="Content-Type" content="text/html; charset=windows-1254"/> <meta name="Author" content="Ruya-tabirleri.com"/> <meta name="robots" content="index, follow"/> <meta name="ROBOTS" content="ALL"/> <meta name="googlebot" content="index, follow"/> <meta name="Revisit-After" content="1 Days"/> <meta name="RATING" content="General"/> <meta name="copyright" content="r?a tabirleri"/> <link rel="stylesheet" href="http://www.ruya-tabirleri.com/img/ruya1.css" type="text/css"/> <script type="text/javascript" src="http://apis.google.com/js/plusone.js"/> </head>
basex-talk@mailman.uni-konstanz.de