new tellico user

Mathieu BELLEVILLE mathieu.belleville at free.fr
Thu Feb 15 02:37:51 MST 2007


Robby Stephenson <robby at ...> writes:

> 
> Hi Mathieu,
> 
> On Thursday 11 January 2007 5:20, mathieu.belleville at ... wrote:
> > -The allocine.fr data source works in UTF-8, but the allocine web site
> > expect latin-1, but if I modify the python code to construct a latin-1
> > search url, then this is tellico which is complaining that the XML is not
> > OK (indeed, it may contain some latin-1 instead of UTF-8, so is out of
> > spec).
> 
> As long as the XML specifies the encoding, I think you should be able to use 
> whatever you want. Is the warning a malformed XML error, or something about 
> a character entity? Can you give me an example of a search, I'll try to see 
> if I can make it a bit more consistent.
> 
> > -The export to pilot-DB: my palm (tungsten T5) is set to latin-1. As far
> > as I know, there is no possibility to set the palm to UTF-8. In the
> > tellico export dialog, there was a section about locale, but it was
> > greyed out, so I could not modify the locale to latin-1 instead of UTF-8.
> 
> That's a bit of a relic from when I coded the export dialog. The locale 
> setting only applies to text exports, and the pilot-DB is a binary one, so 
> it gets disabled. I should probably special-case the pilot-DB. In any case, 
> you're not the first to hit that, you can actually modify the tellicorc 
> file directly to do what you need.
> http://periapsis.org/tellico/doc/hidden-options.html#hidden-export-options-pilotdb
> 
> > All in all, a very good piece of software: felicitations.
> 
> Thanks!
> Robby
> 

I found out about the hidden option for pilot-db export. Issue solved.

Regarding the allociné.fr search:
A search for "bronzés" should give two or three answers from allociné, but it
does not, neither from tellico nor from the command line.
It works if you specify "bronzes", since allocine seems to perform the search
with and without the accent.
I then try to specify the encoding of self.__title line 295 of the python
script, where it says:
 self.__getHTMLContent(self.__searchURL % urllib.quote(self.__title))
the new line is something like (from memory):
 self.__getHTMLContent(self.__searchURL %
urllib.quote(self.__title.encode('latin-1'))

The result was that the script would actually deliver some XML results, but that
tellico would complain that this result is illegal (sorry, I do not have the
exact message at hand). 
My personal guess is that the iso-8859-1 title quoted by urllib is stored in the
 XML, and the unquoting process to UTF-8 produce a bad result.
I am not a python specialist, so I did not explore any further.

Mathieu.






More information about the tellico-users mailing list