Discussion:
We need a solution about .po files && UTF-8
(too old to reply)
Carlos Perelló Marín
2002-04-27 22:33:28 UTC
Permalink
An introduction for gnome-hackers readers...


We have a problem with l10n strings for GNOME 2.0. For all systems with
glibc < 2.2 the autorecode feature of glibc is not available so all .po
files should be encoded as UTF-8. We need mantainers opinion about this
problem because it's a critical bug that needs a fix before GNOME 2.0
release.

We have several options:

1.- Store all .po files for GNOME 2.0 at cvs.gnome.org as UTF-8
2.- Recode all .po files as UTF-8 at distribution time.

The "problems" are:

The option number 1 can cause some problems to translator people because
the lack of UTF-8 editors to update the UTF-8 files.

The option number 2 forces to use gettext >= 0.11 to all mantainers when
they do a make dist and we should add a rule to execute the recode.

Here you have the thread link to get more information about this issue:

http://mail.gnome.org/archives/gnome-i18n/2002-April/msg00075.html


Cheers.
--
Carlos Perelló Marín
mailto:***@gnome-db.org
mailto:***@hispalinux.es
http://www.gnome-db.org
http://www.Hispalinux.es
Valencia - Spain
Diego Sevilla Ruiz
2002-04-27 22:38:35 UTC
Permalink
Hi, Carlos:

Option 2 seems to be worse, because many packages can forgot to make the
change...

Also, what editors are there that support UTF-8? Can I play with emacs?

Best regards.
diego

On Sun, Apr 28, 2002 at 12:33:28AM +0200, Carlos Perelló Marín wrote:

| An introduction for gnome-hackers readers...
|
|
| We have a problem with l10n strings for GNOME 2.0. For all systems with
| glibc < 2.2 the autorecode feature of glibc is not available so all .po
| files should be encoded as UTF-8. We need mantainers opinion about this
| problem because it's a critical bug that needs a fix before GNOME 2.0
| release.
|
| We have several options:
|
| 1.- Store all .po files for GNOME 2.0 at cvs.gnome.org as UTF-8
| 2.- Recode all .po files as UTF-8 at distribution time.
|
| The "problems" are:
|
| The option number 1 can cause some problems to translator people because
| the lack of UTF-8 editors to update the UTF-8 files.
|
| The option number 2 forces to use gettext >= 0.11 to all mantainers when
| they do a make dist and we should add a rule to execute the recode.
|
| Here you have the thread link to get more information about this issue:
|
| http://mail.gnome.org/archives/gnome-i18n/2002-April/msg00075.html
|
|
| Cheers.
|
| --
| Carlos Perelló Marín
| mailto:***@gnome-db.org
| mailto:***@hispalinux.es
| http://www.gnome-db.org
| http://www.Hispalinux.es
| Valencia - Spain
--
Diego Sevilla Ruiz http://ditec.um.es/~dsevilla ***@um.es \ /\
Dpto. Ingeniería y Tecnología de Computadores http://ditec.um.es ) ( ')
Visiting Extreme! Computing Lab http://extreme.indiana.edu ( / )
Indiana University, Bloomington http://www.iub.edu \(__)|
Havoc Pennington
2002-04-27 23:13:08 UTC
Permalink
Post by Carlos Perelló Marín
The option number 2 forces to use gettext >= 0.11 to all mantainers when
they do a make dist and we should add a rule to execute the recode.
Would this mean that when compiling from CVS, translations would be broken?

Havoc
Carlos Perelló Marín
2002-04-27 23:30:13 UTC
Permalink
Post by Havoc Pennington
Post by Carlos Perelló Marín
The option number 2 forces to use gettext >= 0.11 to all mantainers when
they do a make dist and we should add a rule to execute the recode.
Would this mean that when compiling from CVS, translations would be broken?
Good point. With the option that we were thinking on, yes. Translatios
would be broken if you don't do the make dist first, so we need to
update the solution or find another option.

Also, I think that after a make dist a mantainer does a cvs ci, so all
.po files will be committed as UTF-8...

So we have two problems here :-(.

I think that the best solution should be put all as UTF-8, but please
tell me if you have a better fix.

Cheers.
Post by Havoc Pennington
Havoc
--
Carlos Perelló Marín
mailto:***@gnome-db.org
mailto:***@hispalinux.es
http://www.gnome-db.org
http://www.Hispalinux.es
Valencia - Spain
Carlos Perelló Marín
2002-04-27 23:32:39 UTC
Permalink
Hi
Post by Diego Sevilla Ruiz
Option 2 seems to be worse, because many packages can forgot to make the
change...
Also, what editors are there that support UTF-8? Can I play with emacs?
Gedit2
gtranslator (but someone told me that it does not work as well as it
should).
kbabel
emacs (if you read the gnome-i18n thread you can see that seems like it
has some troubles).
...


Cheers.
Post by Diego Sevilla Ruiz
Best regards.
diego
| An introduction for gnome-hackers readers...
|
|
| We have a problem with l10n strings for GNOME 2.0. For all systems with
| glibc < 2.2 the autorecode feature of glibc is not available so all .po
| files should be encoded as UTF-8. We need mantainers opinion about this
| problem because it's a critical bug that needs a fix before GNOME 2.0
| release.
|
|
| 1.- Store all .po files for GNOME 2.0 at cvs.gnome.org as UTF-8
| 2.- Recode all .po files as UTF-8 at distribution time.
|
|
| The option number 1 can cause some problems to translator people because
| the lack of UTF-8 editors to update the UTF-8 files.
|
| The option number 2 forces to use gettext >= 0.11 to all mantainers when
| they do a make dist and we should add a rule to execute the recode.
|
|
| http://mail.gnome.org/archives/gnome-i18n/2002-April/msg00075.html
|
|
| Cheers.
|
| --
| Carlos Perelló Marín
| http://www.gnome-db.org
| http://www.Hispalinux.es
| Valencia - Spain
--
Dpto. Ingeniería y Tecnología de Computadores http://ditec.um.es ) ( ')
Visiting Extreme! Computing Lab http://extreme.indiana.edu ( / )
Indiana University, Bloomington http://www.iub.edu \(__)|
_______________________________________________
gnome-i18n mailing list
http://mail.gnome.org/mailman/listinfo/gnome-i18n
--
Carlos Perelló Marín
mailto:***@gnome-db.org
mailto:***@hispalinux.es
http://www.gnome-db.org
http://www.Hispalinux.es
Valencia - Spain
Jody Goldberg
2002-04-28 00:06:12 UTC
Permalink
Post by Carlos Perelló Marín
Post by Havoc Pennington
Would this mean that when compiling from CVS, translations would be broken?
Good point. With the option that we were thinking on, yes. Translatios
would be broken if you don't do the make dist first, so we need to
update the solution or find another option.
That would seem to make the choice fairly clear.
po files should be in utf8.

Given the choice between requiring developers to have ultra-modern
libc, and translators using a restricted subset of tools the later
seems less restrictive. The former means that some platforms will
not be able to run dists (do all the *bsds support this ?)

Does this mean
bind_textdomain_codeset (GETTEXT_PACKAGE, "UTF-8");
would no longer be required ?
Owen Taylor
2002-04-28 00:20:07 UTC
Permalink
Post by Jody Goldberg
Post by Carlos Perelló Marín
Post by Havoc Pennington
Would this mean that when compiling from CVS, translations would be broken?
Good point. With the option that we were thinking on, yes. Translatios
would be broken if you don't do the make dist first, so we need to
update the solution or find another option.
That would seem to make the choice fairly clear.
po files should be in utf8.
Given the choice between requiring developers to have ultra-modern
libc, and translators using a restricted subset of tools the later
seems less restrictive. The former means that some platforms will
not be able to run dists (do all the *bsds support this ?)
Does this mean
bind_textdomain_codeset (GETTEXT_PACKAGE, "UTF-8");
would no longer be required ?
It's still required because otherwise recent gettext()
and GNU libc will translate the translations back from
UTF-8 to the encoding of the locale.

Regards,
Owen
Carlos Perelló Marín
2002-04-28 00:20:47 UTC
Permalink
Post by Jody Goldberg
Post by Carlos Perelló Marín
Post by Havoc Pennington
Would this mean that when compiling from CVS, translations would be broken?
Good point. With the option that we were thinking on, yes. Translatios
would be broken if you don't do the make dist first, so we need to
update the solution or find another option.
That would seem to make the choice fairly clear.
po files should be in utf8.
Given the choice between requiring developers to have ultra-modern
libc, and translators using a restricted subset of tools the later
seems less restrictive. The former means that some platforms will
not be able to run dists (do all the *bsds support this ?)
Does this mean
bind_textdomain_codeset (GETTEXT_PACKAGE, "UTF-8");
would no longer be required ?
No, We will need this line because if you have a glibc >= 2.2 it will
recode the .po to your default locale, for example in spain we use
iso-8859-15 and it's not UTF-8.

We will not need this line if all GNOME users change his/her locale to
LOCALE.UTF-8 or something like that but it's really difficult to change
it for all GNOME 2.0 users IMHO

cheers.
--
Carlos Perelló Marín
mailto:***@gnome-db.org
mailto:***@hispalinux.es
http://www.gnome-db.org
http://www.Hispalinux.es
Valencia - Spain
R.I.P. Deaddog
2002-04-28 00:21:14 UTC
Permalink
Post by Jody Goldberg
That would seem to make the choice fairly clear.
po files should be in utf8.
Given the choice between requiring developers to have ultra-modern
libc, and translators using a restricted subset of tools the later
seems less restrictive. The former means that some platforms will
not be able to run dists (do all the *bsds support this ?)
Does this mean
bind_textdomain_codeset (GETTEXT_PACKAGE, "UTF-8");
would no longer be required ?
bind_textdomain_codeset is still vital. Gnect from gnome-games 1.90.2
tarball lacks this call, and localized strings are not displayed
as a result, even when the po file is UTF-8 encoded.
--
Abel Cheung
GPG Key: (0xC67186FF) http://deaddog.org/gpg.asc
Karl Eichwalder
2002-04-28 03:22:41 UTC
Permalink
Post by Carlos Perelló Marín
emacs (if you read the gnome-i18n thread you can see that seems like it
has some troubles).
...
Using Emacs from CVS head (plus the settings I posted) does it for quite
some languages, though. Emacs 21.2 features limited UTF-8 support.

BTW, I don't see a problem to ship .gmo files UTF-8 encoded while the
.po files can stay unchanged. Converting .gmo files can either happen
at 'make dist' or at 'make install' time.
--
***@suse.de (work) / ***@gmx.net (home): |
http://www.suse.de/~ke/ | ,__o
Free Translation Project: | _-\_<,
http://www.iro.umontreal.ca/contrib/po/HTML/ | (*)/'(*)
Rodrigo Moya
2002-04-28 11:56:13 UTC
Permalink
Post by Karl Eichwalder
Post by Carlos Perelló Marín
emacs (if you read the gnome-i18n thread you can see that seems like it
has some troubles).
...
Using Emacs from CVS head (plus the settings I posted) does it for quite
some languages, though. Emacs 21.2 features limited UTF-8 support.
BTW, I don't see a problem to ship .gmo files UTF-8 encoded while the
.po files can stay unchanged. Converting .gmo files can either happen
at 'make dist' or at 'make install' time.
yes, indeed, the rule for creating .gmo files from the .po can be
modified to run the conversion to UTF8 at the time the .gmo files are
generated. Thus, translations won't be broken when installing sources
from CVS.

cheers
Carlos Perelló Marín
2002-04-28 12:23:43 UTC
Permalink
Post by Rodrigo Moya
Post by Karl Eichwalder
Post by Carlos Perelló Marín
emacs (if you read the gnome-i18n thread you can see that seems like it
has some troubles).
...
Using Emacs from CVS head (plus the settings I posted) does it for quite
some languages, though. Emacs 21.2 features limited UTF-8 support.
BTW, I don't see a problem to ship .gmo files UTF-8 encoded while the
.po files can stay unchanged. Converting .gmo files can either happen
at 'make dist' or at 'make install' time.
yes, indeed, the rule for creating .gmo files from the .po can be
modified to run the conversion to UTF8 at the time the .gmo files are
generated. Thus, translations won't be broken when installing sources
from CVS.
The problem here are:

1.- A mantainer does always a cvs ci after a make dist so all .po files
are as UTF8 and then he will commit all files as UTF8 if she does not
take care of it.
2.- All people that wants to compile the module from CVS should had
gettext >= 0.11 to be able to compile it correctly.

IMHO is much more easy to recode a file at translation time than at
build time (but it's only my opinion).

Cheers.


P.S.: I will send an email to gnome2-release-***@gnome.org asking for a
final solution so please, send here all your arguments to let them
choose the best solution.
Post by Rodrigo Moya
cheers
_______________________________________________
gnome-hackers mailing list
http://mail.gnome.org/mailman/listinfo/gnome-hackers
--
Carlos Perelló Marín
mailto:***@gnome-db.org
mailto:***@hispalinux.es
http://www.gnome-db.org
http://www.Hispalinux.es
Valencia - Spain
Karl Eichwalder
2002-04-28 12:44:34 UTC
Permalink
Post by Carlos Perelló Marín
1.- A mantainer does always a cvs ci after a make dist so all .po files
are as UTF8 and then he will commit all files as UTF8 if she does not
take care of it.
No. Conversion will happen in a pipe:

msgconv -o - --to UTF-8 LL.po | msgfmt -o LL.gmo -
Post by Carlos Perelló Marín
2.- All people that wants to compile the module from CVS should had
gettext >= 0.11 to be able to compile it correctly.
I don't thinks the lack of gettext 0.11.x will necessarily cause a
fatal error. I 'msgconv' is missing just fall back to 'cat'. This
should work quite good for most of those who don't have a reasonable
glibc or gettext version installed (I bet those are mainly interested in
english messages only).
--
***@suse.de (work) / ***@gmx.net (home): |
http://www.suse.de/~ke/ | ,__o
Free Translation Project: | _-\_<,
http://www.iro.umontreal.ca/contrib/po/HTML/ | (*)/'(*)
Carlos Perelló Marín
2002-04-28 14:07:07 UTC
Permalink
kmaraas has just told me that we can depend on gettext 0.11 without
problems so the main question now is...

kenneth is there an easy hack at intltool package to execute something
like: "msgconv -o - --to UTF-8 LL.po | msgfmt -o LL.gmo -" instead of
the actual procedure to get the .gmo files?

If it works without problems then I will change my mind about this issue
and I will agree about this solution.

P.S.: We should change also the .po's header from the local encoding to
UTF8.

Cheers.
Post by Karl Eichwalder
Post by Carlos Perelló Marín
1.- A mantainer does always a cvs ci after a make dist so all .po files
are as UTF8 and then he will commit all files as UTF8 if she does not
take care of it.
msgconv -o - --to UTF-8 LL.po | msgfmt -o LL.gmo -
Post by Carlos Perelló Marín
2.- All people that wants to compile the module from CVS should had
gettext >= 0.11 to be able to compile it correctly.
I don't thinks the lack of gettext 0.11.x will necessarily cause a
fatal error. I 'msgconv' is missing just fall back to 'cat'. This
should work quite good for most of those who don't have a reasonable
glibc or gettext version installed (I bet those are mainly interested in
english messages only).
--
http://www.suse.de/~ke/ | ,__o
Free Translation Project: | _-\_<,
http://www.iro.umontreal.ca/contrib/po/HTML/ | (*)/'(*)
--
Carlos Perelló Marín
mailto:***@gnome-db.org
mailto:***@hispalinux.es
http://www.gnome-db.org
http://www.Hispalinux.es
Valencia - Spain
Karl Eichwalder
2002-04-28 14:14:25 UTC
Permalink
Post by Carlos Perelló Marín
kmaraas has just told me that we can depend on gettext 0.11 without
problems so the main question now is...
Great!
Post by Carlos Perelló Marín
P.S.: We should change also the .po's header from the local encoding to
UTF8.
This is one of the reasons I'm voting for msgconv -- you'll get header
adjustments for free:

***@tux:~/Projects> grep charset sh-utils-2.0.11.de.po
"Content-Type: text/plain; charset=ISO-8859-1\n"

***@tux:~/Projects> /gnu/bin/msgconv -o - --to UTF-8 sh-utils-2.0.11.de.po \
| grep charset
"Content-Type: text/plain; charset=UTF-8\n"
--
***@suse.de (work) / ***@gmx.net (home): |
http://www.suse.de/~ke/ | ,__o
Free Translation Project: | _-\_<,
http://www.iro.umontreal.ca/contrib/po/HTML/ | (*)/'(*)
ERDI Gergo
2002-04-28 14:23:43 UTC
Permalink
Post by Carlos Perelló Marín
kenneth is there an easy hack at intltool package to execute something
like: "msgconv -o - --to UTF-8 LL.po | msgfmt -o LL.gmo -" instead of
the actual procedure to get the .gmo files?
if this is for gnome2, those apps will use glib-gettextize's
po/Makefile.in.in so we can definitely tweak stuff like .gmo creation
rules.

--=20
.--=3D ULLA! =3D---------------------. `We are not here to give users =
what
\ http://cactus.rulez.org \ they want' -- RMS, at GUADEC 2001
`---=3D ***@cactus.rulez.org =3D---'
"Outlook not so good." That magic 8-ball knows everything! I'll ask about E=
xchange Server next (/.)
Owen Taylor
2002-04-28 14:25:39 UTC
Permalink
Post by Carlos Perelló Marín
An introduction for gnome-hackers readers...
We have a problem with l10n strings for GNOME 2.0. For all systems with
glibc < 2.2 the autorecode feature of glibc is not available so all .po
files should be encoded as UTF-8. We need mantainers opinion about this
problem because it's a critical bug that needs a fix before GNOME 2.0
release.
1.- Store all .po files for GNOME 2.0 at cvs.gnome.org as UTF-8
2.- Recode all .po files as UTF-8 at distribution time.
GTK+ will continue to use plan 1. here, as it does currently (and GLib
will be switched over to do the same whenever I get a bit of time to
do it.)

Distributing files that are different than that what is in CVS is
(in my opionion) a bad idea ... files in a distributed tarball
should either one of:

- Source files, identical to what the maintainer uses
- Files generated from the source files.

Plus if tarballs are different from what people are installing
from CVS, handling bugs and QA-ing what we release is going to
getharder.

Yes, it's a little more difficult for translators, but there
are several decent solutions (use the conversion scripts,
use a Unicode-capable editor, use a UTF-8 locale with a locale
friendly editor.)

Regards,
Owen
Carlos Perelló Marín
2002-04-28 14:48:43 UTC
Permalink
Post by Owen Taylor
Post by Carlos Perelló Marín
An introduction for gnome-hackers readers...
We have a problem with l10n strings for GNOME 2.0. For all systems with
glibc < 2.2 the autorecode feature of glibc is not available so all .po
files should be encoded as UTF-8. We need mantainers opinion about this
problem because it's a critical bug that needs a fix before GNOME 2.0
release.
1.- Store all .po files for GNOME 2.0 at cvs.gnome.org as UTF-8
2.- Recode all .po files as UTF-8 at distribution time.
GTK+ will continue to use plan 1. here, as it does currently (and GLib
will be switched over to do the same whenever I get a bit of time to
do it.)
Distributing files that are different than that what is in CVS is
(in my opionion) a bad idea ... files in a distributed tarball
It seems that we have an option that solves this issue. cvs contents and
dist tar.gz will have the same .po files, the recode will be only when
you generate the .gmo file and this file is not at cvs.gnome.org so the
problem has gone.

We only need that everyone that wants to install GNOME 2.0 from sources
will need gettext >= 0.11 because as cactus has remember me, we can
modify glib-gettextize to recode directly the .gmo files as UTF8.


Bye
Post by Owen Taylor
- Source files, identical to what the maintainer uses
- Files generated from the source files.
Plus if tarballs are different from what people are installing
from CVS, handling bugs and QA-ing what we release is going to
getharder.
Yes, it's a little more difficult for translators, but there
are several decent solutions (use the conversion scripts,
use a Unicode-capable editor, use a UTF-8 locale with a locale
friendly editor.)
Regards,
Owen
--
Carlos Perelló Marín
mailto:***@gnome-db.org
mailto:***@hispalinux.es
http://www.gnome-db.org
http://www.Hispalinux.es
Valencia - Spain
Owen Taylor
2002-04-28 15:13:05 UTC
Permalink
Carlos Perelló Marín <***@gnome-db.org> writes:
Sander Vesik
2002-04-28 15:41:50 UTC
Permalink
Sander Vesik
2002-04-28 15:58:07 UTC
Permalink
Loading...