The LaTeX hyperref packages has a CJKbookmarks option, but as far as I could work out, it simply passes the strings in their current encoding and does not convert them to unicode.
One option to get the bookmarks right is to use \texorpdfstring and to provide both encodings, one for the text and one for the bookmarks.
Another option is to simply let hyperref pass the strings in their current encoding and to change them afterwards. The hyperref package stores the bookmarks in a .out file. Assume the text is encoded in GB2312, then the following script bookmark-gb2utf.pl can be used to convert the bookmarks to the correct encoding.
#!/usr/bin/perl
use Encode;
sub octal {
my ($t) = (@_);
sprintf "\\%o", ord($t);
}
sub convert {
my ($t) = (@_);
if ($t =~ /[\x80-\xFF]/) {
Encode::from_to($t, "GB2312", "UTF-16BE");
$t =~ s/(.)/${\octal($1)}/g;
$t = "\\376\\377" . $t;
}
$t;
}
while (<>) {
$_ =~ s/([^}]*}{)([^}]*)/$1${\convert($2)}/;
print $_;
}
Use of the above script could be as follows:
latex test.tex ./bookmark-gb2utf.pl < test.out > test-utf.out mv test-utf.out test.out latex test.texThe second run of latex reads back in the (converted) bookmarks from test.out.
Sven Verdoolaege