且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

在Perl中,如何将字节数组转换为Unicode字符串?

更新时间:2023-11-15 19:41:34

当然可以.如果你有字节数组

Of course, it is possible. If you have the byte array

my @bytes = (0xce, 0xb1, 0xce, 0xb2, 0xce, 0xb3);

您需要首先将它们组合成一串八位字节:

you need to first combine those into a string of octets:

my $x = join '', map chr, @bytes;

然后,您可以使用 utf8 :: decode 将其转换为UTF-8 就地:

Then, you can use utf8::decode to convert that to UTF-8 in place:

utf8::decode($x)
    or die "Failed to decode UTF-8";

您还可以使用编码:: decode_utf8 .

#!/usr/bin/env perl

use 5.020; # why not?!
use strict;
use warnings;

use Encode qw( decode_utf8 );
use open qw(:std :utf8);

my @bytes = (0xce, 0xb1, 0xce, 0xb2, 0xce, 0xb3);
my $x = join '', map chr, @bytes;

say "Using Encode::decode_utf8";
say decode_utf8($x);

utf8::decode($x)
    or die "Failed to decode in place";

say "Using utf8::decode";
say $x;

输出:

C:\Temp> perl tt.pl  
Using Encode::decode_utf8                      
αβγ                                            
Using utf8::decode                             
αβγ

Encode允许您在许多字符编码之间进行转换.它的功能允许您指定在编码/解码操作失败的情况下发生的情况.使用utf8::decode时,您只能显式检查成功/失败.

Encode allows you to convert among many character encodings. Its functions allow you to specify what happens in case the encoding/decoding operations fail whereas with utf8::decode you are limited to explicitly checking success/failure.