且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

在Haskell中,如何获取UTF8字符串中的字节数?

更新时间:2023-11-15 19:45:58

您可以使用出色的 utf8-string 软件包。

You can use the excellent utf8-string package for this.

import qualified Data.ByteString as BS
import qualified Data.ByteString.UTF8 as UTF8

numBytesUtf8 :: String -> Int
numBytesUtf8 = BS.length . UTF8.fromString

然后,以您的示例为例,

Then, to use your example,

ghci> numBytesUtf8 "Hello Snowman ☃!"
18

当然,您可能一开始就不应该这样做。 UTF8.fromString BS.length 可能是您要使用的函数,但您的字符串可能应该是已经个字节字符串,您可以对将其编码为多少字节感兴趣。

Of course, you should probably not be doing this in the first place. UTF8.fromString and BS.length are probably the functions you want to use, but your strings probably ought to be already bytestrings for you to be interested in how many bytes it takes to encode them as such.