#Lua 的 UTF-8 模块

请查看 Lua 标准库模块列表了解更多相关 API。

变量	说明
`utf8.charpattern`	匹配一个 UTF-8 序列的模式字符串：`"[\0-\x7F\xC2-\xFD][\x80-\xBF]*"`

函数	说明
utf8.char	由 UTF-8 编码生成 UTF-8 字符串
utf8.codes	读取 UTF-8 字符串的编码
utf8.codepoint	读取 UTF-8 字符串的编码
utf8.len	获取 UTF-8 字符串的长度
utf8.offset	获取 UTF-8 字符串的长度

#utf8.char

utf8.char (···)

说明

通过 UTF-8 编码生成 UTF-8 字符串。

参数

... - UTF-8 编码

返回值

返回生成的 UTF-8 字符串

示例

local text = utf8.char(
    -- "Primers " (ASCII)
    80, 114, 105, 109, 101, 114, 115, 32,
    
    -- "编程伙伴" (中文)
    0x7F16,  -- 编
    0x7A0B,  -- 程
    0x4F19,  -- 伙
    0x4F34,  -- 伴
    
    32,  -- 空格
    
    -- "https://xplanc.org" (ASCII)
    104, 116, 116, 112, 115, 58, 47, 47, 120, 112, 108, 97, 110, 99, 46, 111, 114, 103
)

print(text)
>>> Establishing WebAssembly Runtime. 
>>> Standby. 
Powered by Shift.

#utf8.codes

utf8.codes (s [, lax])

说明

将 UTF-8 字符串 s 按照拆分为 UTF-8 编码序列，返回迭代器函数。

参数

s - 要操作的 UTF-8 字符串
lax - 是否提升编码检查范围；默认 false

返回值

返回迭代器函数，每次迭代返回一个索引和 UTF-8 字符编码

示例

for pos, code in utf8.codes("Primers 编程伙伴") do
    print(pos, code, utf8.char(code))
end
>>> Establishing WebAssembly Runtime. 
>>> Standby. 
Powered by Shift.

#utf8.codepoint

utf8.codepoint (s [, i [, j [, lax]]])

说明

将 UTF-8 字符串 s 拆分为 UTF-8 编码序列。

参数

s - 要操作的 UTF-8 字符串
i - 开始位置（字节）；默认为 1
j - 结束位置（字节）；默认为 i
lax - 是否提升编码检查范围；默认 false

返回值

返回 UTF-8 编码序列

示例

print(utf8.codepoint("Primers 编程伙伴", 1, -1))
>>> Establishing WebAssembly Runtime. 
>>> Standby. 
Powered by Shift.

#utf8.len

utf8.len (s [, i [, j [, lax]]])

说明

获取 UTF-8 字符串 s 的长度。

参数

s - 要操作的 UTF-8 字符串
i - 开始位置（字节）；默认为 1
j - 结束位置（字节）；默认为 -1
lax - 是否提升编码检查范围；默认 false

返回值

返回 UTF-8 字符串长度

示例

print(utf8.len("编程伙伴"))
>>> Establishing WebAssembly Runtime. 
>>> Standby. 
Powered by Shift.

#utf8.offset

utf8.offset (s, n [, i])

说明

获取 UTF-8 字符串 s 中第 i 字节之后的第 n 个字符的字节偏移量。

参数

s - 要操作的字符串
n - 第 n 个字符
i - 开始位置（字节）；默认为 1

返回值

返回字节偏移量

示例

print(utf8.offset("编程伙伴", 3))       -- 第三个 UTF-8 字符（伙）的偏移量
print(utf8.offset("编程伙伴", 3, 4))    -- 第四字节后第三个 UTF-8 字符（伴）的偏移量
>>> Establishing WebAssembly Runtime. 
>>> Standby. 
Powered by Shift.

#推荐阅读

UTF-8 Support - Lua 5.4 Reference Manual