加入收藏 | 设为首页 | 会员中心 | 我要投稿 阜阳站长网 (https://www.0558zz.com/)- 科技、建站、内容创作、云计算、网络安全!
当前位置: 首页 > 数据库 > MsSql > 正文

将SQL Server中的文本值从UTF8转换为ISO 8859-1

发布时间:2021-03-30 10:12:42 所属栏目:MsSql 来源:互联网
导读:我在SQL Server中有一个使用utf8 SQL_Latin1_General_CP1_CI_AS编码的列.如何转换和保存ISO 8859-1编码的文本?我想在SQL Server的查询中做一些事情.有小费吗? Ol. Gostei do jogo. Quando “baixei” at achei que no iria curtir muito 我编写了一个函数来

我在SQL Server中有一个使用utf8 SQL_Latin1_General_CP1_CI_AS编码的列.如何转换和保存ISO 8859-1编码的文本?我想在SQL Server的查询中做一些事情.有小费吗?

Ol. Gostei do jogo. Quando “baixei” at achei que no iria curtir muito

解决方法

我编写了一个函数来修复存储在varchar字段中的UTF-8文本.

要检查固定值,您可以像这样使用它:

CREATE TABLE #Table1 (Column1 varchar(max))

INSERT #Table1
VALUES ('Ol. Gostei do jogo. Quando "baixei" at achei que no iria curtir muito')

SELECT *,NewColumn1 = dbo.DecodeUTF8String(Column1)
FROM Table1
WHERE Column1 <> dbo.DecodeUTF8String(Column1)

输出:

Column1
-------------------------------
Ol. Gostei do jogo. Quando "baixei" at achei que no iria curtir muito

NewColumn1
-------------------------------
Olá. Gostei do jogo. Quando "baixei" até achei que no iria curtir muito

代码:

CREATE FUNCTION dbo.DecodeUTF8String (@value varchar(max))
RETURNS nvarchar(max)
AS
BEGIN
    -- Transforms a UTF-8 encoded varchar string into Unicode
    -- By Anthony Faull 2014-07-31
    DECLARE @result nvarchar(max);

    -- If ASCII or null there's no work to do
    IF (@value IS NULL
        OR @value NOT LIKE '%[^ -~]%' COLLATE Latin1_General_BIN
    )
        RETURN @value;

    -- Generate all integers from 1 to the length of string
    WITH e0(n) AS (SELECT TOP(POWER(2,POWER(2,0))) NULL FROM (VALUES (NULL),(NULL)) e(n)),e1(n) AS (SELECT TOP(POWER(2,1))) NULL FROM e0 CROSS JOIN e0 e),e2(n) AS (SELECT TOP(POWER(2,2))) NULL FROM e1 CROSS JOIN e1 e),e3(n) AS (SELECT TOP(POWER(2,3))) NULL FROM e2 CROSS JOIN e2 e),e4(n) AS (SELECT TOP(POWER(2,4))) NULL FROM e3 CROSS JOIN e3 e),e5(n) AS (SELECT TOP(POWER(2.,5)-1)-1) NULL FROM e4 CROSS JOIN e4 e),numbers(position) AS
    (
        SELECT TOP(DATALENGTH(@value)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
        FROM e5
    )
    -- UTF-8 Algorithm (http://en.wikipedia.org/wiki/UTF-8)
    -- For each octet,count the high-order one bits,and extract the data bits.,octets AS
    (
        SELECT position,highorderones,partialcodepoint
        FROM numbers a
        -- Split UTF8 string into rows of one octet each.
        CROSS APPLY (SELECT octet = ASCII(SUBSTRING(@value,position,1))) b
        -- Count the number of leading one bits
        CROSS APPLY (SELECT highorderones = 8 - FLOOR(LOG( ~CONVERT(tinyint,octet) * 2 + 1)/LOG(2))) c
        CROSS APPLY (SELECT databits = 7 - highorderones) d
        CROSS APPLY (SELECT partialcodepoint = octet % POWER(2,databits)) e
    )
    -- Compute the Unicode codepoint for each sequence of 1 to 4 bytes,codepoints AS
    (
        SELECT position,codepoint
        FROM
        (
            -- Get the starting octect for each sequence (i.e. exclude the continuation bytes)
            SELECT position,partialcodepoint
            FROM octets
            WHERE highorderones <> 1
        ) lead
        CROSS APPLY (SELECT sequencelength = CASE WHEN highorderones in (1,2,3,4) THEN highorderones ELSE 1 END) b
        CROSS APPLY (SELECT endposition = position + sequencelength - 1) c
        CROSS APPLY
        (
            -- Compute the codepoint of a single UTF-8 sequence
            SELECT codepoint = SUM(POWER(2,shiftleft) * partialcodepoint)
            FROM octets
            CROSS APPLY (SELECT shiftleft = 6 * (endposition - position)) b
            WHERE position BETWEEN lead.position AND endposition
        ) d
    )
    -- Concatenate the codepoints into a Unicode string
    SELECT @result = CONVERT(xml,(
            SELECT NCHAR(codepoint)
            FROM codepoints
            ORDER BY position
            FOR XML PATH('')
        )).value('.','nvarchar(max)');

    RETURN @result;
END
GO

(编辑:阜阳站长网)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!

    推荐文章
      热点阅读