微信文件传输助手网页版中文编码问题 - V2EX
V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
chenliang0571
V2EX    微信

微信文件传输助手网页版中文编码问题

  •  
  •   chenliang0571 228 天前 1324 次点击
    这是一个创建于 228 天前的主题,其中的信息可能已经有所发展或是发生改变。

    请问有人知道这里的原始中文编码是什么? 如何解码?

    json 返回的字符串

    …è“èè’è”±¨…‘°#TODS2023§ ” #TodsTheItalianPortrait ‘Walter Chiapponi 

    实际内容

    流畅轮廓、精致细节和自由律动,糅合呈现#TODS2023 秋冬 男士系列。 #TodsTheItalianPortrait 创意总监:Walter Chiapponi 

    询问了几个 AI, 基本都建议这样解码:

    const icOnv= require('iconv-lite'); const garbledText = "…è“èè’è”±¨…‘°#TODS2023§ ”\n#TodsTheItalianPortrait\n\n‘Walter Chiapponi"; const buf = Buffer.from(garbledText, 'binary'); // const decodedText = iconv.decode(buf, 'windows-1252'); // const decodedText = iconv.decode(buf, 'latin1'); // const decodedText = iconv.decode(buf, 'gbk'); const decodedText = iconv.decode(buf, 'utf-8'); console.log(decodedText); 

    但是实际输出是这样的, 只有小部分内容被解码:

    流"&轮精! `R!9`R&}#TODS20239 士系 #TodsTheItalianPortrait ::aWalter Chiapponi 

    request url

    https://filehelper.weixin.qq.com/cgi-bin/mmwebwx-bin/webwxsync?sid=9jOOuKz7HAavYjQY&skey=%40crypt_f9b8121a_46a00ef5be0a151552b6da4e4a72d526&pass_ticket=Pr31qvGLTC%252FORqib4BZsCAhorw3BuGqE1br%252FNUxuFFQSjJKCW5AJlqsth2T3UdkxflKlNmfUnuEQXEBe%252FE8mcQ%253D%253D

    response

    { "BaseResponse": { "Ret": 0, "ErrMsg": "" }, "AddMsgCount": 1, "AddMsgList": [ { "MsgId": "100930987469004064", "FromUserName": "@c4dcd4010dc50e5ee03f32ae786701de", "ToUserName": "filehelper", "MsgType": 1, "Content": "…è“èè’è”±¨…‘°#TODS2023§ ”<br/>#TodsTheItalianPortrait<br/><br/>‘Walter Chiapponi", "Status": 3, "ImgStatus": 1, "CreateTime": 1740397130, "VoiceLength": 0, "PlayLength": 0, "FileName": "", "FileSize": "", "MediaId": "", "Url": "", "AppMsgType": 0, "StatusNotifyCode": 0, "StatusNotifyUserName": "", "RecommendInfo": { "UserName": "", "NickName": "", "QQNum": 0, "Province": "", "City": "", "Content": "", "Signature": "", "Alias": "", "Scene": 0, "VerifyFlag": 0, "AttrStatus": 0, "Sex": 0, "Ticket": "", "OpCode": 0 }, "ForwardFlag": 0, "AppInfo": { "AppID": "", "Type": 0 }, "HasProductId": 0, "Ticket": "", "ImgHeight": 0, "ImgWidth": 0, "SubMsgType": 0, "NewMsgId": 100930987469004064, "OriContent": "", "EncryFileName": "" } ], "ModContactCount": 0, "ModContactList": [], "DelContactCount": 0, "DelContactList": [], "ModChatRoomMemberCount": 0, "ModChatRoomMemberList": [], "Profile": { "BitFlag": 0, "UserName": { "Buff": "" }, "NickName": { "Buff": "" }, "BindUin": 0, "BindEmail": { "Buff": "" }, "BindMobile": { "Buff": "" }, "Status": 0, "Sex": 0, "PersonalCard": 0, "Alias": "", "HeadImgUpdateFlag": 0, "HeadImgUrl": "", "Signature": "" }, "ContinueFlag": 0, "SyncKey": { "Count": 14, "List": [ { "Key": 1, "Val": 940546031 }, { "Key": 2, "Val": 897439235 }, { "Key": 3, "Val": 940546023 }, { "Key": 11, "Val": 940546048 }, { "Key": 19, "Val": 44482 }, { "Key": 23, "Val": 1740396794 }, { "Key": 24, "Val": 1740397130 }, { "Key": 25, "Val": 897439235 }, { "Key": 27, "Val": 308443 }, { "Key": 201, "Val": 1740397130 }, { "Key": 203, "Val": 1740396590 }, { "Key": 206, "Val": 101 }, { "Key": 1000, "Val": 1740395520 }, { "Key": 1001, "Val": 1740395522 } ] }, "SKey": "", "SyncCheckKey": { "Count": 14, "List": [ { "Key": 1, "Val": 940546031 }, { "Key": 2, "Val": 897439235 }, { "Key": 3, "Val": 940546023 }, { "Key": 11, "Val": 940546048 }, { "Key": 19, "Val": 44482 }, { "Key": 23, "Val": 1740396794 }, { "Key": 24, "Val": 1740397130 }, { "Key": 25, "Val": 897439235 }, { "Key": 27, "Val": 308443 }, { "Key": 201, "Val": 1740397130 }, { "Key": 203, "Val": 1740396590 }, { "Key": 206, "Val": 101 }, { "Key": 1000, "Val": 1740395520 }, { "Key": 1001, "Val": 1740395522 } ] } } 
    第 1 条附言    228 天前

    windows-1252测试结果

    > iconv.decode(iconv.encode('…è“èè’è”±¨…‘°#TODS2023§ ”#TodsTheItalianPortrait‘Walter Chiapponi', 'windows-1252'), 'utf-8') '畅轮廓精致细节和自由律动,糅呈现#TODS2023秋冬 男士系列。#TodsTheItalianPortrait创总监:Walter Chiapponi' > iconv.decode(iconv.encode('流畅轮廓、精致细节和自由律动,糅合呈现#TODS2023秋冬 男士系列。#TodsTheItalianPortrait 创意总监:Walter Chiapponi', 'utf-8'), 'windows-1252') '…è“èè’è”±¨…‘°#TODS2023§ ”#TodsTheItalianPortrait ‘Walter Chiapponi' > iconv.decode(iconv.encode(iconv.decode(iconv.encode('流畅轮廓、精致细节和自由律动,糅合呈现#TODS2023秋冬 男士系列。#TodsTheItalianPortrait 创意总监:Walter Chiapponi', 'utf-8'), 'windows-1252'), 'windows-1252'), 'utf-8') '畅轮廓精致细节和自由律动,糅呈现#TODS2023秋冬 男士系列。#TodsTheItalianPortrait 创总监:Walter Chiapponi' 
    第 2 条附言    228 天前

    下面这段:
    “‰‰”±¤è…è§è°‰.

    用windows-1252编码,然后utf-8解码后是:
    iconv.decode(iconv.encode('“‰‰”±¤è…è§è°‰', 'windows-1252'), 'utf-8'). 当?微信版本?支?展示该内容,请?级至最新版本。.

    通过网络搜索,发现正确的文字是:
    当前微信版本不支持展示该内容,请升级至最新版本。.

    部分文字无法解码


    仔细分析“当”和“前”发现:

    当 => [0xe5, 0xbd, 0x93] => “
    前 => [0xe5, 0x89, 0x8d] => ‰ (这里8d没有对应的编码,导致编码出现问题).

    windows-1252 的 81、8D、8F、90 和 9D 都未有使用( https://zh.wikipedia.org/wiki/Windows-1252 ).

    查看原始网络数据包,发现字符串包含了部分不可见字符,比如: \x8D \x81.

    所以如果要获得正确的编码,iconv.encode('', 'windows-1252') 之后要替换掉对应位置的值为81、8D、8F、90 或 9D。

    4 条回复    2025-02-25 02:19:48 +08:00
    ntedshen
        1
    ntedshen  
       228 天前
    现(utf8)=e78eb0=°(latin1)
    chenliang0571
        2
    chenliang0571  
    OP
       228 天前
    @ntedshen 似乎不对?
    > iconv.encode('现', 'utf-8')
    <Buffer e7 8e b0>

    > iconv.encode('°', 'latin1')
    <Buffer e7 3f b0>
    ntedshen
        3
    ntedshen  
       228 天前
    @chenliang0571
    https://cs.stanford.edu/people/miles/iso8859.html
    3f 是问号

    其实不用管这个,你现在只需要知道编码是错的,接口无论如何也不可能给你一个拉丁字符集让你自己处理中文。。。
    看看 contenttype 是不是没 utf8
    chenliang0571
        4
    chenliang0571  
    OP
       228 天前
    @ntedshen
    request:content-type:application/json;charset=
    response:content-type:text/plain
    ---

    我知道原因了,windows-1252 的 81 、8D 、8F 、90 和 9D 都未有使用( https://zh.wikipedia.org/wiki/Windows-1252 )

    所以下面的中文编码为 windows-1252 ,然后重新解码 utf-8 部分中文会出错。

    iconv.decode(iconv.encode(iconv.decode(iconv.encode('流畅轮廓、精致细节和自由律动,糅合呈现#TODS2023 秋冬 男士系列。#TodsTheItalianPortrait 创意总监:Walter Chiapponi', 'utf-8'), 'windows-1252'), 'windows-1252'), 'utf-8')

    畅轮廓精致细节和自由律动,糅呈现#TODS2023 秋冬 男士系列。#TodsTheItalianPortrait 创总监:Walter Chiapponi
    关于     帮助文档     自助推广系统     博客     API     FAQ     Solana     4000 人在线   最高记录 6679       Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 23ms UTC 05:18 PVG 13:18 LAX 22:18 JFK 01:18
    Do have faith in what you're doing.
    ubao snddm index pchome yahoo rakuten mypaper meadowduck bidyahoo youbao zxmzxm asda bnvcg cvbfg dfscv mmhjk xxddc yybgb zznbn ccubao uaitu acv GXCV ET GDG YH FG BCVB FJFH CBRE CBC GDG ET54 WRWR RWER WREW WRWER RWER SDG EW SF DSFSF fbbs ubao fhd dfg ewr dg df ewwr ewwr et ruyut utut dfg fgd gdfgt etg dfgt dfgd ert4 gd fgg wr 235 wer3 we vsdf sdf gdf ert xcv sdf rwer hfd dfg cvb rwf afb dfh jgh bmn lgh rty gfds cxv xcv xcs vdas fdf fgd cv sdf tert sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf shasha9178 shasha9178 shasha9178 shasha9178 shasha9178 liflif2 liflif2 liflif2 liflif2 liflif2 liblib3 liblib3 liblib3 liblib3 liblib3 zhazha444 zhazha444 zhazha444 zhazha444 zhazha444 dende5 dende denden denden2 denden21 fenfen9 fenf619 fen619 fenfe9 fe619 sdf sdf sdf sdf sdf zhazh90 zhazh0 zhaa50 zha90 zh590 zho zhoz zhozh zhozho zhozho2 lislis lls95 lili95 lils5 liss9 sdf0ty987 sdft876 sdft9876 sdf09876 sd0t9876 sdf0ty98 sdf0976 sdf0ty986 sdf0ty96 sdf0t76 sdf0876 df0ty98 sf0t876 sd0ty76 sdy76 sdf76 sdf0t76 sdf0ty9 sdf0ty98 sdf0ty987 sdf0ty98 sdf6676 sdf876 sd876 sd876 sdf6 sdf6 sdf9876 sdf0t sdf06 sdf0ty9776 sdf0ty9776 sdf0ty76 sdf8876 sdf0t sd6 sdf06 s688876 sd688 sdf86