
1 ushuz Dec 7, 2013 via iPhone 转成str str() |
3 Hackathon Dec 7, 2013 Python 2.7.5 (default, May 15 2013, 22:43:36) [MSC v.1500 32 bit (Intel)] on win32 Type "copyright", "credits" or "license()" for more information. >>> a = u'\xb2\xe2\xca\xd4' >>> b = a.encode('raw_unicode_escape') >>> print b 测试 >>> c = a.encode('latin1') >>> print c 测试 >>> |
6 lnehe Dec 7, 2013 python的字符编码问题一直搞不懂。。。 |
7 F0ur Dec 7, 2013 python的字符编码一直个要研究的问题- - |
8 9hills Dec 7, 2013 这个不是字符编码问题<_< |
9 VYSE Dec 7, 2013 标题里的就是'\xb2\xe2\xca\xd4'就是编码的,加u在encode转换其实蛮诡异的,不过latin1还能encode说明Python根据OS环境做了些取舍,放在英文默认编码系统里肯定转不出来。 附言里是utf-8的, print '\xe5\xbe\xae\xe4\xbf\xae'.decode('utf-8')即可。 \x出现在u''里表示的就不是byte而是等效于\u00XX, 比如u'\xe5\xbe\xae\xe4\xbf\xae'其实等于u'\u00e5\u00be\u00ae\u00e4\u00bf\u00ae',这样表示的是unicode char table里的第XX位而不是字节,意义就全变了。 反正bytes出现在unicode str里实在诡异。 |
10 shenGun Dec 11, 2013 http://docs.python.org/2/howto/unicode.html Latin-1, also known as ISO-8859-1, is a similar encoding. Unicode code points 0-255 are identical to the Latin-1 values, so converting to this encoding simply requires converting code points to byte values; if a code point larger than 255 is encountered, the string can’t be encoded into Latin-1. 在Documentation中提示unicode的0-255编码和Latin-1的0-255是一样的。说以u'\xb2\xe2\xca\xd4'.decode('Latin-1')转好之后就是'\xb2\xe2\xca\xd4'其实好像还是的编码 |
11 borneo Dec 15, 2013 hey man. by the way, keep it compatible with Python 2+3. http://lucumr.pocoo.org/2011/1/22/forwards-compatible-python/ |
12 yingluck OP |
13 lzjun Aug 11, 2016 |