求助：关于 Python 编码的问题

推荐学习书目

› Learn Python the Hard Way

Python Sites

› PyPI - Python Package Index

› http://diveintopython.org/toc/index.html

› Pocoo

值得关注的项目

› PyPy

› Celery

› Jinja2

› Read the Docs

› gevent

› pyenv

› virtualenv

› Stackless Python

› Beautiful Soup

› 结巴中文分词

› Green Unicorn

› Sentry

› Shovel

› Pyflakes

› pytest

Python 编程

› pep8 Checker

Styles

› PEP 8

› Google Python Style Guide

› Code Style from The Hitchhiker's Guide

This topic created in 3408 days ago, the information mentioned may be changed or developed.

从网页上提取了一段字符出来， x=u'\u7535\u8bdd\u89c6\u9891\u4f1a\u8bae\u64cd\u4f5c\u6d41\u7a0b'，已知网页的编码是 gb2312 的方式，现在想看到 x 的中文是什么，怎么处理？

编码

u9891

u8bae

GB2312

8 replies • 2017-03-07 10:55:05 +08:00

falseen

Mar 5, 2017

如果是 python2 的话，直接 print(x)即可。如果是 python3 的话不会存在这个问题。

maiganne

Mar 5, 2017

@falseen 谢谢， print x 果然可以显示出来，那怎么让 x 变成正常显示的字符串？

falseen

Mar 5, 2017

在 python2 中，字符串就是长这样的，没法改变。它其实就是一个正常的字符串，只是显示的是 utf-8 编码而已，你可以对它进行任何正常的操作。如果你是强迫症患者，一定要让它显示中文的话，那么只有用 python3 了。

wolong

Mar 5, 2017

懒得研究 python 2 的编码问题，所以转 3 了。

dant

Mar 6, 2017

x 是 Unicode codepoint 序列（ Python 2 中的类型是 unicode ， Python 3 中的类型是 str ）
可以通过 x.encode() 转换为字节序列（ Python 2 中的类型是 str ， Python 3 中的类型是 bytes ）

chez

Mar 6, 2017

x.encode('utf-8')

alex0721

Mar 6, 2017

x.encode('gbk') 吧....

crazypig14

Mar 7, 2017

电话视频会议操作流程。。。。 utf8 的