[Python] convert Unicode String to int

Posted on 2020-02-092020-02-11

有一個 unicode 或 utf-8 的字碼，想知道產生出來的字，請服用 chr:
https://docs.python.org/3/library/functions.html#chr

例如：

chr(24465)
 '徑'

想知道某個字對到的code 請使用 ord:
https://docs.python.org/3/library/functions.html#ord

例如：

ord('徑')
 24465

Note that when you encrypt end decrypt text, you usually encode text to a binary representation with a character encoding. Unicode text can be encoded with different encodings with different advantages and disadvantages. These days the most commonly used encoding for Unicode text UTF-8, but others exist to.

In Python 3, binary data is represented in the bytes object, and you encode text to bytes with the str.encode() method and go back by using bytes.decode():

>>> 'Hello World!'.encode('utf8')
b'Hello World!'
>>> b'Hello World!'.decode('utf8')
'Hello World!'

bytes values are really just sequences, like lists and tuples and strings, but consisting of integer numbers from 0-255:

>>> list('Hello World!'.encode('utf8'))
[72, 101, 108, 108, 111, 32, 87, 111, 114, 108, 100, 33]

Personally, when encrypting, you want to encode and encrypt the resulting bytes.

If all this seems overwhelming or hard to follow, perhaps these articles on Unicode and character encodings can help out:

What every developer needs to know about Unicode
Ned Batchelder’s Pragmatic Unicode
Python’s Unicode HOWTO

基本轉換之後，還需要把 int 換成 hex, 範例如下：

>>> ord('耄')
 32772
 >>> hex(ord('耄'))
 '0x8004'
 >>> str(hex(ord('耄')))
 '0x8004'
 >>> str(hex(ord('耄')))[2:]
 '8004'

Try:

"0x%x" % 255 # => 0xff

"0x%X" % 255 # => 0xFF

Python Documentation says: “keep this under Your pillow: http://docs.python.org/library/index.html“

CJK Unified Ideographs
https://en.wikipedia.org/wiki/CJK_Unified_Ideographs

Max的程式語言筆記

[Python] convert Unicode String to int

相關文章：

發佈留言取消回覆

Related Posts

發佈留言 取消回覆

發佈留言取消回覆