How to get the unicode of a character in Python? [duplicate]

Question

Wanna get the unicode of chinese or vietnamese's han-nom and japanese characters I've tried these code

text = "𬖰";

br = text.encode("unicode-escape");

print(br);

and got

b'\\U0002c5b0'

But what should I do when I want to have something like U+2C5B0 or U2C5B0 ?

FYI: "The Unicode of a character" doesn't really make much sense. Unicode is a whole big standard with many different parts to it. You want the Unicode code point. — deceze, Commented Jul 23 at 7:42
Yea ... in "Unicode of a character", "Unicode" could also mean the UTF-8 encoding, UTF-16 encoding and so on. And even "of a character" is ambiguous in many contexts. (Character in what character set?) I recommend you do some background reading on Unicode so that you know and understand the correct terminology ... and then >use< it. — Stephen C, Commented Aug 6 at 2:27

blhsing · Accepted Answer · 2024-08-06 01:33:09Z

2

You can use the ord function to get the character's numeric code point and format it with the 04X specifier in an f-string to display the code point as uppercased hexadecimals that are 0-padded up to 4 characters wide:

print(f'U+{ord(text):04X}')

Demo here

edited Aug 6 at 1:33

answered Jul 23 at 7:31

blhsing

105k9 gold badges83 silver badges129 bronze badges

1

print(f'U+{ord(text):04X}') may be a better version, conforming to conventions used to display codepoints.
– Andj
Commented Aug 6 at 1:17

Add a comment |

Collectives™ on Stack Overflow

How to get the unicode of a character in Python? [duplicate]

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
python
unicode
cjk
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Not the answer you're looking for? Browse other questions tagged pythonunicodecjk or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
python
unicode
cjk
or ask your own question.