Newest 'cjk' Questions

1 vote

0 answers

50 views

Why can't CLion correctly read Chinese strings from a TXT file?

As a Chinese university student studying computer-related majors, I am working on a major assignment assigned by my teacher to create a restaurant management system. However, when I finished my code, ...

BreakingBad6

11

asked Dec 6 at 6:16

0 votes

1 answer

73 views

Unicode string not being read correctly [closed]

I'm reading an HTML file using Java and am having some trouble with a Unicode character. The problematic statement is: <span class="xml-lang" lang="cmn-Hant" xml:lang="cmn-...

Sander Smith

1,437

asked Nov 27 at 18:01

0 votes

1 answer

51 views

Using PD4ML not able to render few Chinese characters in PDF

I am trying to generate the PDF using PD4ML but while rendering it few Chinese characters are appearing as ? in output file. Below is code snippet for reference through which I am generating PDF. try {...

Sachin

1

asked Oct 24 at 13:15

0 votes

0 answers

97 views

How to extract clean japanese text from the pdf folder in python

This is my code import os import PyPDF2 # set the directory where the PDF files are located pdf_directory = '/Users/humnerohit/Desktop/test_pdf_files' # loop through each file in the directory for ...

rohit

1

asked Oct 18 at 12:27

1 vote

0 answers

29 views

Geocoding - Japanese address start to timeout without url encoding from 12th Oct 2024

Here is example of call from Golang without url encoding. https://maps.googleapis.com/maps/api/geocode/json?address=大田区北千束３丁目&key=xxxx(please change to key) Above was returning result until 11th ...

Neeraj Kumar

17

asked Oct 18 at 1:37

0 votes

1 answer

44 views

Which font/setting to correctly display U+2E976 in Firefox and everywhere in Linux [closed]

I'm trying to figure out which font I have to install to be able to see the characters on this page: https://en.wiktionary.org/wiki/%F0%AE%A5%B6#Chinese 𮥶 which is U+2E976 I'm on Linux Mint. I've ...

ReaderGuy42

37

asked Sep 23 at 22:26

0 votes

0 answers

29 views

Download pdf in Japanese shows garbled text on production for rails application using Wickedpdf gem, working perfectly fine locally

Here is the issue in the screenshot I have tried multiple things: I have tried to download the Japanese fonts locally on the production server Added this in the pdf layout: <!DOCTYPE html>...

Safi Ullah

1

asked Sep 5 at 18:09

1 vote

1 answer

88 views

Making a user-friendly input for subdividing a square into coordinates

I am working on a program (in Python) that involves cutting a square(s) into smaller pieces. The user has to enter a 'code', which the program will automatically convert into the coordinates for each ...

Leo

35

asked Sep 1 at 18:23

0 votes

0 answers

176 views

How to convert a column of Japanese words in a column of Hiragana words in Excel

I have a large column of Japanese words (some are entirely in hiragana, katakana or kanji, some are a mix of kanji and hiragana, and some are a mix of kanji and katakana), and I would like to convert ...

kanachan

101

asked Aug 30 at 11:16

3 votes

1 answer

76 views

Regex for Parsing Japanese Parliamentary Speeches in Python

I'm a beginner in Python and am working on a project to preprocess Japanese text data for argument mining. I need to extract metadata (e.g., parliamentary session, date, speaker) and the speech ...

Ana17

31

asked Aug 13 at 10:32

1 vote

0 answers

49 views

Gluon Javafx textfield cannot input Chinese/Japanese/Korean characters on IOS

I am developing an iOS application using Gluon Mobile but have encountered a problem. When a TextField gains focus, the keyboard pops up, but this keyboard is different from the one configured in the ...

54562635qqcom

11

asked Aug 8 at 16:33

0 votes

1 answer

54 views

How to get the unicode of a character in Python? [duplicate]

Wanna get the unicode of chinese or vietnamese's han-nom and japanese characters I've tried these code text = "𬖰"; br = text.encode("unicode-escape"); print(br); and got b'\\...

Charlie Fan

9

asked Jul 23 at 7:30

0 votes

1 answer

47 views

Decoding multibyte non-Unicode characters through codecvt fails

I have experimented with std::codecvt on MSVC and encountered an issue with multibyte character encodings ‒ it cannot convert back from valid multibyte sequences, even when those can be produced when ...

IS4

13.1k

asked Jun 30 at 1:03

0 votes

0 answers

26 views

clear IME input onfocus in Javascript

Currently I'm having problems with IME. I am designing a chat and messaging application with Vue3. I use a div with :editcontenable="true" as input. Everything was ok but there was a problem ...

thanh dang

1

asked Jun 17 at 9:01

1 vote

1 answer

62 views

MeCab doesnt seem to return correct response

So i just installed MeCab and tried to run it in node. i took example from this url https://github.com/hecomi/node-mecab-async#readme to check if i installed it properly but i guess something went ...

moremondocane

11

asked May 29 at 23:10

0 votes

0 answers

21 views

Using Limelight with WordPress

I want to install Limelight on WordPress to to add some Japanese language features such as kana converter, Japanese to romaji converter etc. However, GitHub only describes how to install it, and does ...

Mr. Tom Bahadur

1

asked May 21 at 22:49

0 votes

0 answers

29 views

Image-to-Hanzi library

I'm trying to create an 'ascii' (unicode, but you know) image renderer with higher fidelity for terminals. You can see most current implementations treat each textual character as a pixel, which ...

Hashbrown

13k

asked May 18 at 8:27

1 vote

0 answers

279 views

Difference of special token handling of the BertTokenizer's batch_decode() and decode() method?

For BertTokenizer, I am trying to decode sentences produced after tokenization. Here is my code: from transformers import BertTokenizer ref = '這件衣服皺巴巴的，幫我燙一下吧。' our = '衣服皺了，幫我燙一燙' tokenizer = ...

Raptor

54.1k

asked May 16 at 12:09

1 vote

1 answer

72 views

Errors with the node modulle pinyin

I have a node.js script in which I want to Romanize Chinese characters. There is a node module pinyin. However, when I try it, I get the error: throw err; ^ Error: Cannot find module '/...

user3250335

477

asked May 10 at 21:05

0 votes

0 answers

98 views

Some kanji characters cannot be displayed when using TCPDF

I am using TCPDF latest version (6.7.5) to export PDF documents. In the document there are some Japanese kanji characters that cannot be displayed. I have tried using fonts like cid0jp, cid0cs, cid0cs,...

thinh1995

1

asked May 9 at 14:25

0 votes

1 answer

67 views

terminal stdin moveCursor misalignment with wide characters

I need to programmatically move terminal's caret position with NodeJS. I am using process.stdout.moveCursor(x, y) which works fine on normal ASCII characters with 1 character width. However, this ...

SyndRain

3,695

asked Apr 20 at 1:05

1 vote

2 answers

83 views

Convert Full width numbers into Normal numbers in python

I have a data in an excel file(only 1 column) where there are several japanese characters followed by fullwidth numbers. I want to convert these numbers into normal numbers. いつもありがとう８９０ございます ...

monnomm

13

asked Apr 17 at 16:35

1 vote

0 answers

59 views

Comparing Kangxi Radicals in SQL Server

I have an SQL Server database table containing Japanese Kanji and Radicals. Running a SELECT on this table for a Kanji returns a single row as I expect. However if I run the same Query for a Kangxi ...

Ger

11

asked Apr 15 at 4:42

2 votes

1 answer

108 views

regex for the pattern of one optional space before Chinese words in lua

I tried use string.match("Í",'%s?[\u{4e00}-\u{9FFF}]+') which is similar to how we work in JS or others. But it will match one unnecessary character like the above 'Í'. The official ...

An5Drama

527

asked Apr 11 at 9:40

0 votes

0 answers

13 views

javadoc generated document annotation content Chinese garbled code

javadoc generated document annotation content Chinese garbled code. The red frame is garbled, and the rest is normal Editor IDEA JDK1.8. javadoc -d doc -encoding UTF-8 -charset UTF-8 -docencoding UTF-...

Jambo

1

asked Apr 9 at 10:05

1 vote

0 answers

82 views

Tiptap Editor applying marks cancels combination of korean characters when the mark is created for the first time

When I apply marks to Tiptap Editor, it cancels combination of korean characters when the mark is created for the first time. this video demonstrates the situation. First line, which is without a mark,...

woohyun.park

11

asked Apr 9 at 8:50

0 votes

3 answers

356 views

How to use pandas read_fwf with japanese characters in data

I am trying to read data from text files that contain Japanese city names. Each line contains 32 bytes, with the name column being 22 bytes. When I try to use pandas.read_fwf(), the results are wrong ...

blue

9

asked Apr 5 at 23:28

0 votes

0 answers

140 views

Japanese (Mozc) input not working in browser created by Selenium

I am currently learning Japanese and Webscraping so I thought this would be a perfect match together. But somehow when I type inside the new Firefox browser, which opened after following Code: from ...

SaibotiX

1

asked Mar 19 at 10:29

0 votes

1 answer

89 views

Problems in displaying polyphonic Chinese characters

I attempted to display polyphonic Chinese characters in Dart using the BpmfIansui-Regular.ttf font, which is designed for traditional Chinese characters with BoPoMoFo annotations. For instance, the ...

Shinjou Fang

19

asked Mar 19 at 1:54

1 vote

2 answers

265 views

HL7 encoding characters in non-ASCII strings

I have a question of how to handle HL7v2 encoding characters appearing when using a non-standard (non 7 bit ASCII) character sets. As an example, this is a part of a HL7v2 message: MSH|^~\&|appl|...

Krister Valtonen

101

asked Mar 14 at 10:01

1 vote

1 answer

88 views

Closing HTML tags insert "phantom" space in Chinese text

(Note: I can't provide an image or a sample file; uploads and imgur are blocked from this location.) Images added. I recently received a Word document translation into Chinese (traditional) of one of ...

Jeff Zeitlin

10.7k

asked Mar 5 at 20:26

1 vote

1 answer

83 views

Halfwidth vs Fullwidth Forms in JIS X 208

I am trying to make sense of the following explanation on wikipedia page: ASCII and JISCII punctuation (shown here with a yellow background) may use alternative mappings to the Halfwidth and ...

malat

12.4k

asked Mar 1 at 8:02

2 votes

0 answers

99 views

JavaFX 21 TextField and Japanese input

I have a Java 21 desktop application with JavaFX and I want to support Japanese text input. No problem I guess with a Japanese keyboard, but I have problems (on Mac OS X) with keyboard emulation. I ...

Riccardo Mazzei

43

asked Feb 26 at 7:35

1 vote

0 answers

114 views

Netbeans: UTF-8 characters Chinese Japanese (CJK) text do not print in Netbeans 20 (Java MAVEN) Output window

This is my Code: package hello.learningjava; public class NewClassTEST { public static String aMadeUpString() { return ("Holy moly cow, 你好よくできました！"); } } Image of ...

777

11

asked Feb 11 at 9:56

3 votes

1 answer

460 views

Issue with Converting Full-Width Japanese Numeric Characters and Dots in JavaScript

I'm facing an issue with converting full-width Japanese numeric characters and dots to their half-width in a JavaScript input handling function. The goal is to allow users to input numeric values with ...

siya

55

asked Jan 30 at 10:30

3 votes

0 answers

71 views

How to make Excel sort by a hierarchy? A to Z sort for Japanese isn't working as expected, Custom Sort is too narrow

I am trying to sort names written in Japanese (mainly Katakana) in Excel. However the A to Z sort is not following the pattern that I am expecting: the sort appears to put dakuten and handakuten ...

1caiser

33

asked Jan 27 at 2:32

-1 votes

1 answer

53 views

Dir.bat equivalent that recognises Chinese characters

When I use either my dir.bat file or the below command, it works but if there are Chinese characters, it'll replace them with "?". cmd /c dir /b > "%temp%\Dir.txt" & ...

Deuteronomy93

1

asked Jan 20 at 3:16

1 vote

1 answer

392 views

Zero-width space and <wbr> have no effect on Japanese line-breaking

As I understand it from the documentation, <wbr> and the Unicode zero-width space character, &ZeroWidthSpace; or , are functionally equivalent, and are supposed to suggest line-...

ddbrierton

13

asked Jan 18 at 16:23

0 votes

0 answers

59 views

mysqldump and restore with tranditional chinese character

I am trying to take mysql dump from docker with command: docker exec testsigma_mysql /usr/bin/mysqldump -u root --password=root --default-character-set=utf8mb4 testsigma_opensource > backup.sql ...

weiling shao

11

asked Jan 11 at 9:14

1 vote

0 answers

35 views

How does the js Selection.modify function differentiate between words in Chinese?

The function itself somehow understands when it is necessary to highlight 1 hieroglyph, when 2, and when all 3. I wonder how? Obviously this is not done using regular expressions. There are no ...

a4356

45

asked Jan 9 at 21:14

0 votes

1 answer

46 views

mysql and imported data in PDF::API2 .pdf file garbled

Using PDF::API2 I wrote a perl script to create an invoice PDF file with Japanese text. The file encoding is utf8 and I have "use utf8" in the header. If I declare the variables inside the ...

moring

11

asked Jan 8 at 6:34

1 vote

0 answers

49 views

Mismatched Character Highlighting in Chrome Browser Search

During a search on a Japanese text page in Chrome, link, I experienced an unexpected result. I used the ⌘+f function to search for ﾑ (Unicode 0xFF91) character. Chrome, however, highlighted ム (Unicode ...

Bhuvan

4,167

asked Dec 26, 2023 at 8:38

0 votes

0 answers

26 views

Stop Android Studio from thinking Korean characters are typos

I love keeping the "Problems" section of my IDE clean of errors, since besides Compile errors, Warnings and Hints often catch things for me. However I'm working on a multilingual project, ...

Dr-Bracket

5,405

asked Dec 26, 2023 at 6:58

0 votes

1 answer

64 views

English to Japanese Translation feature in the app

I am currently working with english to japanese feature in an app on the kotlin android studio, but my problem is that even though some of the translation is working, there are translated text that is ...

Rey Mark Enriquez

1

asked Dec 19, 2023 at 5:04

0 votes

0 answers

63 views

Printing Non-English Characters Appears Broken in Node.js Serial Port Printer

I'm encountering an issue with a Node.js application that communicates with a serial port printer. When attempting to print non-English characters (such as accented letters or characters from ...

Apurba

1

asked Dec 16, 2023 at 5:33

0 votes

0 answers

61 views

C++: Inputting & Outputting UTF-8 on Windows?

I'm not familiar with Windows at all I'm struggling to write a function which reads from a file containing Chinese characters & does some regex. Roughly: std::ifstream t(input_file); std::...

user22683446

asked Dec 3, 2023 at 7:09

0 votes

0 answers

145 views

How to show `\` instead of special character `₩` in path separator for Windows Forms text-box?

My Windows Forms app displays a file path in a text-box. For our customers in Korea, this is shown as C:₩Users₩abc123₩my₩path instead of C:\Users\abc123\my\path. However, their command prompts, ...

Ed Graham

4,675

asked Nov 23, 2023 at 12:24

2 votes

0 answers

43 views

In matlab, How can I display both Chinese characters and LaTeX symbols in a label? [duplicate]

How to display both Chinese characters and LaTeX symbols in a label? if I use latex Interpreter, there are some error: close clear x = 0:0.01:2*pi; y = sin(x); figure; plot(x, y); xlabel( '$\text{...

Chin Ching CHAN

21

asked Nov 7, 2023 at 3:14

1 vote

1 answer

124 views

Problem with size of custom MFC control when Language is set to Japanese

I look after the DeepSkyStacker application. If the system language is set to English, then DeepSkyStacker sets its own language to English, and its "Processing" panel (on the right) is ...

David Partridge

315

asked Nov 3, 2023 at 16:05

0 votes

1 answer

62 views

Javascript indexOf/charAt not working for Japanese half-width Katakana

In my codebase I have this code, surprisingly it returns 1: 'ﾄｹﾞ'.indexOf('ｹ') // Returns 1 The character ｹ doesn't seem to appear in string ﾄｹﾞ. I also tried to run this code: 'ﾄｹﾞ'.charAt(1) // ...

code đờ

615

asked Nov 2, 2023 at 4:51

Collectives™ on Stack Overflow

Related Tags