1,141 questions
1
vote
0
answers
50
views
Why can't CLion correctly read Chinese strings from a TXT file?
As a Chinese university student studying computer-related majors, I am working on a major assignment assigned by my teacher to create a restaurant management system. However, when I finished my code, ...
0
votes
1
answer
73
views
Unicode string not being read correctly [closed]
I'm reading an HTML file using Java and am having some trouble with a Unicode character. The problematic statement is:
<span class="xml-lang" lang="cmn-Hant" xml:lang="cmn-...
0
votes
1
answer
51
views
Using PD4ML not able to render few Chinese characters in PDF
I am trying to generate the PDF using PD4ML but while rendering it few Chinese characters are appearing as ? in output file.
Below is code snippet for reference through which I am generating PDF.
try {...
0
votes
0
answers
97
views
How to extract clean japanese text from the pdf folder in python
This is my code
import os
import PyPDF2
# set the directory where the PDF files are located
pdf_directory = '/Users/humnerohit/Desktop/test_pdf_files'
# loop through each file in the directory
for ...
1
vote
0
answers
29
views
Geocoding - Japanese address start to timeout without url encoding from 12th Oct 2024
Here is example of call from Golang without url encoding.
https://maps.googleapis.com/maps/api/geocode/json?address=大田区北千束3丁目&key=xxxx(please change to key)
Above was returning result until 11th ...
0
votes
1
answer
44
views
Which font/setting to correctly display U+2E976 in Firefox and everywhere in Linux [closed]
I'm trying to figure out which font I have to install to be able to see the characters on this page:
https://en.wiktionary.org/wiki/%F0%AE%A5%B6#Chinese
𮥶 which is U+2E976
I'm on Linux Mint. I've ...
0
votes
0
answers
29
views
Download pdf in Japanese shows garbled text on production for rails application using Wickedpdf gem, working perfectly fine locally
Here is the issue in the screenshot
I have tried multiple things:
I have tried to download the Japanese fonts locally on the production server
Added this in the pdf layout:
<!DOCTYPE html>...
1
vote
1
answer
88
views
Making a user-friendly input for subdividing a square into coordinates
I am working on a program (in Python) that involves cutting a square(s) into smaller pieces.
The user has to enter a 'code', which the program will automatically convert into the coordinates for each ...
0
votes
0
answers
176
views
How to convert a column of Japanese words in a column of Hiragana words in Excel
I have a large column of Japanese words (some are entirely in hiragana, katakana or kanji, some are a mix of kanji and hiragana, and some are a mix of kanji and katakana), and I would like to convert ...
3
votes
1
answer
76
views
Regex for Parsing Japanese Parliamentary Speeches in Python
I'm a beginner in Python and am working on a project to preprocess Japanese text data for argument mining. I need to extract metadata (e.g., parliamentary session, date, speaker) and the speech ...
1
vote
0
answers
49
views
Gluon Javafx textfield cannot input Chinese/Japanese/Korean characters on IOS
I am developing an iOS application using Gluon Mobile but have encountered a problem. When a TextField gains focus, the keyboard pops up, but this keyboard is different from the one configured in the ...
0
votes
1
answer
54
views
How to get the unicode of a character in Python? [duplicate]
Wanna get the unicode of chinese or vietnamese's han-nom and japanese characters
I've tried these code
text = "𬖰";
br = text.encode("unicode-escape");
print(br);
and got
b'\\...
0
votes
1
answer
47
views
Decoding multibyte non-Unicode characters through codecvt fails
I have experimented with std::codecvt on MSVC and encountered an issue with multibyte character encodings ‒ it cannot convert back from valid multibyte sequences, even when those can be produced when ...
0
votes
0
answers
26
views
clear IME input onfocus in Javascript
Currently I'm having problems with IME.
I am designing a chat and messaging application with Vue3. I use a div with :editcontenable="true" as input. Everything was ok but there was a problem ...
1
vote
1
answer
62
views
MeCab doesnt seem to return correct response
So i just installed MeCab and tried to run it in node.
i took example from this url https://github.com/hecomi/node-mecab-async#readme to check if i installed it properly but i guess something went ...
0
votes
0
answers
21
views
Using Limelight with WordPress
I want to install Limelight on WordPress to to add some Japanese language features such as kana converter, Japanese to romaji converter etc. However, GitHub only describes how to install it, and does ...
0
votes
0
answers
29
views
Image-to-Hanzi library
I'm trying to create an 'ascii' (unicode, but you know) image renderer with higher fidelity for terminals.
You can see most current implementations treat each textual character as a pixel, which ...
1
vote
0
answers
279
views
Difference of special token handling of the BertTokenizer's batch_decode() and decode() method?
For BertTokenizer, I am trying to decode sentences produced after tokenization. Here is my code:
from transformers import BertTokenizer
ref = '這件衣服皺巴巴的,幫我燙一下吧。'
our = '衣服皺了,幫我燙一燙'
tokenizer = ...
1
vote
1
answer
72
views
Errors with the node modulle pinyin
I have a node.js script in which I want to Romanize Chinese characters. There is a node module pinyin. However, when I try it, I get the error:
throw err;
^
Error: Cannot find module '/...
0
votes
0
answers
98
views
Some kanji characters cannot be displayed when using TCPDF
I am using TCPDF latest version (6.7.5) to export PDF documents. In the document there are some Japanese kanji characters that cannot be displayed.
I have tried using fonts like cid0jp, cid0cs, cid0cs,...
0
votes
1
answer
67
views
terminal stdin moveCursor misalignment with wide characters
I need to programmatically move terminal's caret position with NodeJS. I am using process.stdout.moveCursor(x, y) which works fine on normal ASCII characters with 1 character width.
However, this ...
1
vote
2
answers
83
views
Convert Full width numbers into Normal numbers in python
I have a data in an excel file(only 1 column) where there are several japanese characters followed by fullwidth numbers. I want to convert these numbers into normal numbers.
いつもありがとう890ございます
...
1
vote
0
answers
59
views
Comparing Kangxi Radicals in SQL Server
I have an SQL Server database table containing Japanese Kanji and Radicals. Running a SELECT on this table for a Kanji returns a single row as I expect. However if I run the same Query for a Kangxi ...
2
votes
1
answer
108
views
regex for the pattern of one optional space before Chinese words in lua
I tried use string.match("Í",'%s?[\u{4e00}-\u{9FFF}]+') which is similar to how we work in JS or others. But it will match one unnecessary character like the above 'Í'.
The official ...
0
votes
0
answers
13
views
javadoc generated document annotation content Chinese garbled code
javadoc generated document annotation content Chinese garbled code.
The red frame is garbled, and the rest is normal
Editor IDEA JDK1.8.
javadoc -d doc -encoding UTF-8 -charset UTF-8 -docencoding UTF-...
1
vote
0
answers
82
views
Tiptap Editor applying marks cancels combination of korean characters when the mark is created for the first time
When I apply marks to Tiptap Editor, it cancels combination of korean characters when the mark is created for the first time.
this video demonstrates the situation. First line, which is without a mark,...
0
votes
3
answers
356
views
How to use pandas read_fwf with japanese characters in data
I am trying to read data from text files that contain Japanese city names. Each line contains 32 bytes, with the name column being 22 bytes.
When I try to use pandas.read_fwf(), the results are wrong ...
0
votes
0
answers
140
views
Japanese (Mozc) input not working in browser created by Selenium
I am currently learning Japanese and Webscraping so I thought this would be a perfect match together. But somehow when I type inside the new Firefox browser, which opened after following Code:
from ...
0
votes
1
answer
89
views
Problems in displaying polyphonic Chinese characters
I attempted to display polyphonic Chinese characters in Dart using the BpmfIansui-Regular.ttf font, which is designed for traditional Chinese characters with BoPoMoFo annotations. For instance, the ...
1
vote
2
answers
265
views
HL7 encoding characters in non-ASCII strings
I have a question of how to handle HL7v2 encoding characters appearing when using a non-standard (non 7 bit ASCII) character sets. As an example, this is a part of a HL7v2 message:
MSH|^~\&|appl|...
1
vote
1
answer
88
views
Closing HTML tags insert "phantom" space in Chinese text
(Note: I can't provide an image or a sample file; uploads and imgur are blocked from this location.) Images added.
I recently received a Word document translation into Chinese (traditional) of one of ...
1
vote
1
answer
83
views
Halfwidth vs Fullwidth Forms in JIS X 208
I am trying to make sense of the following explanation on wikipedia page:
ASCII and JISCII punctuation (shown here with a yellow background) may
use alternative mappings to the Halfwidth and ...
2
votes
0
answers
99
views
JavaFX 21 TextField and Japanese input
I have a Java 21 desktop application with JavaFX and I want to support Japanese text input. No problem I guess with a Japanese keyboard, but I have problems (on Mac OS X) with keyboard emulation. I ...
1
vote
0
answers
114
views
Netbeans: UTF-8 characters Chinese Japanese (CJK) text do not print in Netbeans 20 (Java MAVEN) Output window
This is my Code:
package hello.learningjava;
public class NewClassTEST {
public static String aMadeUpString() {
return ("Holy moly cow, 你好 よくできました!");
}
}
Image of ...
3
votes
1
answer
460
views
Issue with Converting Full-Width Japanese Numeric Characters and Dots in JavaScript
I'm facing an issue with converting full-width Japanese numeric characters and dots to their half-width in a JavaScript input handling function. The goal is to allow users to input numeric values with ...
3
votes
0
answers
71
views
How to make Excel sort by a hierarchy? A to Z sort for Japanese isn't working as expected, Custom Sort is too narrow
I am trying to sort names written in Japanese (mainly Katakana) in Excel. However the A to Z sort is not following the pattern that I am expecting: the sort appears to put dakuten and handakuten ...
-1
votes
1
answer
53
views
Dir.bat equivalent that recognises Chinese characters
When I use either my dir.bat file or the below command, it works but if there are Chinese characters, it'll replace them with "?".
cmd /c dir /b > "%temp%\Dir.txt" & ...
1
vote
1
answer
392
views
Zero-width space and <wbr> have no effect on Japanese line-breaking
As I understand it from the documentation, <wbr> and the Unicode zero-width space character, ​ or ​, are functionally equivalent, and are supposed to suggest line-...
0
votes
0
answers
59
views
mysqldump and restore with tranditional chinese character
I am trying to take mysql dump from docker with command:
docker exec testsigma_mysql /usr/bin/mysqldump -u root --password=root --default-character-set=utf8mb4 testsigma_opensource > backup.sql
...
1
vote
0
answers
35
views
How does the js Selection.modify function differentiate between words in Chinese?
The function itself somehow understands when it is necessary to highlight 1 hieroglyph, when 2, and when all 3. I wonder how? Obviously this is not done using regular expressions. There are no ...
0
votes
1
answer
46
views
mysql and imported data in PDF::API2 .pdf file garbled
Using PDF::API2 I wrote a perl script to create an invoice PDF file with Japanese text.
The file encoding is utf8 and I have "use utf8" in the header.
If I declare the variables inside the ...
1
vote
0
answers
49
views
Mismatched Character Highlighting in Chrome Browser Search
During a search on a Japanese text page in Chrome, link, I experienced an unexpected result. I used the ⌘+f function to search for ム (Unicode 0xFF91) character. Chrome, however, highlighted ム (Unicode ...
0
votes
0
answers
26
views
Stop Android Studio from thinking Korean characters are typos
I love keeping the "Problems" section of my IDE clean of errors, since besides Compile errors, Warnings and Hints often catch things for me.
However I'm working on a multilingual project, ...
0
votes
1
answer
64
views
English to Japanese Translation feature in the app
I am currently working with english to japanese feature in an app on the kotlin android studio, but my problem is that even though some of the translation is working, there are translated text that is ...
0
votes
0
answers
63
views
Printing Non-English Characters Appears Broken in Node.js Serial Port Printer
I'm encountering an issue with a Node.js application that communicates with a serial port printer. When attempting to print non-English characters (such as accented letters or characters from ...
0
votes
0
answers
61
views
C++: Inputting & Outputting UTF-8 on Windows?
I'm not familiar with Windows at all
I'm struggling to write a function which reads from a file containing Chinese characters & does some regex.
Roughly:
std::ifstream t(input_file);
std::...
0
votes
0
answers
145
views
How to show `\` instead of special character `₩` in path separator for Windows Forms text-box?
My Windows Forms app displays a file path in a text-box. For our customers in Korea, this is shown as C:₩Users₩abc123₩my₩path instead of C:\Users\abc123\my\path. However, their command prompts, ...
2
votes
0
answers
43
views
In matlab, How can I display both Chinese characters and LaTeX symbols in a label? [duplicate]
How to display both Chinese characters and LaTeX symbols in a label?
if I use latex Interpreter, there are some error:
close
clear
x = 0:0.01:2*pi;
y = sin(x);
figure;
plot(x, y);
xlabel( '$\text{...
1
vote
1
answer
124
views
Problem with size of custom MFC control when Language is set to Japanese
I look after the DeepSkyStacker application.
If the system language is set to English, then DeepSkyStacker sets its own language to English, and its "Processing" panel (on the right) is ...
0
votes
1
answer
62
views
Javascript indexOf/charAt not working for Japanese half-width Katakana
In my codebase I have this code, surprisingly it returns 1:
'トゲ'.indexOf('ケ') // Returns 1
The character ケ doesn't seem to appear in string トゲ.
I also tried to run this code:
'トゲ'.charAt(1) // ...