1

I have a file with Chinese characters text inside, I want to copy those text over to another file. But the file output messes with the chinese characters. Notice that in my code I am using "UTF8" as my encoding already:

BufferedReader br = new BufferedReader(new FileReader(inputXml));
StringBuilder sb = new StringBuilder();
String line = br.readLine();
while (line != null) {
sb.append(line);
sb.append("\n");
line = br.readLine();
}
String everythingUpdate = sb.toString();

Writer out = new BufferedWriter(new OutputStreamWriter(
        new FileOutputStream(outputXml), "UTF8"));

out.write("");
out.write(everythingUpdate);
out.flush();
out.close();
2
  • 2
    Was your input file encoded in UTF-8? Did the FileReader uses UTF-8 when you check getEncoding()? How did you check the output, did your text viewer support UTF-8?
    – gerrytan
    Commented Mar 13, 2013 at 5:36
  • 2
    Read the input file using the encoding which it uses. You can check a file's encoding in many editors.
    – longhua
    Commented Mar 13, 2013 at 5:39

2 Answers 2

3

The answer from @hyde is valid, but I have two extra notes that I will point out in the code below.

Of course it is up to you to re-organize the code to your needs

// Try with resource is used here to guarantee that the IO resources are properly closed
// Your code does not do that properly, the input part is not closed at all
// the output an in case of an exception, will not be closed as well
try (BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream(inputXML), "UTF-8"));
    PrintWriter out = new PrintWriter(new OutputStreamWriter(new FileOutputStream(outputXML), "UTF8"))) {
    String line = reader.readLine();

    while (line != null) {
    out.println("");
    out.println(line);

    // It is highly recommended to use the line separator and other such
    // properties according to your host, so using System.getProperty("line.separator")
    // will guarantee that you are using the proper line separator for your host
    out.println(System.getProperty("line.separator"));
    line = reader.readLine();
    }
} catch (IOException e) {
  e.printStackTrace();
}
2

You should not use FileReader in a case like this, as it does not let you specify input encoding. Construct an InputStreamReader on a FileInputStream.

Something like this:

BufferedReader br = 
        new BufferedReader(
            new InputStreamReader(
                new FileInputStream(inputXml), 
                "UTF8"));
0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.