设置默认的Java字符编码

  • Post author:
  • Post category:java


本文翻译自:

Setting the default Java character encoding


How do I properly set the default character encoding used by the JVM (1.5.x) programmatically?


如何以编程方式正确设置JVM(1.5.x)使用​​的默认字符编码?


I have read that

-Dfile.encoding=whatever

used to be the way to go for older JVMs… I don’t have that luxury for reasons I wont get into.


我已经读过

-Dfile.encoding=whatever

以前用于旧JVM的方式…我没有那么奢侈,因为我不会进入。


I have tried:


我试过了:

System.setProperty("file.encoding", "UTF-8");


And the property gets set, but it doesn’t seem to cause the final getBytes call below to use UTF8:


并且属性已设置,但它似乎不会导致下面的最终getBytes调用使用UTF8:

    System.setProperty("file.encoding", "UTF-8");

    byte inbytes[] = new byte[1024];

    FileInputStream fis = new FileInputStream("response.txt");
    fis.read(inbytes);
    FileOutputStream fos = new FileOutputStream("response-2.txt");
    String in = new String(inbytes, "UTF8");
    fos.write(in.getBytes());

#1楼

参考:

https://stackoom.com/question/1WAJ/设置默认的Java字符编码


#2楼


I have a hacky way that definitely works!!


我有一种绝对有效的hacky方式!

System.setProperty("file.encoding","UTF-8");
Field charset = Charset.class.getDeclaredField("defaultCharset");
charset.setAccessible(true);
charset.set(null,null);


This way you are going to trick JVM which would think that charset is not set and make it to set it again to UTF-8, on runtime!


这样你就会欺骗JVM,它会认为charset没有设置,并让它在运行时再次设置为UTF-8!


#3楼


We were having the same issues.


我们遇到了同样的问题。


We methodically tried several suggestions from this article (and others) to no avail.


我们有条不紊地尝试了本文(和其他人)的一些建议但无济于事。


We also tried adding the

-Dfile.encoding=UTF8

and nothing seemed to be working.


我们还尝试添加

-Dfile.encoding=UTF8

,似乎没有任何工作。


For people that are having this issue, the following article finally helped us track down describes how the locale setting can break

unicode/UTF-8

in

Java/Tomcat



对于遇到此问题的人,以下文章最终帮助我们跟踪描述了区域设置如何在

Java/Tomcat

打破

unicode/UTF-8



http://www.jvmhost.com/articles/locale-breaks-unicode-utf-8-java-tomcat




http://www.jvmhost.com/articles/locale-breaks-unicode-utf-8-java-tomcat


Setting the locale correctly in the

~/.bashrc

file worked for us.




~/.bashrc

文件中正确设置语言环境对我们

~/.bashrc



#4楼


I have tried a lot of things, but the sample code here works perfect.


我尝试过很多东西,但这里的示例代码非常完美。



Link




链接


The crux of the code is:


代码的关键是:

String s = "एक गाव में एक किसान";
String out = new String(s.getBytes("UTF-8"), "ISO-8859-1");

#5楼


I can’t answer your original question but I would like to offer you some advice — don’t depend on the JVM’s default encoding.


我无法回答你原来的问题,但我想提供一些建议 – 不要依赖于JVM的默认编码。


It’s always best to explicitly specify the desired encoding (ie “UTF-8”) in your code.


最好在代码中明确指定所需的编码(即“UTF-8”)。


That way, you know it will work even across different systems and JVM configurations.


这样,您就知道它甚至可以跨不同的系统和JVM配置工作。


#6楼


I think a better approach than setting the platform’s default character set, especially as you seem to have restrictions on affecting the application deployment, let alone the platform, is to call the much safer

String.getBytes("charsetName")

.


我认为比设置平台的默认字符集更好的方法,特别是因为你似乎对影响应用程序部署有限制,更不用说平台了,就是调用更安全的

String.getBytes("charsetName")




That way your application is not dependent on things beyond its control.


这样你的应用程序就不依赖于它无法控制的东西。


I personally feel that

String.getBytes()

should be deprecated, as it has caused serious problems in a number of cases I have seen, where the developer did not account for the default charset possibly changing.


我个人认为

String.getBytes()

应该被弃用,因为它在我见过的许多情况下都会造成严重问题,开发人员没有考虑可能更改的默认字符集。