首先来了解下字符串字面量(string literal)。Java语言规范对string literal是这样描述的(3.10.5 String Literals):
A string literal is a reference to an instance of class String (§4.3.1, §4.3.3).
Moreover, a string literal always refers to the same instance of class String. This
is because string literals – or, more generally, strings that are the values of constant
expressions (§15.28) – are “interned” so as to share unique instances, using the
method String.intern.也就是说,字符串字面量是String实例的引用,并且相同的字符串字面量总是指向同一个String实例。这是因为作为常量表达式值的字符串会驻留(intern)在全局共享的运行时常量池(runtime constant pool)中,这样同一个实例可以被大家共享。
Java虚拟机规范对string literal作出了更进一步的描述(5.1 The Run-Time Constant Pool)
A string literal is a reference to an instance of class String, and is derived
from a CONSTANT_String_info structure (§4.4.3) in the binary representation of
a class or interface. The CONSTANT_String_info structure gives the sequence of
Unicode code points constituting the string literal.
The Java programming language requires that identical string literals (that
is, literals that contain the same sequence of code points) must refer to the
same instance of class String (JLS §3.10.5). In addition, if the method
String.intern is called on any string, the result is a reference to the same
class instance that would be returned if that string appeared as a literal. Thus, the
following expression must have the value true:
(“a” + “b” + “c”).intern() == “abc”To derive a string literal, the Java Virtual Machine examines the sequence of
code points given by the CONSTANT_String_info structure.
– If the method String.intern has previously been called on an instance of
class String containing a sequence of Unicode code points identical to that
given by the CONSTANT_String_info structure, then the result of string literal
derivation is a reference to that same instance of class String.
– Otherwise, a new instance of class String is created containing the sequence
of Unicode code points given by the CONSTANT_String_info structure; a
reference to that class instance is the result of string literal derivation. Finally,
the intern method of the new String instance is invoked.根据上面的描述我们能够知道,string literal是从一个CONSTANT_String_info结构得出的,这个结构包含了组成string literal的Unicode代码点序列。Java语言规定,由相同Unicode代码点组成的string literal必须指向同一个String实例。
String s = “test”;首先,JVM会检查相应的CONSTANT_String_info,获取其包含的Unicode代码点序列,也就是组成“test”的Unicode代码点序列。若之前已经在包含有相同Unicode代码点的String实例上调用过intern()方法,就直接返回在运行时常量池中的那个实例的引用。否则,一个包含了相应Unicode代码点序列的String实例就会被创建,然后返回一个指向这个新创建实例的引用。而后会对这个实例调用intern()方法,将其加入运行时常量池中。
$ java -version
openjdk version “1.7.0-internal-debug”
OpenJDK Runtime Environment (build 1.7.0-internal-debug-absfree_2016_11_06_18_52-b00)
OpenJDK 64-Bit Server VM (build 23.2-b09-jvmg, mixed mode)
首先使用static final的字符串常量:
public class Test {
private static final String s = “test”;
public static String combineStr(String s1, String s2) {
return s1 + s2;
public static void main(String[] args) {
for (int i = 0; i < 10000000; i++) {
combineStr(s, s);
public static void main(java.lang.String[]);
public static void main(java.lang.String[]);
0: iconst_0 // 将0(i的初值)压到操作数栈 1: istore_1 // 将操作数栈栈顶元素(0)存储到局部变量i中 2: iload_1 // 将局部变量i加载到操作数栈 3: ldc #6 // 将1000000压到操作数栈 5: if_icmpge 22 //比较i与1000000大小,若大于等于则跳到22 8: ldc #7 // 将"test"压到操作数栈 10: ldc #7 // 将"test"压到操作数栈 12: invokestatic #8 // Method combineStr:(Ljava/lang/String;Ljava/lang/String;)Ljava/lang/String;
15: pop
16: iinc 1, 1
19: goto 2
22: return
public static void main(String[] args) {
for (int i = 0; i < 10000000; i++) {
combineStr(“test”, “test”);
public static void main(java.lang.String[]);
0: iconst_0 // 将0(i的初值)压到操作数栈 1: istore_1 // 将操作数栈栈顶元素(0)存储到局部变量i中 2: iload_1 // 将局部变量i加载到操作数栈 3: ldc #6 // 将1000000压到操作数栈 5: if_icmpge 22 //比较i与1000000大小,若大于等于则跳到22 8: ldc #7 // 将“test”压到操作数栈 10: ldc #7 // 将“test”压到操作数栈 12: invokestatic #8 // Method combineStr:(Ljava/lang/String;Ljava/lang/String;)Ljava/lang/String;
15: pop
16: iinc 1, 1
19: goto 2
22: return
让我们可以进一步刨根问底,看看本地代码层面两者都是什么样的。借助于HSDIS插件(Basic Disassembler Plugin for HotSpot),我们可以获取到main方法编译为本地代码后对应的汇编代码:
$ java -XX:+PrintAssembly -Xcomp -XX:CompileCommand=dontinline,*Test.main -XX:CompileCommand=compileonly,*Test.main Test
. . .
# {method} ‘main’ ‘([Ljava/lang/String;)V’ in ‘Test’
# parm0: rsi:rsi = ‘[Ljava/lang/String;’
# [sp+0x20] (sp of caller)
. . .
mov %eax,-0x16000(%rsp) ; 检查栈溢
push %rbp ; 保存前一个栈帧基址
sub $0x10,%rsp ; 分配本次栈帧
xor %ebp,%ebp
mov $0xeb84d068,%rsi ; 把“test”所在的地址传递给rsi
mov $0xeb84d068,%rdx ; 把“test”所在的地址传递给rdx
callq 0x00007fc1550ce160 ; 调用combineStr()方法实际上,以上两种传参方式对应的汇编代码都是以上形式的。也就是说,传入字符串字面量或是静态常量实际上是一样的,本质上都是把相应字符串的地址直接传过去。而这个地址在实际执行main方法前就已经获取到了。具体原因还要看规范中对于“隐式对象创建”的说明(Java语言规范12.5 Creation of New Class Instances)
. . .
A new class instance may be implicitly created in the following situations:
• Loading of a class or interface that contains a String literal (§3.10.5) may create
a new String object to represent that literal. (This might not occur if the same
String has previously been interned (§3.10.5).)
. . .
不知是否解决了题主 @郭无心的疑问。如有问题希望各位可以指出:)