Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is the string literal considered a primitive type in JavaScript?

The official documentation as well as tons of articles on the internet say that 'some string' is a primitive value, meaning that it creates a copy each time we assign it to a variable.

However, this question (and answer to it) How to force JavaScript to deep copy a string? demonstrates that actually V8 does not copy a string even on the substr method.

It would also be insane to copy strings every time we pass them into functions and would not make sense. In languages like C#, Java, or Python, the String data type is definitely a reference type.

Furthermore, this link shows the hierarchy and we can see HeapObject after all. https://thlorenz.com/v8-dox/build/v8-3.25.30/html/d7/da4/classv8_1_1internal_1_1_sliced_string.html enter image description here

Finally, after inspecting

let copy = someStringInitializedAbove

in Devtools it is clear that a new copy of that string has not been created!

So I am pretty sure that strings are not copied on assignment. But I still do not understand why so many articles like JS Primitives vs Reference say that they are.

like image 444
Sagid Avatar asked Apr 25 '20 15:04

Sagid


People also ask

Why string is primitive data type in JavaScript?

In JavaScript, a primitive (primitive value, primitive data type) is data that is not an object and has no methods or properties. There are 7 primitive data types: string.

Is a string literal a primitive?

String literals are not primitive values. They are a shorthand notation for representing String objects.

Why is string primitive?

By that definition, strings and arrays would be a primitive types since they have special handling which is unlike anything user code could do.

Is a string a primitive data type?

Primitive data types - includes byte , short , int , long , float , double , boolean and char. Non-primitive data types - such as String , Arrays and Classes (you will learn more about these in a later chapter)


1 Answers

Fundamentally, because the specification says so:

string value

primitive value that is a finite ordered sequence of zero or more 16-bit unsigned integer values

The specification also defines that there are String objects, as distinct from primitive strings. (Similarly there are primitive number, boolean, and symbol types, and Number and Boolean and Symbol objects.)

Primitive strings follow all the rules of other primitives. At a language level, they're treated exactly the way primitive numbers and booleans are. For all intents and purposes, they are primitive values. But as you say, it would be insane for a = b to literally make a copy of the string in b and put that copy in a. Implementations don't have to do that because primitive string values are immutable (just like primitive number values). You can't change any characters in a string, you can only create a new string. If strings were mutable, the implementation would have to make a copy when you did a = b (but if they were mutable the spec would be written differently).

Note that primitive strings and String objects really are different things:

const s = "hey";
const o = new String("hey");

// Here, the string `s` refers to is temporarily
// converted to a string object so we can perform an
// object operation on it (setting a property).
s.foo = "bar";
// But that temporary object is never stored anywhere,
// `s` still just contains the primitive, so getting
// the property won't find it:
console.log(s.foo); // undefined

// `o` is a String object, which means it can have properties
o.foo = "bar";
console.log(o.foo); // "bar"

So why have primitive strings? You'd have to ask Brendan Eich (and he's reasonably responsive on Twitter), but I suspect it was so that the definition of the equivalence operators (==, ===, !=, and !==) didn't have to either be something that could be overloaded by an object type for its own purposes, or special-cased for strings.

So why have string objects? Having String objects (and Number objects, and Boolean objects, and Symbol objects) along with rules saying when a temporary object version of a primitive is created make it possible to define methods on primitives. When you do:

console.log("example".toUpperCase());

in specification terms, a String object is created (by the GetValue operation) and then the property toUpperCase is looked up on that object and (in the above) called. Primitive strings therefore get their toUpperCase (and other standard methods) from String.prototype and Object.prototype. But the temporary object that gets created is not accessible to code except in some edge cases,¹ and JavaScript engines can avoid literally creating the object outside of those edge cases. The advantage to that is that new methods can be added to String.prototype and used on primitive strings.


¹ "What edge cases?" I hear you ask. The most common one I can think of is when you've added your own method to String.prototype (or similar) in loose mode code:

Object.defineProperty(String.prototype, "example", {
    value() {
        console.log(`typeof this: ${typeof this}`);
        console.log(`this instance of String: ${this instanceof String}`);
    },
    writable: true,
    configurable: true
});

"foo".example();
// typeof this: object
// this instance of String: true

There, the JavaScript engine was forced to create the String object because this can't be a primitive in loose mode.

Strict mode makes it possible to avoid creating the object, because in strict mode this isn't required to be an object type, it can be a primitive (in this case, a primitive string):

"use strict";
Object.defineProperty(String.prototype, "example", {
    value() {
        console.log(`typeof this: ${typeof this}`);
        console.log(`this instance of String: ${this instanceof String}`);
    },
    writable: true,
    configurable: true
});

"foo".example();
// typeof this: string
// this instanceof String: false
like image 168
T.J. Crowder Avatar answered Sep 18 '22 16:09

T.J. Crowder