Thursday, February 28, 2013

Yet another blog post about Object Oriented Programming and JavaScript

As mentioned in my previous blog post, I'm doing some extensive JavaScript programming lately. It almost looks like I have become a religious JavaScript fanatical these days, but I can ensure you that I'm not. :-)

Moreover, I used to be a teaching assistant for TU Delft's concepts of programming languages course a few years ago. In that course, we taught several programming languages that are conceptually different from the first programming language students have to learn, which is Java. Java was mainly used to give students some basic programming knowledge and to teach them the basics of structured programming and object oriented programming with classes.

One of the programming languages covered in the 'concepts of programming languages' course was JavaScript. We wanted to show students that it's also possible to do object-oriented programming in a language without classes, but with prototypes. Prototypes can be used to simulate classes and class inheritance.

When I was still a student I had to do a similar exercise, in which I had to implement an example case with prototypes to simulate the behaviour of classes. It was an easy exercise, and the explanation that was given to me looked easy. I also did the exercise quite well.

Today, I have to embarrassingly admit that there are a few bits (a.k.a. nasty details) that I did not completely understand and it was driving me nuts. These are the bits that almost nobody tells you and are omitted in most explanations that you will find on the web or given by teachers. Because the "trick" that allows you to simulate class inheritance by means of prototypes is often already given, we don't really think about what truly happens. One day you may have to think about all the details and then you will probably run into the same frustrating moments as I did, because the way prototypes are used in JavaScript is not logical.

Furthermore, because this subject is not as trivial as people may think, there are dozens of articles and blog posts on the web, in which authors are trying the clarify the concepts. However, I have seen that a lot of these explanations are very poor, do not cover all the relevant details (and are sometimes even confusing) and implement solutions that are suboptimal. Many blog posts also copy the same "mistakes" from each other.

Therefore, I'm writing yet another blog post in which I will explain what I think about this and what I did to solve this problem. Another benefit is that this blog article is going to prevent me to keep telling others the same stuff over and over again. Moreover, considering my past career, I feel that it's my duty to do this.

Object Oriented programming


The title of this blog post contains: 'Object Oriented' (which we can abbreviate with OO). But what is OO anyway? Interestingly enough, there is no single (formal) definition of OO and not everyone shares the same view on this.

A possible (somewhat idealized) explanation is that OO programs can be considered a collection of objects that interact with each other. Objects encapsulate state (or data) and behaviour. Furthermore, they often have an analogy to objects that exist in the real world, such as a car, a desk, a person, or a shape, but this is not always the case.

For example, a person can have a first name and last name (data) and can walk and talk (behaviour). Shapes could have the following properties (state): a color, a width, height or a radius (depending on the type of shape, if they are rectangular or a circle). From these properties, we can calculate the area and the perimeter (behaviour).

When designing OO programs, we can often derive the kind of objects that we need from the nouns and the way they behave and interact from the verbs from a specification that describes what a program should do.

For example, consider the following specification:
We have four shapes. The first one is a red colored rectangle that has a width of 2 and an height of 4. The second shape is a green square having a width and height of 2. The third shape is a blue circle with a radius of 3. The fourth shape is a yellow colored circle with a radius of 4. Calculate the areas and perimeters of the shapes.

This specification may result in a design that looks as follows:



The figure above shows a UML object diagram, containing four rectangled shapes. Each of these shapes represent an object. The top section in a shape contains the name of the object, the middle section contains their properties (called attributes in UML) and state, and the bottom section contains its behaviour (called operations in UML).

To implement an OO program we can use an OO programming language, such as C++, Java, C#, JavaScript, Objective C, Smalltalk and many others, which is what most developers often do. However, it's not required to use an OO programming language to be Object Oriented. You can also keep track of the representation of the objects yourself. In C for example (which is not an OO language), you can use structs and pointers to functions to achieve roughly the same thing, although the compiler is not able to statically check whether you're doing the right thing.

Class based Object Oriented programming


In our example case, we have designed only four objects, but in larger programs we may have many circles, rectangles and squares. As we may observe from the object diagram shown earlier, our objects have much in common. We have designed multiple circles (we have two of them) and they have exactly the same attributes (a color and a radius) and behaviour. The only difference between each circle object is their state, e.g. they have a different color and radius value, but the way we calculate the area and perimeter remains the same.

From that observation, we could say that every circle object belongs to the same class. The same thing holds for the rectangles and squares. As it's very common to have multiple objects that have the same properties and behaviour, most OO programming languages require developers to define classes that capture these common properties. Objects in class based OO languages are (almost) never created directly, but by instantiation of a particular class.

For example, consider the following example implemented in the Java programming language that defines Person class (every person has a first and a last name from which a full name can be generated):

public class Person
{
    public String firstName, lastName;
    
    public Person(String firstName, String lastName)
    {
        this.firstName = firstName;
        this.lastName = lastName;
    }
    
    public String generateFullName()
    {
        return firstName + " " + lastName;
    }
}

Basically, the above class definition consists of three blocks. The upper block defines the attributes (first and last name), the middle block defines a constructor that is used to create a person object -- Every person is supposed to have a first and a last name. The bottom block is an operation (called method in Java) that generates the person's full name from its first and last name.

By using the new operator in Java in combination with the constructor, we can create objects that are instances of a Person class, for example:
Person sander = new Person("Sander", "van der Burg");
Person john = new Person("John", "Doe");
And we can use the generateFullName() method to display the full names of the persons. The way this is done is common for all persons:

System.out.println(sander.generateFullName()); /* Sander van der Burg */
System.out.println(john.generateFullName()); /* John Doe */

Instantiating classes have a number of advantages over directly creating objects with their state and behaviour:

  • The class definition ensures that every object instance have their required properties and behaviour.
  • The class definition often serves the role as a contract. We cannot give an object a property or method that is not defined in a class. Furthermore, the constructor function forces users to provide a first and last name. As a result, we cannot create person object with no first and last name.
  • As objects are always instances of a class, we can easily determine whether an object belongs to a particular class. In Java this can be done using the instanceof operator:
    sander instanceof Person; /* true */
    sander instanceof Rectangle; /* false */
    
  • The behaviour, i.e. the methods can be shared over all object instances, as they are the same for every object. For example, we don't have to implement the function that generates the full name for every person object that we create.

Returning to our previous example with the shapes: Apart from defining classes that capture commonalities of every object instance (i.e. we could define classes for Rectangles, Squares and Circles), we can also observe that each of these shape classes have commonalities. For example, every shape (regardless of whether it's a rectangle, square or circle) has a color, and for each shape we can calculate an area and perimeter (although the way that's done differs for each type).

We can capture common properties and behaviour of classes in so called super classes, allowing us to treat all shapes the same way. By adapting the earlier design of the shapes using classes and super classes, we could end-up with a new design that will look something like this:



In the above figure a UML class diagram is shown:
  • The class diagram looks similar to the previous UML object diagram. The main difference is that the rectangled shapes are classes (not objects). Their sections still have the same meaning.
  • The arrows denote extends (or inheritance) relationships between classes. Inheriting from a super class (parent) creates a sub class (child) that extends the parent's attributes and behaviour.
  • On top, we can see the Shape class capturing common properties (color) and behaviour (calculate area and perimeter operations) of all shapes.
  • A Rectangle is a Shape extended to have a width and height. A Circle is a shape having a radius. Moreover, both use different formulas to calculate their areas and perimeters. Therefore, the calculateArea() and calculatePerimeter() operations from the Shape class are overridden. In fact: they are abstract in the Shape class (because there is no general way to do it for all shapes) enforcing us to do so.
  • The Square class is a child class of Rectangle, because we can define a squares as rectangles with equal widths and heights.

When we invoke a method of an object, first its class is consulted. If it's not provided by the class and the class has a parent class, then the parent class is consulted, then its parent's parent and so on. For example, if we run the following Java code fragment:

Square square = new Square(2);
System.out.println(square.calculateArea()); /* 4 */

The calculateArea() method call is delegated to the calculateArea() method provided by the Rectangle class (since the Square class does not provide one), which calculates the square object's area.

The instanceof() operator in Java also takes inheritance into account. For example, the following statements all yield true:

square instanceof Rectangle; /* true, Square class inherits from Rectangle */
square instanceof Shape; /* true, Square class indirectly inherits from Shape */

However the following statement yields false (since the square object is not an instance of Circle or any of its sub classes):

square instanceof Circle; /* false */

Prototype based Object Oriented programming


Apart from the length, the attractive parts described in this blog post about OO programming is that OO programs often (ideally?) draw a good analogy to what happens in the real world. Class based OO languages enable reuse and sharing. Moreover, they offer some means of enforcing that objects constructed the right way, i.e. that they have their required properties and intended behaviour.

Most books and articles explaining OO programming immediately use the term classes. This is probably due to the fact that most commonly used OO languages are class based. I have intentionally omitted the term classes for a while. Moreover, not everyone thinks that classes are needed in OO programming.

Some people don't like classes -- because they often serve as contracts, they can also be too strict sometimes. To deviate from a contract, either inheritance must be used that extends (or restricts?) a class or wrapper classes must be created around them (e.g. through the adapter design pattern). This could result in many layers of glue code, significantly complicating programs and growing them unnecessarily big, which is not uncommon for large systems implemented in Java.

There is also a different (and perhaps a much simpler) way to look at OO programming. In OO languages such as JavaScript (and Self, which greatly influenced JavaScript) there are no classes. Instead, we create objects, their properties and behaviour directly.

In JavaScript, most language constructs are significantly simpler than Java:

  • Objects are associative arrays in which each member refers to other objects.
  • Arrays are objects with numerical indexes.
  • Functions are also objects and can be assigned to variables and object members. This his how behaviour can be implemented in JavaScript. Functions also have attributes and methods. For example, toString() returns the function's implementation code and length() returns the number of parameters the function requires.

"Simpler" languages such as JavaScript have a number of benefits. It's easier to learn as fewer concepts have to be understood/remembered and easier to implement by language implementers. However, as we create objects, their state and behaviour directly, you may wonder how we can share common properties and behaviour among objects or how we can determine whether an object belongs to a particular class? There is a different mechanism to achieve such goals: delegation to prototypes.

In prototype based OO languages, such as Self and JavaScript, every object has a prototype. Prototypes are also objects (having their own prototype). The only exception in JavaScript is the null object, which prototype is a null reference. When requesting an attribute or invoking a method of an object, first the object itself is consulted. If the object does not implement it, it consults its prototype, then the prototype's prototype and so on.

Although prototype based OO languages do not provide classes, we can simulate class behaviour (including inheritance) through prototypes. For example, we can simulate the two particular person objects that are an instance of a Person class (as shown earlier) as follows:



The above figure shows an UML object diagram (not a Class diagram, since we don't have classes ;) ) in which we simulate class instantation:
  • As explained earlier in this blog post, objects belonging to a particular class have common behaviour, but their state differs. Therefore, the attributes and their state have to be defined and stored in every object instance (as can be observed from the object diagram from the fact that every person has its own first and last name property).
  • We can capture common behaviour among persons in a common object, that will serve as the prototype of every person. This object serves the equivalent role of a class. Moreover, since each person instance can refer to exactly the same prototype object, we also have sharing and reuse.
  • When we invoke a method on a person object, such as generateFullName() then the method invocation is delegated to the Person prototype object, which will give us the person's full name. This exactly offers us the same behaviour as we have in a class based OO language.

By using multiple layers of indirection through prototypes we can simulate class inheritance. For example, to define a super class of a class, we set the prototype's prototype to an object capturing the behaviour of the super class. The following UML object diagram shows how we can simulate our earlier example with shapes:



As can be observed from the picture: if we would invoke the calculateArea() method on the square object (shown at the bottom), then the invocation is delegated to the square prototype's prototype (which is an object representing the Rectangle class). That method will calculate the area of the square for us.

We can also use prototypes to determine whether a particular object is an instance of a (simulated) class. In JavaScript this is done by checking the prototype chain to see whether there is an object that has exactly the same properties as the simulated class.

Simulating classes in JavaScript


So far I think the concepts of simulating classes and class inheritance through prototypes are clear. In short, there are three things we must remember:

  • A class can be simulated by creating a singleton object capturing its behaviour (and state that is common to all object instances).
  • Instantiation of a class can be simulated by creating an object having a prototype that refers to the simulated class object and by calling the constructor that sets its state.
  • Class inheritance can be simulated by setting a simulated class object's prototype to the simulated parent class object.

The remaining thing I have to explain is how to implement our examples in JavaScript. And this is where the pain/misery starts. The main reason of my frustration comes from the fact that every object in JavaScript has a prototype, but we cannot (officially) see or touch them directly.

You may probably wonder why have I used the word 'officially'? In fact, there is a way to see or touch prototypes in Mozilla-based browsers, through the hidden __proto__ object member, but this is a non-standard feature, does not officially exist, should not be used and does not work in many other JavaScript implementations. So in practice, there is no way to access an object's prototype directly.

So how do we 'properly' work with prototypes in JavaScript? We must use the new operator, that looks very much like Java's new operator, but don't let it fool you! In Java, new is called in combination with the constructor defined in a class to create an object instance. Since we don't have classes in JavaScript, nor language constructs that allow us to define constructors, it achieves its goal in a different way:

function Person(firstName, lastName) {
    this.firstName = firstName;
    this.lastName = lastName;
}

var sander = new Person("Sander", "van der Burg");
var john = new Person("John", "Doe");

In the above JavaScript code fragment, we define the constructor of the Person class as a (ordinary) function, that sets a person's first and last name. We have to use the new operator in combination with this function to create a person object.

What does JavaScript's new operator do? It creates an empty object, calls the constructor function that we have provided with the given parameters, and sets this to the empty object that it has just created. The result of the new invocation is an object having a first and last name property, and a prototype containing a constructor property that refers to our constructor function.

So how do we create a person object that has the Person class object as its prototype, so that we can share its behaviour and determine to which class an object belongs? In JavaScript, we can set the prototype of an object to be constructed as follows:

function Person(firstName, lastName) {
    this.firstName = firstName;
    this.lastName = lastName;
}

Person.prototype.generateFullName = function() {
    return this.firstName + " " + this.lastName;
};

/* Shows: Sander van der Burg */
var sander = new Person("Sander", "van der Burg");

/* Shows: John Doe */
var john = new Person("John", "Doe");

document.write(sander.generateFullName() + "<br>\n");
document.write(john.generateFullName() + "<br>\n");

To me the above code sample looks a bit weird. In the code block after the function definition, we adapt the prototype object member of the constructor function? What the heck is this and how does this work?

In fact, as I have explained earlier, functions are also objects in JavaScript and we can also assign properties to them. The prototype property is actually not the prototype of the function (as I have said: prototypes are invisible in JavaScript). In our case, it's just an ordinary object member.

However, the prototype member of an object is used by the new operator that creates objects. If we call new in combination with a function that has a prototype property, then the resulting object's real prototype refers to that prototype object. We can use this to allow an object instance's prototype to refer to a class object. Moreover, the resulting prototype always refers to the same prototype object (namely Person.prototype), that allows us to share common class properties and behaviour. Got it? :P

If you didn't get it (for which I can't blame you): The result of the new invocations in our last code fragment exactly yields the object diagram that I have shown in the previous section containing person objects that refer to a Person prototype.

Simulating class inheritance in JavaScript


Now that we "know" how to create objects that are instances of a class in JavaScript, there is even a bigger pain. How to simulate class inheritance in JavaScript? This was something that really depressed me.

As explained earlier, to create a class that has a super class, we must set the prototype of a class object to point to the super class object. However, since we cannot access or change object's prototypes directly, it's a bit annoying to do this. Let's say that we want to create an object that is an instance of a Rectangle (which inherits from Shape) and calculate its area. A solution that I have seen in quite some articles on the web is:

/* Shape constructor */
function Shape(color) {
    this.color = color;
}

/* Shape behaviour (does nothing) */
Shape.prototype.calculateArea = function() {
    return null;
};

Shape.prototype.calculatePerimeter = function() {
    return null;
};

/* Rectangle constructor */
function Rectangle(color, width, height) {
    Shape.call(this, color); /* Call the superclass constructor */
    this.width = width;
    this.height = height;
}

/* Rectangle inherits from Shape */
Rectangle.prototype = new Shape();
Rectangle.prototype.constructor = Rectangle;

/* Rectangle behaviour */
Rectangle.prototype.calculateArea = function() {
    return this.width * this.height;
};

Rectangle.prototype.calculatePerimeter = function() {
    return 2 * this.width + 2 * this.height;
};

/* Create a rectangle instance and calculate its area */
var rectangle = new Rectangle("red", 2, 4);
document.write(rectangle.calculateArea() + "<br>\n");

The "trick" to simulate class inheritance they describe is to instantiate the parent class (without any parameters), setting that object as the child class prototype object and then adding the child class' properties to it.

The above solution works in most cases, but it's not very elegant and a bit inefficient. Indeed, the resulting object has a prototype that refers to the parent's prototype, but we have a number of undesired side effects too. By running the base class' constructor we will do some obsolete work - it now assigns undefined properties to a newly created object that we don't need. Because of this, the Rectangle prototype now looks like this:



It also stores undefined class properties in the Rectangle class object, which is unnecessary. We only need an object that has a prototype referring to the parent class object, nothing more. Furthermore, a lot of people may also build checks in constructors that may throw exceptions if certain parameters are unspecified.

A better solution would be to create an empty object which prototype refers to the parent's class object, which we can extend with our child class properties. To do that we can use a dummy constructor function that just returns an empty object:

function F() {};

Then we set the prototype property of the dummy function to the Shape constructor function's prototype object member (the object representing the Shape class):

F.prototype = Shape.prototype;

Then we call the new operator in combination with F (our dummy constructor function). We'll get an empty object having the Shape class object as its prototype. We can use this object as a basis for the prototype that defines the Rectangle class:

Rectangle.prototype = new F();

Then we must fix the Rectangle class object's constructor property to point to the Rectangle constructor, because now it has been set to the parent's constructor function (due to calling the new operation previously):

Rectangle.prototype.constructor = Rectangle;

Finally, we can add our own class methods and properties to the Rectangle prototype, such as calculateArea() and calculatePerimeter(). Still got it? :P

Since the earlier procedure is so weird and complicated (I don't blame you if you don't get it :P), we can also encapsulate the weirdness in a function called inherit() that will do this for any class:

function inherit(parent, child) {
    function F() {}; 
    F.prototype = parent.prototype; 
    child.prototype = new F();
    child.prototype.constructor = child;
}
The above function takes a parent constructor function and child constructor as function parameters. It points the child constructor function's prototype property to an empty object which prototype points to the super class object. After calling this function, we can extend the subclass prototype object with class members of the child class. By using the inherit() function, we can rewrite our earlier code fragment as follows:

/* Shape constructor */
function Shape(color) {
    this.color = color;
}

/* Shape behaviour (does nothing) */
Shape.prototype.calculateArea = function() {
    return null;
};

Shape.prototype.calculatePerimeter = function() {
    return null;
};

/* Rectangle constructor */
function Rectangle(color, width, height) {
    Shape.call(this, color); /* Call the superclass constructor */
    this.width = width;
    this.height = height;
}

/* Rectangle inherits from Shape */
inherit(Shape, Rectangle);

/* Rectangle behaviour */
Rectangle.prototype.calculateArea = function() {
    return this.width * this.height;
};

Rectangle.prototype.calculatePerimeter = function() {
    return 2 * this.width + 2 * this.height;
};

/* Create a rectangle instance and calculate its area */
var rectangle = new Rectangle("red", 2, 4);
document.write(rectangle.calculateArea() + "<br>\n");

The above example is the best solution I can recommend to properly implement simulated class inheritance.

Discussion


In this lengthy blog post I have explained two ways of doing object programming: with classes and with prototypes. Both approaches makes sense to me. Each of them have advantages and disadvantages. It's up to developers to make a choice.

However, what bothers me is the way prototypes are implemented in JavaScript and how they should be used. From my explanation, you will probably conclude that it completely sucks and makes things extremely complicated. This is probably the main reason why there is so much stuff on this subject on the web.

Some people argue that classes should be added to JavaScript. For me personally, there is nothing wrong with using prototypes. The only problem is that JavaScript does not allow people to use them properly. Instead, JavaScript exposes itself as a class based OO language, while it's prototype based. The Self language (which influenced JavaScript) for instance, does not hide its true nature.

Another minor annoyance is that if you want to properly simulate class inheritance in JavaScript, the best solution is probably to steal my inherit() function described here. You can probably find dozens of similar functions in many other places on the web.

You may probably wonder why JavaScript sucks when it comes to OO programming? JavaScript was originally developed by Netscape as part of their web browser, in a rough period known as the browser wars, in which there was heavy competition between Netscape and Microsoft.

At some point Netscape bundled the Java platform with its web browser allowing developers to embed Java Applets in web pages, which are basically advanced computer programs. They also wanted to offer a light weight alternative for less technical users that had to look like Java, which had to be implemented in a short time span.

They didn't want to design and implement a new language completely from scratch. Instead, they took an existing language (probably Self) and adapted it to have curly braces and Java keywords, to make it look a bit more like Java. Although Java and JavaScript have some syntactic similarities, they are in fact fundamentally different languages. JavaScript hides its true nature, such as the fact that it's prototype based. Nowadays, its popularity has significantly increased and we use it for many other purposes in addition to the web browser.

Fortunately, the latest ECMAScript standard and recent JavaScript implementations ease the pain a little. New implementations have the Object.create() method allowing you to directly create an object with a given prototype. There is also a Object.getPrototypeOf() which gives you read-only access to a prototype of any object. However, to use these functions you need a modern browser or JavaScript runtime (which a lot of people don't have). It will probably take several years before everybody has updated their browsers to versions that support these.

References


In the beginning of this blog post, I gave a somewhat idealized explanation of Object Oriented programming. There is no uniform definition of Object-Oriented programming and its definition seems to be still in motion. On the web I found an interesting blog post written by William Cook, titled: 'A Proposal for Simplified, Modern Definitions of "Object" and "Object Oriented', which may be interesting to read.

Moreover, this is not the only blog post about programming languages concepts that I wrote. A few months ago I also wrote an unconventional blog post about (purely) functional programming languages. It's a bit unorthodox, because I draw an analogy to package management.

Finally, Self is a relatively unknown language for the majority of developers in the field. Although it started a long time ago and it's considered an ancient language, to me it looks very simple and powerful. It's a good lesson for people who want to design and implement an OO language. I could recommend everybody to have a look at the following video lecture from Stanford University about Self. Moreover, Self is still maintained and can be obtained from the Self language website.