Excerpt from Concepts of Object-Oriented Programming with Visual basic by Steven Roman, published by Springer-Verlag. ISBN: 0-387-94889-9

Copyright © 1999 by The Roman Press, Inc. All Rights Reserved. You may view and print this document for your own personal use only. No portion of this document may be sold or incorporated into any other document for any reason.

The Basics of Object-Oriented Programming

Data Types

We begin our discussion of object-oriented programming with a more familiar concept - that of a data type. What is a data type? For instance, what is the integer data type?

One possible answer is that, for Visual Basic, the integer data type is the set of integers from -32,768 to 32,767. However, this answer is not very stimulating and will not lead us to object-oriented ideas.

To provide a more fruitful answer, recall that all data are stored in a computer in binary form, as strings of 0's and 1's. Moreover, to a computer (that is, to the CPU), a binary string is just a string of 0's and 1's - no more and no less. The CPU knows how to push these strings around but does not attach a meaning or interpretation to them. It might be fair to say that the CPU recognizes only one data type, namely, the bit data type!

With this view in mind, we can say that the integer data type is a certain way of interpreting binary words - 16-bit binary words in the case of Visual Basic. Consider, for example, the 16-bit binary word 0100 0001 0100 0001. To the CPU, this is nothing more than a string of bits. To Visual Basic, it is also nothing more than that, until we give it an interpretation.

For instance, to tell Visual Basic to interpret this word as a string data type, we would write

Dim X as String
X = "AA"

since the ANSI code for "AA" is 0100 0001 0100 0001. To tell Visual Basic to interpret this word as an integer data type, we would write

Dim X as Integer
X = 16705

since the word in question is the binary representation of the integer 16705.

Now comes the key point. Interpreting a binary word as a particular data type (such as integer or string) implicitly gives that word certain properties and operations. For instance, a binary word of the integer data type has the sign property, which can take the values positive, negative or zero. In Visual Basic, the sign property is realized through the Sgn function, as in the following code:

Select Case Sgn(X)
Case 1

MsgBox "Positive"
Case 0

MsgBox "Zero"
Case -1

MsgBox "Negative"
End Select

On the other hand, when the same binary word is interpreted as a string, it does not have the sign property. It does, however, have the length property, realized in Visual Basic by the Len function. Moreover, binary words of the integer data type have the usual arithmetic operations, such as negation, addition, subtraction and multiplication, whereas binary words of the string data type have the concatenation operation, for instance.

Since data types would not be of much use without their concomitant properties and operations, it makes sense to define a data type as a way of interpreting binary strings together with these properties and operations. Thus, the properties and operations are included as part of the definition of data type.

As we will see, this definition of data type has far-reaching consequences. It may seem at first that it makes no difference whether we include the properties and operations as part of the definition, or think of them as separate from the data type, as long as they are there. However, the purpose of including the properties and operations within the definition is more than cosmetic, since it allows for the abstraction of the concept of data type and signals the beginning of a new philosophy when thinking about coding issues. Part of this new philosophy is termed encapsulation.

Encapsulation

The idea of encapsulation is to contain (or "encapsulate") in one neat bundle the properties and behaviors (operations) that characterize an object. This serves three useful purposes:

· It permits the protection of these properties and behaviors from outside tampering by exposing only those portions that are needed in order to use the properties and behaviors.

· It allows the inclusion of validation code to help catch errors in the use of the exposed interface.

· It frees the user from having to know the details of how the properties and behaviors are implemented.

In a sense, all of learning involves encapsulation of concepts. Let us consider an example that involves the way computers store signed (that is, positive, zero and negative) integers and do arithmetic with these integers. Please bear with me, even if some of this seems a bit irrelevant at first - it will only take a couple of paragraphs.

As you undoubtedly know, an integer is stored in the memory of a PC as a string of 0's and 1's - called a binary string. This string needs to be interpreted in order to have meaning. In some languages, such as C, we may designate, for instance, that the string should be interpreted as a signed integer or as an unsigned integer, the difference being whether or not the number is allowed to be negative. This distinction is important because we have only limited space, and so, for instance, disallowing negative numbers gives us more room for nonnegative numbers.

In any case, Visual Basic does not provide us with this option. All integers are considered to be signed integers and are represented in the computer's memory in a form called two's-complement representation. The reason for the "fuss" is that there is no way to directly represent a negative sign in the computer's memory, and so a portion of the binary string itself must be used to represent the negative sign.

For simplicity, let us consider 8-bit binary numbers. An 8-bit binary number has the form a7a6a5a4a3a2a1a0, where each of the a1's is a 0 or a 1. We can think of it as appearing in memory as shown below.

As an example, consider the binary numbers

x = 11110000 and y = 00001111

In the two's-complement representation, the leftmost bit a7 (called the most significant bit) is the sign bit. If the sign bit is 1, the number is negative. If the sign bit is 0, the number is positive. Thus, x is negative and y is positive.

The formula for converting a two's-complement representation a7a6a5a4a3a2a1a0 of a number to a decimal representation is

decimal rep. = -128a7 + 64a6 + 32a5 + 16a4 + 8a3 + 4a2 + 2a1 + a0

(note that the coefficients are just successive powers of 2). Thus, for instance, the decimal representation of the number x given above is

x = -128 + 64 + 32 + 16 = -16

and the decimal representation of y is

y = 8 + 4 + 2 + 1 = 15

To take the negative of a number when it is represented in two's-complement form, we must take the complement of each bit (that is, change each 0 to a 1 and each 1 to a 0) and then add 1. For instance, to form the negative of the number x, we first take the complement

00001111

which, in decimal form, is

8 + 4 + 2 + 1 = 15

and then we add 1, to get 16 (which is indeed the negative of -16).

Hopefully, at this point you are saying to yourself, "What is the point of this discussion? What does it have to do with object-oriented programming? I didn't buy this book to get a math lesson! As a programmer, I don't have to worry about these details. I just write code like

x = -16
y = -x

and let the computer and the programming language worry about which representation to use and how to perform the given operations."

If you are saying this, then you get the point! The details of how signed integers are interpreted by the computer (and the compiler) and how their properties and operations are implemented are encapsulated in the integer data type and are thus hidden from the user. Only those portions of the properties and operations that we need in order to work with integers are exposed outside of the data type. These portions form the public interface for the integer data type.

Moreover, encapsulation protects us from making errors. For instance, if we had to do our own negating by taking Boolean complements and adding 1, we might forget to add 1! The encapsulated data type takes care of this automatically.

Encapsulation has yet another important feature. Any code that is written using the public interface will remain valid even if the internal workings of the integer data type are changed for some reason, as long as the interface is not changed. For instance, if we move the code to a computer that stores integers in one's-complement representation, then the internal procedure for implementing the operation of negation in the integer data type will have to be changed, but from the programmer's point of view, nothing has changed. The code

x = -16
y = -x

is just as valid as before. What a relief.

Abstract Data Types

Encapsulation is so useful for data types that it makes sense to apply the concept to as many other objects as possible. Perhaps the best way to make this idea clear is through an example.

Consider a hypothetical teacher who wishes to write a program to keep students' exam grades for a particular course (say there will be three exams during the semester). Why not define an abstract data type named Student and give it some properties and operations?

The Student data type will have the properties FullName, StudentID, Exam1, Exam2 and Exam3. The FullName and StudentID properties are strings and the Exam1, Exam2 and Exam3 properties are real numbers. The Student data type has two operations: Average, which returns a weighted average of the three exam grades (the third exam is the final), and Pass, which returns yes if the student passes the course or no if the student does not pass.

From now on, we will generally refer to operations as methods, because it is the term used in Visual Basic and it is better suited to abstract data types. While we are on the subject of terminology, in object-oriented programming circles, properties are also called resources, attributes or member variables and methods are also called behaviors, services, operation, member function or responsibilities (yuck!).

An abstract data type is an abstraction of a basic (or shall we say concrete) data type, such as the integer data type. It has properties and operations, but it does not have quite the same concept of interpretation. To understand the differences between abstract and concrete data types, consider that the code

Dim X As Integer
X = 200

can be interpreted as follows:

Line 1: Let X be a variable that refers to a 16-bit area of memory that will be interpreted as an integer. Line 2: Fill the area of memory referred to by X with the binary string 0000 0000 1100 1000, which is thus the binary representation of the integer 200.

For an abstract data type, we must be a little less "concrete." The code

Dim Donna As Student
Set Donna = New Student

can be interpreted as follows:

Line 1: Let Donna be a variable that will refer to an abstract "object" of type Student (rather than to a memory location). Line 2: Create an object of that type and let Donna refer to that object.

From a programmer's perspective, the notion of a 16-bit area of memory is concrete - it is something we can visualize. On the other hand, the notion of an object doesn't bring anything concrete to mind. Instead, we may simply think of an object as a "black box" that Visual Basic manages in some arcane fashion. The concrete side of an object consists of its properties and methods, not where it is "located" in memory. In a sense then, an object is defined or characterized completely by its public interface!

From a practical point of view, since we seldom think about areas of memory when dealing with, say, an integer variable, there is little difference between a concrete and an abstract data type, although the latter tends to be more complex, in that it has many more properties and methods.

It is also interesting to note that abstract data types are built upon concrete data types. For instance, objects of the abstract Student data type have a FullName property, which takes as its value a member of the concrete String data type.

As we will see, properties of an object can also be of type Object, that is, an object can have a property whose value is another object. We will refer to such a property as an object property. For instance, an object of type Student can have a property of type Teacher. These objects in turn can have properties of type Object, and so on. Eventually, the chain of object properties must terminate in objects whose properties have concrete data types (such as integer or string).

The existence of object properties allows for the creation of object hierarchies in Visual Basic, an example of which is shown below. In this case, an object of type Student has three properties of type Object - Advisor, Courses and Transcript. The Courses object is a special type of object, known as a collection object. Collection objects are designed to hold other objects. This is important since, unless an object is either referred to by a variable or contained in a collection, it will be destroyed by Visual Basic.

The Courses collection object contains objects of type Course, that is, individual courses. Each object of type Course has an object property called Professor, which in turn has an object property called Emolument.

Object hierarchies can get quite complicated, but they have great advantages. For instance, they provide a logical structure to the program. We will discuss object hierarchies in more detail later in this chapter.

Classes

In many languages, including Visual Basic, abstract data types are implemented through classes. In general terms, a class is just a description of the properties and methods that define an abstract data type. In Visual Basic, a class is an actual code module that describes these properties and methods. Thus, a class is a template for making objects of a certain type. It is important not to confuse the template for building objects with the objects themselves. Put another way, it is important not to confuse the description of an object with the object itself.

Defining a Class in Visual Basic
To define a class in Visual Basic, we just insert a new class module and assign it a name in the Properties dialog box, for example, CStudent. Note that, as is customary, we prefix the letter "C" (for Class) to the class name.

If we want to think in a truly object-oriented manner, we should think of the class CStudent itself as a member of an abstract data type called Class (perhaps it should be CClass, but this is not customary). One of the properties of the class Class is Name. Simply put, every class has a name. The Class data type is managed by the Visual Basic IDE (integrated development environment) and has three properties: Name, Instancing and Public, as you can see from the Properties window of a class module, shown in Figure 1.1. (We will discuss the other two properties in the chapter on OLE automation.) When we use the Insert Class menu option in Visual Basic, we are actually creating a new object of type Class. Visual Basic then assigns to the object's Name property the default value Class1 (or Class2, etc.).

Figure 1.1

It is good programming practice to describe a newly created class in the General Declarations section of the class module. Thus, for the class CStudent, we might write

'' (In General Declarations Section)
' The Student Class CStudent
'
' Author: Steven Roman
' Date of last revision: Oct 1, 1066
'
' Properties:
' FullName
' StudentID
' Exam1, Exam2, Exam3
'
' Methods:
' Average
' Pass

(Code lines that begin with a double apostrophe are comments to the reader and are not intended to be included in the actual code. Thus, for instance, we will indicate the location of code by lines such as the first line of code above.)

Defining a Property in Visual Basic
In Visual Basic, a property is described by a public variable or by an exposed private variable (we will clarify this soon). Thus, we can make the following property declarations in the General Declarations section of the CStudent class module:

'' (In General Declarations Section)
' Properties
Public FullName as String
Public StudentID as String
Public Exam1 as Single
Public Exam2 as Single
Public Exam3 as Single

Defining a Method in Visual Basic
In Visual Basic, a method is just a public procedure (subroutine or function). Thus, the Average method for the CStudent class is defined as follows:

Public Function Average() as Single

' Exam scores must lie between 0 and 100
If Exam1 >=0 and Exam1 <= 100 and _
Exam2 >=0 and Exam2 <= 100 and _
Exam3 >=0 and Exam3 <= 100 then

Average = 0.25*Exam1 + 0.25*Exam2 + 0.50*Exam3

Else

'' Code here to display an error message

End If

End Function

Public Function Pass(pCutOff as Single) as Boolean

' Pass a student if average >= cutoff
If pCutOff <0 or pCutOff >100 then

'' Code here to display error message

Else

If Average >= pCutOff then

Pass = true

Else

Pass = False

End If

End If

End Function

Exposing Properties Through the Property Let/Set/Get Procedures
There is one major problem with our implementation of the CStudent class - it is too public and violates encapsulation principles. It is clear that methods allow for the inclusion of validation code to "protect" them from untoward usage. For instance, we have included validation code to prevent negative exam scores and scores that exceed 100.

On the other hand, the value of a property set through a public variable can be set from anywhere in the project and is not subject to validation. For instance, student IDs may be required to be strings of length 10, but there is nothing to prevent us, or someone coding another portion of the application, from accidentally entering a StudentID of length 9. This might produce an error down the road, when the property is used. However, at that point, it might be very difficult to locate the offending code.

In order to prevent this, proper encapsulation dictates that we hide direct access to the member variables, by declaring them Private. Then we expose each property by means of two special methods - Property Let (or Property Set, in the case of properties of type object) to set the property and Property Get to retrieve the property. This will allow us to include validation code to prevent the aforementioned peccadillo. Here is how the StudentID property should be coded, for instance:

'' (In Declarations Section)
' Private member variable
Private mStudentID As String

Public Property Let StudentID(pID As string)

' Validate ID
If Len(pID) = 10 then

mStudentID = pID
Else

'' Code here to raise an error
End If

End Property

Public Property Get StudentID() As String

'' Getting the property requires no validation
StudentID = mStudentID

End Property

Note that the private variable holds the property, which is exposed through public methods. Thus, the public interface consists entirely of methods, which can contain validation code. This has the further advantage that we can easily make the StudentID property read-only by leaving out the Property Let method. In this case, the property can be set only through its private member variable, from within the class module. This would not be possible using property variables alone.

In the theory of object-oriented programming, the Property Let (and Set) method is referred to as an update method, since it updates the value of the property, and the Property Get method is referred to as an accessor method, since it accesses the value of a property. (Sometimes both property procedures are referred to as accessor methods.) These two methods are the key to encapsulation, since their presence implies that the public interface consists only of methods, which can perform validation.

As mentioned in the introduction to the book, for the sake of simplicity and to save space, we will reluctantly violate encapsulation principles in our examples, by often declaring properties using public variables.

Objects

A class is just a description of some properties and methods and does not have a life of its own. In fact, if we run a program that contains only an empty form, along with a class module filled with code, nothing much will happen - the code in the class module will not execute. To obtain something useful, we must create an instance of the class, officially known as an object. Creating an instance of a class is referred to as instancing, or instantiating, the class. (For some reason, one sometimes sees the inaccurate phrase instance of an object. The object is the instance.) We may create as many instances of a class as desired.

Instancing a class, that is, creating an object, is a two-step process in Visual Basic, because it is first necessary to declare a variable to use as a reference to the prospective object. Moreover, there are two forms of object creation - explicit creation and implicit creation.

Explicit Object Creation
To explicitly create an object of type CStudent, we use the code

Dim Donna As CStudent
Set Donna = New CStudent

The first line declares a variable named Donna to be of type CStudent. The second line asks Visual Basic to create an object of type CStudent and assign Donna as a reference to that object.

It is very important to keep a clear distinction between an object and a variable that refers to that object. It is the variable that we use in code. In fact, variables are the only means we have to use objects - we never really "see" the object itself.

To drive this point home, note that if we write

Dim Steve As CStudent
Set Steve = Donna

then Steve and Donna both point to (that is, refer to) the same object. Thus, we have two object variables but only one object.

The fact that object variables provide only references to objects can lead to confusion when many different object variables reference the same object. One area of potential problems arises when it comes time to destroy an object, since Visual Basic will not do so until all references to that object are removed. Thus, for instance, while the line

Set Donna = Nothing

removes the Donna reference to the object, the object is not destroyed, since Steve still holds a reference to that object.

Despite these facts, one often hears an expression such as "the object Donna" rather than "the object referenced by Donna." Since the former usage is so common, we will feel free to use it as well, with the hope that no confusion will arise between a variable that points to an object and the object itself.

Instance Variables and Member Variables
There are some additional points that we should emphasize here. The line

Set Donna = New CStudent

not only creates an object, but assigns to that object its own copy of the member variables (both public and private) of the class CStudent. Simply put, each object gets its own copy of the member variables. These copies are referred to as the instance variables of the object. Thus, there is a distinction between member variables and instance variables. In a sense, member variables are never actually used as variables but serve as a "prototype" for the instance variables.

Thus, for example, if we write

Dim Bill as CStudent
Set Bill = New CStudent

then Donna and Bill will each have their own separate variables named FullName, StudentID, Exam1, Exam2 and Exam3. To refer to Donna's instance variables, the variable name must be qualified, as in

Donna.FullName

In fact, the expression Donna.FullName is called a fully qualified property name.

On the other hand, since Donna and Steve (from the previous subsection) point to the same object, they share that object's instance variables. Thus, the code

Steve.Exam1 = 20

implies also that Donna.Exam1 equals 20. This is in contradistinction to the situation for ordinary variables, where, for example, if we execute the code

X = 5
Y = X
Y = 10

then the value of X is still 5.

Method names must also be qualified, to indicate which instance variables (if any) are to be used in the code for that method. Thus, to execute the Average method for Donna, we write

Donna.Average

The Pass method requires a parameter, as in

If Donna.Pass(65) then MsgBox "You Passed!"

It is important to emphasize that, while each instance of a class gets its own copy of the member variables, all instances share the same method code from the class. Thus, the lines

Donna.Average

and

Bill.Average

will cause the same lines of code to be executed. Of course, any references to member variables are replaced by the corresponding instance variables for the object in question. For example, when Donna.Average is executed, Visual Basic uses Donna's instance variables Donna.Exam1, Donna.Exam2 and Donna.Exam3, whereas when Bill.Average is executed, Visual Basic uses Bill's instance variables Bill.Exam1, Bill.Exam2 and Bill.Exam3.

Method code sharing is one of the most important advantages of object-oriented programming. In short, methods contain resuable code.

The As Object Syntax
If you have been programming with Visual Basic, then you may be familiar with the following syntax for declaring an object variable:

Dim Donna As Object

This line declares Donna as a variable of the generic abstract data type Object and thus allows Donna to reference any object. To illustrate, consider the following code:

' Declare Donna as a generic object variable
Dim Donna As Object

' Set Donna to reference a CStudent object
Set Donna = New CStudent
Donna.Name = "Donna Smith"
MsgBox Donna.Name

' Now set Donna to reference a Visual Basic form!
Set Donna = Forms.Item(0)
MsgBox Donna.Name

After setting the object variable Donna to point to a CStudent object, we then set the variable to point to a form object! (Forms.Item(0) refers to the first loaded form. We will discuss the Forms collection later in the chapter.)

It might seem that the use of generic object variables simplifies programming, since we don't have to decide ahead of time what type of object a given variable will reference. However, there is a performance penalty to pay for using generic object variables. We will say a few words about this issue now and postpone a thorough discussion until the chapter on OLE automation, where the issue is most keenly felt.

Visual Basic can determine the type of object that is referenced by a variable (also known as resolving a variable reference or binding a variable) either at compile time or at run time. Resolving references at compile time produces a more efficient executable file, since the file does not need to contain the extra code needed to do the referencing.

When we use specific object types rather than generic object types, such as CStudent instead of Object, or Textbox instead of Control, or Integer instead of Variant, Visual Basic can bind the variables at compile time, which is more desirable. This is particularly important when there are lots of object variables, or when using OLE automation objects, as we will see in a later chapter. The moral is: For improved performance, be as specific as possible when declaring variables.

Implicit Object Creation
As we have seen, the second line in the following code

Dim Donna As CStudent
Set Donna = New CStudent

asks Visual Basic to create an object of type CStudent and assign Donna as a reference to that object. Visual Basic provides an interesting and often useful alternative to this explicit object creation, which we will refer to as implicit object creation.

Implicit object creation is done by replacing the two lines above with the following single line:

Dim Donna as New CStudent

The effect of this line is to declare Donna as a variable of type CStudent, but it does not immediately create an instance of the CStudent class. However, Visual Basic will create an instance of CStudent and point Donna to that instance the first time the variable Donna is used. For example, the line

Donna.FullName = "Donna Smith"

will cause Visual Basic to create a CStudent object (that is, assuming this is the first line in which the variable Donna is used).

We will see examples of the use of both explicit and implicit object creation when we discuss object properties a bit later in this chapter.

It is probably true that explicit object creation is the better programming practice, since we have precise control over when objects are created. Under implicit object creation, the point at which an object is created depends upon when it is first referenced. Thus, we might add a seemingly innocuous new line of code to a program that was written many months earlier, and thereby unwittingly change the point at which an object is created. Generally, this is not a problem. However, as we will see, when an object is first created, Visual Basic fires a special event called the Initialize event for that object. If we have placed some time-sensitive code in that event, a change in the creation time of an object could potentially cause problems.

On the other hand, we will see in the chapter on OLE automation that there are circumstances that require the use of implicit object creation. Thus, a good general working rule is to use explicit object creation (even though it takes a few extra lines of code) unless implicit object creation is required.

Referencing Public Variables and Procedures
For the sake of reference, let us pause to collect in one place the rules for declaring and referencing Public variables and procedures in Visual Basic. Note that the rules depend on the type of module in which the variable or procedure is defined.

· Public variables and procedures can be defined in the General Declarations section of a module only - not within procedures.
· Public variables and procedures can be defined in any type of module (form, standard or class).
· A public variable or procedure defined in a standard module can be used throughout the project without qualification.
· A public variable or procedure defined in a form module must be qualified with the form name before it can be addressed. For example, if a form module named frmMain has a public variable or procedure named Pub, then the proper syntax for accessing this variable or procedure from other parts of the project is frmMain.Pub.
· A public variable or procedure defined in a class module must be qualified before it can be addressed. However, unlike the case for form modules, if a class named TestClass has a public variable or procedure named Pub, then we cannot address this variable or procedure as TestClass.Pub, for the simple reason that TestClass is not an object; it is a class (or template for an object). To qualify a public variable or procedure from a class module, you must first instance that class and then use the instance name as a qualifier, as in

Dim ctest As New TestClass
Call ctest.Pub


· You cannot define a public fixed-length string or array (or constant or Declare) in a Form or Class Module.

Note that procedures and properties in different modules can have the same name. As long as the fully qualified names are different, there is no problem accessing the correct procedure or property. Some authors suggest that you should not use the same name in different modules, because it can cause confusion. However, this advice precludes taking advantage of an important programming technique known as overloading. In fact, Visual Basic does this very thing all the time, as in, for example, frmMain.ScaleHeight and frmOther.ScaleHeight.