Introducing new form of parametric polymorphism
in object oriented programming languages
Author: Iancu Mihai Cãputã
Mentor: dr. Simona Claudia Motogna
[Nowadays software development tools have to provide effective means of data manipulation with minimal development time. As types represent the meaning of raw data, this paper focuses upon taking types to another level in an object oriented dynamically type-safe programming language, to increase language flexibility and productivity.]
CONTENT:
1. Background
2. Introducing type unbound variables
3. Type unbound variables and parametric polymorphism
4. Implications of type unbound variables
4.1 Type safety
4.2 Strong typing
4.3 Limitations
4.4 Thread safety
4.5 Lifetime and variable storage
6. Future work
7. References
Mainstream object oriented programming languages, such as .NET languages or Java, support both kind of type-checks (statically and dynamically) in order to increase productivity and to allow the programmer to write code in a non-intrusive manner.
Static type checks or static typing means that the type–checker, which may or may not be part of the compiler, performs an analysis over the source-code at compile time to ensure that certain type-constraints are not being violated.
In C#, for instance, static type checks are made when resolving the overloading of methods, or when performing an upcast (casting from derivate type into base type).
Dynamic type checks (aka. Runtime type checks) characterizes a dynamically typed programming language which is one where type constraints are being checked at runtime, before the usage of a variable/object of a certain type, in a given context (function call, variable assignments, casting, etc).
In C#, dynamic type checks are made when a downcast (casting from base type into a derivate type) is performed, in order to ensure that the backing object that is cast into the derivate type, is instance of the derivate type or of another type that derives from that derivate type. This check can only be done at runtime, except some scenarios in which a smart compiler can figure out that the cast is possible.
By taking the dynamically-typing even further, productivity can be increased considerably. This can be achieved by introducing the concept of “Type-unbound variables” which induces a new form of parametric polymorphism and exemplify it on a programming language prototype, called X Language or shortly X. It targets the .NET framework 2.0 and the syntax is almost identical to C# 2.0’s. This language is under development.
Also in this paper I will often make references to C#, which is very popular amongst developers, but most of these references are traceable in most of the OOPL on the mainstream. (Java, Delphi.NET, C++.NET …)
2. Introducing type unbound variables
In C#, when you declare a variable, you must specify its type. This is a hint for the compiler so as it will be able to enforce type-safety. That variable will be bound to its type throughout its entire scope. This means that if you want to change its type, at runtime, you won’t be able to do that.
Type unbound variables refer to variables that aren’t bound to a certain type. At the moment i of execution the variable has the type Ti and at the moment i+1 it can have any other type, Ti+1 which is not necessarily ad-hoc polymorphic with Ti.
In almost every Object Oriented programming language, there is a system class, Type, which is used to manipulate the concept of “Type”. Information about types (system or user defined ones) is accessible at runtime throughout instances of this class. In C# there are mechanisms (Reflection [3]) for creating objects based on information held by these instances. This is the only way you can create instances of a variable type at runtime.
This is quite tedious and the usage of System.Type class in order to do that, does not intervene to the programmer in a natural manner. This class hints us more to a class schema or an object runtime inspector rather than to a usable type.
In X language, which implements the concept of type unbound variables, the behavior of System.Type class instances is similar to the one of a type identifier. We can use an instance of the System.Type class (which holds meta-informations about a type) just as if it were a type. [Figure1]
Figure 1: Usage of a type unbound variable (a)
…
System.Type T;
T = int;
T a = 10; //a is a valid integer with the value of 10
…
3. Type unbound variables and parametric polymorphism
Polymorphism is a programming language feature that allows values of different data types to be handled using a uniform interface.
Christopher Strachey identified in 1967, two fundamentally different kinds of polymorphism: ad-hoc and parametric.
Ad-hoc polymorphism is when the range of actual types that can be used is finite, and the combinations must be specified individually prior to use. In object-oriented programming this is realized through object-inheritance (objects of different types can be handled uniformly through an interface or a common base class called superclass).
Parametric polymorphism enables a function or a data type to be written generically so that it can handle values identically without depending on their type. For example, in Microsoft Visual C++ parametric polymorphism is realized with templates or in C# with generics. [ Figure 2 ]
Figure 2: C# Generics
…
List<int> myList = new List<int>();
myList.Add(1);
myList.Add(2);
myList.Add(3);
…
This kind of parametric polymorphism is resolved statically at compile time.
„In .NET 2.0, generics have native support in IL (intermediate language) and the CLR itself. When you compile generic C# server-side code, the compiler compiles it into IL, just like any other type. However, the IL only contains parameters or place holders for the actual specific types. In addition, the metadata of the generic server contains generic information.
The client-side compiler uses that generic metadata to support type safety. When the client provides a specific type instead of a generic type parameter, the client's compiler substitutes the generic type parameter in the server metadata with the specified type argument. This provides the client's compiler with type-specific definition of the server, as if generics were never involved. This way the client compiler can enforce correct method parameters, type-safety checks, and even type-specific IntelliSense. „
Juval Lowy - An Introduction to C# Generics [2]
In C# the abstract data type List is available as a generic type. When you use the class List to specify the type of a variable, or to derive from it, you have to specialize it by telling the compiler what kind of elements this list will handle.
Parametric polymorphism comes inherently in X when using an instance of System.Type class as a type identifier for a method’s return type, a method’s argument or class member. [Figure 3]
Figure 3: A generic type (parametric polymorphism)
class GenericList
{
protected System.Type itemType;
public ItemType
{
get { return this.itemType;}
set {this.itemType = value;}
}
public GenericList(System.Type itemType)
{
ItemType = itemType;
}
public void Add(ItemType item)
{
//...
}
public void Remove(ItemType item)
{
//...
}
public ItemType operator [](int index)
{
//...
}
…
}
Parametric polymorphism comes from the fact that by changing the ItemType of the GenericList instance, it will handle different lists of items, without having to change any code.
When specializing this kind of “genericity” (by supplying a valid System.Type instance for each Type variable in the class), you will not create a new type, but a new behavior.
Suppose that you have a class, named MyClass, written by a third party. You want to create a proxy for MyClass, named MyClassProxy which delegates all the methods calls to a Remote method call server[this illustrates the design pattern Proxy, and is often used in RMI (remote method invocation), RPC(remote procedure call) ]. MyClassProxy looks identical in terms of method signatures to MyClass, but those types are not ad-hoc polymorphic since instances of these types cannot be treated uniformly through an interface or a base class that exposes their methods. Suppose that you have to write down code that dynamically decides whether it uses objects of MyClass or objects of MyClassProxy and does a certain task. In order to achieve that, in C# 2.0, you’ll either have to the write the code that does the job, twice, firstly for MyClass, and secondly for MyClassProxy, or as an alternative you will have to extract that code into a generic method, but that is a bit intrusive, and sometimes it is not quite straight-forward.
In X programming language the solution comes from the usage of the parametric polymorphism in the form of type-unbound variables. You declare a variable of type System.Type, and you fill it with MyClass or MyClassProxy accordingly. Then you use that variable to declare the instances of MyClass/MyClassProxy, and operate with them just as if you don’t have to decide whether to use MyClass or MyClassProxy.
[Figure 4]
Figure 4: Instance of a variable type (obj)
…
System.Type T;
if (bUseProxy) //we need to use the proxy class
{
T = MyClassProxy;
}
else
{
T = MyClass;
}
T obj; // create an instance of MyClass or MyClassProxy depending on the previous
decision
… // use obj no matter what type underlies it
4. Implications of type unbound variables
When implementing this new form of parametric polymorphism several aspects must be kept in mind regarding: type safety, strong typing, thread safety and limitations of type unbound variables.
- For each variable that is about to be used, at the moment of execution, the underlying type must be known, otherwise an exception will be thrown.
- Whenever a method argument is a type unbound variable, that method’s overloading resolution must be done at runtime
- Member access of a type unbound variable is resolved at runtime hence all the validations upon that member must be done at runtime
- Method calls of type unbound variables are late bound
- Each operator is subjected to all the constraints methods are subjected to
- When changing the underlying type of a type-unbound variable, a policy vis-à-vis the current value of the variable needs to be adopted: Since every type has a default value, the value of the variable will be reset to this default value.
-Variable types (variables of type System.Type) cannot be used in class hierarchies definition (cannot be used as base types)
-Variable types cannot be used to define delegate types (function pointer types) [Figure 5]
Figure 5:
class D
{
}
class A
{
Type typeVar = D;
class C:typeVar //this is not allowed
{
}
delegate typeVar myDelegate(D b, A a) ;// this is not allowed
delegate int myDelegate(typeVar b, A a);//this is not allowed either
}
- Type instances cannot be used to define entities (variables/members) of a larger domain of visibility.
ex:
class C
{
private Type
memberType;
public memberType member1; //this is not allowed since ‘member1’ has a
//larger domain of visibility than ‘memberType’
protected memberType member2; //this is not allowed since ‘member2’ has a
//larger domain of visibility than ‘memberType’
}
System.Type instances can be regarded as shared resources once they were used to create instances of a type. The problem of thread safety arises here.
Suppose that a type-unbound variable ‘V’ is used by thread ‘Th1’, and thread ‘Th2’ wants to change the underlying type of ‘V’. The system should expose a mechanism through which the programmer could be able to ensure thread-safety.
In order to ensure this, the System.Type class from X, exposes a variable counter, which indicates the number of type unbound variables of that type. When the variable counter is 0, the type can be changed without causing any havoc.
Please note that this variable counter is different than the reference counter which indicates how many instances of that type are being referenced at a moment of execution.
4.5. Lifetime and variable storage:
The lifetime of a Type-unbound variable is not determined by the lifetime of the type instance that was used to define it.
Example:
{
Type T;
T a = new T();
StartThread(a); //start a thread and pass ‘a’ as parameter
}// when T runs out of scope, the instance that was passed to the thread doesn’t get //garbage collected
The type-unbound variable storage is subjected to all the policies, type-bound variables (the regular ones) are subjected to.
Firstly, it is noticed that type-unbound variables are not “syntactic sugar”, in order to implement this feature several extensions must be supported by the core of the virtual machine and covered by the design of the compiler.
There are plenty of ways to implement such feature, and many performance- related policies can be applied.
Implementations and
In the following piece of code I’ll try to exemplify how this feature, one can implement. The code is written in C++, and covers only the surface of the concept: how to implement instructions in the virtual processor, which provide method invoking and member access of type-unbound variables.
//This structure holds metainfos about a method,
//this kind of metainfos are also usefull for reflecting
//upon a method.
struct Method
{
void* address; //the address of the method, this is determined at load-time
bool isVirtual;//this field specifies wheter or not the method is virtual
//... //any other metainfos
};
//This struct holds metainfos about a type which is synonim with class
struct Type
{
Method** methodes;//the methods, the type contains, can be organized as a dictionary
//that maps a method hash to a Method
};
//This structure contains information about an Instance of a class.
//If the type member of this structure can be chagned, we are
//talking about a type unbound variable
struct Instance
{
Type* type;//the type of the instance (the class to which it belongs)
void* vft;//virtual function table
void* storage;//data storage
//...//other related infos
};
// this function represents the call instruction supported by the virtual processor.
void call(void* method)
{
}
// this function resolves the collision of methodes that have same methodHash (same //name).
Method* OverloadingResolution(Type* type, unsigned long methodHash, Instance** params)
{
}
//This function represents the extended call instruction suported by the virtual //processor.
//It perfoms a dynamic call.This means that the method cannot be determined at compile //time, and it has to be searched through the methods of the underlying type, by the //methodHash.
//This hash is determined at compile time by applying some sort of a hash function
//on the method name.
void xcall(Instance* _this,unsigned long methodHash, Instance** params,unsigned int paramCount)
{
void* methodAddress = 0;//the method address that needs to be determined
//resolve overloads collision
Method* methodInfo = OverloadingResolution(_this->type,methodHash,params);
if (methodInfo==0)
throw new Exception();//the type does not contain a definiton for that method
if (methodInfo->IsVirtual)//if the method is virtual, then we need to lookup its
//adress into the virtual function table, (still by its //hash)
{
methodAddress = _this->vft[methodInfo];
}
else//if the method is not virtual then its address is the one held by the
//methodInfo structure
//returned by the method overloading
{
methodAddress = methodInfo->address;
}
//perform a "’this’ call”
push(_this);
for (int i=0;i<paramCount;++i)
{
push(params[i]);
}
call(methodAddress);
}
//This function provides access o a member , it returns its address from a specific storage
void* xMemberAccess(Instance* _this, unsigned long memberHash)
{
//perform a look-up by the memberHash, into the Type of Instance, to find the
//address of the member into the Instance storage
//this access is type checked, which means that if the Tpye does not contain a
//definiton
//for the specified member, an exception will be thrown.
}
Firstly I intend to develop the X Language, to illustrate this concept and several others, and then to define type constraints that will allow the static type-checker to resolve some validations which are currently done by the runtime type-checker.
I would like to evolve the intellisense of the development environment X Language will integrate in, so as it will work in the case of type-unbound variables and variable types.
[1] L. Cardelli and P. Wagner: On Understanding Types, Data Abstraction and
Polymorphism, Technical Report CS-85-14, Brown University, Department of Computer Science, 1985
[2] Juval Lowy: An Introduction to C# Generics, Visual Studio 2005 Technical Articles,
2005 : http://msdn2.microsoft.com/en-us/library/ms379564(vs.80).aspx
[3] .NET Framework Developer’s Guide about reflection: http://msdn2.microsoft.com/en-us/library/cxz4wk15.aspx
[4] Philip Wadler, Stephen Blott: How to make ad-hoc polymorphism less ad hoc, from
Proc. 16th ACM Symposium on Principles of Programming Languages, (January, 1989)