Essence# Syntax: Self Expressions

A self expression is essentially a (possibly cascaded) message sent to self, except that the pseudo-variable “self” is omitted. The value of the “invisible self” that receives the message is established by configuring the compiler to set the pseudo-variable self to the desired value during execution of the self expression.

A self expression is allowed just one message chain having just one message receiver–and the receiver must not be specified syntactically.

Conceptually, the syntactically unspecified–and therefore implied–receiver of the message(s) in the message chain of a self expression is the pseudo-variable self. That’s because the same compiler infrastructure used to set the pseudo-variable self to the correct value during the execution of a method or block is also used to set the value of the implied, unspecified receiver of the message(s) in the message chain of a self expression, and also because a self expression is parsed by the parser simply by adding an actual parse node for the pseudo-variable self as the receiver of the message(s) in the message chain of the self expression. That’s possible because the parser implements self expressions as their own “root parse node” or “grammatical start symbol.”

Any syntactically-valid expression which sends a message to an operand can be converted into a self expression simply be removing whatever syntactical construct is the receiver of the message (a.k.a, the leftmost operand in an expression.) The following examples show two pairs of expressions, where the first member of the pair uses self expression syntax and the second one does not:

Configuring a class

"Using 'self expression' syntax:"

        superclass: Object; 
        instanceVariableNames: #(red green blue)

 

"Using 'expression' syntax:"

        Color 
                superclass: Object; 
                instanceVariableNames: #(red green blue)

 

Adding methods to a class (using method literals to define the method):

"Using 'self expression' syntax:"

        protocol: #accessing 
        method:
                [## red
                        ^red
                ]

 

"Using 'expression' syntax:"

        aClass
                protocol: #accessing 
                method:
                        [## red
                                ^red
                        ]

 

Prior Art

The inspiration for self expressions comes from Tirade, a data representation language invented by Göran Krampe. The main reason that Essence# uses self expressions instead of Tirade is simply because, once you have a full parser/compiler, it is significantly simpler to implement self expressions than to implement Tirade. Given a full compiler, implementing self expressions only involves adding a few, relatively small methods to the parser and compiler. And it also technically doesn’t require adding new syntax to the language, since self expressions only use syntactical forms that would have to exist in any case; the only innovation is to permit simpler syntax that omits an otherwise-required syntactical construct. So self expressions were “the simplest thing that could possibly work.”

That said, Tirade would be a much superior solution to the problem of programming-language neutral data interchange. But that’s not the problem that self expressions are intended to solve.

Essence# Syntax: Method Declarations

method declaration is the syntactical construct used to define a method, including its name and its logic. In Essence#, the name of a method is typically referred to as its method selector; or even just its selector.

The EBNF for a method declaration:

MethodDeclaration   = [MethodHeaderToken, [ClassSpecification]], 
                      MethodHeader, ExecutableCode;
MethodHeaderToken   = OptionalWhitespace, "##", OptionalWhitespace;
MethodHeader        = UnaryMethodHeader | 
                               BinaryMethodHeader | 
                               KeywordMethodHeader;
UnaryMethodHeader   = UnaryMessageSelector;
BinaryMethodHeader  = BinaryMessageSelector, OptionalWhiteSpace, BindableIdentifier;
KeywordMethodHeader = KeywordMethodHeaderSegment, 
                      {Whitespace, KeywordMethodHeaderSegment};
ClassSpecification  = QualifiedIdentifier,OptionalWhitespace,">>", OptionalWhitespace;

 

At the beginning of a method declaration, the parser will recognize “##” (two immediately-adjacent hash characters) as a lexical token called a method header token, and will interpret the token to mean “what follows is a method header.”

Whitespace is permitted, but not required, both preceding and following the method header token.

A method header token may optionally appear as the first token of any method declaration; but it is not required if the method declaration is the root of the parse tree.

However, if a method header token occurs as the first (non-whitespace) token following a “[” token, then the parser will have no choice but to interpret what follows as the remaining tokens of a method literal (which must be terminated by a “]” token eventually.) And in that case, the “##” (method header) token may be required:

If the initial token following a “[” (BlockBegin) token is either a binary message selector or a keyword, then the source code enclosed within the “[” and “]” tokens will be parsed as a method declaration even though there is no leading method header token, and the entire construct (including the “[” and “]” tokens) will be interpreted as a method literal, and not as a block literal. So in either of those cases, no method header token is required.

However, if the first token following a “[” token is a unary message selector (which might instead just be a variable name,) and if there is no leading method header token in between the “[” token and the unary message selector, then the parser will not interpret the construct as a method literal, but will instead interpret it as a block. So, when a method declaration occurs as part of a method literal, and said method declaration has a unary method header,  the only way to get the parser to interpret the construct as a method literal, and not as a block, is to use a method header token as a prefix to the unary method header.

Note: The method header token will also be required as a prefix to a binary method header, if the binary selector is the “|” (vertical bar) token. That constraint is required in order to avoid syntactical ambiguity, due to the fact that a vertical bar token may also be the initial token of a variable declaration list.

If and only if a method header is preceded by the “##” (method header) token, the name of the class which is to be used as the environment for binding variable references when compiling the method may be specified preceding the method header. But in that case, the token “>>” must then be used as a separator between the class name and the method header.

Whitespace is permitted but not required in between the class name and the “>>” token, and in between the “>>” token and the method header.

Following the method header there must be an executable code construct. An executable code construct defines the method’s logic.  Colloquially, an executable code construct is referred to as a method body. A method bodyhas the same exact syntactical structure as a block body.

There are three different types of method header: A unary method header, a binary method header and akeyword method header.

A method declaration with a unary method header must be invoked using a unary message.

A method declaration with a binary method header must be invoked using a binary message.

A method declaration with a keyword method header must be invoked using a keyword message.

Examples of method declarations using all three types of method header are shown below (none of which have a method header token as a prefix):

Method declaration using a unary method header:

printString
        | stream |
        stream := String new writeStream.
        self printOn: stream.
        ^stream contents

 

Method declaration using a binary method header:

@ y
        ^Point x: self y: y

 

Method declaration using a keyword method header:

displayOn: aGraphicsContext at: aPoint
        aGraphicsContext displayString: self at: aPoint

 

If a method header token (“##”) precedes it, then a class specification construct may optionally precede any of the three types of method header. If present, the class specified by the class specification will be used by the compiler as the behavioral context in which the method will be compiled. In other words, the instance variables defined by the specified class, the class variables defined by the specified class and the global variables imported by the specified class will be used to bind any variables referenced by the method that aren’t either method parameters or local variables.

Here are the same three method declarations constructed to have an optional method header token and class specification construct as a prefix to the method header:

Method declaration using a unary method header and an optional class specification:

## Object>>printString
        | stream |
        stream := String new writeStream.
        self printOn: stream.
        ^stream contents
 

 

Method declaration using a binary method header and an optional class specification:

## Number>> @ y
        ^Point x: self y: y

 

Method declaration using a keyword method header and an optional class specification:

## String>>displayOn: aGraphicsContext at: aPoint
        aGraphicsContext displayString: self at: aPoint

 

Any method declaration (whether it uses a unary method header, a binary method header or a keyword method header, and whether or not it uses an optional class specification) may optionally begin with a method header token. The reason the method header token is optional is because its purpose is either to separate one method declaration from another in a sequence of method declarations, or else to distinguish a method literal from ablock. Outside of those two cases, it has no purpose, function or meaning. Its presence or absence has no effect on the semantics of the method.

Here’s an example showing two method declarations separated by an intervening method header token:

Duration>>asDays
        ^self ticks / TicksPerDay

## 

Duration>>asHours 
        ^self ticks / TicksPerHour

Essence# Syntax: Method Literals

A method literal is a method declaration surrounded by enclosing syntax so that it can be embedded as a literal value in an Essence# expression.

The EBNF for method literals:

MethodLiteral       = "[", MethodDeclaration, "]";

 

A method literal must be enclosed between a single beginning “[” character and a single ending “]” character, making its syntax rather similar to that of a block. The key difference between the syntax of a block and the syntax of a method literal is that the construct that immediately follows the beginning “[” character must be unambiguously a method header. And that, in fact, is one of the reasons that the “##” (method header) tokenexists, and is required in some cases, but optional in others: When the “##” token is required, it’s because its absence would create syntactical ambiguity, such that it would not be possible for the parser to distinguish ablock from a method literal.

For more information on the syntax of a method declaration, please see the article on that topic.

Methods defined in Essence# class libraries declare methods as method literals, instead of as method declarations that are the root of their respective parse trees. Using method literals for that purpose obviates any need to encode method names as filenames; or alternatively, it obviates any need to define a special syntax for dealing with sequences of method declarations, or for syntactically embedding method declarations inside of class declarations. So there’s no need for a special “file in” syntax, nor any need for a special parser that can consume a special “file in format.”.

That said, the ANSI standard does require that a conforming implementation support the Smalltalk Interchange Format. Essence# does not currently support that format, but will do so before it leaves beta.

Using Class Specifications

A method declaration may optionally use a class specification construct–but only if a method header token is also used. That means a method literal may also use a class specification construct, since its syntax is defined as an embedding of a method declaration enclosed in between the tokens “[” and “]”.

The presence or absence of the class specification construct may change the behavior of the compiler:

If there is no class specification in the method header, then whether or not the compiler will attempt to bind non-local variable references depends upon how the compiler is invoked. If the compiler is not provided with a binding context for non-local variables when it’s invoked, and if there is no class specification in the method header to provide one, then the compiler won’t check whether any references to non-local variables might be undeclared (however, that check is always performed whenever a method is added to a class or trait.)

On the other hand, if the method header includes a class specification, then the compiler will always attempt to bind references to non-local variables, using whichever class is specified by the class specification construct as the binding context. In that case, any undeclared variables will be treated as compilation errors.

In other words, the compiler interprets the presence of a class specification construct in a method header as a command to verify that there would be no undeclared variables referenced by the method it’s compiling, if that method were to be added to the specified class. Conversely, it interprets the absence of a class specification as a command to defer any such checks until the method is actually added to a class or trait.

When compiling either self expressions or “do its” (initializers or scripts,) the compiler is not configured to provide any default binding context for method declarations–and therefore is also not configured to do so formethod literals. That’s because there’s no way to know a priori what the “right” binding context might be in such cases.

Since methods are checked for any references to undeclared variables when they are added to a class or to a trait (which is usually the proper time, because that’s when the right binding context is known absolutely,) there are no system integrity issues raised by this binding paradigm. And that’s why the method literals in “methods.instance” and “methods.class” files don’t use class specification constructs in their method headers. There’s no need, really.

However, there are compilation use cases other than compiling “methods.instance” and “methods.class” files. And some of those use cases do require that the compiler bind all variable references during initial compilation–which is why class specification syntax is present as on option for method headers.

Multiple Object Spaces In Essence#

What is an Object Space?

An object space is an object that encapsulates the execution context of an Essence# program. It is also responsible for initializing and hosting the Essence# run time system, including the dynamic binding subsystem that animates/reifies the meta-object protocol of Microsoft’s Dynamic Language Runtime (DLR.)

Any number of different object spaces may be active at the same time. Each one creates and encapsulates its own, independent execution context. The compiler and the library loader operate on and in a specific object space. Blocks and methods execute in the context of a specific object space. Essence# classes, traits and namespaces are bound to a specific object space. Even when a class, trait or namespace is defined in the same class library and the same containing namespace, they are independent and separate from any that might have the same qualified names that are bound to a different object space.

In spite of that, it is quite possible for an object bound to one object space to send messages to an object bound to a different object space. One way to do that would be to use the DLR’s hosting protocol. That’s because an Essence# object space is the Essence#-specific object that actually implements the bulk of the behavior required by a DLR language context, which is an architectural object of the DLR’s hosting protocol.

The C# class EssenceSharp.ClientServices.ESLanguageContext subclasses the DLR class Microsoft.Scripting.Runtime.LanguageContext, and thereby is enabled to interoperate with the DLR’s hosting protocol. But an instance of EssenceSharp.ClientServices.ESLanguageContext’s only real job is to serve as a facade over instances of the C# class EssenceSharp.Runtime.ESObjectSpace. And EssenceSharp.Runtime.ESObjectSpace is the class that reifies an Essence# object space.

So, if you are only interested in using Essence#, and have no interest in using other dynamic languages hosted on the DLR, there is no need to use a DLR language context in order to invoke the Essence# compiler and run time system from your own C#, F# or Visual Basic code. You can use instances of EssenceSharp.Runtime.ESObjectSpace directly. The only disadvantage of that would be that using other DLR-hosted languages would then require a completely different API (e.g, using an IronPython library from Essence# code requires using the DLR hosting protocol, and hence requires using a DLR language context).

The advantages of using instances of EssenceSharp.Runtime.ESObjectSpace directly would be a much richer API that is far more specific to Essence#.

You can get the object space for the current execution environment by sending the message #objectSpace to any Essence# class (even to those that represent CLR types.) And the Essence# Standard Library includes a definition for an Essence# class that represents the Essence#-specific behavior of instances of the C# class EssenceSharp.Runtime.ESObjectSpace. It’s in the namespace CLR.EssenceSharp.Runtime, and so can be found at %EssenceSharpPath%\Source\Libraries\Standard.lib\CLR\EssenceSharp\Runtime\ObjectSpace.

There are many ways that Essence# object spaces might be useful. One example would be to use one object space to host programming tools such as as browsers, inspectors and debuggers, but to have the applications on which those tools operate be in their own objects spaces. That architecture would isolate the programming tools from any misbehavior of the applications on which they operate–and vice versa.

To get additional insight into the concept of object spaces and how they might be used to good effect, the paper Virtual Smalltalk Images: Model and Applications is highly recommended.

Using Reflection On The Essence# Code Base

Just because there is as yet no Essence# GUI library, and therefore no native Essence# code browsing tools, doesn’t mean that the intrinsic reflecting capabilities of Essence# can’t be used. In fact, scripts are provided in the shared scripts folder that provide at least some of the functionality traditionally provided by code browsers:

ShowAllMethods: The ShowAllMethods.es script can be used to print out the names and declaring class or trait of all the methods of a class or trait. The subject class or trait must be passed in as an argument, as in the following example which will print out the names and declaring class or trait of all the methods of class Array to the Transcript:

es ShowAllMethods -a Array | more

ShowAllMessagesSent: The ShowAllMessagesSent.es script can be used to print out all the messages sent by each method of a class or trait. The output is cross-referenced by the sending methods, and each such method specifies the class or trait that declares it. The subject class or trait must be passed in as an argument, as in the following example which will print out the names of all the messages sent by each method of class Array to the Transcript:

es ShowAllMessagesSent -a Array | more

ShowAllSenders: The ShowAllSenders.es script can be used to print out the names and declaring class or trait of all the methods in the object space that send a specified message. The subject message selector must be passed in as an argument, as in the following example which will print out to the Transcript the names and declaring class or trait of all the methods in the object space that send the message do:

es ShowAllSenders -a #do: | more

ShowAllSendersInHierarchy: The ShowAllSendersInHierarchy.es script can be used to print out the names and declaring class or trait of all the methods of a specified class that send a specified message. The subject message selector and the subject class or trait must both be passed in as arguments, as in the following example which will print out to the Transcript the names and declaring class or trait of all the methods of OrderedCollection that send the message do:

es ShowAllSendersInHierarchy -a #do: -a OrderedCollection | more

ShowUnimplementedMessages: The ShowUnimplementedMessages.es script can be used to print out the names of all the messages sent by the methods of a specified class or trait to the pseudo-variable self for which the specified class or trait has no implementing methods. The subject class or trait must be passed in as an argument, as in the following example which will print out to the Transcript the names of any messages sent to self by the class OrderedCollection that send messages for which OrderedCollection has no implementing methods:

es ShowUnimplementedMessages -a OrderedCollection | more

Note: Classes that represent CLR types typically have virtual Essence# methods that don’t need to be formally declared, because the Essence# dynamic binding system will automatically bind to and invoke the methods of a CLR type, provided those methods have less than two parameters. Messages sent in order to invoke such methods of CLR types will unavoidably show up as “unimplemented messages” when using the ShowUnimplementedMessages.es script.

ShowTraitUsageConflicts: The ShowTraitUsageConflicts.es script can be used to print out the name and declaring trait of all methods which were excluded from a trait usage expression due to the fact that methods with the same selectors were declared by two or more of the traits combined in a trait usage expression. The subject class or trait must be passed in as an argument, as in the following example which will print out to the Transcript methods excluded from the trait usage of ReadStream because two or more of the traits used by ReadStream had the same method selector:

es ShowTraitUsageConflicts -a ReadStream | more

 

Not My Type: Dealing with CLR types in Essence#

There are three issues that you will encounter when using Essence# that involve CLR types:

  1. Obtaining a CLR type as a value that can be stored in a variable, passed as a parameter or sent messages;
  2. Converting a value from one CLR type to another; and
  3. Creating instances of CLR generic types.

Obtaining A CLR Type As An Essence# Value

The Essence# class System.Type (in the Essence# namespace CLR.System, which corresponds to the .Net namespace System) has many class messages that answer commonly-used CLR types: For example, the expression Type string evaluates to the .Net object that represents the .Net type System.String and the expression Type timeSpan evaluates to the .Net object that represents the .Net type System.TimeSpan. [Note: Those messages are unique to Essence#; the actual .Net class System.Type doesn’t support them. You can add Essence#-specific instance methods or class methods to any .Net class or struct; they just won’t be available outside of Essence#.]

Of course, if you have an instance of the type, you can just send it the message #getType (which is a message to which all .Net objects will respond.) Another way to get a CLR type is to send the message #instanceType to the Essence# class that represents CLR values having that type.

If the CLR type can’t be obtained using one of the techniques explained above, the fallback is to encode the assembly-qualified name of the type in a string, and then send the string the message #asHostSystemType, as shown in the following example:

'System.IO.ErrorEventArgs, System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089'
        asHostSystemType

In most cases (but not all,) the #asHostSystemType message will fail if the fully-qualified name of the assembly is not provided. Two exceptions to that would be when the type is in the mscorlib assembly or in the EssenceSharp assembly.

CLR Type Conversion

Formally–in other words, in theory–Essence# is dynamically typed: The “type” of an expression is simply the powerset of all messages that can be sent to the expression without causing a method-binding error. And that is also true in practice, in that the compiler permits any message to be sent to any expression, and in the fact that the compiler permits any expression to be assigned to any variable or passed as any argument in any message, and in the fact that typing errors are only discovered and reported at run time.

But that does not mean that any expression can be passed as any argument in any message to any receiver without causing any run-time type errors, even in cases where the only messages that will be sent to the value of an argument won’t result in a method binding error. That constraint does hold in most other Smalltalk implementations, and it does hold when the method that will be invoked by the message is an Essence# method. But it does not hold when the “method” that will be invoked is a CLR method.

In Essence#, a CLR method is just a “user primitive”–although the “primitive” does not always need to be formally defined as a method in the Essence# source code, because the dynamic binding subsystem automatically binds Essence# messages to CLR methods that require less than 2 arguments, to CLR type constructors that require no arguments, and to any non-private properties or fields of a CLR type. And it is generally the case in all Smalltalk implementations that “primitive methods” will fail if the types of their arguments are not what they require–even if the primitive could have performed its intended operation simply by sending (using Smalltalk dynamic message dispatch) the right messages to the argument. Primitives generally don’t send messages to their arguments using Smalltalk dynamic message dispatch. And the methods of CLR types certainly do not.

Fortunately, the Essence# dynamic binding system will–if it can–automatically do the required type conversions for you, such as number type conversions, String/Symbol conversions, and any conversions defined by implicit or explicit conversion operators defined by a CLR type. It will even convert Essence# blocks into CLR functions with the required parameter type signature.

But the Essence# dynamic binding system isn’t magic. It can’t handle all cases. It’s simply not possible to convert any type into any other type. And even when it is, the run time system may not have the information required to do it correctly. And in other cases, the logic to do the required conversion simply hasn’t been implemented, for one reason or another.

CLR Array Types

One common case where a conversion might be possible, but the dynamic binding system doesn’t attempt to do it, involves arrays. From the point of view of the CLR, an Essence# array is an array whose elements have the type System.Object (an instance of the Essence# class Array,) or the type System.Byte (a ByteArray,) or the type System.UInt16 (a HalfWordArray,) or the type System.UInt32 (a WordArray,) or the type System.UInt64 (a LongWordArray,) or the type System.Single (a FloatArray,) or the type System.Double (a DoubleArray,) or the type System.Decimal (a QuadArray) or the type String (a Pathname.)

Unfortunately, there are many methods of CLR types that require an array having elements of some type other than the ones supported by Essence#. If the CLR method parameter requires an array whose element type is not the same as that of the corresponding argument at run time, then the binding to the method will fail.

Fortunately, if the conversion of the array to one with the required type is possible (because all the elements can be converted to the required CLR type,) you can code that conversion yourself in Essence#. The following example shows how (this is not new functionality; it just hasn’t been documented/explained):

| clrSourceArray elementType arrayType argument|
		
"Step 1: Convert the Essence# array into a CLR array:"
clrSourceArray := #('one' 'two' 'three') asHostSystemArray.

"Step 2: Get the desired element type of the array to be passed as an argument:"
elementType := System.Type string.

"Step 3: Create a CLR array type with the necessary element type:"
arrayType := elementType makeArrayType.

"Step 4: Create the CLR array having the correct size and element type:"
argument:= arrayType new: clrSourceArray size.

"Step 5: Copy the elements of the source array into the array that will be used as an argument:"
1 to: clrSourceArray size do: [:index | argument at: index put: (clrSourceArray at: index)].
^argument

Or you could just send the message #asHostSystemArrayWithElementType: to the Essence# array, as in the following example:

#('one' 'two' 'three')
        asHostSystemArrayWithElementType: System.Type string

CLR Generic Types

The CLR has three types of “generic type”: Open generic types, partially-open generic types and closed generic types. An open generic type is also called a generic type definition. An open generic type is one where all of its generic type parameters remain “open” because none of them have been bound to a specific type argument. A partially-open generic type is one where some, but not all, of the type’s generic type parameters have been bound to specific type arguments. A closed generic type is one where all of the type’s generic type parameters have been bound to specific type arguments. For the most part, a partially-open generic type is essentially the same as an open generic type. The distinction that really matters is between closed generic types and ones that have generic type parameters that aren’t bound to a specific type.

It’s possible to create instances of closed generic types, but it is not possible to create instances of open or partially-open generic types. And that can be a problem, because .Net class libraries typically only provide generic type definitions that define open generic types. So, although you can define an Essence# class that represents a .Net type that is an open generic type definition, you won’t be able to use that Essence# class to create instances of the type by sending it the normal instance creation messages (e.g, #new or #new:). Creating instances of such a type requires providing one or more types that will be used as the type arguments to construct a closed generic type from the open generic type defined by the .Net class library.

Fortunately, you can use Essence# to construct a closed generic type from an open generic type definition, as illustrated in the following example:

| openGenericDictTypeDefinition esClass closedGenericType |
openGenericDictTypeDefinition := 
        'System.Collections.Generic.Dictionary`2' asHostSystemType.
esClass := openGenericDictTypeDefinition asClass.
closedGenericType := esClass 
                        instanceTypeWith: Type string 
                        with: Type string.
dict := closedGenericType new.
dict at: #foo 
	ifPresent: 
                [:value | 
                        System.Console 
                                write: 'The value at #foo is '; 
                                writeLine: value
                   ] 
	ifAbsent: 
                [
                        System.Console writeLine: '#foo is not present'
                ].
dict at: #foo put: #bar.
dict at: #foo 
	ifPresent: 
                [:value | 
                        System.Console 
                                write: 'The value at #foo is '; 
                                writeLine: value
                   ] 
	ifAbsent: 
                [
                        System.Console writeLine: '#foo is not present'
                ].

The Essence# class that represents a CLR type can be obtained by sending the message #asClass to the CLR type object.

If an Essence# class represents a generic type (whether the type is open, partially-open or closed makes no difference,) then you can create a closed generic type by sending one of the following messages to the Essence# class: #instanceTypeWith: aCLRType, #instanceTypeWith: aCLRType with: aCLRType, #instanceTypeWith: aCLRType with: aCLRType with: aCLRType ,  , #instanceTypeWith: aCLRType with: aCLRType with: aCLRType, #instanceTypeWith: aCLRType with: aCLRType with: aCLRType with: aCLRType with: aCLRType or #instanceTypeWithAll: anArrayOfCLRTypes.

One good way to handle the case where the .Net type you want to use is an open generic type definition would be to define a subclass of an Essence# class that represents that type, and then use one of the messages above to construct a closed generic type that will be the instance type of the subclass.

Another good way to do it is simply to use the CLR’s syntax for closed generic types when specifying the instance type of the subclass. The following examples show both approaches:

	"Class creation/configuration using the Essence# #instanceTypeWith: message"
	
	| superclass class |
        superclass := 'System.Collections.Generic.List`1, mscorlib'
                                asHostSystemType asClass.
        class := Class new.
	class 
                superclass: superclass;
                instanceType: (superclass instanceTypeWith: System.Type object).

	"Class creation/configuration using the CLR's syntax for closed generic types"
	| class |
        class := Class new.
	class 
                assemblyName: 'mscorlib';
                hostSystemNamespace: #System.Collections.Generic;
                hostSystemName: 'List`1[[System.Object]]'

Invoking .Net methods that have “out” or “ref” parameters

The latest release of Essence# introduces the ability to invoke .Net methods that have “out” or “ref” parameters. This post will explain what those are, why supporting them is a technical challenge, and how to pass such parameters to .Net methods that use them.

Conceptually, there are only two ways to pass parameters to functions: By value or by reference. Although there are (usually rather old or even ancient) programming languages that pass all parameters “by reference,” most programming languages pass parameters “by value” in the default case. And some languages, such as Essence#, pass all parameters “by value.”

The following example will be used to illustrate the difference in the semantics:

var x = 3;
var y = f(x);

If the variable x is passed to the function f as a “by value” parameter, then it will be impossible for the function f to change the value of the variable x. However, if the variable x is passed to the function f as a “by reference” parameter, then the function f will be able to change the value of the variable x (although it may nevertheless refrain from doing so.)

Although there is more than one way to implement the two different ways of passing parameters, conceptually the difference is that, when arguments are passed “by value,” only the value of the argument is passed in to the function being called, whereas when arguments are passed “by reference,” the address of the argument (typically, the address of a variable) is passed in to the function being called. In the latter case, the called function may use that address to assign a new value to the variable–although it need not do so.

In fact, many languages implement “pass by value” semantics by physically passing in the address of the arguments that are passed “by value,” but nevertheless disallowing assignment to function parameters. As Peter Deutsch famously said, “you can cheat as long as you don’t get caught.” However, neither the CLR nor the DLR’s LINQ expression compiler cheat in that way: Unless a parameter is declared to require “pass by reference” semantics, arguments are passed by copying their values into the activation frame of the function being invoked.

So the problem for languages implemented on the CLR, such as Essence#, that have no syntax for specifying that parameters will be passed “by reference,” is that there’s no straightforward way to invoke the methods of CLR types that have any “pass by reference” parameters, where the intended usage is that the called method will, as part of its normal operation, assign a new value to an argument that is passed by reference.

And there’s an additional complication: Some methods that use “pass by reference” parameters will neither need nor use the initial value of the parameter, because they only set its value, and never “read” it. Conversely, others will use the initial value to compute and set a new value for the argument. And yet others may do one or the other with the same parameter, depending on the state of the receiver and the value of other parameters.

The C# language attempts to reduce the ambiguity regarding the usage model of “by reference” parameters by requiring that the method definer use different syntax to declare parameters, depending on whether the method will or will not use the initial value of a “by reference” parameter to compute and then set the argument’s new value. In the C# syntax, a parameter that is passed by reference must have one of two keywords as a prefix: out or ref.

The C# out keyword causes the compiler to require that the method that declares the parameter must first assign a value to the parameter before the parameter can be used–and before the method can return. That makes it impossible to use or access the initial value of the parameter. In contrast, the C# ref keyword causes the compiler to require that whoever invokes the method must first assign a value to the parameter.

But C# is not the only language used on the CLR, and the CLR itself does not enforce any constraints on the usage of “by reference” parameters. A .Net language’s compiler may do so, but such is not required by the CLR. That means there may exist .Net types that have methods that use “pass by reference” parameters that do not abide by the rules of C#.

The reflection metadata provided by the CLR does reveal, for each method parameter, whether it is a “pass by value” or a “pass by reference” parameter.  It may also reveal whether the intended usage of a “pass by reference” parameter is as a C#-style out parameter. But compilers are not required to provide that particular piece of metadata. The parameter may be used as an out parameter even if the reflection metadata makes no such claim. The only information provided by the reflection metadata that must be accurate is whether or not the parameter requires that its corresponding argument be passed “by value” or “by reference.”

So for now, Essence# deals with .Net method parameters that require that arguments must be passed by reference in one of two ways:

If the actual argument at runtime is anything other than a block or delegate, the dynamic binding subsystem will create and initialize a new variable that will be passed as the argument. Unless the reflection metadata claims that the parameter is an “out” parameter, the variable will be initialized to have the same value as the argument specified by the invoking code. The method will then be invoked with that new variable as the argument corresponding to the “by reference” parameter. When the method completes, the value of the created variable that was passed in as the actual argument will be assigned to the expression that represents the originally-specified argument. That assignment may cause an exception, and it may fail to update the original variable. I’m researching improvements, perhaps using the DLR’s Meta Object Protocol.

However, if the actual argument at runtime is a block or delegate, then the dynamic binding subsystem will create and initialize a new variable that will be passed as the argument (just as it does for the other case.) The method will then be invoked with that new variable as the argument corresponding to the “by reference” parameter (still essentially the same procedure as in the other case.) But when the method completes, the procedure is different: The one-argument block or delegate that was the originally-specified argument will be invoked, with the newly-created variable as its argument. That permits the caller to access the value of the “out” parameter, because its value will be the value of the block (or delegate) argument. Note, however, that the block or delegate must have an arity of one, and if it’s a delegate, its parameter type must be assignable from the parameter type of the invoked method’s parameter.

Example Usage

Let’s assume that we want to invoke the method Get as defined in the C# class named ValueModel shown below:

class ValueModel {
        private static readonly Object notAValue = new Object();

        protected Object value = notAValue;

        public bool HasValue {
                get {return value != notAValue;}
        }

        public void Set(Object newValue) {
                value = newValue;
        }

        public bool Get(out Object value) {
                if (HasValue) {
                        value = this.value;
                        return true;
                }
                return false;
        }
}

If we have an instance of the above ValueModel class in a variable named model (in an Essence# block or method,) we can invoke the Get method as follows:

[:model |
        | value |
        (model get: [:v | value := v])
                ifTrue: [value doSomethingUseful]
]

Or more simply, we could just do:

[:model | model get: [:value | value doSomethingUseful]]

Important: If the .Net method being invoked returns a boolean value, then the block will only be invoked if the .Net method returns true. But if the .Net method has a return type of void, or has a return type other than Boolean, then the block will always be invoked.

Appe’s New ‘Playgrounds’: Back To The Future, One More Time

So Wired thinks Apple’s Playgrounds development paradigm is revolutionary?

Not so much: Smalltalk programmers have been coding with analogous capabilities since before Steve Jobs ever saw his now famous demo at Xerox PARC.

The Smalltalk development environment now has close to 40 years of maturity and experience behind it. And its cool features from 1979 still haven’t all been added to the IDEs of other languages, to say nothing of the ones added since then.

Ref: http://www.wired.com/2014/07/apple-swift/

Static Typing And Interoperability

Static typing inhibits interoperability. It does so between class libraries and their clients, and also between programming languages.

The reason is simply because static typing violates encapsulation in general, and improperly propagates constraints that aren’t justifiable as architectural, design or implementation requirements in particular. It does so either by binding too early, or by binding (requiring) more than it should.

The reason that static typing violates encapsulation is because it fails to permit the object itself from being the sole arbiter of its own behavior, and the sole arbiter of the semantics of any operations that may be applied to it. It does that by forcing the programmer to break the veil of encapsulation of the object by directly referencing its class or type as the static type constraint of variables that will be permitted to contain objects of that type, instead of respecting the privacy of that information, which should remain hidden behind the wall of encapsulation that objects are supposed to provide. And then that information is used by the compiler to force the behavior of objects based on the static type constraint of the variables that reference them, as well as the semantics of the operations applied to them, thus violating encapsulation even more profoundly.

As usually implemented by widely-used programming languages, and therefore as typically used by program developers, static typing confuses the distinction between a class and the type that the class implements. And even in an ideal implementation in some language that almost no one will ever actually use in anger, it violates encapsulation by confusing the variables or parameters that reference a value with the class of the referenced value, the type of the referenced value, or (usually) both.

A class is not a type. It’s an implementation of a type.  Therefore, any system built on that false assumption is intrinsically broken at a very deep level. If you write code based on the idea that classes are types, you need to transform your thinking.

And a variable is not the value it references. Therefore, any system built on that false assumption is intrinsically broken at a very deep level.  The semantics of a variable is the architectural role(s) it plays in the algorithm that uses it. It is not the class of the value of the variable that the variable may be referencing at any particular time, nor the type that that class may be implementing at that particular time (the type that a class implements is not necessarily a static property of the class.)

As a concrete example, consider the usage of an array object in a statically-typed language which must hold objects of different types (or different classes, it makes no difference in this case,) and therefore must use Object as the static type constraint for the array’s element type (assuming there is no less-general common superclass.) Putting elements into such an array is easy, and raises no issues.  However, although it is equally easy to fetch elements from such an array, using the retrieved values for other than the most trivial purposes typically requires the programmer to over-constrain the retrieved value: The programmer is forced to constrain such values to being instances of some predefined type (which often also means constraining them to be instances of some predefined class,) before such values can be used in application-specific or domain-specific ways. The programmer must impose such constraints by casting the value to a specific type or class–which will fail if the programmer misjudges the type of the object, even though there would have been no failure had the programmer been allowed to use the value in the operations that were needed without having to cast the value to some specific type (because those operations would be valid for all the elements of the array, regardless of their type or class.)

Typically, one only needs to send one or two messages to values retrieved from a variable or from a collection. One usually does not need to send every message (or apply every operation) defined by a particular type or class. Therefore, requiring that a value must be an instance of a particular type or class is almost always requiring more than is necessary–and usually, far more. Imposing such unnecessary constraints inhibits interoperability because it reduces the generality of the code, reduces the generality of the entities the code defines, and reduces the generality of the the entities that the code can make use of.

And the violation of encapsulation bubbles up to higher-level entities: Classes, modules, packages, assemblies, class libraries and entire frameworks–which is how it inhibits interoperability, not only between class libraries or application frameworks and their clients, but even between different programming languages.

Static typing dramatically increases the coupling between components, because it not only exposes the specific entities that define and/or implement all the behavior of a value, it also requires that clients use precisely and only those specific defining or implementing entities, and typically prohibits the use of functionally/semantically equivalent alternatives that would have worked as desired. This is true even when interfaces are pervasively used as static type constraints (which is rarely the case in practice, and so should in all fairness be dismissed as a moot point,) and because clients must in any case use only the specific interfaces used and/or defined by the component providing the service they’d like to use.

Pervasively using interfaces as static type constraints, although it definitely helps, nevertheless remains a problem a) because interfaces typically require behavior that just is not needed every time one of their instances is used (which is almost always the case,) b) because interfaces defined by third parties–and the classes and methods that use them–generally cannot be changed in any way by their clients, and c) because it is possible (and even likely) that the various providers of “reusable” (ahem) components will separately define what are conceptually “the same” interfaces for the same purposes, which nevertheless won’t be type-compatible even if they have exactly the same names and require precisely all the same operations (messages, operators, etc.)

Why? Because “reusable” components from independent sources that use what are conceptually “the same” interfaces nevertheless cannot easily interoperate with each other, since two (or more!) interfaces that aren’t defined in the same namespace will be classified by the static type system as different, incompatible types, no matter how identical they are otherwise. That means that, for example, the function provided by class library A from source Alpha that transforms a video stream by applying a user-supplied function to the color of each pixel cannot be used with the function that performs the desired color transformation provided by class library B from source Beta, because the two class libraries each use their own “Color” interface, which is not type compatible in spite of having the exact same name and defining the exact same set of required operations with the exact same semantics.

And that’s a short–and incomplete–example of why and how static typing substantially reduces a programmer’s degree of freedom in combining and reusing code components from different, independent sources, relative to what it could be if programming languages used the open-world assumption instead of the closed-world assumption regarding which operations are valid. Operations should be assumed valid until that assumption is proven false, for reasons that are analogous to those that justify the assumption that a person accused of a crime is innocent until proven guilty, or those that motivate the assumption that the null hypothesis is true until proven false by actual experimental observation.

Most of human progress over the past few centuries is directly attributable to discarding the prior obsession with being able to present absolute proofs, and replacing that paradigm with what we now know and love as the scientific method, which is based on falsifiability instead of being based on provability. Our modern technology only exists because we gave up the idea that our models of the world must be provably correct, and instead started using the paradigm of falsifiability, as embodied by the scientific method.

The programming world is in severe need of a Copernican Revolution so that we can discard the ever-more complicated epicycles and contortions that the static typists go through to try to match the generality, reusability and productivity of dynamic programming languages.

Note: You can vote on this essay on Reddit

How To Develop Code For Essence# Using Your Favorite Smalltalk’s Development Tools

Essence# currently provides no support for any sort of GUI. Nor does it provide any code editing or code browsing tools. You might think that that means that the only way to develop code in Essence# is to use generic text editors, but such is not the case.

You could develop–and debug–your code using, for example, Squeak, Pharo or VisualWorks. That can be done because, unless your development environment’s Smalltalk compiler accepts (or worse, requires) non-standard syntax, the Essence# compiler can and will compile it. It will also accept/compile many of the common non-standard syntax extensions, such as qualified (“dotted”) names (VisualWorks, Smalltalk-X,) dynamic array literals (Squeak, Pharo) or even dynamic dictionary literals (Amber.)

So the heart of the issue is the differences in the standard libraries supported by other Smalltalk implementations. But that issue can be addressed by adding Essence# compatibility classes and methods to the environment where you develop your code, and by adding foreign-environment compatibility classes and methods to your suite of Essence# libraries/namespaces when you port your code to Essence#.

I expect a large body of such inter-Smalltalk compatibility code to be developed over time–some by me, and some of it contributed by others. In Essence# such special-purpose code modules don’t have to be installed by default. They can be partitioned into separate libraries that are only loaded when and as needed. So if you don’t need either the VASmalltalk or the Dolphin compatibility library, just don’t load them.

However, there is one obstacle in the path of this strategy: The file-out formats of other Smalltalk development environments are not at all the same as the Essence# class library format. Given that, it would actually be a lot of work to port code from VisualWorks, Squeak or Pharo to Essence#.

Or more correctly, it used to be a lot of work, because I’m using this post to publish and explain a solution to the problem: Code that will file out a class or class category from VisualWorks, Pharo or Squeak into Essence# class library format. The code can be obtained from the Essence# Tools Repository on GitHub. Although at present only those three development platforms are supported, it should not be too hard to figure out how to port one of them to another Smalltalk development platform. If you do so, please contribute it to the community.

The version of the code export tool for each of the three development environments is named EssenceClassLibraryExporter, and each has the same external API–which will be explained below. The one for Squeak is located here, the one for Pharo is located here, and the one for VisualWorks is located here. Warning: Don’t export code that you do not own, or that is not open source, unless you have the appropriate permission(s.) The VisualWorks class libraries published by Cincom are not open source, although the “Contributed” code may be (don’t assume; verify.)

After you’ve filed in EssenceClassLibraryExporter.st into Pharo, VisualWorks or Squeak, you use it by writing code in a workspace and executing it as a “do It.” Here’s what you need to know in order to do it correctly:

There are two key issues: 1) What code should be exported, and 2) what Essence# class library and namespace should it be exported to.

Because of the way the EssenceClassLibraryExporter works, it’s best to answer the second question first.

Selecting the target Essence# namespace

When exporting from VisualWorks, you probably should use the same namespace name in Essence# that you use in VisualWorks. But when exporting from a Smalltalk development environment that doesn’t have (or doesn’t typically use) namespaces, you’re faced with an architectural decision that you didn’t have to make until such time as you needed to export your code to Essence#. You can, of course, simply export your code to the Smalltalk namespace. But best practice would be to define and use a namespace in Essence# that’s specific to your project (unless you are deliberately extending a namespace authored by someone else.)

Namespaces in Smalltalk have always been controversial. There is a vocal and influential subset of the Smalltalk community that opposes them. So I am going to address that issue upfront:

First, I must point out that all ST80-compliant Smaltalk implementations that support “shared pools” actually do have and use namespaces. That’s precisely what a “shared pool” is: a namespace. Shared pools could be used to do all the things that namespaces do, but for the fact that the ST80-style compilers and code browsers generally used (with the notable exceptions of VisualWorks and Smalltalk-X, and soon Essence#) haven’t been designed to make such usage elegant and easy. But they could be.

And Smalltalk classes also act as namespaces, in that they can define “class variables” that can be used to change the meaning of references to variables defined in the SystemDictionary. And “shared pools” can have the same effect. Consequently, having formal support for modern, first-class namespaces adds no code comprehension or interpretation issues that were not already present even in Smalltalk-80 ca. 1983, as distributed by XSIS.

But the bottom line is that computer science long ago learned that different name binding scopes are necessary–and become ever more necessary as the size of the code base increases, and as the number on independent contributors increases. That’s why objects provide their own “namespaces” for their instance variables, so as to keep each object’s set of named instance variables separate and distinct from that of all other objects, even when the names of the variables are the same. It’s why each method defines and uses its own “namespace” of parameters and local variables–and why blocks do the same. The bottom line is that Essence# (and VisualWorks, and Smalltalk-X, and .Net, and most modern languages) provide namespaces that localize and encapsulate references to non-local variables for analogous reasons, to solve problems that are essentially the same, using a very similar solution, justified by issues and logic that are also largely analogous.

To the extent that multiple name binding scopes cause problems of comprehension or interpretation with respect to the meaning of named variable references, they do so universally for named variable references of all types and in all contexts: Block and method parameters, block and method local variables, function parameters and local variables, the fields of records/structs, the named instance variables of objects, the class variables and shared pools of ST80, the static variables of C++, C# and Java, and modules/assemblies/programs that contain multiple classes and/or functions.

If, on balance, programmers and the code bases they write get more benefit than harm from having separate name binding scopes for class variables, instance variables, parameters and local variables, why would that not also be the case for classes?  Is that not one of the key benefits provided by having separate program files in the case of file-based languages, and separate images in the case of image-based Smalltalk systems? If separate name binding scopes are bad when provided by formal, first-class namespaces, why are they not also bad when implemented by having different images, or by having different program files?

For those reasons, opposition to formal namespaces as first-class objects has always seemed to me to be the equivalent of opposition to compilers by old-school assembly-language coders, opposition to structured-programming by old-school FORTRAN and COBOL coders, opposition to OO by old-school procedural programmers and opposition to dynamic typing by those who think a variable is semantically equivalent to the object it references, and/or that a class is semantically equivalent to the type (concept) it implements.

I’ve used VisualWorks with namespaces for a decade and a half, and have never encountered the issues that opponents of namespaces say they fear.  Not once, not ever. That’s one reason it reminds me so much of opposition to dynamic typing.

Get over it. Move on: Formal namespaces as first class objects solve more problems than they cause. The proof is in the experience of those who’ve been using them for one or more decades now, and in the ever-increasing adoption of them in modern programming systems and languages.

Selecting the target Essence# class library

Conceptually, an Essence# class library is an independently-loadable code module. It roughly corresponds to what other languages/platforms call a “package,” “module,” “parcel” or “assembly” (and these, too, were historically opposed by some; thankfully, they “got over it.”)

It’s important to note that an Essence# class library can contain one or more Essence# namespaces, that the same Essence# namespace can occur (be defined and/or modified by) more than one Essence# class library, that namespaces are an essential and foundational aspect of the .Net platform, and that Essence# namespaces were designed to deal with the fact that namespaces are so central to the way that the .Net framework operates.

You partition your code base into one or more class libraries so that your code modules can be loaded when and as needed as logically-consistent and cohesive units of functionality. You partition your code into one or more namespaces in order to isolate the named entities in your code base from unwanted name clashes with other parts of your own code base, or from unwanted name clashes with code from third parties (of which you may have no knowledge or even interest.) That’s two separate concerns solved by two separate mechanisms (“objects,”) which is proper OO design.

Key point:  You probably should not export your code to the Essence# Standard Library, for the reasons explained in the Essence# documentation section on CodePlex, under the heading Handling multiple releases of Essence#. For one thing, it puts your work at risk of being overwritten when you merge with a new release of Essence# using the Essence# installation program. Using your own private Essence# class library avoids all risk of that happening, and provides several other important benefits that I won’t go into here.

Using the EssenceClassLibraryExporter

The following code snippet will export the class named MyClass to the Essence# class library MyClassLibrary in the Essence# namespace MyNamespace, assuming that MyClass is locally defined in the namespace Smalltalk (i.e., that’s the namespace in which the class is defined in the Pharo, Squeak or VisualWorks image you’re using):

EssenceClassLibraryExporter libraryPath: #('MyClassLibrary').
(EssenceClassLibraryExporter 
        exportingTo: 'MyNamespace' 
        from: Smalltalk)
                exportClass: MyClass

By default, the exported code will be placed in the current working directory of your Smalltalk image. So in the case of the example above, the folder representing/implementing the target Essence# class library would therefore be the current working directory of your Smalltalk image, and the target namespace would be implemented/represented by the folder MyNamespace, which would be created as a direct subfolder in the image’s current working directory.

To use a different folder as the target Essence# class library, send the message #libraryPathPrefix: aPathnameString to the class EssenceClassLibraryExporter before exporting your code, as in the following example:

EssenceClassLibraryExporter libraryPathPrefix: 
        '/Users/myUserName/Documents/Developer/EssenceSharp/Source/Libraries/'.

To export to a nested Essence# namespace, use a qualified named (“dotted notation”) for the target Essence# namespace name, as in the following example, which would export to the nested namespace MyRootNamespace.MyNestedNamesapce, at the absolute path ‘/Users/myUserName/Documents/Developer/EssenceSharp/Source/Libraries/MyClassLibrary/MyNamespace/MyNestedNamespace/’:

EssenceClassLibraryExporter libraryPathPrefix: 
        '/Users/myUserName/Documents/Developer/EssenceSharp/Source/Libraries/'.
EssenceClassLibraryExporter libraryPath: #('MyClassLibrary').
(EssenceClassLibraryExporter 
        exportingTo: 'MyNamespace.MyNestedNamespace' 
        from: Smalltalk)
                exportClass: MyClass

To export all the classes in a class category, replace the message #exportClass: aClass as in the examples above with the message #exportClassCategory: aSymbolNamingAClassCategory, as in the following example:

EssenceClassLibraryExporter libraryPathPrefix: 
        '/Users/myUserName/Documents/Developer/EssenceSharp/Source/Libraries/'.
EssenceClassLibraryExporter libraryPath: #('DevelopmentTools').
(EssenceClassLibraryExporter 
        exportingTo: 'SystemTesting.SUnit' 
        from: Smalltalk)
                exportClassCategory: #'SUnit-Core-Kernel'

If you use the Pharo version of EssenceClassLibraryExporter (in a Pharo image, of course,) then Traits will be handled correctly.