Meta User Manual
A programming environment allowing programmers to be more productive in implementing well-designed and well-tested software in a multitude of languages.
A collection of new languages augmenting pre-existing languages. Examples:
C++> is an augmented version of
Python> is an augmented version of
Java> is an augmented version of
- <HTML> is an augmented version of HTML.
- <Gnuplot> is an augmented version of Gnuplot.
- ... more ...
An environment for unifying families of related languages. Examples:
- unifies object-oriented programming languages.
- unifies type-setting languages.
- unifies plotting/graphing languages.
- ... more ...
A generic mechanism for defining new languages:
- Meta provides a universal syntax for new languages that is structured, consistent, intuitive, concise, expressive, extensible, self-documenting, readable, and writable.
A means of adding useful features of individual base languages to the entire family of languages it belongs to.
The terms Meta-language, augmentation, and unification are central to Meta:
Meta-language:defines topic-specific Meta-languages that augment and unify a set of underlying base languages.
- : for object-oriented programming languages
- : for functional progrmaming languages
- : for type-setting languages
- : for plotting data in 2D/3D graphs
- : for describing family trees
Syntax and semantics: A Meta-language is defined using Meta's generic syntax, and captures the fundamental semantics of some family of related formal languages.
Feature set: Each Meta-language identifies a desired set of features that define the family of languages in question, and guarantees support for all features.
Base languages: Each Meta-language is associated with a family of pre-existing languages referred to as its base languages. The Meta-language ensures that all of the features in its feature set are available in each base language. Even if a base language does not provide direct support for a given feature,finds a way of emulating that feature using other aspects of the base language.
: One critically important Meta-language defined by (and the one within which itself is implemented) is , which augments/unifies object-oriented programming languages. The base languages of include
Perl, with more languages being added over time. Most of this introductory discussion of will focus on .
Augmentation: A Meta-language augments its family of base languages by introducing a new language "above" each base language.
C++> is an augmented version of
Python> is an augmented version of
X> is an augmented version of
Xfor any base language
Semantics: Each new language is a semantic superset of the underlying base language (it adds additional capabilities while ensuring all pre-existing capabilities remain intact). <
C++> can do everything
C++can do, but also provides additional features missing from
C++. In general, <
X> can do everything
Xcan do, but also provides additional features missing from
Syntax: The syntax of <
C++> is the same as
C++at the level of statements, but new syntax is introduced for everything above statements (e.g. for defining classes, methods, variables, and every other syntactic construct that deems as important in an object-oriented programming language). The syntax of <
Java> is the same as
Javaat the level of statements, but new syntax (crucially, exactly the same syntax as for <
C++>) is introduced for everything above statements. In general, the syntax of <
X> is the same as base language
Xat the level of statements, but new syntax (shared across all <
Xi>) is used for everything above statements.
metac, converts a program written in <
C++> into a program written in
C++, converts a program written in <
Java> into a program written in
Java, and in general converts a program written in <
X> into a program written in base language
X. Note that although <
Python> and <
C++> (for example) use exactly the same syntax for defining concepts above the level of statements, the compiler knows how to convert this same syntax into the very different syntax used by each base language.
Unification: One immediate consequence of the fact that <
Xi> and <
Xj> share the exact same syntax above the level of statements is that any program implemented in <
Xi> is implicitly also partially implemented in every other <
Xj> supported by the Meta-language. This means:
Knowledge Transfer: Suppose a programmer knows the syntax of base language
, and the syntax of <
Xi>, but does not know base language
. Such a programmer is able to start writing
programs much more quickly, by writing in <
Xj>, most of whose syntax the programmer already knows (everything above the level of statements is the same as <
Xi>). This means that a <
Python> programmer can start writing
Eiffelcode or Ruby code much more quickly, by implementing that code in <
Eiffel>, or <Ruby>.
Langugage Migration: Migrating an implementation from one base language to another is vastly easier in <than in the base languages themselves. For example, converting a
Java> program to <
C++> is much, much easier than converting a
C++, because the syntax of the <
Java> program differs from the <
C++> program only at the level of statements, whereas the syntax of
C++at every level.
Native methods: Oftentimes, code written primarily in one language can be optmiized by having expensive functionality implemented in a more efficient language. For example,
JavaNative Interface (JNI) for linking in C/
Perland most other OO languages provide similar mechanisms for embedding code written in other languages. provides various forms of support (syntax, automated generation of relevant glue code, etc.) to make this possible.
We've noted that the syntax of <
X> is the same as base language
Xat the level of statements. This means that the bodies of methods in <
Python> contain pure
Pythoncode, and the bodies of methods in <
C++> contain pure
C++code. Crucially, the specifics of syntax allows one to specify the statement-level code associated with a particular method in multiple base languages at the same time (immediately adjacent to one another) so one can (for example) provide both the
Pythonimplementation of a method and the
C++implementation. A program written in both
C++is denoted <
Python|C++>, and a program written in
C++is denoted <
C++>. In general, for any base languages
, a single meta source file can represent an implementation in all of those base languages (note that only statement-level code needs to be specified for each base language) and such a program is said to be written in
As we will see later,takes this unification of languages even further. Reducing the amount of base language syntax one needs to learn, providing a means of defining method bodies in multiple base languages at the same time, and providing support for native implementations of methods is just the tip of the iceberg with respect to Meta's unification capabilities.
To summarize, <
C++>, etc.). Each of
these languages uses the same new syntax to define high-level object-oriented
concepts (namespaces, classes, methods, fields, etc), but uses the syntax of the
underlying base language for statements.
Each Meta-language identifies a set of features/capabilities that it guarantees support for in all base languages. The following enumerates, at a high-level, some of the features guaranteed by.
- are first-class objects
- accessors: auto-generated getters, setters, mutators, incrementers, adders, etc.
- object serialization: auto-generated methods for writing to protobuf and human-readable streams
- uml diagrams: auto-generated class diagrams from the source code.
- optional fields: no memory if not present (at the expense of more memory and slower access for such fields).
- support for instance/class/static fields
- bit fields:
C++bit fields are provided in all base languages.
- are first-class objects.
- support for methods within namespaces, within classes, and within other methods.
- support for closures.
- support for instance/class/static methods.
- support for operator overloading.
- support for C#-style
- support for
throwssemantics in method signatures.
- are first-class objects
- support for classes within namespaces and within other classes.
- many auto-generated methods defined on classes.
- each user-defined class has associated auto-generated test class and meta-class.
- are first-class objects
- can be arbitrarily nested.
- are first-class objects
- allow one to capture all dependencies (external files, external urls, external code) of a program, ensuring a hermetic build environment.
UnifiedSyntax: Once a programmer knows syntax and the set of constructs/attributes defined by , they are able to create programs in any base language supported by as long as they know the statement-level syntax of that language. Programmers do not need to know the syntax for defining namespaces, classes, methods, fields, etc., nor do they need to know any of the other things that go into defining an OO program (how to set up test classes, how to access reflection in the base language, etc.). All of these things are abstracted away by . This means a person familiar with
Xi> is able to learn and code in some other language
much more quickly (by writing the program in <
Xj> instead of
Type System: defines a concise, expressive,
intuitive type system that can be directly mapped to the type systems of all
of its base languages. It supports:
- augmented primitive types: int, uint, float, double, char, bool, units,
- int is shorthand for int<32>, and each of int<1>, int<2>, int<3>, ..., int<64> is a legal primitive int type.
- uint is shorthand for uint<32>, and each of uint<1>, uint<2>, uint<3>, ..., uint<63> is a legal primitive unsigned int type.
provides type support for units of measurement, allowing
one to write use tokens like
- native types: introduces a set of types that have direct coorelates in each base-language. When a program is compiled into a specific base language, a reference to a native type in Meta source code is replaced verbatim by the associated base language type.
- class types:
- union types: a type that is one of an explicit set of types.
- weak references
- by-value, by-reference, and by-pointer.
- mutable and immutable
- template/generic types
- augmented primitive types: int, uint, float, double, char, bool, units, etc.
TheLibrary: A collection of classes that are implemented in every base language, with exactly the same interface and semantics, and exhibiting the same complexity profile in all base languages.
- Strings (str), Vectors (vec) and Dictionaries (map)
- Dates and times
- Units of Measurement
The <Language Hierarchy: A program can be written in
Xi>, the extended language that augments base language
, after which it can be incrementally (method by method) extended to provide support for some other base language
. Such a program is said to be implemented in
<. A program written in this language can be compiled into both
. This process can be repeated to add support for
, yielding an implementation in
<. also provides the special language * that defines statement-level constructs. A program written in * has no base-language syntax, but can be compiled into any base-language.
Expressivity:allows one to do more while typing less in a number of ways:
Auto-generated Testing Infrastructure: Everyclass implicitly defines an associated test class, and every method implicitly defines (one or more) associated test methods. Programmers do not have to deal with any of the "glue" that goes into setting up a test harness, and can focus on writing code and associated unit tests, providing increased readability and conciseness. Most importantly, the code and associated test code are adjacent to one another. also provides formalizations for test doubles (mocks, stubs, fakes) and auto-generates associated code.
Auto-generated Reflexivity:guarantees that the core object-oriented concepts, like classes, methods, fields, namespaces, etc. are all first-class objects. It also guarantees support for meta-programming and many forms of reflection. Meta-classes are central to this support.
Auto-generated Build System: Bazel workspaces andprograms automatically generate and maintain
BUILDfiles, providing all the benefits of Bazel without the programmer needing to know anything about Bazel (but having complete control over every aspect of the build system if they want it).
Auto-generated Object Serialization:programs specify sufficient information about the state of each field witin a class to allow to automatically generate both machine-friendly and human-friendly serialization support for arbitrary objects. Every instances of every class defined in responds to
Google's protocol buffers and
toString, and in the vast majority of situations a programmer needs to do nothing special (but has complete control over every aspect of this serialization if they want it).
Auto-generated Native Code Support:programs written in one base language can implement specific methods in a more efficient alternative base language, and the compiler automatically generates all the inter-language glue code needed to make such native methods work properly.
Auto-generated UML visualization: UML class diagrams can be automatically generated for anyprograms. The level of detail syntax provides when describing ojbect state allows for these UML diagrams to be exceptionally precise (and concise).
Source Code Customization and Canonicalization:allows users a significant amount of control over the appearance of programs by supporting user-provided aliases for all attribute keys and feature values, user-provided syntax highlighting cues, etc. At the same time, any Meta program can be converted from a customized form to a canonical representation, and back again (or to some other user's entirely different customized representation).
Literal String Variable Interpolation:programs can use special syntax (even within base-language code) for formatting strings based on variable interpolation.
Standardized Output: The output produced by baselang compilers, test harnesses, etc. are cleaned up and standardized so that the user sees a similar style of output regardless of which base language aprogram is compiled and executed within.
Meta-Base Line Mapping: Thecompiler maintains a line-number mapping between meta source file locations andand corresponding base-language file locations. This allows to provide the user with line numbers within files even when the actual output is reported in terms of base language file locations. Useful in error reporting.
Implementation Variants: Multiple variants of a method can be implemented, andcan automate the process of determining which variant performs best (in cpu usage, memory, time, etc.).
Aspect-oriented programming:provides syntax for identifying and modifying code elsewhere in a program. Note that the fact that every construct in a program is uniquely identifiable makes this especially easy and powerful in Meta.
Profiling:provides easy access to profiling information in all base languages, ensuring the output is consistent and uniform.
Test Coverage:provides easy access to test coverage information in all base languages, ensuring the output is consistent and uniform.
Figure 1 shows a simple <
Python> program. Before we step through the
example in detail (next section) we will discusses some basics of Meta
Figure 1: person.meta (an example <
Meta syntax is designed to be structured, consistent, intuitive, concise, expressive, readable and writable. Everysource file consists of a sequential list of constructs. Every construct is a sequential list of one or more attributes followed by a (often optional) terminator. Each attribute consists of a key/value pair (an attribute key and an attribute value). This simple structure (files are composed of constructs that are composed of attributes that are key/value pairs) plays a role in many of the benefits that provides. In BNF notation, syntax is:
<metafile> ::- <construct>+ <construct> ::- <attribute>+ <terminator>? <attribute> ::- <attribute_key>? <attribute_value> <terminator> ::- ';' | 'end' (<kind> <id>?)? ';'
The above BNF, although useful for capturing the essential structure of Meta syntax, glosses over some important details. In particular:
Attributes making up a construct are divided into three kinds, which appear in a specific order and have different roles:
- feature attributes:
- appear before the primary attribute.
- feature keys are always optional.
- feature values come from a pre-defined fixed set of words defined by the Meta-language schema.
- all feature attributes associated with a construct have default values; any feature attribute not given an explicit value when specifying a construct will use this defualt.
- primary attribute:
- the key is the name of a construct, and is always present (it or an alias).
- the value is a unique identifier (amongst all other constructs in the same parent attribute). Some constructs have optional primary values (if optional and not provided, auto-assigns a unique id).
- exactly one occurrence of the primary attribute exists in each construct instantiation.
- secondary attributes
- appear after the primary attribute.
- keys are usually required but sometimes optional (when the syntax of the value unambiguously identifies which attribute is being specified).
- values can vary from a simple identifier to a complex block containing arbitrarily nested sub-constructs.
- the schema defined by the Meta-language indicates an attribute value type for each attribute, which dictates what is syntactically legal as the value of each attribute.
- feature attributes:
Attribute values can come in a variety of syntactic flavors or attribute types, ranging from simple identifiers to literal strings to blocks of indented text representing arbitrarily nested constructs. Some examples of legal attribute types are:
- id: a simple identifier.
- word: a single word (no spaces).
- type: a type (details later)
- expr: an int, float, string, list, dictionary, callsite, etc.
- simple block: a list of zero or more lines of lexically scoped text.
- complex block: a list of zero or more lexically scoped constructs.
There are two ways that nested syntax can be introduced within Meta: simple blocks and complex blocks.
- simple block: a sequence of arbitrary (lexically scoped) lines that consumes verbatim.
- complex block: a sequence of nested constructs, which can in turn contain nested constructs.
- syntax: one of the syntactic rules that enforces is that if an attribute has a block-valued attribute type, the attribute key must end with a colon.
Each Meta-language defines the constructs that are legal for that Meta-language.defines constructs relevant to object-oriented programming languages (namespaces, classes, methods, fields, etc.), while defines constructs relevant to type-setting languages (like document, section, table, graph, paragraph, list, etc). A program consists, syntactically, of a sequence of instantiations of these various constructs. For example, in a program may contain a namespace construct that contains class constructs that each contain method and field constructs. The following is a small subset of the constructs provided by :
- namespace: a space within which symbols can be defined to avoid conflicts.
- class: a collection of state and behavior representing an abstract data type
- method: an named piece of code that operates on an instance of some class.
- field: a single element of state within a class.
- accessor: a method that provides get/set/ref/etc. access to a field.
- lifecycle: a concise encapsulation of functionality for initializing/finalizing a class instance.
- behavior: defines a multi-method (supports dynamic dispatch on more than one receiver).
- category: a syntactic means of grouping related methods/fields within a class together
- assoc: a formal mechanism for identfying external code needed by a specific class.
- resource: a formal mechanism for identifying external resources needed by a program
- command: a mechanism for defining a named entry point into a program
- flag: a mechanism for allowing users to pass information into a program.
- var: a variable declaration
Based on the above, we can provide a more accurate BNF grammer forsyntax:
<metafile> ::- <construct>+ <construct> ::- <attributes> <terminator> <attributes> ::- <feature_attr>* <primary_attr> <secondary_attr>* <feature_attr> ::- <attr_key>? <feature_value> <primary_attr> ::- <construct_name> <construct_id>? <secondary_attr> ::- <attr_key>? <secondary_value> <terminator> ::- ';' | 'end' (<construct_name> <construct_id>?)? ';' <attr_key> ::- <id> <construct_name> ::- <id> <construct_id> ::- <word> <secondary_value> ::- <id> | <xid> | <word> | <lit_expr> | <type> | <callsite> | <simple_block> | <complex_block> <simple_block> ::- ':' <newline> (<indentation> <line>)* <complex_block> ::- ':' <newline> (<indentation> <construct>)* <lit_expr> ::- <literal> | <expr> <literal> ::- <number> <unit>? | <string> | <list> | <map> <expr> ::- <op> <id> | <id> <op> <id> | '(' <expr> ')'
<id> ::- [a-zA-Z_][a-zA-Z0-9_]* <word> ::- [^\s]+ <string> ::- '"' [^"]+ '"' | "'" [^']+ "'" <number> ::- <int> | <float> <list> ::- '[' (<expr> (',' <expr)*)? ']' <map> ::- '[' (<string> ':' <expr> ',')* ::- (<string> ':' <expr> [','])? ']' <callsite> ::- '@' <id> ('.' | '!') <id> ( '(' ( <lit_expr> (',' <lit_expr>)* )? ')' )? <type> ::- <type_qualifiers>? <type_name> <type_params>? <type_qualifiers> ::- @#? | &?(#?[usw]?\*)*#? <type_name> ::- <id> <type_params> ::- '<' <type> ( ',' <type> )* '>'
: How Meta-languages Affect Syntax
Regardless of whether one is writingor or source files, the overall syntax is exactly the same; every source file is a collection of constructs, every construct is a collection of attributes (and optional terminator), and every attribute is a key/value pair.
However, the Meta-language does have a crucial impact on syntax, because the Meta-language defines the set of legal constructs and the set of legal attributes associated with each construct. What is considered a construct in one Meta-language is usually not considered a construct in another language. For example, class, method, and field, while defines constructs like plot and line. If a legal plot construct from were copied into a source file, it would not be considered a legal construct, even though it conforms to the general form of files-are-constructs-are-attributes-are-key/value-pairs structure; the Meta-language defines the legal constructs.defines constructs like
Each Meta-language provides a Meta schema that defines defines the constructs legal in the language. Note that since constructs consist of attributes, defining a construct means defining the set of attributes that are legal for that construct.
Meta schemas aredocuments in their own right, which means they are described using syntax, which requires a Meta-language that defines the legal constructs/attributes. The Meta-language responsible for defining Meta-languages is denoted . This language contains the following constructs:
- MetaLanguage: a construct representing a self-contained Meta-language schema.
- Construct: a construct for defining a
construct. Consists of:
- The construct name
- Documentation about the purpose of the construct.
- The list of attributes making up the construct
- Attribute: a construct for defining an attribute within a construct.
- The attribute kind (feature, primary, or secondary)
- Whether the attribute value is required (usually) or optional (rarely).
- The attribute key. For primary keys this must match the construct name.
- Zero or more aliases for the attribute key.
- The default value of the attribute, if one is not explicitly provided.
- The type of the attribute value (feature-list, id, xid, expr, simple block, complex block, etc.)
- For feature attributes, the type is always a specific set of identifiers, but aliases can be defined for each of these values (see FeatureValue below).
- If the attribute type is a complex block, the list of constructs that are legal within that block is also provided.
- A comment describing the semantics of the attribute.
- FeatureValue: a construct for defining feature values (and their aliases)
- File: a construct containing all constructs with a source file.
- BaseLanguage: a construct for defining a base language supported by a Meta-language.
- Template: a construct for use in compiling a construct into a base language.
Meta schemas have a direct impact on syntax:
primary feature keys (aka construct names): When theparser is expecting to see a construct, it needs to find a token representing a legal construct name. Since feature attributes can appear before the primary attribute (which specifies the construct name), the parser consumes tokens until it sees a legal construct name, then verifies that all consumed tokens are valid feature attribute keys/values for that construct. Having the construct name in hand also allows the parser to know which secondary attribute keys are legal after the primary key.
secondary attribute keys: Secondary attribute keys are identifiers (and thus tokens). Which tokens identify secondary attributes depends on which construct is currently being parsed, which is identified by the primary attribute.
secondary attribute values: The Meta-language schema indicates what the attribute value type of each attribute is. each attribute that is legal for a given construct. This has a profound affect on what syntax is legal. If an attribute type is specified to be a literal string, what is legal for the value is very different than if the attribute type is a complex block.
feature attribute keys and values: Feature values are always words (and thus tokens). Because feature values uniquely identify their associated feature key, feature keys are rarely explicitly provided, but can be. Both feature keys and feature values can have associated aliases ( alternative tokens by which the key/value can be referenced). Thus, the set of all feature keys, feature key aliases, feature values, and feature value aliases defined on a construct make up the legal set of tokens that can appear before a primary attribute.
The best way to learnsyntax is by example. The best way to understand how augments and unifies object-oriented programming languages is by exampe. So, although is more general than , we will focus our attention on and present a number of programs that will allow us to discuss syntax and semantics.
Example 1: Basic Syntax
Figure 2 presents the same <
Python> program as Figure 1, although this
time it is showing only the top-level
construct, with all block-valued attributes collapsed (red ellipsis denote
collapsed nested blocks). Clicking on a block-valued attribute key will toggle
visibility (hiding its block if currently showing it, showing if currently
hidden). The grey box to the left provides additional control over which lexical
scopes of the program are shown. You can hide (H), show (S), and toggle (T) the
lexical scopes associated with any of the block-valued attributes in the
program. For example, try clicking the red
(which shows all constructs), and then clicking the red
H (which hides all blocks again).
Figure 2: person.meta (lexically collapsed, with interactivity)
In parsing aprogram (or any source file), it is crucial to be able to identify which tokens are attribute keys (and which keys are primary, feature or secondary), as well as which tokens represent feature values and secondary values. Syntax highlighting is especially useful in conveying this information:
feature attribute (rarely used explicitly)
text representing a comment
... (a placeholder for a secondary attribute value that has been collapsed).
With this syntax highlighting information in hand, let's dive into the
details of our example. If Figure 2 is not showing a black box with three
lines (numbered 1, 3, and 19), click the red
in the grey box of Figure 2. The resulting output will help us identify the
attributes making up the sole
namespace demo (on line 1) is the first attribute, with attribute key
namespaceand attribute value
demo. This is the primary attribute of the construct (we know this because the attribute key matches the name of one of the constructs defined by . Primary attributes are critically important to the parser; the primary attribute key identifies the kind of construct being parsed (in this case, a
namespace), and the primary attribute value provides a unique identifier by which this new construct can be referenced, amongst all other constructs within the same lexical scope. It is only when the parser identifies the primary attribute that it can establish which feature and secondary attributes are legal for it (and thus which tokens are allowed before and after the primary attribute).
Theschema, in its definition of the
namespaceconstruct, indicates that the associated primary attribute value must be of type
xid. This is an "extended identifier" (one or more identifiers separated by '.'). In our example, the value is
demo, which is a valid xid. If some text appeared after
namespacethat wasn't a valid xid, the compiler would naturally report a syntax error.
comment:... (starting on line 1) represents the second attribute, with
comment:being the attribute key, and
...indicating that the attribute value (a nested block) has been collapsed. Clicking on the
comment:key (in Figure 1) will open up the collapsed text. The schema for has specified that the value associated with a
comment:attribute key must be a simple block (zero or more indented lines). Since our example contains a single indented line, the compiler stores the value of the attribute as a list containing one element:
A namespace for our demo classes.
After viewing the content of the
comment:value (by clicking it in Figure 1), click on it again to hide the simple block before proceeding to the next attribute.
scope:... (starting on line 3) represents the third attribute, with
scope:being the attribute key, and
...once again indicating that the attribute value has been collapsed. Clicking on the
scope:key (in Figure 1) will open up the block, revealing the actual attribute value associated with this attribute key. knows that the attribute type of the
scope:attribute of the
namespaceconstruct is a complex block. A complex block differs from a simple block (the type of the
comment:above) in that a complex block contains indented constructs, while a simple block contains indented lines. Complex blocks imply a recursive verification of syntactic correctness, while simple blocks are verbatim text that does not perform any validation on.
We will discuss the content of this complex block after we finish discussing the namespace construct.
end namespace demo; (line 19) marks the end of the
namespaceconstruct. There are numerous ways that constructs can be terminated. One could also have used
;or not provided an explicit terimination at all. The longer forms are useful for large constructs, while shorter forms are useful for short constructs.
Having stepped through our first construct, let's now turn our attention to
block, which we know is a complex block and thus must contain constructs. Figure
3 shows our entire program again.
Figure 3: person.meta (uncollapsed, with interactivity)
Let's now consider lines 5-17, which
class construct. Can you
identify how many attributes it has? Suggestion: try clicking on
scope: in line 5 (and clicking again to
reshow the contents).
class Person is the primary attribute, identifying that we are defining a
classconstruct whose name is
Person. We know this is a primary attribute both because the color of the attribute key tells us, and because the attribute key matches the name of one of the constructs defined by .
- scope:... is a secondary attribute, with
scope:. Since this key ends with a colon we know it is block-valued. The value of this attribute spans lines 6 to 16. Can you guess, based on the syntax appearing within the
scope:block, whether defines the type of this attribute to be a simple block or a complex block?
Given the complicated structure of the text within this block, it probably comes as no surprise that this is a complex block. The
scope:attribute is defined on almost all constructs provided by , and represents the "primary content" of the construct in question. For example, the
scope:attribute of a
namespaceconstruct defines all the classes residing within that class, the
scope:attribute of a
classconstruct defines all the methods and fields (and other constructs deems useful within a class) that reside within the class, and the
scope:attribute of a
methodconstruct defines the code associated with that method.
We mentioned earlier that each Meta-language has a schema that defines all legal constructs and attributes. If the schema defines a complex-block-valued attribute on a construct, one of the required parts of that definition is the legal set of constructs that can appear within that complex block. Within the
scope:attribute of a
classcan appear, but a
fieldcannot. Within the
scope:attribute of a
fieldconstruct can appear, as can a
methodconstruct, but a
scope:attribute of most constructs in has a complex-block value, we will see later on that one special case is
method, which by default has a simple-block value. Remember that <
X> uses the same syntax as
Xat the level of statements, and this is perfectly captured by using a simple-block for method scopes; simple blocks store a list of lines verbatim, which is all we need for method bodies in <
end class; marks the explicit termination of the construct in line 17. Note that it uses a slightly less verbose syntax than was used for the namespace construct.
Lines 6-16 is the scope the Person class and defines two constructs,
field and one
Lines 7-8 define a
field(known as an instance/class variable in
Smalltalk, a data member in
C++, a field in
Java, an attribute in
Eiffel, etc.). A field represents named state associated with the instance of some class. Some notes:
field name is the primary attribute, identifying that we are defining a
fieldconstruct and specifying the name of the field to be
: str after the primary attribute, the next token visible is the colon (':'). Since constructs are lists of attributes and attributes are key/value pairs, it is reasonable to assume that ':' is an attribute key of some secondary attribute defined on the field construct. That is indeed the case; ':' is an alias for the
typeattribute (an alias of an attribute key can appear anywhere that the key itself can appear).
The attribute value
stris an examples of a Meta type. Types are a ubiquitous and crucial aspect of object-oriented programming languages, and if wants to provide augmention and unification of existing OO languages it must be able to express the entire gamut of types definiable in the base languages it supports. At the same time, aims to have types be as readable and concise as possible, offering minimal representations for the most common types. The type system provides support for primitive types, native types, class types, and parameterized types, and is discussed in much more detail later. For now, note that
stris one of the crucial native types provided by , representing an immutable (potentially interned) string of characters.
#:... is the last attribute in the construct, with key
#:. Because it ends with a colon, we know the value is the entire nested block below it (consisting of one line, line 8). Note that
#:is an alias for
comment:, which is defined on every construct in . It is a heavily used attribute in programs.
Implicit termination: Note that there is no end-of-construct termination syntax for this field construct (no semicolon, no 'end;' or 'end field;' or 'end field name;', all of which would have been legal terminators). The field construct must be terminated by the definition of the lifecycle in line 10.parser is able to establish that the
Implicit accessor generation: Field constructs indo much more than just define state. They also implicitly define a (potentially quite large) collection of accessor methods (getter, setter, mutator, incrementer, indexer, serializer, printer, and various others). This automatic generation of code based on a small syntactic footprint is a comman feature of , and one we'll see in various other places. For our example, it is important to know that the following methods are generated (this is not a complete list):
- a getter
name()that returns a
- a setter
nameIs(name)that accepts a single
strvalue and assigns it to the underlying field.
The actual field definition is considered private; every access to the field should occur through the accessors.
- a getter
Lines 10-15 define a
lifecycleconstruct. Semantically, this construct provides a unified mechanism for performing object initialization and cleanup.
, a lifecycle construct captures in a single construct the following things:
The semantics of
C++constructors/destructors (including copy and move constructors and associated assignment operators), etc. A lifecycle indicates how a new instance of an object is initialized, and how instances are cleaned up at end-of-life.
Setup and teardown methods defined on the the test class associated with this user-defined class (both per-method and per-class variants). This will be discussed in more detail in later examples.
Static class initialization and meta-class initialization. This too will be discussed in later examples.
lifecycle : In line 10, note that the
lifecycleprimary attribute does not appear to have an associated value, since the next token is
params:and is colored like a secondary attribute key. This is not an error, but rather a special-case supported by Meta. In the definition of the
lifecycleprimary attribute within the schema, its value is marked as optional. An explicit value is allowed, but if one is not provided, the compiler will auto-assign a unique identifier.
params:... In lines 10-12, the
params:attribute is defined (which we know is block-valued because its key ends in a colon). This attribute defines the list of arguments that can be passed into the initializer for this class. Its value is a complex-block that accepts
In Line 11-12, a
varconstruct is defined:
- var name is the primary attribute.
- : str is a secondary attribute (the ':' token is an alias for
type, similar to the the
- #:... provides a one-line comment for the parameter.
- ; terminates the construct.
scope:... Lines 13-14 define a scope: attribute. Since this scope applies to the class initializer, and initializers are a special kind of method, it shouldn't be too surprising that this scope is simple-valued (remember that
methodconstructs were one of the few constructs whose
scope:attribute is simple (a list of lines) instead of complex (a list of sub-constructs).
The single line making up the body of the initializer is the
self.nameIs(name). Note that
nameIsis the name of the setter associated with the
fieldconstruct defined in line 7 (i.e. it is implicitly generated by the compiler, so programmers can rely on its existence without having to define it themselves).
end; terminates the lifecycle construct in line 15. It is using a more concise form of the terminator syntax (
end;) than what was used for the class or namespace constructs. Any of these terminators (or none at all) is acceptable; it is entirely programmer preference.
The above program is not useful unless we can compile it. To do so, we use
metac, the compiler. This executable performs much more
than just compilation, with the various actions controlled by sub-commands,
flags and args.
To compile a
metac compile subcommand as
% metac --metalang=oopl --baselang=python compile person.meta
Note that --metalang has -L as an abbrev, and --baselang has -b as an abbrev.
Also, when specifying base languages one can use the full name or any of a
variety of suffixes commonly associated with the baselang. For python, one can
use 'py' instead of 'python', and for
C++, one can use 'cc' or 'ccp' instead of
C++'. Thus, our command could also be expressed as:
% metac -L oopl -b py compile person.meta
The meta compiler maintains a set of configurable variables, one of which is the default Meta-language to use. The default metalang is 'oopl', so the above can be expressed as:
% metac -b py compile person.meta
Another configurable variable is the default base language. If one most
often works with
Python, one could set the default base language to
in which case the above could be expressed as:
% metac compile person.meta
The default sub-command of
it need not be specified:
% metac person.meta
The .meta suffix is not needed:
% metac person.meta
In addition to the
compile subcommand, the
compile has many other subcommands, including:
- canonical: Canonicalize the specified meta files and report diffs.
- checkmap: Verify that a baselang map file is correct.
- checkrep: Verify that the repository is sane.
- clean: Remove a class from the repository
- compile: Compile specified .meta files.
- config: Print out the config values
- emacs: Generate an emacs major mode for the current meta language.
- get: Get a .meta file (parse, expand, import, translate and compile).
- help: A summary of all commands.
- html: Create .html files for the specified metafile(s) and their basefiles.
- index: Print out the index of namespaces/classes/methods.
- mirror: Mirror a meta-generate python hierarchy to another directory.
- repl: Invoke a baselang-specific read-execute-print loop (REPL).
- reverse: Reverse compile baselang files into meta files.
- run: Invoke a class-specific entry point
- setup: Perform specialized baselang-specific WORKSPACE configuration.
- shell: An interactive shell within which any meta2 command can be invoked.
- snapshot: Create a snapshot.
- summarize: Print out a summary of all specified metafiles.
- test: Test specified namespaces/classes/methods.
- uml: Generate a UML diagram of all classes in all specified metafiles.
Example 2: The Syntax Of Unification
Having parsed a simple <
Python> program attribute-by-attribute, we can
now move beyond syntax to discuss all the ways that augments and
unifies object-oriented programming languages. We will be using <
C++> programs for the most part, but the discussion applies equally
to all of the base languages supported by .
There are two core directions we need to explore in dicussing whatoffers: how it supports augmentation of base languages, and how it supports unification of base languages. In this section we will discuss some basics of how unifies base languages, after which we will focus on augmentation (while relying on the unification syntax introduced here).
We've been discussing an example <
Python> program stored in a file
person.meta. But what would the <
C++> program that
implements the same code look like? Let's move our <
person2.metapy (note that the suffix
explicitly specifies the baelang, removing the need to specify a --baselang
flag). Let's put our new <
C++> implementation in
Figure 4a shows
person2.metapy, and Figure 4b
Figure 4a: person2.metapy
Figure 4b: person2.metacc
A close inspection of Figure 3, when compared with Figure 4a, will reveal a
single difference (which is also different in Figure 4b). In Figure 3, line 13
specifies secondary key
scope:, whereas Figure 4a specifies
and Figure 4b specifies
scope<. The new syntax in Figures 4a and 4b are
examples of base language qualifiers, which supports for every
attribute key. At first glance this new syntax may not seem particularly useful,
since the code in both files would work equally well using
However, we'll shortly see why this new syntax is exceptionally useful. First
though, lets compile our two programs into
% metac person2.metapy % metac person2.metacc
The resulting base language code is shown in Figures 5a and 5b.
Figure 5a: person2.metapy compiled into
Figure 5b: person2.metacc compiled into
Looking at the generated base language code, we can make a variety of observations:
conciseness: One of the goals of <is to do more while typing less. Note that our 19 lines and 317 bytes of of
Python> code correspond to 4 python files containing 103 lines and 1950 bytes, while our 19 lines and 319 bytes of <
C++> code produced 8
C++files containing 142 lines and 3085 bytes. The toy nature of our example over-represents the ratio of sizes between meta code and base-language code, but it is common for code to be 3-6 times more concise than its underlying base language implementations. Furthermore, having all related code in a single file, rather than being scattered across many different files in various directories, is another time saving benefit of code.
shared syntax, less learning curve: Note that although Figures 4a and 4b represent different programming languages (<
Python> and <
C++>), they look almost identical, differing in exactly two lines (lines 13 and 14). This stems from the fact that <
Python> syntax is exactly the same as <
C++> syntax above the level of statements. In constrast, compare the syntax of the compiled
C++programs; they are completely different from one another. A person who learns <
Python> already knows how to write a great deal of <
C++> code; all one needs is to learn statement-level
new features: A look at the generated base-language code reveals a variety offeatures that we haven't discussed yet. We will explore them in more detail in subsequent examples, but here is a summary of some of them:
Bazel: Note that a
BUILDfile is automatically created in both base languages. This is because uses bazel as its base language build system, providing all the benefits of bazel without the programmer needing to know anything about it.
test-classes: Note that in
Python, not only is there a class
demo.Person, there is also a class
demo_test.PersonTest. And for
C++, there is class
demo_test::PersonTest. This is because the compiler automatically generates a test class for every user-defined class. And as we will see in subsequent examples, every method defined on a class can have one or more test methods implicitly defined on the test class, based on test code provided in the source file. The proximity of code and test code is one of the ways provides readability, writability, and conciseness.
meta-classes: Note that in
Python, in addition to class
demo.Personthere is also a class
demo.PersonMeta. And in
C++, there are classes
demo::Personmeta. This is because the compiler automatically generates a metaclass for every user-defined class that uses 's reflective features. This metaclass allows to provide various powerful reflection and meta-programming capabilities in all baselanguages, even those not known for their reflectivity (e.g.
Consider Figure 4a and 4b again. They are represented by two different files,
one written in <
Python> and the other in <
C++>. However, with the
support of base language qualifier syntax, we can create a single Meta
person.meta) that provides implementations in both
C++. Such a program is shown in Figure 5.
Figure 5: person3.meta, written in <
If Figure 4a is said to be written in <
Python>, and Figure 4b is written
C++>, it makes sense to say that Figure 5 is written in
Python|C++>. When compiling this program into a base language, we can
metac which base language we want to use. If we
want to compile to
Python, we use:
and if we want to compile to
% metac -b py person3.meta
C++, we use:
% metac -b cc person3.meta
person.meta contains 21 lines and 358 bytes, and can
automatically produce 12 base language files containing 245 lines and 5035
bytes, providing even more conciseness between programs and their
underlying base language implementations.
C++, what would
be needed to also implement it in, for example,
would be to add the following before line 17:
After doing so, the program can be said to be implemented in
entire hierarchy of programming languages above the base languages it
augments and unifies, as showing in Figure 6.
Figure 6: TheLanguage Hierarchy
Python|C++> is very useful as a means of incrementally
migrating a <
Python> implmentation to a <
C++> implementation, if one
wants to maintain the same code in both languages in perpetuity even the
convenient adjacent syntax offered by gets difficult to maintain (every
change to statement-level code in any baselang scope block needs to be
propogated to all other baselang scope blocks). This becomes more and more
tedious the more languages one is attempting to maintain the implementation in,
for example <
Eiffel>. The obvious solution to this is to
introduce Meta-level syntax for statements, so that a program can be expressed
purely in , with no base language syntax whatsoever. Such a program is thus
written in all base languages at the same time, as it can be compiled down to
any of them. This extended version of in which statements are
implemented as constructs is referred to as *.
Although* is quite powerful, given its ability to implement a program in all base languages simultaneously, that power comes at an expense, one that many programmers may not need or want to incur. Programmers rarely like a language because of how classes or methods or fields are defined. But programmers may become attached to the statement-level syntax of a particular language, and oftentimes base language syntax can be more concise than the corresponding implementation in Meta. * does offer some powerful syntactic benefits with its statement-level constructs, but most programs do not need to be expressed completely in *. Rather, it is useful to use * syntax in those places where it is convenient, and to use base-language syntax where that is more convenient.
Example 3: Test Classes
Having explored some ways in whichprovides base language unification, we will now focus more solidly on the many ways that augments individual base languages.
Figure 7 builds on Figure 1 (our initial <
Python> example) by adding some
new fields and a method, introducing various new constructs and attributes,
and (most importantly), ensuring that our program is fully tested.
Figure 7: person4.meta (adding unit tests)
Here are some of the changes between Figure 1 and Figure 7:
comment: vs #:: In line 1, we've changed
fields height and weight: In lines 9-12, we've defined two new fields,
weight. Both have meta-type
real<64>, representing a 64-bit floating point number. The token
doubleis an alias for this meta-type, so we could have used
attribute provides (aka ->): In line 15, the structure of
field namehas changed. In Figure 1, this parameter was given an explicit type and comment string, but in Figure 7 these attributes have been replaced with
-> name. The
->token is an alias for the
providesattribute. It is useful in situations where the parameter to a method/initializer is to be used to initialize the value of some field defined on the class. To see why this is beneficial, scroll back up to Figure 1 and note the redundancy in information specified in lines 7-8 (the definition of
field name) and lines 11-12 (the definition of the
nameparameter passed to the Person initializer). Since the
nameparameter is destined to be set as the value of the
namefield, it is unnecessary to specify the type of the parameter or to document it, as the type and documentation are the same as for the field it will initialize.
This is a relatively minor feature, but given how common it is to initialize fields from initializer arguments, it ends up saving real time for programmers, and is yet another way in whichallows us to do more while saying less.
implicit scope:: In Figure 7 there is no longer a
scope:attribute in the lifecycle construct. This is because the only actions that need to be performed in the initializer are the initialization of the three fields from their corresponding parameters, and that is handled by the
->syntax. The compiler implicitly defines a scope that code that initializes the fields appropriately.
lifecyce setup: In lines 18-19, the lifecycle construct has a new secondary attribute defined on it,
setup:. This (and various other) attribute is used by the test harness provided by . The compiler implicitly defines a
PersonTestclass (in namespace
demo_test) to go along with the user-defined
Personclass (in namespace (
demo). The code in the
setup:block is provided as the body of the setup method defined on that test class.
Remember that in any xUnit compliant testing environment (whichprovides in all base languages), the setup method is invoked before each test method is invoked (and a teardown method is invoked after each test method). Although most languages define xUnit compatible testing environments, the exact details of how setup/teardown methods (and the class and namespace equivalents of setup/teardown) are named and implemented varies between base languages. hides away all of this minutiae, allowing the programmer to simply specify
clteardown:attributes to define the relevant code. Thus, a programmer need only learn a single syntax that applies uniformly across all base languages, rather than needing to learn base-language specific details.
For our particular example, the setup method initializes a field named
person(defined on the
demo.PersonTestclass in line 24) to a new instance of the
demo.Personclass we are implementing. Because of this setup code, every test method associated with this class can rely on
personbeing properly initialized.
Note that line 19 makes reference to the variable
test, even though this variable is not defined anywhere. This is a special pseudo-variable defined by . It is semantically equivalent to
Smalltalk, etc.) and
Distinguishing between user-code and test-code: In, the user code and test code are adjacent to one another. Using
selfin both code blocks can be confusing because what self refers to is different (e.g. an instance of Person or an instance of PersonTest). By providing new syntax for PersonTest receivers, we can make it clear that code in
setup:(et.al.) blocks refers to PersonTest, while code in
scope:(et.al.) blocks refers to Person.
Readability: When performing checks on values in test methods, it is more readable to have the callsite
self.iseq(...)because we are in fact testing whether two values are equal.
Note also in line 19 that we are invoking a method named
PersonTestinstance. provides a special test class from which all test classes inherit, on which a large number of value-comparing methods are defined. This same class is provided in all base languages, with exactly the same interface in all languages. The
iseqmethod verifies that the first two arguments are equal to one another.
lifecycle test: In lines 20-21, a
test:attribute is defined. This is another simple-block that is related to the test-harness.
One of the things the lifecycle construct does is define an initializer for instances of the class. An initializer is a special kind of method, and is thus amenable to being tested. A method to test the initializer is implicitly added to the test class auto-generated for the user class, and the body of this test method is given by the body of the
test:block found on the lifecycle construct.
Note that having the testing code be adjacent to the code being tested significantly improves readability, and is much more convenient than having to continually juggle multiple files when creating tests. Increased ease means it is more likely that tests will be written and maintained as the code evolves.
In our example, line 21 is defining the test method for the initializer to consist of a single check that the name of our
personinstance equals "Bob".
test field person: line 24 defines another field construct, similar to those in lines 7-12, but with one crucial difference. The token test appears before the primary key, and both the color and position of this token imply that it must be a feature value defined on the field construct. That is indeed the case, as field defines feature attribute location with values user, test, meta, and usertest. This feature attribute is defined on most constructs that can appear within a field, and allows us to indicate whether the construct in question applies to the user-defined class (the default), or to the test class or meta class implicitly defined by for each user class. By specyfing a value of test, we are indicating that this field is defined on
demo_test.PersonTest, not on
demo.Person. Thus, any test method can rely on the existence of this field (and can access it via
method bodyMassIndex: lines 26-35 introduce a new construct, method, which naturally defines a method on
method bodyMassIndex is the primary key, identifying this construct as a method with name
: real<64> is a secondary attribute (
:is an alias for
typehere, just as it is for field and var) that identifes the return type of the method. The value of the attribute is a meta-type, and we've already encountered type
#:... in lines 26-27 provides a comment describing the purpose of the method.
scope:... in lines 28-30 provides the (
Python) implementation of the method.
test:... in lines 31-34 provides a block of (
Python) code for testing the correctness of the implementation in the scope: block. The contents are inserted as the body of the implicitly generated method
demo_test.PersonTest.test_bodyMassIndex()that defines for
Invoking theTest Harness
The meta compiler can invoke the test harness in a number of ways, and at varying levels.
To test the
demo.Person class in its entirety, one uses:
% metac --metalang=oopl --baselang=py test demo.Person
In the above command there was no reference to a
demo.Person, along with the meta
language and base language, are sufficient to identify the test code to
By default, invoking the test harness will use the Bazel testing infrastructure, which provides for hermeticity, reproducibility, and numerous other benefits.
metac has been configured with default
oopl, and using -b instead of --baselang, the above
command can be shortened to:
% metac -b py test demo.Person
metac has been configured with a default baselang
python, the above becomes:
% metac test demo.Person
The default action, if a sub-command is not specified when
is invoked, parses the command line arguments into three sets: 1) a list of
source files (based on suffixes), and 2) a list of legal fully-qualified namespace,
classes or methods, and 3) a list of args not matching either of the previous two
categories. Because of this, the default action can perform both compilation of Meta
source files and testing of fully-qualified symbols at the same time. One can thus
invoke the test harness with:
% metac demo.Person
and one can compile code and then invoke the test harness with:
% metac person4.meta demo.Person
% metac person4 demo.Person
It is sometimes useful to be able to rely on the native unit testing
infrastructure of a particular base-language rather than on the bazel
testing infrastructure, which can be accessed by
% metac -r demo.Person
To obtain verbose output (in which every method tested, along with the time
taken to test the method and any output produced by the method are also
printed), one can specify
% metac -rv demo.Person
If one wants to test only a single method, one simply provides the fully qualified name of the method:
% metac -rv demo.Person.bodyMassIndex
And if one wants to test an entire namespace, one simply provides the fully qualified name of the namespace:
% metac -rv demo
Example 4: Entry Points, Command Line Interfaces, and Flags
Software isn't useful unless there is a means of getting it running.
Java provides (and available in all base languages,
Figure 8: person5.meta (adding entry points)
Figure 8 extends our running example by introducing a command construct
- Meta Library (what is the set of classes we need?)
- Statement-level constructs