It is unlikely that you will ever design a general-use language like Fortran, C++, ML, or Prolog, but if you become a professional software engineer or software architect, it is highly likely that you will specialize in some problem area, like telecommunications, aviation, banking, or gaming. You will become expert at building systems in your problem area, and you may well design a notation, a language, that helps you and others write solutions to their problems in this area. In this case, you are a designer of a domain-specific language that is used to build domain-specific software architectures.
This chapter introduces these concepts, applying the concepts already learned.
Specific problem areas, e.g., flight-control or telecommunications or banking, use specific hardware architectures, and they also use specific software architectures. When a new model of airplane is designed, the hardware architecture (the airplane hardware, including its computers) is based on a hardware design that has succeeded in the past. (It is too great of a risk to start from scratch; it is also better to build on and refine what is known to work.) The software architecture for the plane will also be based on some standard layout that is known to work well.
Software architects use a collection of concepts and techniques to build a new system in an established problem area; this collection is called a domain-specific software architecture. It contains concepts, languages, tools, and methods:
Strictly speaking, the reference requirements form part of the terminology of the application domain, but they are often specially identified because they are treated specially in the implementation methodology.
A language that is designed for discussing problems, behaviors, and solutions within a problem domain is a domain-specific language (DSL). The language's vocabulary includes concepts and notation from the problem domain: the nouns, pronouns, adjectives, verbs, and adverbs of the language. The language lets participants (people and machines) discuss and implement solutions within the domain. Because its vocabulary is limited to the specific domain, a DSL is often useless to discuss and solve problems outside the domain.
DSL uses concepts familiar to people who work in the domain. For example, say that you must install an alarm system in an office building, and you must discuss the setup with the building's owners and employees. A DSL for sensor-alarm networks would discuss
Compare the lingo of sensor alarms to the lingo you write in Java --- in the latter, the ``nouns'' are numbers, strings, and variables that name numbers, strings, commands, etc. The ``adjectives'' are data types and other declaration modifiers. The ``operations'' are arithmetic, data-structure indexing, function call, etc. ``Actions'' are commands, or groups of commands. ``Events'' can be GUI events or a call to a method to start execution. Java is a ``DSL'' for computation on (arrays of) numbers and strings.
Now, take your view of Java and think again of a domain like sensor alarms, or network protocols, or music composition or gaming. What are the domains of interest, their elements, the features, operations, actions, and events? How many of these are directly implemented (that is, ``understood'') by a computer? How many must be ``refined'' to be computational (understood by a computer)?
A simplistic view is that a DSL is a kind of ``restricted'' programming
language, much like Legal English is ``restricted'' English:
RANGE OF PROGRAMMING LANGUAGES:
More General More Specific
< ------------------------------------------------------- >
GPL (C, Java, ML, etc.) DSL GUI
(Here, we treat a GUI as a ``programming language'' because
a user ``programs'' with mouse drags and clicks.)
But a DSL is not exactly a restricted GPL: Consider the relationship between English and algebra --- the notation of the latter is definitely not a mere restriction of the former. It is the restriction of the problem area that is significant.
A DSL lets stakeholders (the participants in a systems project) communicate their ideas (needs, suggestions, solutions, implementations, orders). The DSL is a is a modelling language that lets us discuss models, structures, and behaviors specialized to a problem domain like telecommunications, banking, transportation, gaming, algebra, typesetting, etc.
If the computer is a ``participant,'' that is, we can use the DSL to tell the computer what to do --- we can program the computer --- then the DSL is a domain-specific programming language.
In terms of domain-specific software architecture, someone might ask you,
For example, ''It would be nice to have a little language to help us lay out the wiring and sensors for a building's alarm system.''.
Or, ''It would be nice to have a little language to help us write the protocols for how the movement detectors send/receive messages to/from the other devices and people in the network.''
This kind of wishful thinking can lead to a domain-specific programming language, in particular, a top-down domain-specific programming language.
Excel is a good example --- it has a nice mix of graphical and textual notations that falls within the grasp of a user who has rudimentary math and problem-solving skills. The user can layout a spreadsheet that computes totals of rows and columns. (If you have never used Excel or a spreadsheet tool, you can read a tutorial here.)
Another good example is Yacc --- a user writes the BNF rules of a language, and this gives the information the Yacc compiler uses to build a parser matching the BNF rules. (Here is a tutorial.)
Another good example is SQL --- without knowing the internal layout of a data base, a user can write a query in terms of sets and set operations, and the SQL interpreter executes the query as if it were a data-structures lookup algorithm. (There is a demo and tutorial here.)
HTML lets a user format a web document in terms of paragraphs, lists, and fonts and hides for the user the details of spacing, line breaks, and painting text and pictures. (The document you are reading is ``programmed'' in HTML that I wrote by hand. Use the View/PageSource menu option on your web browser to see the program. Here is a tutorial.)
A top-down DSPLs can be used ``standalone'' without a connected general-purpose programming language.
Upon first hearing, it sounds like top-down DSPLs are wonderful --- a language for just my problem that lets me say exactly what I want! --- but in reality, a top-down DSPL is a ``mixed bag'' of assets and drawbacks:
There is another variant of DSPL, one that is used by an experienced programmer who wants to ``extend'' a general-purpose language with concepts specific to the problem domain. In this situation, the DSPL is programmed in the general-purpose, host language as a library of data structures and operations.
This is called a bottom-up DSPL. DSPLs for GUI-building are typically bottom-up DSPLs, because a GUI by itself is useless --- the GUI must be connected to components that do something. Here are some examples of GUI libraries (GUI DSPLs) that are mated to general-purpose languages:
Because of the popularity of GUIs, the DSPL for GUI building is ``married'' to its host language in Visual Basic and Visual C++. (But in the beginning, there were only Basic and C++.)
Experienced programmers naturally become bottom-up DSPL designers, because over time they will assemble a library of custom data structures, functions, and control structures (templates) that they use over and over again to solve problems in the same domain. Eventually, the programs they write consist mostly of the components of their library and less and less of new code, finally reaching the point where the underlying, host programming language acts merely as ``glue'' for connecting the components selected from the library.
At this point, the host language with its library is essentially a bottom-up DSPL, because the library has become ``more important'' to problem solving than the host language itself. What has happened is this:
In practice, bottom-up DSPLs often evolve starting from a dynamic-data-structure host language, like Scheme or Perl or Python or Ruby, because (i) there is less keyword notation to clutter the programs and (ii) there are few preset limitations on combinations of data and control structure. The custom-written library for the problem area is written in the host language, and it is oriented towards encoding ``domain-concepts-as-code'' (nouns as data structures, verbs as operations, adjectives and adverbs as control-structure templates) so that the scenarios discussed in the problem area's DSL can be readily converted into programs using the DSPL. Experienced programmers have good instincts for coding domain concepts as code and saving them as libraries. It is almost a matter of survival --- there is never enough time to build a new solution completely from scratch!
A bottom-up DSPL has its strengths and weaknesses also:
Try to develop the ``levels'' or ``layers'' or the language, which suggest the syntax (BNF grammar) of the DSL. It can be helpful to draw ad-hoc parse trees/operator trees for the scenarios that you wish to state in the domain. The trees will help you see the levels, the nouns, adjectives, verbs (operations), actions (sentences), and events (calls). If the scenarios read as it they are understandable by a computer (that is, computable), then this can be the basis of a top-down DSPL.
The parts of the scenarios that are not computable are probably part of a DSL modelling/design language, and you must consider how to refine those parts into something that is computable. (That's why methodology is a key part of Domain-Specific Software Architecture.)
Here are questions to ask:
Most non-experts have difficulty with nested structures of any form --- sequencing is about the most they can handle. Repetition is also a challenge.
Data structures must be kept simple, resembling real-life, physical structures (a sheet of graph paper, a chest of drawers, a filing cabinet, a dictionary) or resembling the structures that are basic to the domain in which the users are immersed (hallways, buildings, wiring bundles...).
Keep this directive in mind, always:
If the DSPL's users see notation and concepts that lie outside their domain, the users will get lost. (That's why non-programmers don't use Java as a DSPL for spreadsheet building!)
In the case of HTML, the primitive data are text, which is embedded within the program (document) and images, whose filenames are embedded. Both operations and data structures are stated with bracket pairs of form, <op ...> and </op>, where ... can hold arguments to the operation/structure. (Examples: <p> ... </p>, <font = NAME> and </font>., <ul> ... <li> ... </ul>., etc!) There are standard environments for laying out pages, and a user can formulate and name a custom environment.
How should we implement the document language? Should it be an extension of an existing programming language? Should it be a new language? Should it be expressed within a GUI so that programs are typed with the mouse plus keystrokes or as a text file, typed solely with a keyboard?
Document languages tend to be static --- they don't need variables, identifiers, loops, and components as much as other languages, and their users tend to be less sophisticated at composition and planning ahead than software people, so a top-down, stand-alone language is sensible here. The language might be GUI-based (e.g., Word) or text based (e.g., PS or LaTex or Roff) or both (HTML).
Note how Perl is Regular expression language bottom up added to script. Note how Python mail/internet libraries work this way. Java's libraries are lesser successful.
Experienced programmers naturally become bottom-up DSPL designers: Over time, they will assemble a library of custom data structures, functions, and control structures (templates) that they use over and over again to solve problems in the same domain. Eventually, the programs they write consist mostly of the components of their library and less and less of new code, finally reaching the point where the underlying, host programming language acts merely as ``glue'' for connecting the components selected from the library.
At this point, the host language with its library is essentially a bottom-up DSPL, because the library has become ``more important'' to problem solving than the host language itself. What has happened is this:
So, if you work in the same problem domain on a regular basis, you will do yourself well to consciously organize your work into a library that can evolve into a bottom-up DSPL. To plan ahead for this, you should review the list of items at the beginning of the chapter that define domain-specific software architecture. Then, for each item, ask yourself, ``How much of this affects what I have to program?'' and more specifically, ``What patterns of programming will I do for this?''
When you build systems in the domain, you will need data structures and control structures and component structures that match the parts of the software architecture. You want to have a good match between each concept, idea, technique in the domain and a piece of code, so that you can convert somewhat mechanically to the concepts within a domain-specific software architecture to their software coding.
Take note of patterns of coding you do --- what repeats over and over in the coding of data structures, control structures, and component structures. Are the patterns important? Do they match the concepts in the domain-specific software architecture?
These exercises should give you strong suggestions about what you should include in the library you develop. Once you start your library, force yourself to use it as much as possible (instead of writing from scratch something similar) and improve it so that it can be used in the future. Your goal should be to do programming mostly by selecting code from your library and ``gluing'' it together with the underlying host language.
If you are having good success at developing and using your library, then you might consider how to make the library as ``stand-alone'' as possible, that is, writing program skeletons and saving them in the library, so that you write a new program by selecting the appropriate skeleton and inserting into it the data structures, operations, and control structures that you also have saved in the library. This means you use the underlying host language only as an ``interface language'' to contact external components that you have not written and you use the host language only as a ``trap door'' to write code that must ``escape'' from the problem domain area.
Implicit in the previous paragraph are the notions of framework and product line. A framework is a library that has a one or more program skeletons that one starts with to build a system. The programmer selects the appropriate skeleton and fills in the gaps with a mix of other library code and code custom written in the host language. GUI libraries are almost always organized as frameworks.
A product line is a family of programs that are structured almost exactly the same, but they differ only in minor customizations. (Consider a product line of cars all based on the same engine-chassis assembly. A software example is Notepad/Wordpad/Word, which are all based on the same structure but have different degrees of customizations for font choices, formatting, and file formats.) A product line of software is built from a library when the one and the same skeleton is used for all software products, and the gaps in the skeleton are all filled by other library components.