Last Updated: 2019-05-16
Authors:
Jens von Pilgrim, Jakub Siberski, Mark-Oliver Reiser,
Torsten Krämer, Ákos Kitta, Sebastian Zarnekow, Lorenzo Bettini, Jörg Reichert, Kristian Duske, Marcus Mews, Minh Quang Tran, Luca Beurer-Kellner
This document contains the N4JS Design and Implementation documentation.
1. Introduction
This document describes design aspects of the N4JS compiler and IDE. It relies on the following N4JS related specifications:
-
N4JS Language Specification [N4JSSpec]
1.1. Notation
We reuse the notation specified in [N4JSSpec].
1.2. IDE Components
The N4JS and N4JSIDE components are organized via features. The following features with included plugins are defined (the common prefix "org.eclipse.n4js" is omitted at the plugin name):
| Feature | Plugin | Description |
|---|---|---|
org.eclipse.n4js.lang.sdk |
N4JS core language with parser, validation etc. |
|
org.eclipse.n4js |
Xtext grammar with generator and custom code for N4JS, scoping (and binding) implementation, basic validation (and Xsemantics type system). |
|
doc |
(in doc folder) General documentation (including web page) written in AsciiDoc |
|
external.libraries |
Support for N4JS libraries shipped with the IDE, i.e. core N4JS library and mangelhaft. |
|
ui |
UI components for N4JS, e.g., proposal provider, labels, outline, quickfixes. |
|
jsdoc |
Parser and model for JSDoc |
|
external.libraries.update |
Not included in feature. Updates the external library plugin |
|
org.eclipse.n4js.ts.sdk |
Type System |
|
ts |
Xtext grammar with generator and custom code for type expressions and standalone type definitions. |
|
ts.model |
Xcore based types model with helper classes etc. |
|
ts.ui |
Xtext generated UI for type system, not really used as this TS files are not editable by users. |
|
org.eclipse.n4js.unicode.sdk |
||
common.unicode |
Xtext grammar with generator and custom code used by all other grammars for proper unicode support. |
|
org.eclipse.n4js.regex.sdk |
Regular expression grammar and UI, used by N4JS grammar and UI |
|
regex |
Xtext grammar with generator and custom code used by N4JS grammars for regular expressions. |
|
regex.ui |
UI components for regular expressions, e.g., proposal provider, labels, outline, quickfixes. |
|
org.eclipse.n4js.sdk |
This feature defines the N4JSIDE. It contains core UI plugins and all includes (almost all) other features! |
|
environments |
Utility plugin, registers n4scheme for EMF proxy resolution. |
|
model |
Xcore based N4JS model with helper classes etc. |
|
product |
N4JSIDE main application. |
|
releng.utils |
(in releng folder) Contains utility classes only used for building the system, e.g., tools for generating antlr based parser with extended features. |
|
utils |
general utilities |
|
utils.ui |
general UI utilities |
|
org.eclipse.n4js.compiler.sdk |
Compilers and Transpilers |
|
generator.common |
Not included in feature, logically associated. |
|
generator.headless |
N4JS headless generator (i.e. command line compiler). |
|
transpiler |
Generic transpiler infrastructure |
|
transpiler.es |
Transpiler to compile to EcmaScript |
|
org.eclipse.n4js.json.sdk |
N4JS JSON |
|
json |
Xtext grammar with generator and custom code for a extensible JSON language support. Used in N4JS for the project description in terms of a |
|
json.ui |
UI components for extensible JSON language support, e.g., proposal provider, labels, outline. |
|
json.model |
Not included in feature, logically associated. Xcore based model for the JSON language. |
|
org.eclipse.n4js.semver.sdk |
Semantic version string support. |
|
semver |
Parser and tools for semantic version strings. |
|
semver.ui |
UI tools for semantic version strings. |
|
semver.model |
Not included in feature, logically associated. Xcore model of semantic version strings. |
|
org.eclipse.n4js.runner.sdk |
Runners for executing N4JS or JavaScript code |
|
runner |
Generic interfaces and helper for runners, i.e. JavaScript engines executing N4JS or JavaScript code. |
|
runner.chrome |
Runner for executing N4JS or JavaScript with Chrome. |
|
runner.chrome.ui |
UI classes for launching the Chrome runner via the org.eclipse.debug.ui |
|
runner.nodejs |
Runner for executing N4JS or JavaScript with node.js. |
|
runner.nodejs.ui |
UI classes for launching the node.js runner via the org.eclipse.debug.ui |
|
runner.ui |
Generic interfaces for configuring N4JS runner via the debug ui. |
|
org.eclipse.n4js.tester.sdk |
Runners and UI for tests (via mangelhaft). |
|
tester |
Generic interfaces and helper for testers, i.e. JavaScript engines executing N4JS tests (using mangelhaft). |
|
tester.nodejs |
Tester based on the nodejs runner for executing mangelhaft tests with node.js |
|
tester.nodejs.ui |
UI for showing test results. |
|
tester.ui |
Configuration of tests via the debug UI. |
|
org.eclipse.n4js.jsdoc2spec.sdk |
JSDoc 2 Specification |
|
jsdoc2spec |
Exporter to generate API documentation with specification tests awareness |
|
jsdoc2spec.ui |
UI for API doc exporter |
|
org.eclipse.n4js.xpect.sdk |
||
xpect |
Xpect test methods. |
|
xpect.ui |
UI for running Xpext tests methods from the N4JSIDE (for creating bug reports). |
|
org.eclipse.n4js.smith.sdk |
Feature for internal N4JS IDE plugins only intended for development (for example, the AST Graph view). |
|
smith |
Non-UI classes for tools for smiths, that is, tools for developers of the N4JS IDE such as AST views etc. |
|
smith.ui |
UI classes for tools for smiths, that is, tools for developers of the N4JS IDE such as AST views etc. |
|
org.eclipse.n4js.tests.helper.sdk |
Test helpers. |
|
org.eclipse.n4js.dependencies.sdk |
Collection of all external non-ui dependencies, used for local mirroring of update sites. |
|
org.eclipse.n4js.dependencies.ui.sdk |
Collection of all external ui dependencies, used for local mirroring of update sites. |
|
uncategorized plugins |
||
flowgraphs |
Control and data flow graph model and computer. |
|
Fragments |
not associated to features, only listed here for completeness |
|
utils.logging |
Fragment only, configuration for loggers, in particular for the product and for the tests |
|
1.2.1. Naming Conventions
In the above sections, tests were omitted. We use the following naming conventions (by example) for test and tests helper:
| project |
- |
| project.tests |
tests for project, is a fragment |
| project.tests.helper |
helper classes used ONLY by tests |
| project.tests.performance |
performance tests |
| project.tests.integration |
integration tests |
| project.ui |
- |
| project.ui.tests |
tests for ui project, fragment of project.ui |
| project.ui.tests.helper |
helper classes used ONLY by tests |
| project.ui.tests.performance |
- |
| tests.helper |
general test helper |
| ui.tests.helper |
general ui test helper |
| project.xpect.tests |
xpect tests for the project, despite dependnecies to UI the can be executed as plain JUnit tests |
| project.xpect.ui.tests |
xpect tests for the project, need to be executed as eclipse plugin tests |
Due to Maven, tests are in subfolder tests (incl. helpers), implementation bundles in plugins, and release engineering related bundles in releng.
2. Eclipse Setup
2.1. Contribute
Eclipse developers who want to develop N4JS itself should use the Oomph Eclipse installer. The N4JS project is listed under "Eclipse Projects/N4JS" This setup installs the correct Eclipse version, creates a new workspace and clones all projects into it (for details see below).
2.1.1. Eclipse Installer
The recommended way to install the Eclipse IDE and set up the workspace is to use the Eclipse Installer. This installer is to be downloaded from https://wiki.eclipse.org/Eclipse_Installer
Run the installer and apply the following steps:
-
change to "Advance Mode" via the menu (upper-right corner) (no need to move the installer)
-
select a product, e.g. "Eclipse IDE for Eclipse Committers" with product version "Oxygen"
-
double-click the entry Eclipse Projects/N4JS so that it is shown in the catalog view below
-
on the next page, configure paths accordingly. You only have to configure the installation and workspace folder.
-
start installation
The installer will then guide you through the rest of the installation. All plug-ins are downloaded and configured automatically, so is the workspace including downloading the git repository and setting up the workspace.
2.1.2. Manual IDE Configuration
For a manual install, clone the code and import all top-level projects from the docs, features, plugins, releng, testhelpers, and tests folders. Activate the targetplatform contained in the releng/org.eclipse.n4js.targetplatform/ project.
The N4JS IDE is developed with Eclipse Oxygen 4.7 or better since the system is based on Eclipse anyway. It is almost impossible to use another IDE to develop Eclipse plugins. The list of required plugins includes:
-
Xtext/Xtend 2.10.0
-
Xcore 1.4.0
-
Xsemantics 1.10.0
-
Xpect 0.1
It is important to use the latest version of Xtext and the corresponding service release of Xcore. You will find the latest version numbers and plugins used in the target platform definition at https://github.com/eclipse/n4js/blob/master/releng/org.eclipse.n4js.targetplatform/org.eclipse.n4js.targetplatform.target
3. Release Engineering
3.1. Nightly build on Eclipse infrastructure
The N4JS IDE, headless n4jsc.jar, and the N4JS update site is being built on the Eclipse Common Build Infrastructure (CBI). For this purpose the N4JS project is using a dedicated Jenkins instance, referred to as a "Jenkins Instance Per Project" (JIPP) in Eclipse CBI documentation. At this time, the N4JS project’s JIPP is running on the "old" infrastructure, not yet using docker. This will be migrated at a later point in time.
The N4JS JIPP is available at: https://ci.eclipse.org/n4js/
The nightly build performs the following main steps:
-
compile the N4JS implementation,
-
build the n4jsc.jar, the IDE products for MacOS, Windows, Linux, and the update site,
-
run tests,
-
sign the IDE product for macOS and package it in a .dmg file,
-
deploy to n4jsc.jar, IDE products and update sites to Eclipse download server (i.e. download.eclipse.org),
-
move all artifacts older than 7 days from download.eclipse.org to archive.eclipse.org.
Details about all the above steps can be found in the Jenkinsfile eclipse-nightly.jenkinsfile, located in
the root folder of the N4JS source repository on GitHub.
The most accurate documentation for our JIPP can be found at https://wiki.eclipse.org/IT_Infrastructure_Doc. Note that many other documents do not apply to our JIPP, at the moment, as they refer to the new infrastructure, e.g. https://wiki.eclipse.org/CBI and https://wiki.eclipse.org/Jenkins.
3.2. Build the N4JS IDE from command line
Ensure you have
-
Java 8
-
Maven 3.2.x and
-
Node.js 6
installed on your system.
Clone the repository
git clone https://github.com/Eclipse/n4js.git
Change to the n4js folder:
cd n4js
Run the Maven build:
mvn clean verify
You may have to increase the memory for maven via export MAVEN_OPTS="-Xmx2048m" (Unix) or set MAVEN_OPTS="-Xmx2048m" (Windows).
3.2.1. Publish maven-tooling org.eclipse.n4js.releng.util
For extending the N4JS-language in a different project, the org.eclipse.n4js.releng.util module needs to be published as a maven-plugin. You can deploy this SNAPSHOT-artifact to a local folder by providing the local-snapshot-deploy-folder-property pointing to an absolute path in the local file system:
|
mvn clean deploy -Dlocal-snapshot-deploy-folder=/var/lib/my/folder/local-mvn-deploy-repository
The existence of local-snapshot-deploy-folder will trigger a profile enabling the deploy-goal for the project org.eclipse.n4js.releng.util
3.2.2. Generation of Eclipse help for spec and design document
The HTML pages for N4JSSpec and N4JSDesign documents are generated from the Asciidoc sources in the project org.eclipse.n4js.spec org.eclipse.n4js.design by Asciispec.
Figure The process of creating Eclipse help for N4JSSpec shows the generation process for N4JSSpec document. The process for N4JSDesign is the same. The following explains the diagram.
-
Asciispecis used to compile the source N4JSSpec Asciidoc into a single largeN4JSSpec.htmlfile which contains all the chapters. The use of the custom parameter-a eclipse-help-modeindicates that a specical header and footer styles as well as CSS style should be used (i.e. no table of content menu, no download links etc.). Here, we are using the possibility provided by Asciidoctor to configure header/footer as well as CSS style via parameter:docinfodir:and:stylesheet:. -
Our custom tool
ChunkersplitsN4JSSpec.htmlinto multiple chunked HTML files, each of which corresponds to either theindexfile or a chapter. -
Another custom tool
EclipseHelpTOCGeneratortakes to Docbook fileN4JSSpec.xmland generates an XML file describing the table of content (TOC) in the Eclipse format. This TOC file references the chunked HTML files above.
3.3. Updating Eclipse, etc.
For updating the N4JS IDE to a new version of Eclipse, EMF, Xtext, etc. follow these steps:
-
Create a new branch.
-
Bump versions of all dependencies mentioned in file
N4JS.setup:-
Update all labels that refer to the version of the Ooomph setup (search for "label!" to find them).
-
Choose a new Eclipse version and define this in
N4JS.setup. -
For those other dependencies that come with Eclipse (e.g. EMF, Xtext) find out which version matches the chosen Eclipse version and define that version in
N4JS.setup.
Tip: use the contents list of the SimRel you are targeting, e.g. https://projects.eclipse.org/releases/2019-03 -
For those other dependencies that are available via the Eclipse Orbit, find out which version is the latest version available in the Orbit and define that version in
N4JS.setup.
Tip: contents of the Eclipse Orbit can be found at https://download.eclipse.org/tools/orbit/downloads/
(choose the correct link for the chosen Eclipse version!) -
For all remaining dependencies (i.e. unrelated to Eclipse and not in Orbit), choose a version to use and define it in
N4JS.setup.
-
-
Check
Require-Bundlesections of MANIFEST.MF files by searching for related bundle names or for;bundle-version=":-
There should be at most one version constraint for a specific bundle
-
There should be no version constraints to our bundles (i.e.
org.eclipse.n4js…)
-
-
Review parent pom.xml files, i.e.
releng/org.eclipse.n4js.parent/pom.xml:-
Update property
xtext-version. -
Check all other
*-versionproperties and update them where needed.
-
-
Update target platform file
org.eclipse.n4js.targetplatform.targetusing Ooomph’s auto-generation:-
Start the Eclipse Installer.
-
Update the Eclipse Installer (using the button with the turning arrows).
-
On the second page, add the
N4JS.setupfile from your branch to the Eclipse Installer, using a GitHub raw(!) URL:
https://raw.githubusercontent.com/eclipse/n4js/BRANCH_NAME/releng/org.eclipse.n4js.targetplatform/N4JS.setup -
Ooomph a new development environment with this setup.
-
In the new Eclipse workspace created by Ooomph, the target platform file should have uncommitted changes:
-
carefully review these changes, to be sure they make sense, and then
-
commit & push those changes to your branch.
-
-
-
Thoroughly test the new versions:
-
Run builds.
-
Ooomph another N4JS development environment with Eclipse Installer. This time, after Ooomphing is completed, the target platform file should no longer have any uncommitted changes.
-
All the above steps need to be performed in the n4js-n4 repository, accordingly (e.g. file N4JS-N4.setup). In addition:
-
Update file
redirected_com.enfore.n4js.targetplatform.targetby copying the content of filecom.enfore.n4js.targetplatform.targetand then changing all repository location URLs to point to the n4ide-p2-mirror. -
bump the hard-coded Eclipse version in Dockerfile
n4js-n4/jenkins/create-p2-mirror/Dockerfile! -
Run Jenkins build job
Management/Create_P2_Mirror[_From_Branch]
4. Tips & Tricks, Dos & Don’ts
In this chapter we collect some coding hints and guidelines on how to properly use the APIs of Eclipse, EMF, Xtext and other dependencies we are using, as well as our own utilities and helpers.
This chapter is only about coding; add information on things like Eclipse setup or Maven/Jenkins to one of the preceding chapters. Similarly, this chapter is intended to provide just a quick overview, check-list and reminder; add detailed information and diagrams to one of the succeeding chapters.
4.1. Naming
-
The internal handling of N4JS project names is non-trivial (due to the support for npm scopes), see API documentation of
ProjectDescriptionUtils#isProjectNameWithScope(String)for a detailed overview. In short:-
IN4JSProject#getProjectName()andIProject#getName()return different values! -
Avoid using the Eclipse project name, i.e. the return value of
IProject#getName(), as far as possible (only use it in UI code when actually dealing with what is shown in the Eclipse UI). -
The last segment of an URI or path pointing to an N4JS project is not always the project name; use utilities in
ProjectDescriptionUtilsinstead, e.g.#deriveN4JSProjectNameFromURI()! (However, given an URI or path pointing to a file inside an N4JS project, you can use its last segment to obtain the file name.)
-
4.2. Logging
In many situations developer needs to use some kind of logging. When in need, follow these rules:
-
Use
org.apache.log4j.Logger;for logging. Other logging utilities (like java built in logger) are not configured. -
do not use
System.outnorSysetem.errfor logging. It is ok to use it for debugging purposes, but those calls should never be merged to master. (with exception of headless compiler, which uses them explicitly) -
There is central logger configuration in
org.eclipse.n4js.utils.logging(andcom.enfore.n4js.utils.logging) that should be used-
log4j.xmlused for production -
log4j_tests.xmlused when running tests
-
-
in Eclipse run configurations logger has to be set properly, e.g.
log4j.configuration=file:${workspace_loc:com.enfore.n4js.utils.logging/log4j_tests.xml} -
in maven configurations logger has to be set separately, e.g.
-Dlog4j.configuration="file:${basedir}/../../plugins/com.enfore.n4js.utils.logging/log4j_tests.xml
4.3. Cancellation Handling
At various occasions, Xtext provides an instance of class CancelIndicator to allow our code to handle cancellation of
long-running task.
Some things to keep in mind:
-
whenever a
CancelIndicatoris available any code that might not return immediately should implement proper cancellation handling (as explained in the next items). -
most importantly: reacting to a cancellation by returning early from a method is an anti-pattern that leads to problems (client code might continue work on a canceled and thus invalid state); instead: throw an
OperationCanceledException! -
don’t use
CancelIndicator#isCanceled()for cancellation handling, except in certain special cases. A valid exception case might be during logging to show a message like "operation was canceled". -
instead, inject the Xtext service called
OperationCanceledManagerand invoke its method#checkCanceled(), passing-in the cancel indicator (this method is null-safe; it will throw anOperationCanceledExceptionin case a cancellation has occurred). Don’t directly create and throw anOperationCanceledExceptionyourself. -
use the other methods provided by
OperationCanceledManagerwhen appropriate (see code of that class for details). -
in try/catch blocks, when catching exceptions of a super type of
OperationCanceledException, be sure to not suppress cancellation exceptions. For example:// Java code @Inject private OperationCanceledManager operationCanceledManager; /** Returns true on success, false otherwise. */ public boolean doSomething(CancelIndicator ci) { try { // do something that might be canceled return true; } catch(Exception e) { operationCanceledManager.propagateIfCancelException(e); // <- IMPORTANT! return false; } }Try/finally blocks, on the other hand, do not need any special handling.
-
a cancel indicator can also be stored in the rule environment (see
RuleEnvironmentExtensions#addCancelIndicator()). This means:-
if you create a rule environment completely from scratch and you have a cancel indicator at hand, add it to the rule environment via
RuleEnvironmentExtensions#addCancelIndicator()(not required when usingRuleEnvironmentExtensions#wrap()for deriving a rule environment from an existing one). -
if you have a rule environment available, be sure to use its cancel indicator in long-running operations, i.e. with code like:
// Xtend code import static extension org.eclipse.n4js.typesystem.utils.RuleEnvironmentExtensions.* class C { @Inject private OperationCanceledManager operationCanceledManager; def void doSomething() { for(a : aLotOfStuff) { operationCanceledManager.checkCanceled(G.cancelIndicator); // main work ... } }
-
4.4. Caching
-
Caching of external libraries (implemented in ExternalProjectMappings)
-
update only using
EclipseExternalLibraryWorkspace#updateState() -
always mind that the diff of current state and cached state is a necessary information for cleaning dependencies of removed npms
-
see
EclipseExternalIndexSynchronizer#synchronizeNpms()for implementation
-
-
updating also happens when external root locations change (see ExternalIndexUpdater)
-
-
Caching of user workspace projects (implemented in MuliCleartriggerCache)
-
caches only some project information and should be refactored along with Core, Model and EclipseBasedN4JSWorkspace
-
4.5. Miscellaneous
-
Resource load states: when an N4JS/N4JSD file is loaded, a certain sequence of processing is triggered (parsing, linking, validation, etc.) and thus an
N4JSResourcetransitions through a sequence of "load states". For details, see N4JS Resource Load States.
5. Parser
Some of the concepts described here were presented at EclipseCon 2013 and XtextCon 2014. Note that the material presented at the linked videos may be outdated.
5.1. Overview
The parser is created from an Xtext grammar. Actually, there are several grammars used as shown in Figure CD Grammars. These grammars and the parsers generated from them are described more closely in the following sections.
5.2. N4JS Parser
One of the most tricky parts of JavaScript is the parsing because there is a conceptual mismatch between the ANTLR runtime and the specified grammar. Another challenge is the disambiguation of regular expressions and binary operations. Both features require significant customizing of the generated parser (see figure below).
5.3. Parser Generation Post-Processing
The ANTLR grammar that is generated by Xtext is post-processed to inject custom code into the grammar file before it is passed to the ANTLR tool. This is required in particular due to ASI (Automated Semicolon Insertion), but for some other reasons as well.
Actually, there are several injections:
-
Due to Xtext restrictions, the generated ANTLR grammar file (*.g) is modified. This means that some some additional actions are added and some rules are rewritten.
-
Due to ANTLR restrictions, the generated ANTLR Java parser (*.java) os modified. This means that some generated rules are slightly modified to match certain requirements.
-
Due to Java restrictions, the generated Java parser needs to be preprocessed in order to reduce the size of certain methods since they must not exceed 64k characters. This is implemented by means of an MWE fragment, activated after the other post processing steps are done.
The first two steps are handled by AntlrGeneratorWithCustomKeywordLogic, which is configured with additional helpers in GenerateN4JS.mwe2. shows the customized classes which modify the code generation. These classes are all part of the releng.utils bundle.
5.3.1. Automatic Semicolon Insertion
The EcmaScript specification mandates that valid implementations automatically insert a semicolon as a statement delimiter if it is missing and the input file would become invalid due to the missing semicolon. This is known as ASI. It implies that not only valid implementations have to perform this, but a valid parser has to mimic this behavior in order to parse executable code. The ASI is implemented by two different means.
The parser’s error recovery strategy is customized so it attempts to insert a semicolon if it was expected. Both strategies have to work hand in hand in order to consume all sorts of legal JavaScript code.
5.3.1.1. Injected code in the Antlr grammar file
Under certain circumstances, the parser has to actively promote a token to become a semicolon even though it may be a syntactically a closing brace or line break. This has to happen before that token is consumed thus the rules for return statements, continue statements and break statements are enhanced to actively promote these tokens to semicolons.
The same rule is applied to promote line breaks between an expression and a possible postfix operator ++ or –. At this location the line break is always treated as a semicolon even though the operator may be validly consumed and produce a postfix expression.
In both cases, the method promoteEOL() is used to move a token that may serve as an automatically injected semicolon from the so called hidden token channel to the semantic channel. The hidden tokens are usually not handled by the parser explicitly thus they are semantically invisible (therefore the term hidden token). Nevertheless, they can be put on the semantic channel explicitly to make them recognizable. That’s implemented in the EOL promotion. The offending tokens include the hidden line terminators and multi-line comments that include line breaks. Furthermore, closing braces (right curly brackets) are included in the set of offending tokens as well as explicit semicolons.
5.3.1.2. Customized error recovery
Since the EOL promotion does not work well with Antlr prediction mode, another customization complements that feature. As soon as an invalid token sequence is attempted to be parsed and missing semicolon would make that sequence valid, an offending token is sought and moved to the semantic channel. This is implemented in the custom recovery strategy.
5.3.2. Async and No line terminator allowed here Handling
There is no way of directly defining No line terminator allowed here. This is required not only for ASI, but also for async. This requires not only a special rule (using some rules from ASI), but also a special error recovery since the token ’async’ may be rejected (by the manually enriched rule) which is of course unexpected behavior from the generated source code.
5.3.3. Regular Expression
The ANTLR parsing process can basically be divided into three steps. First of all, the file contents has to be read from disk. This includes the proper encoding of bytes to characters. The second step is the lexing or tokenizing of the character stream. A token is a basically a typed region in the stream, that is a triplet of token-id, offset and length. The last step is the parsing of these tokens. The result is a semantic model that is associated with a node tree. All necessary information to validate the model can be deduced from these two interlinked representations.
Since the default semantics and control flow of Antlr generated parsers do not really fit the requirements of a fully working JavaScript parser, some customizations are necessary. Regular expression literals in JavaScript cannot be syntactically disambiguated from div operations without contextual information. Nevertheless, the spec clearly describes, where a regular expression may appear and where it is prohibited. Unfortunately, it is not possible to implement these rules in the lexer alone, since it does not have enough contextual information. Therefore, the parser has been enhanced to establish a communication channel with the lexer. It announces when it expects a regular expression rather than a binary operation.
This required a reworking of the Antlr internals. Instead of a completely pre-populated TokenStream, the parser works on a lazy implementation that only reads as many characters as possible without a disambiguation between regular expression literals and divide operators.
Only after the parser has read this buffered tokens and potentially announced that it expects a regular expression, another batch of characters is processed by the lexer until the next ambiguous situation occurs. This is fundamentally different from the default behavior of Antlr.
shows the involved classes which allow for this lexer-parser communication.
5.3.4. Unicode
Unicode support in JavaScript includes the possibility to use unicode escape sequences in identifiers, string literals and regular expression literals. Another issue in this field is the specification of valid identifiers in JavaScript. They are described by means of unicode character classes. These have to be enumerated in the terminal rules in order to fully accept or reject valid or invalid JS identifiers.
For that purpose, a small code generator is used to define the terminal fragments for certain unicode categories. The UnicodeGrammarGenerator basically iterates all characters from Character.MIN_VALUE to Character.MAX_VALUE and adds them as alternatives to the respective terminal fragments, e.g. UNICODE_DIGIT_FRAGMENT.
The real terminal rules are defined as a composition of these generated fragments. Besides that, each character in an identifier, in a string literal or in a regular expression literal may be represented by its unicode escape value, e.g. ` u0060`. These escape sequences are handled and validated by the IValueConverter for the corresponding terminal rules.
The second piece of the puzzle are the unicode escaped sequences that may be used in keywords. This issue is covered by the UnicodeKeywordHelper which replaces the default terminal representation in the generated Antlr grammar by more elaborated alternatives. The keyword if is not only lexed as ’if’ but as seen in snippet
Terminal if listing.
If :
( 'i' | '\\' 'u' '0`` 0`` 6`` 9' )
( 'f' | '\\' 'u' '0`` 0`` 6`` 6' );
5.3.5. Literals
Template literals are also to be handled specially, see TemplateLiteralDisambiguationInjector for details.
5.4. Modifiers
On the AST side, all modifiers are included in a single enumeration N4Modifier. In the types model however, the individual modifiers are mapped to two different enumerations of access modifiers (namely TypeAccessModifier and MemberAccessModifier) and a number of boolean properties (in case of non-access modifiers such as abstract or static). This mapping is done by the types builder, mostly by calling methods in class ModifierUtils.
The grammar allows the use of certain modifiers in many places that are actually invalid. Rules where a certain modifier may appear in the AST are implemented in method isValid(EClass,N4Modifier) in class ModifierUtils and checked via several validations in N4JSSyntaxValidator. Those validations also check for a particular order of modifiers that is not enforced by the grammar.
See API documentation of enumeration N4Modifier in file N4JS.xcore and the utility class ModifierUtils for more details.
5.5. Conflict Resolutions
5.5.1. Reserved Keywords vs. Identifier Names
Keywords and identifiers have to be distinguished by the lexer. Therefore, there is no means to decide upfront whether a certain keyword is actually used as a keyword or whether it is used as an identifier in a given context. This limitation is idiomatically overcome by a data type rule for valid identifiers. This data type rule enumerates all keywords which may be used as identifiers and the pure IDENTIFIER terminal rule as seen in Keywords as Identifier listing.
N4JSIdentifier: IDENTIFIER
| 'get'
| 'set'
...
;
5.5.2. Operators and Generics
The ambiguity between shift operators and nested generics arises also from the fact, that Antlr lexer upfront without any contextual information. When implemented naively, the grammar will be broken, since a token sequence a>>b can either be part of List<List<a>> b or it can be part of a binary operation int c = a >> b. Therefore the shift operator may not be defined with a single token but has to be composed from individual characters (see Shift Operator listing).
ShiftOperator:
'>' '>' '>'?
| '<' '<'
;
6. Type System
6.1. Type Model and Grammar
The type model is used to define actual types and their relations (meta-model is defined by means of Xcore in file Types.xcore)
and also references to types (meta-model in TypeRefs.xcore). The type model is built via the N4JSTypesBuilder when a resource
is loaded and processed, and most type related tasks work only on the type model. Some types that are (internally) available
in N4JS are not defined in N4JS, but instead in a special, internal type language not available to N4JS developers, called N4TS
and defined in file Types.xtext.
The types are referenced by AST elements; vice versa the AST elements can be referenced from the types (see SyntaxRelatedTElement).
This backward reference is a simple reference to an EObject.
6.1.1. Type Model Overview
The following figure, Types and Type References, shows the classes of the type model and their inheritance relations, both the actual type definitions as defined in Types.xcore and the type references defined in TypeRefs.xcore. The most important type reference is the ParameterizedTypeRef; it is used for most user-defined references, for both parameterized and non-parameterized references. In the latter case, the list of type arguments is empty.
Most types are self-explanatory. TypeDefs is the container element used in N4TS. Note that not all types and properties of types are available in N4JS – some can only be used in the N4TS language or be inferred by the type system for internal purposes. Some types need some explanation:
-
TObjectPrototype: Metatype for defining built-in object types such asObjectorDate, only available in N4TS. -
VirtualBaseType: This type is not available in N4JS. It is used to define common properties provided by all types of a certain metatype. E.g., it is used for defining some properties shared by all enumerations (this was the reason for introducing this type).
We distinguish four kinds of types as summarized in Kind Of Types. Role is an internal construct for different kind of users who can define the special kind of type. The language column refers to the language used to specify the type; which is either N4JS or N4TS.
| Kind | Language | Role | Remark |
|---|---|---|---|
user |
N4JS |
developer |
User defined types, such as declared classes or functions. These types are to be explicitly defined or imported in the code. |
library |
N4JSD |
developer |
Type declarations only, comparable to C header files, without implementation. Used for defining the API of 3rd party libraries. These type definitions are to be explicitly defined or imported in the code. |
builtin |
N4TS |
smith |
Built-in ECMAScript objects interpreted as types. E.g., |
primitive |
N4TS |
smith |
Primitive ECMAScript (and N4JS ties), such as |
6.1.2. Built-in and Primitive Types
The built-in and primitive types are not defined by the user, i.e. N4JS programmer. Instead, they are defined in special
internal files using the internal N4TS language: builtin_js.n4ts, builtin_n4.n4ts, primitives_js.n4ts, primitives_n4.n4ts.
6.1.3. Type Model DSL
For defining built-in types and for tests, a special DSL called N4TS is provided by means of an Xtext grammar and generated
tools. The syntax is similar to n4js, but of course without method or function bodies, i.e. without any statements of expressions.
The grammar file is found in Types.xtext.
The following list documents some differences to N4JS:
-
access modifiers directly support
Internal, so no annotations are needed (nor supported) here. -
besides N4 classifiers such as classes, the following classifiers can be defined:
-
objectdefine classes derived from object (predefined object types) Special features:indexed defined what type is returned in case of index access
-
virtualBasevirtual base types for argument
-
-
primitiveprimitive types (number, string etc.) Special features:indexed defined what type is returned in case of index access
autoboxedType defines to which type the primitive can be auto boxed
assignmnentCompatible defines to which type the primitive is assignment compatible
-
types
any,null,void,undefined– special types.
Annotations are not supported in the types DSL.
6.2. Type System Implementation
The bulk of the type system’s functionality is implemented in packages org.eclipse.n4js.typesystem[.constraints|.utils].
Client code, e.g. in validations, should only access the type system through the facade class N4JSTypeSystem.
Each of the main type system functions, called "judgments", are implemented in one of the concrete subclasses of
base class AbstractJudgment. Internally, the type system is using a constraint solver for various purposes;
entry point for this functionality is class InferenceContext. All these classes are a good entry point into
the code base, for investigating further details.
Some type information is cached (e.g., the type of an expression in the AST) and the above facade will take care to read from the cache instead of re-computing the information every time, as far as possible. This type cache is being filled in a dedicated phase during loading and processing of an N4JS resource; see Type Inference of AST and N4JS Resource Load States for details.
6.3. Type Inference of AST
Most judgments provided by the facade N4JSTypeSystem and implemented by subclasses of AbstractJudgment are used
ad-hoc whenever client code requires the information they provide. This is applied, in particular, to judgments
-
subtype -
substTypeVariables -
upperBound/lowerBound
For judgment type footnote:type, not for expectedType,
but this may be changed in a future, further refactoring.], however, the processing is very different: we make
sure that the entire AST, i.e. all typable nodes of the AST, will be typed up-front in a single step, which
takes place during the post-processing step of an N4JSResource (see N4JS Resource Load States), which
also has a couple other responsibilities. By triggering post-processing when client code invokes the type judgment
for the first time for some random AST node (at the latest; usually it is triggered earlier), we make sure that
this sequence is always followed.
The remainder of this section will explain this single-step typing of the entire AST in detail.
6.3.1. Background
Originally, the N4JS type system could be called with any EObject at any given time, without any knowledge of the
context. While this looked flexible in the beginning, it caused severe problems solving some inference cases, e.g.,
the rule environment had to be prepared from the outside and recursion problems could occur, and it was also
inefficient because some things had to be recalculated over and over again (although caching helped).
It is better to do type inferencing (that is, computing the type of expressions in general) in a controlled manner. That is, instead of randomly computing the type of an expression in the AST, it is better to traverse the AST in a well-defined traversal order. That way, it is guaranteed that certain other nodes have been visited and, if not, either some special handling can kick in or an error can be reported. This could even work with XSemantics and the declarative style of the rules. The difference is that by traversing the AST in a controlled manner, the rules can make certain assumptions about the content of the rule-environment, such as that it always contains information about type variable bindings and that it always contains information about expected types etc.
In that scenario, all AST nodes are visited and all types (and expected types) are calculated up-front. Validation
and other parts then do not need to actually compute types (by calling the actual, Xsemantics-generated type system);
instead, at that time all types have already been calculated and can simply be retrieved from the cache (this is
taken care of by the type system facade N4JSTypeSystem).
This also affects scoping, since all cross-references have to be resolved in this type computation step. However, even for scoping this has positive effects: E.g., the receiver type in property access expressions is always visited before visiting the selector. Thus it is not necessary to re-calculate the receiver type in order to perform scoping for the selector.
The above refactoring was done in summer 2015. After this refactoring, we are still using Xsemantics to compute the
types, i.e. the type judgement in Xsemantics was largely kept as before. However, the type judgment is invoked
in a controlled traversal order for each typable AST node in largely one step (controlled by ASTProcessor and TypeProcessor).
The upshot of this one-step type inference is that once it is completed, the type for every typable AST node is known. Instead of storing this information in a separate model, this information will be stored and persisted in the type model directly, as well as in transient fields of the AST [1]. Currently, this applies only to types, not expected types; the inference of expected types could / should be integrated into the one-step inference as part of a future, further refactoring.
| Before | After |
|---|---|
ad-hoc type inference (when client code needs the type information) |
up-front type inference (once for entire AST; later only reading from cache) |
started anywhere |
starts with root, i.e. the |
Xsemantics rules traverse the AST at will, uncontrolled |
well-defined, controlled traversal order |
lazy, on-demand resolution of |
pro-active resolution of
|
6.3.2. Triggering
The up-front type inference of the entire AST is part of the post-processing of every N4JSResource and is thus triggered when post-processing is triggered. This happens when
-
someone directly calls
#performPostProcessing()on an N4JSResource -
someone directly calls
#resolveAllLazyCrossReferences()on an N4JSResource, -
EMF automatically resolves the first proxy, i.e. someone calls an EMF-generated getter for a value that is a proxy,
-
someone asks for a type for the first time, i.e. calls
N4JSTypeSystem#type(), -
…
Usually this happens after the types builder was run with preLinking==false and before validation takes place.
For details, see classes PostProcessingAwareResource and N4JSPostProcessor.
6.3.3. Traversal Order
The traversal order during post-processing is a bit tricky, as some things need to be done in a top-down order (only few cases, for now [2]), others in a bottom-up order (e.g. the main typing of AST nodes), and there is a third case in which several AST nodes are processed together (constraint-based type inference).
Figure Order in which AST nodes are being processed during post-processing. provides an example of an AST and shows in which order the nodes are processed. Green numbers represent top-down processing, red numbers represent bottom-up processing and blue numbers represent the processing of the surrounding yellow nodes in a single step.
In the code, this is controlled by class ASTProcessor. The two main processing methods are
-
#processNode_preChildren(), which will be invoked for all AST nodes in a top-down order (so top-down processing should be put here), -
#processNode_postChildren(), which will be invoked for all AST nodes in a bottom-up order (so bottom-up processing should be put here).
The common processing of groups of adjacent yellow nodes (represented in the figure by the two yellow/brown
triangles) is achieved by PolyProcessor telling the TypeProcessor to
-
ignore certain nodes (all yellow nodes) and
-
invoke method
PolyProcessor#inferType()for the root yellow node in each group (only the root!). . For details, see the two methods#isResponsibleFor()and#isEntryPoint()inPolyProcessor.
6.3.4. Cross-References
While typing the entire AST, cross-references need special care. Three cases of cross-references need to be distinguished:
| backward reference |
= cross-reference within the same file to an AST node that was already processed
|
| forward reference |
= cross-reference within the same file to an AST node that was not yet processed
|
| references to other files |
|
Note that for references to an ancestor (upward references) or successor (downward references) within an AST, the classification as a forward or backward reference depends on whether we are in top-down or bottom-up processing. Figure Upward Downward illustrates this: the left and right side show the same AST but on the left side we assume a top down processing whereas on the right we assume a bottom up processing. On both sides, backward references are shown in green ink (because they are unproblematic and always legal) and forward references are shown in red ink. Now, looking at the two arrows pointing from a node to its parent, we see that it is classified as a backward reference on the left side (i.e. top down case) but as a forward reference on the right side (i.e. bottom down case). Conversely, an arrow from a node to its child is classified as a forward reference on the left side and as a backward reference on the right side. Arrows across subtrees, however, are classified in the same way on the left and right side (see the horizontal arrows at the bottom).
6.3.5. Function/Accessor Bodies
An important exception to the basic traversal order shown in Figure Order in which AST nodes are being processed during post-processing. is that the body of all functions (including methods) and field accessors is postponed until the end of processing. This is used to avoid unnecessary cycles during type inference due to a function’s body making use of the function itself or some other declarations on the same level as the containing function. For example, the following code relies on this:
let x = f();
function f(): X {
if(x) {
// XPECT noerrors --> "any is not a subtype of X." at "x"
return x;
}
return new X();
}
Similar situation using fields and methods:
class C {
d = new D();
mc() {
// XPECT noerrors --> "any is not a subtype of D." at "this.d"
let tmp: D = this.d;
}
}
class D {
md() {
new C().mc();
}
}
For details of this special handling of function bodies, see method ASTProcessor#isPostponedNode(EObject) and field
ASTMetaInfoCache#postponedSubTrees and the code using it. For further investigation, change isPostponedNode() to always
return false and debug with the two examples above (which will then show the incorrect errors mentioned in the XPECT
comments) or run tests to find more cases that require this handling.
6.3.6. Poly Expressions
Polymorphic expressions, or poly expressions for short, are expressions for which the actual type depends on the expected type and/or the expected type depends on the actual type. They require constraint-based type inference because the dependency between the actual and expected type can introduce dependency cycles between the types of several AST nodes which are best broken up by using a constraint-based approach. This is particularly true when several poly expressions are nested. Therefore, poly expressions are inferred neither in top-down nor in bottom-up order, but all together by solving a single constraint system.
Only a few types of expressions can be polymorphic; they are called poly candidates: array literals, object literals, call expressions, and function expressions. The following rules tell whether a poly candidate is actually poly:
-
ArrayLiteral— always poly (because their type cannot be declared explicitly). -
ObjectLiteral— if one or more properties do not have a declared type. -
CallExpression— if generic & not parameterized. -
FunctionExpression— if return type or type of one or more formal parameters is undeclared.
This is a simplified overview of these rules, for details see method #isPoly(Expression) in AbstractPolyProcessor.
The main logic for inferring the type of poly expressions is found in method #inferType() in class PolyProcessor.
It is important to note that this method will only be called for root poly expressions (see above). In short, the basic
approach is to create a new, empty InferenceContext, i.e. constraint system, add inference variables and constraints for
the root poly expression and all its nested poly expressions, solve the constraint system and use the types in the solution
as the types of the root and nested poly expressions. For more details see method #inferType() in class PolyProcessor.
So, this means that nested poly expressions do not introduce a new constraint system but instead simply extend their parent poly’s constraint system by adding additional inference variables and constraints. But not every nested expression that is poly is a nested poly expression in that sense! Sometimes, a new constraint system has to be introduced. For example:
-
child poly expressions that appear as argument to a call expression are nested poly expressions (i.e. inferred in same constraint system as the parent call expression),
-
child poly expressions that appear as target of a call expression are not nested poly expressions and a new constraint system has to be introduced for them.
For details see method #isRootPoly() in AbstractPolyProcessor and its clients.
6.3.7. Constraint Solver
The simple constraint solver used by the N4JS type system, mainly for the inference of poly expressions, is implemented
by class InferenceContext and the other classes in package org.eclipse.n4js.typesystem.constraints.
The constraint solving algorithm used here is largely modeled after the one defined in The Java Language Specification 8,
Chapter 18, but was adjusted in a number of ways, esp. by removing functionality not required for N4JS (e.g. primitive types,
method overloading) and adding support for specific N4JS language features (e.g. union types, structural typing).
For details see the API documentation of class InferenceContext.
6.4. Structural Typing
Structural typing as an optional subtyping mode in N4JS is implemented in StructuralTypingComputer, activated depending on
the value of property typingStrategy in ParameterizedTypeRef and its subclasses.
7. Type Index
7.1. Design Rationale
We use a separate types model to represent types, see Type Model and Grammar. Declared elements (e.g., classes)
in N4JS are parsed and a new types model instance is derived from them. All type references (of the N4JS AST)
are then bound to these type instances and not to the N4JS declaration. However, there exists a relation between a type
and its declaration. The type instances (which are EObjects) are added to the resource of the N4JS file as part of
the public interface of the resource. This public interface is represented by a TModule. While the actual source code
is the first element of a resource (index 0), the module is stored at index 1. It contains the derived type information,
the information about exported variables and functions as well as information about the project and vendor. The Xtext
serializer ignores the additional element. Besides, the complete type instances are stored in the user data section of
the IEObjectDescription of the TModule. Since the user data only allows strings to be stored, the EObjects are serialized
(within a virtual resource). When a reference is then bound to a type, the type can be directly recreated (deserialized)
from the user data. The deserialized EObject is then added to the appropriate resource. It is not necessary to load the
complete file just to refer to a type from that file.
The design relies on two key features of Xtext:
-
Besides the parsed text (i.e., the AST), other elements can be stored in a resource, which are then ignored by the Xtext serializer, while still being properly contained in an EMF resource.
-
The
DerivedStateAwareResourceallows some kind of post processing steps when reading a resource. This enables a custom class, hereN4JSDerivedStateComputer, to create the types models (TClass, TRole and so on) from the parsedN4ClassDeclaration,N4RoleDeclarationand so on.
7.1.1. Getting the Xtext Index (IResourceDescriptions) Content
An instance of the IResourceDescriptions can be acquired from the resource description provider. Just like all services
in Xtext, this can be injected into the client code as well. The resource descriptions accepts a non-null resource set.
The resource set argument is mandatory to provide the index with the proper state. We are differentiating between three
different state. The first one is the persisted one, basically the builder state is a resource description as well, and
it provides a content that is based on the persisted state of the files (in our case the modules and package.json file)
on the file system. The second one is the live scoped index, this is modification and dirty state aware. Namely when using
this resource descriptions then an object description will be searched in the resource set itself first, then in the dirty
editor’s index, finally among the persisted ones. The third index is the named builder scoped one. This index should not be
used by common client code, since it is designed and implemented for the Xtex builder itself.
A resource set and hence the index can be acquired from the N4JS core, in such cases an optional N4JS project can be specified. The N4JS project argument is used to retrieve the underlying Eclipse resource project (if present) and get the resource set from the resource set provider. This is completely ignored when the application is running in headless mode and Eclipse resource projects are not available. It is also important to note that the resource set is always configured to load only the persisted states.
When the Eclipse platform is running, the workspace is available and the all N4JS projects are backed by an Eclipse resource
project. With the Eclipse resource project the resource sets can be initialized properly via the resource set initializer
implementations. This mechanism is used to get the global objects (such as console) and the built-in types (such as string,
number) into the resource set via the corresponding resource set adapters. In the headless case a special resource set
implementation is used; ResourceSetWithBuiltInScheme. This implementation is responsible to initialize the globals and the
built-in types into itself.
7.2. Design Overview
Type Model With Xtext Index shows a simplified UML class diagram with the involved classes. In the figure, a class (defined as N4ClassExpression in the AST and its type TClass) is used as a sample, declared type—roles or enums are handled similarly.
In the Eclipse project build the N4JSResourceDescriptionManager (resp. by the logic of its super class) is called by the
N4JSGenerateImmediatelyBuilderState to get the resource description for a resource. The resource description manager loads
the resource to create / update the resource descriptions. Loading an Xtext resource means that it is reparsed again.
All cross references are handled here only by the lazy linker so that the node model will contain an unresolved proxy
for all cross references.
After the resource is loaded there is a derived state installed to the resource. For this the N4JSDerivedStateComputer will
be called. It will take the parse result (= EObject tree in first slot of the resource) and navigate through these objects
to create type trees for each encountered exportable object that are stored in exported TModule of the resource.
Create Type From AST, a snippet of Type Model with Xtext Index,
shows only the classes involved when creating the types from the resource.
For these elements types have to be derived as they are exportable: N4ClassDeclaration, N4RoleDeclaration, N4InterfaceDeclaration,
N4EnumDeclaration, ExportedVariableDeclaration and FunctionDeclaration.
After loading and initializing the resources now all cross references in the resources are resolved. For this the
ErrorAwareLinkingService is used. This class will in turn call the N4JSScopeProvider to first try to do scoping locally
but eventually also delegate to the global scope provider to find linked elements outside the current resource. This
will be done e.g. for every import statement inside the N4JS resource.
For determine the global scope all visible containers for this resource are calculated. For this the project description
(= loaded package.json file) is used to determine which folders of the current project should be included for looking for
N4JS resources. Also all referenced projects and their resources are added to the visible containers. For these containers
N4JSGlobalScopeProvider builds up a container scope. This container scope will be a N4JSTypesScope instance.
For the actual linked element in the resource to be resolved, its fully qualified name is used. This name is calculated by
using the IQualifiedNameConverter. We bound a custom class named N4JSQualifiedNameConverter who converts the / inside the
qualified name to a dot, so e.g. my/module/MyFileName is converted to my.module.MyFileName. Btw. the initial qualified name
was derived from the node model.
With this qualified name N4JSTypeScope.getSingleElement is called. This method does the actual resolving of the cross reference.
For this the URI of the cross reference is used to determine the linked resource.
There are now three cases:
-
If the resource which contains the linked EObject is already loaded the EObject description found for the URI is returned
-
If the resource is not loaded but the first slot of the resource is empty the referenced type is tried to be rebuild from an existing resource description for the linked resource inside the Xtext index.
-
If the resource is not loaded and the first slot is set, the linked EObject will be resolved with the fragment of the given URI.
While calculating the resource description for a N4JSResource, the EObject descriptions of their exported objects have to be
calculated as well. For this the N4JSResourceDescriptionStrategy is used. For computing the exported objects of a resource only
the root TModule and its contained types and variables are taken in consideration.
The EObjectDescriptions for a n4js resource include:
-
An exported representation of the derived
TModule. This carries these properties:-
a qualified name (e.g.
my.module.MyFileNamewhen the resource is stored undersrc/my/module/MyFileName.jsin the project and the project description has marked src has src folder). The calculation of the qualified name is delegated to theN4JSNamingUtil. -
the user data which is the serialized string of the exported
TModuleitself. It includes the types determined for this resource, so for every element found in this resource, that is contained in anExportStatement, an EObject has been created before inN4JSDerivedStateComputer. In most cases this an EObject extendingTypefrom the types model, e.g.TClassforN4ClassDeclaration. There is an exception forExportedVariableDeclarationwhereTVariableis used as representative (and this EObject is not contained in the types model only in the N4JS model). For usability reasons (quicker quick fix etc.), also top level types not exported yet are stored in theTModel. -
the information on project and vendor id are part of the module descriptor.
-
-
Descriptions for all top level types that are defined in the resource. These descriptions do not have any special properties, so they just have a name.
-
All exported variables are also described in the resource description. They don’t carry any special information either.
The EObjectDescription for an EObject contained in an ExportStatement:
-
the qualified name of the module export (e.g. for a
N4ClassDeclarationthe qualified namemy.module.MyFileName.MyClassNamewould be produced, when the resource is stored undersrc/my/module/MyFileName.jsin the project, the project description has marked src has src folder and the N4 class uses the name MyClassName]). The calculation of the qualified name is delegated to theN4JSNamingUtil. -
the EObject represented by the EObject description, here this is not the actual EObject from N4JS but the type EObject from the TypeSystem, that has been inferenced by using
N4JSTypeInferencer -
the user data is only an empty map for this EObjectDescription
With this the resource description for a resource should be fully created / updated. Serialize to Index shows the classes involved creating the resource and EObjectDescriptions, along with the serialized type information.
7.3. N4JS Resource Load States
Below state diagram depicts the state transitions when loading and resolving an N4JS resource.
Additionally, the following table relates the values of the resource’s flags to the states.
| State | Parse Result | AST | TModule | ASTMetaInfoCache | loaded | fullyInitialized | fullyProcessed | reconciled |
|---|---|---|---|---|---|---|---|---|
Created |
|
|
|
|
false |
false |
false |
false |
Created' |
|
|
|
|
false |
true |
false |
false |
Loaded |
available |
with lazy linking proxies |
|
|
true |
false |
false |
false |
Pre-linked |
available |
with lazy linking proxies |
with stubs |
|
true |
true |
false |
false |
Fully Initialized |
available |
with lazy linking proxies |
with DeferredTypeRefs |
|
true |
true |
false |
false |
Fully Processed |
available |
available |
available |
available |
true |
true |
true |
false |
Loaded from Description |
|
proxy |
available |
|
indeterminate |
true |
true |
false |
Loaded from Description' |
|
proxy |
with DeferredTypeRefs |
|
indeterminate |
true |
true |
false |
Fully Initialized ® |
available |
with lazy linking proxies |
available |
|
indeterminate |
true |
false |
true |
Fully Processed ® |
available |
available |
available |
available |
indeterminate |
true |
true |
true |
Remarks:
-
oddities are shown in red ink, in the above figure.
-
in the above figure:
-
"AST (proxy)" means the AST consists of only a single node of type
Scriptand that is a proxy, -
"AST (lazy)" means the AST is completely created, but cross-references are represented by unresolved Xtext lazy-linking proxies,
-
"TModule (stubs)" means the TModule has been created with incomplete information, e.g. return types of all TMethods/TFunctions will be
null(only used internally by the incremental builder), -
"TModule (some deferred)" means the TModule has been created, does not contain stub, but some `TypeRef`s are `DeferredTypeRef`s that are supposed to be replaced by proper `TypeRef`s during post-processing.
-
"AST" and "TModule" means the AST/TModule is available without any qualifications.
-
-
state Created': only required because Xtext does not clear flag
fullyInitializedupon unload; that is done lazily when#load()is invoked at a later time.. Thus, we do not reach state Created when unloading from state Fully Initialized but instead get to state Created'. To reach state Created from Fully Initialized we have to explicitly invoke#discardDerivedState()before(!) unloading. -
state Loaded from Description': transition
#unloadAST()from state Fully Initialized leaks a non-post-processed TModule into state Loaded from Description, which is inconsistent with actually loading a TModule from the index, because those are always fully processed. Hence, the addition of state Loaded from Description'. -
states Fully Initialized ® and Fully Processed ®: these states are reached via reconciliation of a pre-existing TModule with a newly loaded AST. These states differ in an unspecified way from their corresponding non-reconciled states. For example, in state Fully Initialized ® the TModule does not contain any DeferredTypeRefs while, at the same time, the TModule isn’t fully processed, because proxy resolution, typing, etc. have not taken place, yet.
-
TODO old text (clarify this; I could not reproduce this behavior): when
unloadASTis called,fullyInitializedremains unchanged. This is why the value offullyInitializedshould be indeterminate in row Loaded from Description; it depends on the previous value if the state Loaded from Description was reached by callingunloadAST.
7.4. Types Builder
When a resource is loaded, it is parsed, linked, post-processed, validated and eventually compiled. For linking and validation type information is needed, and as described above the type information is created automatically when loading a resource using the types builder. Resource Loading shows an activity model with the different actions performed when a resource is loaded.
The blue colored steps are standard Xtext workflow. Handling the TModule and storing that in the index are N4 specific (red background).
7.4.1. Type Inference not allowed in Types Builder
A crucial point in the workflow described above is the combination of types model building and type inference. In some cases, the type of a given element is not directly stated in the AST but has to be inferred from an expression and other types. For example, when a variable declaration does not declare the variable’s type explicitly but provides an initializer expression, the actual type of the variable is inferred to be the type of the expression.
However, the types builder cannot be allowed to use type inference, mainly for two reasons:
-
type inference through Xsemantics could lead to resolution of cross-references (i.e. EMF proxies generated by the lazy linker) and because the types builder is triggered when getContents() is called on the containing
N4JSResourcethis would break a basic contract of EMF resources. -
type inference could cause other resources to be loaded which would lead to problems (infinite loops or strange results) in case of circular dependencies. This is illustrated in Types Builder Problem and Types Builder Proxies.
Therefore, whenever the type of a particular element has to be inferred, the types builder will use a special type reference
called DeferredTypeRef [3],
in order to defer the actual type inference to a later stage, i.e. the post-processing stage.
7.4.2. Deferred Type References
Whenever type inference would be required to obtain the actual type of an element, the types builder will insert a stub to defer
actual type inference (see previous section). A dedicated subclass of TypeRef, called DeferredTypeRef, is used that contains neither
the actual type information nor any information necessary to perform the type inference at a later point in time. Later, this
DeferredTypeRef will be replaced during post-processing, see TypeDeferredProcessor.
All DeferredTypeRefs will be replaced by the actual types during post-processing. One important reason for resolving
all DeferredTypeRefs as early as possible is that they are not suited for serialization and therefore have to be removed
from the types model before populating the Xtext index, which includes serializing the TModule into the user data of the
root element. This is always assured by the logic that manages the triggering of the post-processing phase.
To manually trigger resolution of all DeferredTypeRefs in a given types model, simply call method performPostProcessing(CancelIndicator)
of the containing N4JSResource (should never be required by client code such as validations).
7.4.3. Use cases of DeferredTypeRef
Currently, DeferredTypeRefs are created by the types builder only in these cases:
-
actual type of an exported TVariable if no declared type but an initialization expression are given.
-
actual type of a TField if no declared type but an initialization expression are given.
-
actual type of properties of ObjectLiterals if not declared explicitly.
-
actual type of formal parameters and return value of function expressions if not declared explicitly.
Note that this overview might easily get out-dated; see references to class DeferredTypeRef in the code.
7.5. Incremental Builder (Overview)
This section provides a brief overview of how the incremental builder works.
General remarks:
-
The N4JS incremental builder is a combination of Eclipse builder infrastructure, Xtext-specific builder functionality and several adjustments for N4JS and N4MF.
-
The
IBuilderStateimplementation is identical to the persisted Xtext index. No matter how many Xtext languages are supported by the application, only a singleIBuilderStateinstance is available in the application. Since we have one singleIBuilderState, we have one single persisted Xtext index throughout the application. -
For simplicity, the below description assumes we have only N4JS projects in the workspace and no other Xtext languages are installed.
Major components:
-
XtextBuilder(inherits from Eclipse’sIncrementalProjectBuilder):-
the actual incremental builder
-
note: Eclipse will create one instance of
XtextBuilderper project at startup.
-
-
IBuilderState(Xtext specific; no Eclipse pendant):
identical to theXtext index, i.e. the globally shared, persisted instance of
IResourceDescriptions.
Main workflow:
-
for each project that contains at least one resource that requires rebuilding, Eclipse will call the project’s
XtextBuilder. -
each
XtextBuilderwill perform some preparations and will then delegate toIBuilderStatewhich will iterate over all resources in the builder’s project that require rebuilding.
7.5.1. XtextBuilder
Whenever a change in the workspace happens …
-
Eclipse will collect all projects that contain changed resources and compute a project-level build order (using the
build orderof the workspace, seeWorkspace#getBuildOrder(), which is based on project dependencies) -
for the first [4] project with changed resources, Eclipse will invoke method
IncrementalProjectBuilder#build(int,Map,IProgressMonitor)of the project’sXtextBuilder
(NOTE: from this point on, we are in the context of acurrent project) -
in
XtextBuilder#build(int,Map,IProgressMonitor):
the builder creates an empty instance ofToBeBuilt(Xtext specific) -
in
XtextBuilder#incrementalBuild(IResourceDelta,IProgressMonitor):-
The builder will iterate over all files in the project and for each will notify a
ToBeBuiltComputerabout the change (added, updated, or deleted) which can then decide how to update theToBeBuiltinstance, -
then forwards to
#doBuild().Note: if user changes 1..* files in a single project but later more files in other, dependant projects need to be built, the above step will happen for all projects, but will have an effect only for the first project that contains the actual file changes (i.e. in the standard case of saving a single file
ToBeBuiltwill always be non-empty for thefirstproject, and always empty for the other, dependant projects; if aSave Allis done,ToBeBuiltcould be non-empty for later projects as well).
-
-
in
XtextBuilder#doBuild(ToBeBuilt,IProgressMonitor,BuildType):-
first check if
ToBeBuiltis empty AND global build queue does not contain URIs for current project → then abort (nothing to do here) -
creates instance of BuildData with:
-
name of current project (as string)
-
newly created, fresh
ResourceSet -
the
ToBeBuilt(containing URIs of actually changed resources within current project, possibly filtered byToBeBuiltComputer) -
the
QueuedBuildData(an injected singleton) -
mode flag
indexingOnly(only true during crash recovery)
-
-
invoke
IBuilderStatepassing theBuildData
→ updates itself (it is the global Xtext index) to reflect all changes incurrent project; validates and updates markers; runs transpiler (see below for details) -
invoke all registered
IXtextBuilderParticipants(Xtext specific) for thecurrent project-
this is where normally we would do validation and run the transpiler; however, for performance reasons (do not load resource again) we already do this in the
IBuilderState(this is the idea of theGenerateImmediatelyBuilderState) -
in our implementation, almost nothing is done here, except trivial stuff such as deleting files during clean build
At this point: returning from all methods.
-
-
-
back in
XtextBuilder#build(int,Map,IProgressMonitor):
→ return with an array of IProjects; in our case: we return all other N4JSProjects referenced in the package.json of the project-
important: these are not the projects that will be processed next: we need to continue with projects that depend on the current project, not with projects the current project depends on!
-
Eclipse calls the returned projects
interestingProjectsand uses that as a hint for further processing; details not discussed here.
-
-
continue with step one:
Eclipse will invokeXtextBuilder#build(int,Map,IProgressMonitor)again for all other projects that have a dependency to thecurrent projectof the previous iteration, plus all remaining projects with changed resources.
7.5.2. IBuilderState
Invoked: once for each project containing a changed resource and dependant projects.
Input: one instance of BuildData, as created by XtextBuilder, containing:
-
name of current project (as string)
-
newly created, fresh
ResourceSet -
the
ToBeBuilt-
set of to-be-deleted URIs
-
set of to-be-updated URIs
-
-
the
QueuedBuildData, an injected singleton maintaining the following values [5]:-
a queue of URIs per project (below called the
global queue)
(actually stored inQueuedBuildData#projectNameToChangedResource) -
a collection of
all remaining URIs
(derived value: queued URIs of all projects + queues URIs not associated to a project (does not happen in N4JS)) -
a collection of
pending deltas(always empty in N4JS; probably only used for interaction with Java resources)
-
-
mode flag
indexingOnly(only true during crash recovery)
7.5.2.1. Copy and Update Xtext Index
-
in
IBuilderState#update(BuildData,IProgressMonitor):
creates a copy of itsResourceDescriptionsDatacallednewData[6] -
in
AbstractBuilderState#doUpdate(…):
updatesnewDataby writing new resource descriptions into it.-
Creates a new load operation (
LoadOperation) instance from theBuildData#getToBeUpdated()and loads all entries. While iterating and loading the resource descriptions, it updatesnewDataby registering new resource descriptions that are being created on the fly from the most recent version of the corresponding resources. -
Adds these resources to the current project’s build queue. (
BuildData#queueURI(URI uri))
-
-
for all to-be-deleted URIs given in
ToBeBuiltin theBuildData, removes the correspondingIResourceDescriptionfromnewData-
ToBeBuilt#getAndRemoveToBeDeleted()returns all URIs that have been marked for deletion but not marked for update and will clear the set of to-be-deleted URIs inToBeBuilt.
-
7.5.2.2. Build State Setup Phase
-
Calculates a set
allRemainingURIs[7] as follows:-
Initially contains all resource URIs from
newData. -
All URIs will be removed from it that are marked for update (
BuildData#getToBeUpdated()). -
Finally, all URIs will be removed from it that are already queued for build/rebuild. (
BuildData#getAllRemainingURIs()).
-
-
Creates an empty set
allDeltasof resource description deltas
(c.f.IResourceDescription.Delta). [8] -
Process Deleted: for all to-be-deleted URIs, creates a delta where the old state is the current state of the resource and the new state is
nulland adds it toallDeltas. -
Adds all
pending deltasfromQueuedBuildDatatoallDeltas(does not apply to N4JS). -
Enqueue affected resources, part 1: adds to the
global queuethe URIs of all resources affected by the changes inallDeltas.For N4JS, allDeltasalways seems to be empty at this point, so this does nothing at all. -
Creates an empty set
changedDeltasfor storing deltas that were modified by the build phase and represent an actual change. UnlikeallDeltas, this set contains only those URIs that were processed by the builder - the underlying user data information contains the differences between the old and the new state. -
Creates a new
current queueand adds all URIs from theglobal queuethat belong to thecurrent project.
7.5.2.3. Process Queued URIs
Processes all elements from the queue until it contains no more elements.
-
Load the resource for the first/next URI on the current queue
In case of a move, the loaded resource could have a different URI! -
Once the resource has been loaded, it removes its URI from the current queue to ensure it will not be processed again.
-
If the loaded resource is already marked for deletion, stop processing this resource and continue with next URI from the current queue (go to step Load Res) [9]
-
Resolves all lazy cross references in the loaded resource. This will trigger post-processing, including all type inference (c.f.
ASTProcessor#processAST(…)). -
Creates a delta for the loaded resource, including
-
a resource description based on the new state of the resource, wrapped into the
EObject-based resource description (as with the Xtext index persistence inEMFBasedPersister#saveToResource()). -
a resource description for the same resource with the state before the build process.
-
-
Adds this new delta to
allDeltasand, if the delta represents a change (according toDefaultResourceDescriptionDelta#internalHasChanges()), also adds it tochangedDeltas. -
Adds the resource description representing the new state, stored in the delta, to
newData, i.e. the copiedResourceDescriptionsData, replacing the old resource description of the loaded resource [10]. -
If the current queue is non-empty, go to step Load Res and continue with the next URI in the current queue.
7.5.2.4. Queueing Affected Resources
When the current queue contains no more URIs (all have been processed) …
-
Enqueue affected resources, part 2: add to the global queue URIs for all resources affected by the changes in
changedDeltas[11]. -
Returns from
#doUpdate(), returningallDeltas(only used for event notification). -
back in
IBuilderState#update(BuildData,IProgressMonitor):
makes thenewDatathe publicly visible, persistent state of the IBuilderState (i.e. theofficialXtext index all other code will see).
We now provide some more details on how the global queue is being updated, i.e. steps Enqueue Affected Resources and Update Global Queue. Due to the language specific customizations for N4JS, this second resource-enqueuing phase is the trickiest part of the incremental building process and has the largest impact on how other resources will be processed and enqueued at forthcoming builder state phases.
-
If
allDeltasis empty, nothing to do. -
If
allDeltascontains at least one element, we have to check other affected resources by going through the set of all resource URIs (allRemainingURIs) calculated in in the beginning of the build process. -
Assume we have at least one element in the
allDeltasset, the latter case is true and we must check all elements whether they are affected or not. We simply iterate through theallRemainingURIsset and retrieve the old state of the resource description using the resource URI. -
Once the resource description with the old state is retrieved, we check if it is affected through the corresponding resource description manager. Since we currently support two languages, we have two different ways for checking whether a resource has changed or not. One for package.json files and the other for the N4JS language related resources.
-
The package.json method is the following: get all project IDs referenced from the
candidatepackage.json and compare it with the container-project name of the package.json files from thedeltas. The referenced IDs are the followings:-
tested project IDs,
-
implemented project IDs,
-
dependency project IDs,
-
provided runtime library IDs,
-
required runtime library IDs and
-
extended runtime environment ID.
-
-
The N4JS method is the following:
-
We consider only those changed deltas which represent an actual change (
IResourceDescription.Delta#haveEObjectDescriptionsChanged()) and have a valid file extension (.n4js,.n4jsdor.js). -
For each
candidate, we calculate the imported FQNs. The imported FQNs includes indirectly imported names besides the directly imported ones. Indirectly imported FQNs are, for instance, the FQNs of all transitively extended super class names of a direct reference. -
We state that a
candidateis affected if there is a dependency (for example name imported by acandidate) to any name exported by the description from a delta. That is, it computes if a candidate (with givenimportedNames) is affected by a change represented by the description from the delta. -
If a
candidateis affected we have to do an additional dependency check due to the lack of distinct unique FQNs. If a project containing the delta equals with the project contained by the candidate, or if the project containing the candidate has a direct dependency to the project containing the delta, we mark a candidate as affected.
-
-
If a candidate was marked as affected, it will be removed from the
allRemainingURIsand will be added to the build queue. -
If a candidate has been removed from the
allRemainingURIsand queued for the build, we assume itsTModuleinformation stored in the user data is obsolete. To invalidate the obsolete information, we wrap the delta in the custom resource description delta so whenever theTModuleinformation is asked for, it will be missing. We then register this wrapped delta into the copied Xtext index, end the builder state for the actual project then invoke the Xtext builder with the next dependent project.
7.5.3. Example
To conclude this section, we briefly describe the state of the above five phases through a simple example. Assume a
workspace with four N4JS projects: P1, P2, P3 and PX. Each project has one single module with one single
publicly visible class. Also let’s assume project P2 depends on P1 and P3 depends on P2. Project PX have
no dependencies to other projects. Project P1 has a module A.n4js with a class A, project P2 has one single
module B.n4js. This module has a public exported class B which extends class A. Furthermore, project P3 has
one single module: C.n4js. This module contains one exported public class C which extends B. Finally, project
PX has a module X.n4js containing a class X that has no dependencies to any other classes. The figure below
picture depicts the dependencies between the projects, the modules and the classes as described above.
For the sake of simplification, the table below describes a symbol table for all resources:
P1/src/A.n4js |
A |
P2/src/B.n4js |
B |
P3/src/C.n4js |
C |
PX/src/X.n4js |
X |
Let assume auto-build is enabled and the workspace contains no errors and/or warnings. We make one simple modification
and expect one single validation error in class C after the incremental builder finished its processing; we delete the
method foo() from class A.
After deleting the method in the editor and saving the editor content, a workspace modification operation will run and
that will trigger an auto-build job. The auto-build job will try to build the container project P1 of module A. Since
the project is configured with the Xtext builder command, a builder state update will be performed through the Xtext builder.
Initially, due to an Eclipse resource change event (we literally modify the resource from the editor and save it), the
ToBeBuilt instance wrapped into the BuildData will contain the URI of the module A marked for an update. When updating
the copied index content, module A will be queued for a build. While processing the queued elements for project P1,
module A will be processed and will be added to the allDeltas set. Besides that, it will be added to the changedDeltas
set as well. That is correct, because its TModule information has been changed after deleting the public foo() method.
When queuing affected resources, iterating through the set of allRemainingURIs, we recognize that module B is affected.
That is indeed true; module B imports the qualified name of class A from module A and project P2 has a direct
dependency to P1. In this builder state phase, when building project P1, module C is not considered as affected.
Although class C from module C also imports the qualified name of class A from module A, project P3 does not
have a direct dependency to project P1. When module B becomes enqueued for a forthcoming build phase, we assume its
TModule information is obsolete. We invalidate this TModule related user data information on the resource description
by wrapping the resource description into a custom implementation (ResourceDescriptionWithoutModuleUserData). Due to this
wrapping the resource description for module B will be marked as changed (IResourceDescription.Delta#haveEObjectDescriptionsChanged())
whenever the old and current states are being compared.
The Eclipse builder will recognize (via IProjectDescription#getDynamicReferences()) that project P2 depends on project P1
so the Xtext builder will run for project P2 as well. At the previous phase we have enqueued module B for the build.
We will therefore run into a builder state update again. We do not have any resource changes this time, so ToBeBuilt will
be empty. Since ToBeBuilt is empty, we do not have to update the copied Xtext index state before the builder state setup
phase. As the result of the previous builder state, phase module B is already enqueued for a build. When processing B
we register it into the allDeltas set. That happens for each resource being processed by the builder state. But it will be
registered into the changedDeltas because we have previously wrapped module B into a customized resource description delta
to hide its obsolete TModule related user data information. Based on the builder state rules and logic described above,
module C will be marked as an affected resource, will be queued for build and will be wrapped into a customized resource
description delta to hide its TModule related user data information.
In the next builder state phase, when building project P3, we apply the same logic as we applied for project P2. The
builder state will process module C and will update the Xtext index state. No additional resources will be found as
affected ones, nothing will be queued for build. The build will terminate, since there were no changed IResource instances
and the build queue is empty.
The outcome of the incremental build will be a workspace that contains exactly one validation error. The error will be
associated with module C which was exactly our expectation, however, we have to clarify that transitive C dependency
was built due to wrong reasons. Module C was build because we wrapped module B to hide its user data information and
not because it imports and uses class A from module A which should be the logical and correct reason.
7.6. Dirty state handling
When two or more (N4)JS files are opened in editors and one of them is changed but without persisting this change the other open editors should be notified and if this change breaks (or heals) references in one of the other open resources their editors should updated so that warn and error markers are removed or added accordingly.
When there are changes in the currently open editor these changes are propagated to all other open editors. Each Xtext editor has
got its own resource set. The N4JSUpdateEditorStateJob runs for each open editor different from the editor where the changes have
been made. In those editors the affected resources are unloaded and removed from the resource set. Then the Xtext resource of
these editors is reparsed. After reparsing scoping and linking is invoked again, but now the references resources are rebuild
as EObjectDescriptions. The N4JSResource holds its own content that only contains 1..n slots when proxified.
N4JSTypeScope.getSingleElement (called when resolving cross references and the linked element should be returned) will return the
EObjectDescription created from the ModuleAwareContentsList in N4JSResource, that contains the first slot as proxy and the other
slots as types. Sequence Diagram: Dirty State, Trigger N4JSUpdateEditorStateJob shows the flow to trigger the N4JSUpdateEditorStateJob and Sequence Diagram: Dirty State, N4JSUpdateEditorStateJob in Detail
shows the sequence logic of the N4JSUpdateEditorStateJob in detail.
N4JSUpdateEditorStateJob
N4JSUpdateEditorStateJob in DetailA concrete example should illustrate the behaviour of the dirty state handling in conjunction with fully and partial loading of resources:
Let A.js as above, and B.js as follows:
import A from "A.js"
export class B {}
-
assume is opened and loaded: is created with
-
is filled with a special proxy to resolve the AST of A only if needed.
-
will be set to type A, loaded from
EObjectDescriptionof A.js/A
-
-
AST of A.js is to be accessed, e.g., for displaying JSDoc. A.js is not opened in an editor! is modified as follows:
-
is filled with AST, i.e. proxy in 0 is resolved
-
is updated with parsed type: 1) proxify , 2) unload (remove from content), 3) reload with parsed types again
-
-
Assume now that A.js is opened and edited by the user.
-
Reconceiler replaces with modified AST
-
LazyLinker updates
-
is proxified
-
B
searchesfor A and finds updated
-
Each opened Xtext editor has got its own resource set! Such a resource set contains the resource for the currently edited
file in the first place. When starting editing the file, the resource is reparsed , reloaded and linking (resolving the cross
references) is done. By resolving the cross references N4JSTypeScope is used and now the URIs of the linked elements are belonging
to resources not contained in the resource set of the editor so these resources a create in the resource set and their contents
is loaded from the resource descriptions via
N4JSResource.loadFromDescription .
When the resource content is loaded from the existing resource description available from Xtext index the first slot is set to be
a proxy with name #:astProxy.
After that for all exported EObject descriptions of that resource description the user data is fetched and deserialized to types
and these types are added to the slots of the resource in order they were exported. But the order is not that important anymore.
As the resource set for the editor is configured to use the DirtyStateManager ( ExternalContentSupport.configureResourceSet(resourceSet, dirtyStateManager) ),
all other open editors will be notified by changes in the current editor. This is done by N4JSDirtyStateEditorSupport that schedules
a N4JSUpdateEditorStateJob that create a new resource description change event.
Via isAffected and ResourceDescription.getImportedNames it is determine if a change in another resource affects this resource.
Before loading the resource always N4JSDerivedStateComputer.installDerivedState is execute that, as we learned earlier, is responsible
for creating the types in the resource.
On every change to a N4JS file that requires a reparse the N4JSDerivedStateComputer.discardDerivedState is executed. This method do an
unload on every root element at the positions 1 to n. In the N4JSUnloader all contents of the this root elements are proxified (i.e.
there is a proxy URI set to each of them) and the references to the AST are set to null (to avoid notifications causing
concurrent model changes). The proxification indicates for all callers of these elements, that they have to reload them. Afterwards
it discards the complete content of the resource. The content is build up again with the reparse of the N4JS file content.
As each editor has its own resource set, only the resource belonging to the current editor is fully loaded. All other referenced resources are only partially loaded, i.e. only the slot 1 of these resources are loaded (i.e. the types model elements) in this resource set. Linking is done only against these types model elements. Synchronization between the resource sets of multiple open editors is done via update job as described above.
7.6.1. Use case: Restoring types from user data
-
Use case: referencing resources in editor: This has been described already in context of dirty state handling
-
Use case: referencing resources from JAR: This is still to be implemented.% taskIDE-37
7.6.2. Use case: Updating the Xtext index
When a N4JS file is changed in way that requires reparsing the file, the underlying resource is completely unloaded and loaded again. By this the also the elements at the Xtext index are recreated again, belonging to this resource (i.e. new entries for new elements in the resource, update index elements of changed elements, delete index entries for deleted elements).
When Eclipse is closed the Xtext index is serialized in a file.
When starting Eclipse again, the Xtext index is restored from this file:
8. Rename Refactoring
The rename refactoring operation is implemented based on current Xtext’s rename refactoring implementation. However, lots of customization have been done in order to make Rename Refactoring work for N4JS. In order to understand N4JS customization, it is imperative to understand how Xtext implements rename refactoring. In this chapter, we will focus on Xtext’s architecture for rename refactoring. Additionally, we will point to the components that have been customized for N4JS.
8.1. Rename Refactoring UI interaction
Xtext’s implementation allows rename refactoring be in either one of two modes (1) Direct refactoring mode (3) Refactoring with dialog mode. Diagram Direct Rename Refactoring UI interaction shows the UI interaction in direct refactoring mode.
In this diagram, the classes in yellow are customized by N4JS implementation to handle N4JS-specific characteristics.
-
DefaultRenameElementHandler: Our custom N4JS implementation converts the selected element to be renamed into its corresponding TModule element. -
ILinkedPositionGroupCalculator: This class is responsible for calculating locations of names to be changed during linked edit mode. We need to provide a custom N4JS implementation to handle composed elements. -
RenameElementProcessor: We need to provide a custom N4JS implementation to add N4JS-specific validation of conditions, e.g. checking name conflicts etc.
The key class for creating updates of a declaration and its associated references is RenameElementProcessor. In the following section, we will see how this class interacts with other classes to achieve this.
8.2. RenameElementProcessor interaction
Diagram RenameElementProcessor interaction shows the interaction of RenameElementProcessor and other classes to create changes for both declaration and references during rename refactoring.
As seen in the diagram, there are two stages of creating updates:
-
Creating updates for declaration is done by
IRenameStrategyand -
Creating updates for references is done by
ReferenceUpdateDispatcher.ReferenceUpdateDispatcherin turn delegates the finding of references toIReferenceFinder.
The text edits for changing the definition and the references are accumulated by an IRefactoringUpdateAcceptor.
The classes in yellow are customized by N4JS implementation.
-
IRenameStrategy: the custom N4JS implementation creates updates for constituent members of composed elements. -
IReferenceFinder: the custom N4JS implementation used for finding references of a declaration. -
RefactoringCrossReferenceSerializer: custom N4JS implementation to retrieve the updated name for cross references. For some unknown reason, the default implementation does not work correctly.
9. Flow Graphs
9.1. Flow graphs overview
In this chapter, the control and data flow analyses are introduced.
Since not all AST elements are relevant for the control or data flow analyses, a new marker class is introduced called ControlFlowElement.
All AST elements which are part of the control flow graph implement this class.
The term control flow is abbreviated as CF and hence ControlFlowElements are abbreviated as CF elements.
The following picture shows the control flow graph of the function f.
9.1.1. Internal graph
Every internal graph refers to a single control flow container.
The graph consists of nodes and edges, where the edges are instances of ControlFlowEdge, and nodes are instances of Node.
Additionally, a so called complex node is used to group nodes that belong to the same CF element.
- Internal graph
-
Control flow graphs are created based on the AST elements. Nevertheless, a fine-grained abstraction is used that is called internal graph. The internal graph reflects all control flows and data effects that happen implicitly and are part of the language’s semantics. For instance, the for-of statement on iterable objects forks the control flow after invoking the
next()method. This is done implicitly and not part of the written source code. Moreover, this invocation could cause side effects. These control flows and effects are reflected using the internal graph. To implement analyses that refer to AST elements, an API for flow analyses is provided which hides the internal graph and works with AST elements only. In the following, the term control flow graph refers to the internal graph. - Control flow container
-
At several places in the AST, an execution of statements or elements can happen. Obviously, statements can be executed in bodies of methods or function expressions. In addition, execution can also happen in field initializers or the
Scriptitself. Since all these AST elements can contain executable control flow elements (CF elements), they thus contain a control flow graph. In the following, these AST elements are called control flow containers or CF containers. - Nodes
-
Nodes represent complete CF elements or parts of them. For instance, simple CF elements like a
breakstatement are represented using only one node. Regarding more complex CF elements that introduce a more complex control flow, due to e.g. nested expressions, several nodes represent one CF element. All nodes of a single CF element are grouped within a complex node. - Edges
-
Edges reference a start and an end node which reflects a forward control flow direction, which is in the following called forward traverse direction. Traversing from end to start of an edge is thus called backward traverse direction. The so called next node is either the end node in context of forward, or the start node in context of backward traverse direction. Edges also reflect the reason of the control flow using a control flow type. The default control flow type is called
Successorand such edges connect two ordinary subsequent nodes. Other types likeReturnorBreakindicate control flow that happens due to return or break statements. A special control flow type isRepeatthat indicates the entry of a loop body. This edge is treated specially when traversing the control flow graph to avoid infinitive traversals of loops. - Complex node
-
The complex node always has a single entry and exit node, no matter the control flow it causes. For instance, although the for-statement can contain various control flows among its nested CF elements, its complex node still has a single entry and exit node. This simplifies concatenating subsequent complex nodes, since only their exit nodes have to be connected to the following entry node. Aside from exit and entry node, a complex node usually contains additionally nodes to represent the CF element. These nodes are connected by control flow edge so that their control flow lies within complex node. However, regarding nested CF elements, the control flow leaves and re-enters a complex node. To specify which CF element is nested, a delegating node (
DelegatingNode) is created that points to the nested CF element.
Consider that source code elements can be nested like expressions that have sub-expressions as in 1 + 2 * 3.
Also statements can contain other statements like in if (true) return;.
The design of the control flow graph deals with this nesting by mapping CF elements to several nodes.
All nodes of one CF element are aggregated into the complex node.
1+2 creates an internal graph of three complex nodes to deal with nested integer literalsThe example in the figure above shows the internal graph produced by the source code 1+2.
Additionally, a simpler version of the internal graph is shown (called User Graph View), with which client analyses deal.
The user graph view is only a view on the internal graph, but does not exist as an own instance.
In the figure, the nesting of the integer literals becomes obvious:
The control flow edges of delegating nodes targets entry nodes of different CF elements.
Also, there are CF edges from the exit nodes of these nested CF elements to return the control flow.
In the above figure, the complex node of the for statement is shown. The details of the complex nodes of the nested CF elements (such as the initializer or the body statement) are omitted. The figure displays the control flow fork after the condition and also shows the Repeat edge that targets the for body. The node called Catch Node is used in situations when there are jumping control flows introduced for instance by a continue statement. The catch will will then be the target of an control flow edge that starts at the continue statement.
The graph of nodes and edges is constructed in the following order.
-
First, for every CF element a complex node and all its nodes are created. Also, the nodes within the complex node are connected according to their control flow behavior.
-
Second, all subsequent complex nodes are connected by connecting their exit and entry nodes. Moreover, nested complex nodes are connected by interpreting the delegating nodes.
-
Third, jumping control due to
return,throw,breakorcontinuestatements is computed. This is done by first deleting the successor edge of the jumping statement and introducing a new control flow edge that ends at the jump target.
9.1.2. Optimizations
The internal graph contains many nodes to simplify the graph construction. However, these nodes carry no relevant information when traversing the graph. Consequently, in an optimization step, they are removed from the graph for performance reasons.
A node removal for a node is done by replacing the path by the new path . These removals are done for delegating nodes that only have one incoming and one outgoing edge.
A second kind but similar optimization reduces the number of helper nodes that are used as entry nodes. In case a complex nodes consists only of exactly one entry and one exit node, both of these nodes are collapsed into one node. This remaining node then is the representing node of the AST element.
9.1.3. API for client analyses
To implement client analyses based on the control flow graph, the three classes GraphVisitor, GraphExplorer and BranchWalker are provided.
They provide the means to visit CF elements in a control flow graph and also to traverse single control flow paths.
The method N4JSFlowAnalyses#analyze can execute several client analyses in one run to maintain scalability.
9.1.3.1. Mapping from internal to AST elements
The API classes work with AST elements such as ControlFlowElement instead of the internally used graph classes ComplexNode, Node or ControlFlowEdge.
The mapping from internal classes to AST elements is done in the GraphVisitor class.
Note that the control flow graph has the following properties:
-
ExpressionStatements are not represented. Instead, only their expressions are represented. Nevertheless, the API can deal with calls that refer to expression statements, e.g. when requesting their successors. -
Control statements are also not represented in the graph, but can also be used in calls to the API. The reason is, that it is unclear when a control statement (e.g. a for loop) is visited exactly.
-
Since a
FlowEdgewhich connects twoControlFlowElements can represent multiple internal edges, it can have multipleControlFlowTypes.
9.1.3.2. Graph visitor
Graph visitors traverse the control flow graphs of every CF container of a script instance in the following two traverse directions:
-
Forward: from the container’s start to all reachable CF graph elements.
-
Backward: from the container’s end to all reachable CF graph elements.
In each traverse direction, the graph visitor visits every reachable CF element and edge. Note that neither empty statements nor control statements are part of the control flow graph. The order of visited CF elements is related to either a breadth or a depth search on the CF graph. However, no specific order assumptions are guaranteed.
"loop" and "end" are dead code and displayed in grey.9.1.3.3. Graph explorer
Graph visitors can request a graph explorer to be activated under specific conditions related to the client analysis. A graph explorer is the start point to analyze control flow branches. The first control flow branches is started directly at the graph explorer’s creation site, but of course this first branches might fork eventually. The graph explorer keeps track of all forked branches that originate at its activation site. It also provides the means to join previously forked branches again.
9.1.3.4. Branch walker
With every graph explorer, a branch walker is created that traverses the control flow graph beginning from the activation site of the graph explorer.
On every such branch, the two visit methods of CF elements and edges respectively, are called in the order of the traverse direction.
Every time the branch forks, the fork method of the branch walker is invoked and creates another branch walker which will continue the traversal on the forked branch.
The fork method can be used to copy some path data or state to the newly forked branch walker.
Note that every edge is always followed by the branch walker except for repeat edges which are followed exactly twice.
The reason to follow them twice is that first, following them only once would hide those control flows that re-visit the same CF elements due to the loop.
Second, following them more than twice does not reveal more insights, but only increases the number of branches.
When control flow branches merge again, for instance at the end of an if-statement, two or more branch walkers are merged into a new succeeding one.
The graph explorer provides the means to do this.
In case a CF element has no next elements, the branch walker terminates.
9.1.3.5. Example 1: Compute string for each path
Let’s assume that we want to compute all control flow branches of a function and use the client API for that.
The function f() in the following code snippet has four control flow branches: 1 → 2, → 3 →, → 4 → and 5.
f() has four control flow branches.function f() {
1;
if (2)
3;
else
4;
5;
}
To compute these control flow branches, the class AllBranchPrintVisitor extends the GraphVisitor.
Already in the method initializeMode() a graph explorer is activated.
Note that the method requestActivation() can be understood as a addListener method for a listener that listens to visit events on nodes and edges.
Immediately after the activation request, the first branch walker is created in the method firstBranchWalker().
The first visited CF element of the branch walker will then be the expression 1.
It is formed into a string and added to the variable curString.
After expression 1, the flow edge from 1 to 2 is visited.
This will concatenate the string → to the path string.
Variable curString will eventually hold the branch string like 1 → 2.
Since the control flow forks after 2, the method forkPath() is called and creates two new instances of a branch walker.
These new instances succeed the the first branch walker instance and each traverses one of the branches of the if-statement.
When the if-statement is passed, these two branches are merged into a new succeeding branch walker.
After all branch walkers are terminated, the graph explorer and graph visitor are also terminated.
The method getBranchStrings() collects all four computed strings from the variable curString of all branch walkers.
class AllBranchPrintVisitor extends GraphVisitor {
protected void initializeMode(Mode curDirection, ControlFlowElement curContainer) {
super.requestActivation(new AllBranchPrintExplorer());
}
class AllBranchPrintExplorer extends GraphExplorer {
class AllBranchPrintWalker extends BranchWalker {
String curString = "";
protected void visit(ControlFlowElement cfe) {
curString += cfe.toString();
}
protected void visit(FlowEdge edge) {
curString += " -> ";
}
protected AllBranchPrintWalker forkPath() {
return new AllBranchPrintWalker();
}
}
protected BranchWalker joinBranches(List<BranchWalker> branchWalkers) {
// TODO Auto-generated method stub
return null;
}
protected BranchWalkerInternal firstBranchWalker() {
return new AllBranchPrintWalker();
}
}
List<String> getBranchStrings() {
List<String> branchStrings = new LinkedList<>();
for (GraphExplorerInternal app : getActivatedExplorers()) {
for (BranchWalkerInternal ap : app.getAllBranches()) {
AllBranchPrintWalker printPath = (AllBranchPrintWalker) ap;
branchStrings.add(printPath.curString);
}
}
return branchStrings;
}
}
9.1.3.6. Path quantor
Graph explorers are typically used to reason on all branch walkers that start at a specific location.
For instance, such a reasoning might determine whether some source element is reachable or whether a variable is used or not.
To simplify this, quantors are provided.
Since branch walkers originating from a single activation point can fork, the reasoning has to include all these forked branch walkers.
Hence, graph explorers are instantiated using a quantor which can be either For All, At Least One OR None that refers to all branches.
After all branch walkers of an explorer are terminated, the explorer is regarded as either passed or failed.
Paths also can be aborted manually using the methods pass() or fail().
When pass or fail are used, the graph explorer might be terminated in the following cases:
-
If the quantor of the graph explorer is For All, and
fail()is called on a branch walker. -
If the quantor of the graph explorer is At Least One, and
pass()is called on a branch walker.
Additionally, a graph explorer can be aborted manually by canceling all its branches.
9.1.4. Control flow analyses
9.1.4.1. Dead code analysis
The dead code analysis uses the graph visitor in all four modes and collects all visited CF elements. The collected CF elements are saved separately for every mode. After the graph visitor is terminated, the unreachable CF elements are computed like follows:
-
CF elements, that are collected during forward and catch block mode are reachable.
-
CF elements, that are collected during islands mode are unreachable.
-
CF elements, that are only collected during backward mode, are also unreachable.
In a later step, the unreachable elements are merged into unreachable text regions that are used for error markers.
9.2. Dataflow
9.2.1. Dataflow graph
The data flow graph provides means to reason about symbols, effects, data flow, aliases and guards in the control flow graph.
The main classes of the data flow API are DataflowVisitor and Assumption.
- Symbol
-
Symbols represent a program variable in the sence that it represents all AST elements, that bind to the same variable declaration (according to scoping). The terms symbol and variable are used synonymously.
- Effect
-
Effects are reads, writes and declarations of symbols. For instance, a typical CF element with a write effect is an assignment such as
a = null;. Every effect refers to a single symbol and graph node. The following effects are provided:-
Declaration: is the declaration of a variable.
-
Write: is the definition of a variable’s value, which is typically done with an assignment.
-
Read: is the read of a variable’s value, which could happen when passing a variable as an argument to a method call.
-
MethodCall: is the call of a property method of a variable.
-
Note that the term value use means either write or method call of a variable. The term value definition means that a variable is written.
- Data flow
-
The term data flow is used for assignments of all kind. For instance, the assigments
a = b,a = 1,a = nullor evenfor (let [a] of [[0],[undefined]]);are data flows. The data is always flowing from the right hand side to the left hand side. - Alias
-
Due to data flow, other symbols can get important for an analysis. For instance, the data flow
a = bmakesbimportant when reasoning aboutasince the value ofbis assigned toa. In the API isbtherefore called an alias ofa. - Guard
-
Guards are conditions that appear in e.g.
it-statements. For instance, a typical guard is the null-check in the following statement:if (a == null) foo();. For every CF element, guards can hold either always, never or sometimes. Note that the null-check-guard always holds at the method invocationfoo();. DataflowVisitor-
The class
DataflowVisitorprovides means to visit all code locations where either effects happen or guards are declared. For instance, when a variable is written, the callback methodDataflowVisitor#visitEffect(EffectInfo effect, ControlFlowElement cfe)gets called. In case a guard is declared, the callback methodvisitGuard(Guard guard)gets called. Assumption-
The class
Assumptionprovides means to track the data flow of a specific symbol from a specific code location. For instance, assumptions are used to detect whether the symbolsin the property accesss.propis or may be undefined. In this example, the assumption symbol issand its start location is the property access. From there, the data flow ofsis tracked in backwards traverse direction. Also, (transitive) aliases ofsare tracked. In case a data flow that happens onsor its aliases, the callback methodholdsOnDataflow(Symbol lhs, Symbol rSymbol, Expression rValue)is called. For every effect that affectssor one of its aliases, the callback methodholdsOnEffect(EffectInfo effect, ControlFlowElement container)is called. And finally, for all guards that hold always/never at the start location regarding symbols, the callback methodholdsOnGuards(Multimap<GuardType, Guard> neverHolding, Multimap<GuardType, Guard> alwaysHolding)is called.
9.2.2. Dataflow analyses
9.2.2.1. Def→Def / Def→Nothing analysis
A Def→Def analysis finds all defintions of a variable that are always a predecessor of another definition. Its result is a set of all obsolete definition sites.
A Def→!Use analysis finds all definitions of a variable that are not followed by either a read or a method call. These definition are therefore obsolete and can be removed.
Both of these analyses are performed in traverse direction Forward.
9.2.2.2. Def|Use←Decl analysis
A Def|Use←Decl analysis finds all preceding def or use sites of a declarations of a specific variable. The paths might contain other defs or uses of the same variable. In case such paths exists, the variable is used before it is declared. This analysis is done in traverse direction Backward.
In the above figure a graph visitor would visit all CF elements.
When it visits the declaration in line 8 (let w), it will activate a graph explorer (star 1 in the figure) for variable w.
Now, the first branch walker is created and walks the control in backward traverse direction.
When encounters the exit node of the if-statement, it will create two forked branches and .
Now, enters then the branch of the if-statement (star 2), while traverses directly to the condition of the if-statement.
Next, visits the def site of variable w (star 3).
This means, that there exist a def site of w before w was declared and hence, an error should be shown.
Since there could exist more cases like this, neither the branch walker nor the graph explorer are terminated.
When reaching star 4, the two branch walkers and are joined and the follow-up branch walker is created.
At star 5, the end the CF container is reached and the will be terminated.
After all branch walkers are terminated, the graph explorer for the declaration site of variable w is evaluated:
All use or def sites, that were reachable should be marked with an error saying that the declaration has to be located before the use of a variable.
Note this analysis is currently implemented as a control flow analysis since it does not rely on guards, aliases. Also, it only relies on local variables and hence does not need the symbols that are provided by the data flow API.
10. External Libraries
External libraries are N4JS projects that are provided by the N4JS IDE: the built-in/shipped libraries, and all 3rd-party libraries that were installed by the N4JS library manager. Each external library consist of a valid package.json file located in the project root and an arbitrary number of files supported by N4JS projects, e.g. .n4js, .njsd and .js files. The purpose of the external libraries is to share and to provide core and third party functionality for N4JS developers both in compile and runtime without rebuilding them.
[sec:Built-in_External_Libraries] are external libraries that provide some basic functionality for N4JS programmers, such as the class N4Injector.
3rd-party libraries are external libraries that are not built-in/shipped with the N4JS IDE. Instead, they can be installed later by the user from third party providers. Currently, only npm packages are supported.
The N4JS index is populated when the external libraries are compiled. However, this compilation is only triggered through the library manager, but not when building workspace projects. (Self-evidently, the index is also populated when compiling workspace projects.)
Name clashes of projects can happen and they are solved in the following order:
-
User workspace projects always shadow external libraries.
-
In case of a name clash between a shipped and a 3rd-party library, the 3rd-party library shadows the shipped project.
The N4JS library manager is a tool in the N4JS IDE to view and manage external libraries. In particular, the user can (un-)install new 3rd-party libraries, or can trigger the build of all external libraries to re-populate the N4JS index. The library manager also supports other maintenance actions such as deleting all 3rd-party libraries.
10.1. Major Components
External libraries are supported based on different components all over the application.
The followings are the most important ones:
-
External Resources (
IExternalResource)-
These are customized
IResourceimplementations for external projects, folders and files. -
With this approach the
IProject,IFolderandIFileinterfaces have been implemented. Each implementation is backed by a purejava.io.Filebased resource. -
When accessing such external resources for example visiting purposes, getting the members of the resource or simply deleting the resource, internally each requests will be directly performed on the wrapped
java.io.Filewithout accessing theorg.eclipse.core.resources.IWorkspaceinstance.
-
-
External Library Workspace
-
This is a kind of dedicated workspace for external libraries and their dependencies.
-
Any query requests to retrieve a particular project or any dependencies of a particular project via the
IN4JSCoresingleton service will delegated to its wrappedN4JSModelsingleton. Internally theN4JSModelhas a reference to a workspace for all the ordinary workspace projects and another reference to the workspace for external libraries. Each query requests will be forwarded to the workspace for the ordinary projects first, and then to the external library workspace. If ordinary project workspace can provide any meaningful response for a request, then the external library workspace will not be accessed at all. Otherwise the query will be executed against the external library workspace. This fallback mechanism provides a pragmatic solution to the project shadowing feature. The project shadowing will be described in details later in this section. -
The External Library Workspace is only supported and available in the IDE case, in the headless case there are no external libraries available from this dedicated workspace. Since the Xtext index creation and the entire build infrastructure is different, it is supported via target platform file. This is described in more details in a later section (Headless External Library Support]).
-
-
External Library Preference Store
-
This preference store is being used to register and un-register external library root folders into its underlying ordered list. A folder is called as an external library root folder if it is neither equal with the Eclipse workspace root nor being nested in the workspace root and contains zero to any external libraries.
-
Whenever any modifications are being saved in this preference store the External Library Workspace will be updated as well, new libraries will be registered into the workspace and removed libraries will be cleaned up from the workspace.
-
When the N4JS IDE application is started in production mode, the initial state of the preference store is being pre-populated with default values. This is necessary to provide built-in libraries to end users. These default values and additional advanced configurations will be mentioned in more details later in this section.
-
-
Library Manager
-
This service is responsible for downloading and installing third party npm packages into the
node_modulesfolder of the N4JS IDE. After downloading, the newly-installed and/or updated packages are registered as external libraries into the system.
-
-
External Library Builder
-
This service is responsible for updating the persistent Xtext index with the currently available external libraries.
-
Unlike in case of any other ordinary projects, this builder does not triggers a build via the
org.eclipse.core.internal.events.BuildManagerbut modifies the persisted Xtext index (IBuilderState) directly. -
Considers shadowed external libraries when updating the persisted Xtext index.
-
Makes sure that the external library related Xtext index is persistent and will be available on the next application startup.
-
-
External Library Xtext Index Persister
-
This class is responsible for recovering the consistent external library Xtext index state at application startup.
-
Scheduled on the very first application startup to prepare the Xtext index for the available external libraries.
-
Recovers the Xtext index state after a force quit and/or application crash.
-
-
External Library Preference Page
-
Preference page to configure and update the state of the External Library Preference Store.
-
Provides a way to install npm dependencies as external libraries into the application.
-
Reloads the external libraries. Gets the most recent state of N4JS type definition files and updates the Xtext index content based on the current state of the external libraries.
-
Exports the current npm dependency configuration as a target platform file. This will be discussed in another section ([sec:Headless_External_Library_Support]).
-
-
Miscellaneous UI Features
-
Searching for types provided by external libraries.
-
Opening external modules in read-only editor.
-
Navigation between external types.
-
Project Explorer contribution for showing external dependencies for ordinary workspace projects.
-
Editor-navigator linking support for external modules.
-
Installing third party npm dependencies directly from package.json editor via a quick fix.
-
10.1.1. External Resources
This approach provides a very pragmatic and simple solution to support external libraries in both in the IN4JSCore and in the IBuilderState. While IN4JSCore supports a completely transparent way of external libraries via the IN4JSProject interface all over in the application, the IBuilderState is responsible for keeping the Xtext index content up to date with the external libraries. Below picture depicts the hierarchy between the ordinary IResource and the IExternalResource instances. As described above each external resource is backed by a java.io.File resource and each access and operation being invoked on the IResource interface will be delegated to this backing resource.
10.1.2. External Library Workspace
External library workspace is an extension of the InternalN4JSWorkspace. This workspace is used for storing and managing external libraries all over the application. External libraries can be registered into the workspace by providing one to many external library root folder locations. The provided root folder locations will be visited in an ordered fashion and the contained external libraries (N4JS projects) will be registered into the application. If an external library from a root folder has been registered, then a forthcoming occurrence of an external library with the same artefact identifier (and same folder name) will be ignored at all. For instance let assume two external library root locations are available ER1 and ER2, also ER1 contains P1 and P2 external libraries, while ER2 contains P2 and P3. After registering the two roots into the workspace ER1 will be processed first, and P1 and P2 will be registered to the workspace, when processing the forthcoming ER2 root, P2 will be ignored at all as an external with the same name exists. Finally P3 will be registered to the workspace. External libraries cannot be registered directly into the workspace it is done automatically by the External Library Preference Store and by the npm Manager.
10.1.3. External Library Preference Store
This persistent cache is used for storing an ordered enumeration of registered external library root folder locations. Whenever its internal state is being persisted after a modification, all registered modification listeners will be synchronously notified about this change. All listeners will receive the store itself with the updated state. There are a couple of registered listeners all over the application listening to store update events but the most important one is the External Library Workspace itself. After receiving an external library preference store update event, the external library workspace will calculate the changes from its own state: creates a sort of difference by identifying added, removed and modified external libraries. Also tracks external library root location order changes. Once the workspace has calculated the changes[12] it will interact with the External Library Builder Helper which will eventually update the persisted Xtext index directly through the IBuilderState. After the Xtext index content update all ordinary workspace projects that directly depend either on a built or a cleaned external library will be automatically rebuilt by the external library workspace.
10.1.4. Library Manager
This service is responsible for downloading, installing third party npm dependencies into the local file system. This is done directly by npm from Node.js. Once an npm package has been downloaded and installed it will be registered into the external library workspace. As part of the registration, the Xtext index content will be updated and all dependent ordinary workspace projects will be rebuilt automatically. An npm package cannot be installed via the Library Manager if it already installed previously.
10.1.5. External Library Builder
This builder is responsible for updating the persisted Xtext index state with external library content directly through the IBuilderState. When providing a subset of external libraries to either build or clean, internally it orders the provided external libraries based on the project dependencies. Also, it might skip building all those external libraries that have are being shadowed by a workspace counterpart. An external library is being shadowed by an ordinary workspace project, if the workspace project is accessible and has exactly the same project name as the external library.
10.1.6. External Library Xtext Index Persister
By default Xtext provides a way to fix corrupted index or to recreate it from scratch in case of its absence. Such inconsistent index states could occur due to application crashes or due to non-graceful application shutdowns. Although this default recovery mechanism provided by Xtext works properly, it is provided only for projects that are available in the Eclipse based workspace (org.eclipse.core.resources.IWorkspace) but non of the external libraries are not available from the Eclipse based workspace, so inconsistent external library index content cannot be recovered by this default mechanism. N4JS IDE contributes its own logic to recover index state of external N4JS libraries. When the default Xtext index recovery runs, then it will trigger a external reload as well. This external reload is guaranteed to run always after the default recovery mechanism.
10.1.7. External Library Preference Page
This preference page provides a way to configure the external libraries by adding and removing external library root folders, also allows the user to reorder the configured external library root locations. Besides that, npm packages can be installed into the application as external libraries. Neither removing nor reordering built-in external libraries are supported, hence these operations are disabled for built-ins on the preference page. No modifications will take effect unless the changes are persisted with the Apply button. One can reset the configurations to the default state by clicking on the Restore Defaults button then on the Apply button. The Reload button will check whether new type definition files are available for npm dependencies, then reloads the persistent Xtext index content based on the available external libraries. Once the external library reloading has been successfully finished, all dependent workspace projects will be rebuilt as well. From the preference page one can export the installed and used third party npm packages as a target platform. This exported target platform file can be used with the headless compiler. After setting up the headless compiler with this exported target platform file, the headless tool will collect and download all required third party npm dependencies.
10.2. Headless External Library Support
The headless compiler is not capable of supporting built-in libraries. The whole build and Xtext index creation infrastructure is different in the IDE and in the headless case. Also, due to its archive nature (n4jsc.jar) of the headless tool, neither the runtime nor the Mangelhaft libraries can be loaded into the headless compiler.
The headless compiler supports downloading, installing and using third party npm packages. To enable this feature one has to configure the target platform via the –targetPlatformFile (or simply -tp) and the –targetPlatformInstallLocation (or simply -tl) arguments.
If the target platform file argument is configured, then all third party dependencies declared in the target platform file will be downloaded, installed and made available for all the N4JS projects before the compile (and run) phase. If the target platform file is given but the target platform install location is not specified (via the –targetPlatformInstallLocation argument), then a the compilation phase will be aborted and the execution will be interrupted.
For more convenient continuous integration and testing purposes there are a couple of additional exception cases with respect to the the target platform file and location that users of the headless compiler have to keep in mind. These are the followings:
-
–targetPlatformSkipInstall. Usually dependencies defined in the target platform file will be installed into the folder defined by option–targetPlatformInstallLocation. If this flag is provided, this installation will be skipped, assuming the given folder already contains the required files and everything is up-to-date. Users have to use this flag with care, because no checks will be performed whether the location actually contains all required dependencies. -
If
–targetPlatformSkipInstallis provided the–targetPlatformInstallLocationparameter is completely ignored. -
If
–targetPlatformSkipInstallis provided the–targetPlatformFileparameter is completely ignored. -
If neither
–targetPlatformInstallLocationnot–targetPlatformFileparameters are specified the headless tool will treat this case as an implicit–targetPlatformSkipInstallconfiguration.
If the target platform install location is configured, and the target platform file is given as well, then all third party dependencies specified in the target platform file will be downloaded to that given location. If the target platform file is given, but the target platform install location is not specified, then a the compilation phase will be aborted and the execution will be interrupted.
java -jar n4jsc.jar -projectlocations /path/to/the/workspace/root -t allprojects -tp /absolute/path/to/the/file -tl /path/to/the/target/platform/install/location -rw nodejs -r moduleToRun
10.2.1. Custom npm settings
In some cases there is a need for custom npm settings, e.g. custom npm registry. Those kind of configurations are
supported via .npmrc file (see https://docs.npmjs.com/files/npmrc).
In N4JSIDE user can specify path to his custom configuration file in the preference page.
For the commandline N4JSC.jar provides special option -npmrcRootLocation that allows headless compiler to
use custom settings.
10.3. Future Work
Some aspects not covered in current design, but worth consideration in the future
10.3.1. Multiple Dependency Scope
npm scope dependencies
- DEPENDENCY_DEVELOPMENT
- DEPENDENCY_PEER
- DEPENDENCY_BUNDLE
-
https://docs.npmjs.com/files/package.json#bundleddependencies
- DEPENDENCY_OPTIONAL
-
https://docs.npmjs.com/files/package.json#optionaldependencies
- DEPENDENCY_PROVIDES
- DEPENDENCY_WEAK
-
http://www.rpm.org/wiki/PackagerDocs/Dependencies#Weakdependencies
10.3.2. Run Tests from TestLibrary
Imagine we are implementing some API, and we want to run tests for that API. Tests are delivered to us as separate package, and there is not direct association between implementation and test projects (tests are not depending on implementation). Still we want to run provided tests to see if our implementation complies with API tests, e.g. AcceptanceTest suite for Application written against application sdk.
Appendix A: Acronyms
|
Compile-Time Dependency |
|
Run-Time Dependency |
|---|---|---|---|
|
Load-Time Dependency |
|
Initialization-Time Dependency |
|
Execution-Time Dependency |
|
Acceptance Criteria |
|
ANother Tool for Language Recognition |
|
Application Programming Interface |
|
Abstract Syntax Tree |
|
Automatic Semicolon Insertion |
|
Abstract Syntax Tree |
|
Backus-Naur Form |
|
Content-Assist |
|
Constraint Satisfaction Problem |
|
Command Line Interface |
|
Document Object Model |
|
Domain Specific Language |
|
Extended Backus-Naur Form |
|
Eclipse Modeling Framework |
|
Eclipse Public License |
|
Fully Qualified Name |
|
Greatest Lower Bound, also known as infimum |
|
GNU General Public License |
|
Integrated Development Environment |
|
Interface Definition Language |
|
Liskov Substitution Principle |
|
Least Upper Bound, also known as supremum |
|
NumberFour JavaScript |
|
User Interface |
|
Unified Modeling Language |
|
Virtual Machine |
|
Extensible Markup Language |
|
XSL Transformations |
Extensible Stylesheet Language |
|
|
What You See Is What You Get |
|
without loss of generality |
Appendix B: Licence
This specification and the accompanying materials is made available under the terms of the Eclipse Public License v1.0 which accompanies this distribution, and is available at http://www.eclipse.org/legal/epl-v10.html
Eclipse Public License - v 1.0
THE ACCOMPANYING PROGRAM IS PROVIDED UNDER THE TERMS OF THIS ECLIPSE
PUBLIC LICENSE (AGREEMENT). ANY USE, REPRODUCTION OR DISTRIBUTION OF
THE PROGRAM CONSTITUTES RECIPIENT’S ACCEPTANCE OF THIS AGREEMENT.
1. DEFINITIONS
Contributionmeans:-
-
in the case of the initial Contributor, the initial code and documentation distributed under this Agreement, and
-
in the case of each subsequent Contributor:
-
changes to the Program, and
-
additions to the Program;
where such changes and/or additions to the Program originate from and are distributed by that particular Contributor. A Contribution ’originates’ from a Contributor if it was added to the Program by such Contributor itself or anyone acting on such Contributor’s behalf. Contributions do not include additions to the Program which:
-
are separate modules of software distributed in conjunction with the Program under their own license agreement, and
-
are not derivative works of the Program.
-
-
-
Contributor-
means any person or entity that distributes the Program.
Licensed Patents-
mean patent claims licensable by a Contributor which are necessarily infringed by the use or sale of its Contribution alone or when combined with the Program.
Program-
means the Contributions distributed in accordance with this Agreement.
Recipient-
means anyone who receives the Program under this Agreement, including all Contributors.
2. GRANT OF RIGHTS
-
Subject to the terms of this Agreement, each Contributor hereby grants Recipient a non-exclusive, worldwide, royalty-free copyright license to reproduce, prepare derivative works of, publicly display, publicly perform, distribute and sublicense the Contribution of such Contributor, if any, and such derivative works, in source code and object code form.
-
Subject to the terms of this Agreement, each Contributor hereby grants Recipient a non-exclusive, worldwide, royalty-free patent license under Licensed Patents to make, use, sell, offer to sell, import and otherwise transfer the Contribution of such Contributor, if any, in source code and object code form. This patent license shall apply to the combination of the Contribution and the Program if, at the time the Contribution is added by the Contributor, such addition of the Contribution causes such combination to be covered by the Licensed Patents. The patent license shall not apply to any other combinations which include the Contribution. No hardware per se is licensed hereunder.
-
Recipient understands that although each Contributor grants the licenses to its Contributions set forth herein, no assurances are provided by any Contributor that the Program does not infringe the patent or other intellectual property rights of any other entity. Each Contributor disclaims any liability to Recipient for claims brought by any other entity based on infringement of intellectual property rights or otherwise. As a condition to exercising the rights and licenses granted hereunder, each Recipient hereby assumes sole responsibility to secure any other intellectual property rights needed, if any. For example, if a third party patent license is required to allow Recipient to distribute the Program, it is Recipient’s responsibility to acquire that license before distributing the Program.
-
Each Contributor represents that to its knowledge it has sufficient copyright rights in its Contribution, if any, to grant the copyright license set forth in this Agreement.
3. REQUIREMENTS
A Contributor may choose to distribute the Program in object code form under its own license agreement, provided that:
-
it complies with the terms and conditions of this Agreement; and
-
its license agreement:
-
effectively disclaims on behalf of all Contributors all warranties and conditions, express and implied, including warranties or conditions of title and non-infringement, and implied warranties or conditions of merchantability and fitness for a particular purpose;
-
effectively excludes on behalf of all Contributors all liability for damages, including direct, indirect, special, incidental and consequential damages, such as lost profits;
-
states that any provisions which differ from this Agreement are offered by that Contributor alone and not by any other party; and
-
states that source code for the Program is available from such Contributor, and informs licensees how to obtain it in a reasonable manner on or through a medium customarily used for software exchange.
-
When the Program is made available in source code form:
-
it must be made available under this Agreement; and
-
a copy of this Agreement must be included with each copy of the Program.
Contributors may not remove or alter any copyright notices contained within the Program.
Each Contributor must identify itself as the originator of its Contribution, if any, in a manner that reasonably allows subsequent Recipients to identify the originator of the Contribution.
4. COMMERCIAL DISTRIBUTION
Commercial distributors of software may accept certain responsibilities
with respect to end users, business partners and the like. While this
license is intended to facilitate the commercial use of the Program, the
Contributor who includes the Program in a commercial product offering
should do so in a manner which does not create potential liability for
other Contributors. Therefore, if a Contributor includes the Program in
a commercial product offering, such Contributor (Commercial
Contributor) hereby agrees to defend and indemnify every other
Contributor (Indemnified Contributor) against any losses, damages
and costs (collectively Losses) arising from claims, lawsuits and
other legal actions brought by a third party against the Indemnified
Contributor to the extent caused by the acts or omissions of such
Commercial Contributor in connection with its distribution of the
Program in a commercial product offering. The obligations in this
section do not apply to any claims or Losses relating to any actual or
alleged intellectual property infringement. In order to qualify, an
Indemnified Contributor must: a) promptly notify the Commercial
Contributor in writing of such claim, and b) allow the Commercial
Contributor to control, and cooperate with the Commercial Contributor
in, the defense and any related settlement negotiations. The Indemnified
Contributor may participate in any such claim at its own expense.
For example, a Contributor might include the Program in a commercial product offering, Product X. That Contributor is then a Commercial Contributor. If that Commercial Contributor then makes performance claims, or offers warranties related to Product X, those performance claims and warranties are such Commercial Contributor’s responsibility alone. Under this section, the Commercial Contributor would have to defend claims against the other Contributors related to those performance claims and warranties, and if a court requires any other Contributor to pay any damages as a result, the Commercial Contributor must pay those damages.
5. NO WARRANTY
EXCEPT AS EXPRESSLY SET FORTH IN THIS AGREEMENT, THE PROGRAM IS PROVIDED
ON AN AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND,
EITHER EXPRESS OR IMPLIED INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES
OR CONDITIONS OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR
A PARTICULAR PURPOSE. Each Recipient is solely responsible for
determining the appropriateness of using and distributing the Program
and assumes all risks associated with its exercise of rights under this
Agreement , including but not limited to the risks and costs of program
errors, compliance with applicable laws, damage to or loss of data,
programs or equipment, and unavailability or interruption of operations.
6. DISCLAIMER OF LIABILITY
EXCEPT AS EXPRESSLY SET FORTH IN THIS AGREEMENT, NEITHER RECIPIENT NOR ANY CONTRIBUTORS SHALL HAVE ANY LIABILITY FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING WITHOUT LIMITATION LOST PROFITS), HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OR DISTRIBUTION OF THE PROGRAM OR THE EXERCISE OF ANY RIGHTS GRANTED HEREUNDER, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
7. GENERAL
If any provision of this Agreement is invalid or unenforceable under applicable law, it shall not affect the validity or enforceability of the remainder of the terms of this Agreement, and without further action by the parties hereto, such provision shall be reformed to the minimum extent necessary to make such provision valid and enforceable.
If Recipient institutes patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Program itself (excluding combinations of the Program with other software or hardware) infringes such Recipient’s patent(s), then such Recipient’s rights granted under Section 2(b) shall terminate as of the date such litigation is filed.
All Recipient’s rights under this Agreement shall terminate if it fails to comply with any of the material terms or conditions of this Agreement and does not cure such failure in a reasonable period of time after becoming aware of such noncompliance. If all Recipient’s rights under this Agreement terminate, Recipient agrees to cease use and distribution of the Program as soon as reasonably practicable. However, Recipient’s obligations under this Agreement and any licenses granted by Recipient relating to the Program shall continue and survive.
Everyone is permitted to copy and distribute copies of this Agreement, but in order to avoid inconsistency the Agreement is copyrighted and may only be modified in the following manner. The Agreement Steward reserves the right to publish new versions (including revisions) of this Agreement from time to time. No one other than the Agreement Steward has the right to modify this Agreement. The Eclipse Foundation is the initial Agreement Steward. The Eclipse Foundation may assign the responsibility to serve as the Agreement Steward to a suitable separate entity. Each new version of the Agreement will be given a distinguishing version number. The Program (including Contributions) may always be distributed subject to the version of the Agreement under which it was received. In addition, after a new version of the Agreement is published, Contributor may elect to distribute the Program (including its Contributions) under the new version. Except as expressly stated in Sections 2(a) and 2(b) above, Recipient receives no rights or licenses to the intellectual property of any Contributor under this Agreement, whether expressly, by implication, estoppel or otherwise. All rights in the Program not expressly granted under this Agreement are reserved.
This Agreement is governed by the laws of the State of New York and the intellectual property laws of the United States of America. No party to this Agreement will bring a legal action under this Agreement more than one year after the cause of action arose. Each party waives its rights to a jury trial in any resulting litigation.
Appendix C: Bibliography
N4JS Project. (2018). N4JS Language Specification. Retrieved from https://www.eclipse.org/n4js/spec/N4JSSpec.html