Execution vs. Parse-Based Language Servers

Tradeos and Opportunities for Language-Agnostic Tooling for Dynamic Languages

Stefan Marr

s. marr@kent. ac. uk

School of Computing

University of Kent

United Kingdom

Humphrey Burchell

h. burchell@kent. ac. uk

School of Computing

University of Kent

United Kingdom

Fabio Niephaus

fabio. niephaus@oracle. com

Oracle Labs

Potsdam, Germany

Abstract

With the wide adoption of the language server protocol, the

desire to have IDE-style tooling even for niche and research

languages has exploded. The True language framework

facilitates this desire by oering an almost zero-eort ap-

proach to language implementers to providing IDE features.

However, this existing approach needs to execute the code

being worked on to capture much of the information needed

for an IDE, ideally with full unit-test coverage.

To capture information more reliably and avoid the need

to execute the code being worked on, we propose a new

parse-based design for language servers. Our solution pro-

vides a language-agnostic interface for structural informa-

tion, with which we can support most common IDE features

for dynamic languages.

Comparing the two approaches, we nd that our new

parse-based approach requires only a modest development

eort for each language and has only minor tradeos for

precision, for instance for code completion, compared to

True’s execution-based approach.

Further, we show that less than 1,000 lines of code capture

enough details to provide much of the typical IDE func-

tionality, with an order of magnitude less code than ad hoc

language servers. We tested our approach for the custom

parsers of Newspeak and SOM, as well as SimpleLanguage’s

ANTLR grammar without any changes to it. Combining both

parse and execution-based approaches has the potential to

provide good and precise IDE tooling for a wide range of

languages with only small development eort. By itself, our

approach would be a good addition to the many libraries im-

plementing the language server protocol to enable low-eort

implementations of IDE features.

DLS ’22, December 07, 2022, Auckland, New Zealand

to ACM.

This is the author’s version of the work. It is posted here for your personal

use. Not for redistribution. The denitive Version of Record was published in

Proceedings of the 18th ACM SIGPLAN International Symposium on Dynamic

Languages (DLS ’22), December 07, 2022, Auckland, New Zealand, https:

//doi. org/10.1145/3563834.3567537.

CCS Concepts: • Software and its engineering

→

Inte-

grated and visual development environments; Frame-

works.

Keywords: language server protocol, comparison, parse time,

execution time

ACM Reference Format:

Stefan Marr, Humphrey Burchell, and Fabio Niephaus. 2022. Ex-

ecution vs. Parse-Based Language Servers: Tradeos and Oppor-

tunities for Language-Agnostic Tooling for Dynamic Languages.

In Proceedings of the 18th ACM SIGPLAN International Symposium

on Dynamic Languages (DLS ’22), December 07, 2022, Auckland,

New Zealand. ACM, New York, NY, USA, 14 pages. https://doi. org

/10. 1145/3563834. 3567537

1 Introduction

A bit more than half a decade ago, the language server pro-

tocol (LSP)

was proposed as a way to provide IDE features

such as code navigation, completion, and error reporting

independent of a specic IDE. In contrast to previous ap-

proaches, the LSP enables language developers to implement

such IDE features once for a language, and use the result-

ing language server in various editors and IDEs, including

Atom, Eclipse, Emacs, Visual Studio, and VIM. In the past,

one would need to implement separate plugins for each IDE.

Soon after the LSP was announced, we started wondering

how one could utilize language implementation frameworks

such as True [Würthinger et al

2012] to not just make it

simpler to have IDE support for a language in many dierent

editors and IDEs with a single implementation, but simplify

the creation of such language servers by identifying the

language-agnostic parts.

Stolpe et al

[2019] built a rst version of such a system for

the True language framework, which was adapted and re-

leased as the GraalVM Language Server.

Their key idea was

to utilize the instrumentation [Van de Vanter et al

2018] and

language interoperability [Grimmer et al

2018] features of

the True framework to enable IDE features. This approach

minimized the extra eort for language implementers while

Language Server Protocol, Microsoft, https://microsoft

github

io/language-

server-protocol/

Can we get the IDE for free, too?, S. Marr, https://www

stefan-marr

de/

2016/08/can-we-get-the-ide-for-free-too/

GraalVM Language Server Protocol, Oracle, https://www

graalvm

org/22

tools/lsp/

DLS ’22, December 07, 2022, Auckland, New Zealand Stefan Marr, Humphrey Burchell, and Fabio Niephaus

still enabling useful tooling in IDEs. However, to minimize

the implementation eort, the approach relies solely on what

is available in a language’s runtime, and it requires that the

code that is being worked on is executable and ideally fully

covered by unit tests to retrieve all available details. In some

way, this moves some of the burden for tooling to the de-

veloper using the tools, who have to care more to produce

well-formed and well-tested code to get a rich tooling experi-

ence. Another limiting factor is that language runtimes rarely

keep information irrelevant for execution. Thus, declarative

aspects such as where classes or elds are dened, code com-

ments with documentation, or the semantics of tokens for

highlighting, may not be preserved at run time and thus

cannot be shown in an IDE using this approach.

In this paper, we explore how we can trade a little extra

eort from the language implementers for more complete

tooling for dynamic languages. Instead of relying solely on

run-time information, we will demonstrate how to design a

language server framework so that only a small amount of

information needs to be extracted as part of the parsing and

compilation step of a language implementation. We argue

that we can realize most common IDE features for an order

of magnitude less eort than ad hoc language servers.

With Stolpe et al

’s approach arguably achieving support

for the most common and perhaps highest priority IDE fea-

tures with minimal eort, we chose it as a baseline for com-

parison and will focus on the same set of IDE features. Thus,

we will discuss how to support listing symbols for instance

in an outline view, enable goto denition, referencing and

highlighting elements related to a currently selected one,

communicating syntax errors, providing help for method

signatures, as well as general support for displaying hover

information, semantic highlighting, and code completion.

The paper’s contributions are

•

a design for a language-agnostic language server that

provides IDE features common for dynamic languages

based on an interface that can be used from custom

parsers, compilers, as well as parsers generated e.g. by

ANTLR, and independent of any language implemen-

tation framework;

•

a detailed analysis of tradeos between parse and

execution-based approaches and the opportunities of

combining them;

•

an implementation demonstrating our parse-based

approach for the languages SOM, Newspeak, and Sim-

pleLanguage.

Furthermore, we nd that we need less than 1,000 lines

of code to implement support for a language. Thus reaching

the goal of requiring only little more eort than the about

500 lines of code for the execution-based approach. Based

on a survey of 20 open source language servers, this is an

When referring to parse time in this paper, we consider any type of seman-

tic analysis to be part of it.

order of magnitude less code than needed for the median ad

hoc language server (see section 7.3). A brief performance

assessment conrms that parsers not designed for IDE use

are practical, even for larger les (see section 8).

Overall, we believe our design would be a useful addition

to many libraries implementing the language server protocol,

since it enables low-eort language server implementations.

2 Background

This section reviews the language server protocol, discuss

alternative approaches such as language workbenches, and

summarizes the work by Stolpe et al. [2019].

2.1 Language Server Protocol

Announced in June 2016, the language server protocol (LSP)

has gained wide adoption in editors, IDEs, and many lan-

guage communities implemented language servers. In June

2022 at the time of writing, version 3.17 of the LSP was

current. It supports document-related events, support for

windowing features such as notications, log messages, and

progress updates, as well as a wide range of language and

workspace-related features that are our focus.

The workspace features include events around le changes,

creation, renaming, deletion, and similar, but also workspace-

wide listing and resolving of language symbols.

The language-related features include classic ones: navi-

gating to a declaration or denition, nding of references,

requesting call and type hierarchies, highlighting related ele-

ments in an editor, tooltips or hover information, selection or

folding of ranges, code completion, method signature help,

diagnostic information, i.e, errors and warnings, and docu-

ment renaming and formatting. It also covers more modern

features such as code lenses, semantic highlighting, inline

values as typically displayed during debugging, and inlay

hints like parameter names of methods. Arguably, this covers

most, if not all common, features IDEs provide these days.

The protocol itself is based on JSON-RCP and consists of

requests, responses, and notications. To be as language-

agnostic as possible, the protocol tries to communicate loca-

tion information and the information to be displayed with-

out encoding specic semantics. Thus, responses typically

contain code ranges, strings to be shown, as well as meta

information to enable an IDE to dierentiate language con-

cepts, e.g., methods and elds, with dierent icons. Code

ranges are represented as pairs of line and column to indi-

cate start and end of a section, for instance to indicate where

an identier is in a le. The protocol does not generally en-

code structural details, except for basic containment, e.g.,

methods belonging to a class, for an outline view. This also

means, language-specic constructs such as abstract syn-

tax trees are not exposed to the editor. Instead, for instance

semantic highlighting is realized as list of tuples that charac-

terize tokens based on osets. Figure 1 illustrates this with

Execution vs. Parse-Based Language Servers DLS ’22, December 07, 2022, Auckland, New Zealand

Client

Language

Server

req: doc/definition

uri: example.sl

pos:

line: 2

col: 5

result:

uri: main.sl

range:

startLine: 12

startCol: 20

endLine: 12

endCol: 24

Figure 1. Example interaction in the Language Server Pro-

tocol. The client requests the denition for a symbol at a

position in a document identied by line and column. The

server responds with a result, identifying the le where the

symbol is dened, and the range at which it is found. The

example uses pseudo code for legibility and conciseness.

an example for an interaction when an IDE users would try

to navigate to the denition of a symbol.

Very similar to the LSP is Monto [Keidel et al

2016], which

also solves the issue that each language needed to implement

separate plugins for each IDE with a similar protocol.

2.2 Language Workbenches

Language Workbenches [Erdweg et al

2015] aim to simplify

the creation of programming languages. Often they provide

a set of languages or tools to dene new languages. This

can include a way to specify syntax, scoping rules, typing

systems, and even the full execution semantics. For example,

Rascal [Klint et al

2009], Spoofax [Kats and Visser 2010], and

Xtext [Eysholdt and Behrens 2010] also derive IDE features

from the languages implemented in them. The IDE feature

set is similar to that of the LSP. Xtext for example directly

supports the LSP to provide these IDE features. The Rascal-

based Bacatá [Merino et al. 2020] uses the LSP to bring IDE

features to computational notebooks.

When designing a new language, some of these work-

benches can make it unnecessary to implement a language

server. Though, for language implementers that can not use

such workbenches, perhaps when building IDE support for

an existing language, other approaches such as the one pro-

posed in this paper will remain relevant.

2.3 A Language-Agnostic Design of a

Execution-Based Language Server

Since we rely on Stolpe et al

[2019]’s work, this section

briey summarizes their approach and what a language im-

plementer needs to do to benet from it.

Their approach to provide IDE features utilizes True’s

support for instrumenting language implementations [Van de

Vanter et al

2018] and its language interoperability facili-

ties [Grimmer et al. 2018].

The basic idea is to execute the code shown in the IDE and

obtain the necessary information about symbols, denitions,

source locations, documentation, variables, or method signa-

tures from the annotated and instrumented abstract syntax

tree as well as the run-time objects of a language, taking

inspiration from Haupt et al

[2011]. For this purpose, they

execute the code either using unit tests, or if none are avail-

able, the well-known entry points, such as main methods.

To avoid side eects, everything is executed in a sandbox.

Their sandboxing approach also provides the necessary sup-

port to execute the edited les based on their in-memory

versions. Since execution can take noticeable amounts of

time, previously harvested information will be cached.

SimpleLanguage is a small language to document how

the True framework can be used. Stolpe et al

use it as a

running example. It is dynamically typed and has rst-class

functions, local variables, various control structures, and

objects with dynamically added elds. Though, it does not

have any notion of classes or types.

The language server implementation uses the True frame-

work to parse a le whenever the IDE sends le-open or

changed events. When parsing fails, the error is reported as

a diagnostic to the client.

While parsing, functions are collected by the framework

and tools get access to the abstract syntax tree (AST). “Func-

tions” here includes also methods, and things such as static

initializers depending on a language. The AST nodes belong-

ing to a function are language-specic. To have language-

agnostic information about their semantics, the nodes can be

tagged, which is then used for tooling. A node also encodes

the code range it originates from.

To identify for example variable declarations, AST nodes

are tagged with the

DeclarationTag

and the corresponding

node can implement the

getNodeObject()

method which

can provide additional details such as the kind of language

element declared, and any statically known types.

Function calls and similar are identied by the

CallTag

the corresponding AST nodes. The corresponding function

denitions are either identied by name, based on the con-

crete functions called at run time, or using

findTopScopes()

True’s API to look for the name in global scopes. Once can-

didate objects representing for instance the functions are

found, the source position where they are dened can be

requested with True’s findSourceLocation() method.

Variables are tagged with

ReadVariableTag

Write-

VariableTag

to identify all references to a specic variable.

True keeps details on nested scopes to support debug-

gers. It can be queried with

findLocalScopes()

to correctly

distinguish symbols based on the language’s scoping rules.

To answer an IDE’s request for documentation and sig-

nature help, the run-time objects need to implement the

getDocumentation()

and

getSignature()

methods. The

information is enriched with run-time type information that

can be queried using True’s findMetaObject() function.

DLS ’22, December 07, 2022, Auckland, New Zealand Stefan Marr, Humphrey Burchell, and Fabio Niephaus

For code completion, completion candidates are deter-

mined using the scoping-related functions at the point in the

source where there request originates. To distinguish dier-

ent kinds of entities in the IDE, the language interop mecha-

nism is used to identify the entity as for instance callable or

instantiable. A language can further advice an IDE on when

code completion should be triggered by implementing the

getCompletionTriggerCharacters() method.

Code completion for object property accesses needs run-

time information. Since a le typically has a parse error when

requesting the completion, for instance after typing a dot, a

previous version is used to try to obtain a run-time object

and enumerate its members via the language interop API.

Overall, a language implementer needs to add support

for four dierent tags (

DeclarationTag

CallTag

Read-

VariableTag

WriteVariableTag

) on AST nodes, and im-

plement four methods (

getNodeObject

getDocumentation

getSignature

getCompletionTriggerCharacters

) that are

specic to the IDE support. The other APIs have other uses

for tools such as True’s debugger or language interop sup-

port. While these APIs may require substantial implemen-

tation eort, we exclude their implementation eort from

further consideration, since they are not only used for IDE

support.

In their absence, one may want to consider our

parse-based approach to support IDE features more directly,

as discussed in the remainder of this paper.

Overall, we nd that the eort needed to implement IDE-

specic support is minimal when using the approach pro-

posed by Stolpe et al. [2019].

3 A Language-Agnostic Design for a

Parse-Based Language Server

Our goal is to devise a design for a language server that keeps

the eort to gain IDE support as small as possible, while

maximizing the utility of the IDE features. Thus, we aim to

keep the increase in the implementation eort compared to

Stolpe et al

’s execution-based approach as small as possible.

3.1 Architectural Overview

Figure 2 gives a high-level overview of language servers in

the context of the language server protocol. The LSP con-

nects a language server with IDEs (see section 2.1) and is

used for instance to communicate the le content of the le

currently being edit. In the simplest form, an IDE will send

the complete content of a le. It will also send requests for

instance for code completion for which it then will receive

a result. A language server may also send notications, for

instance to report the results of a lint tool.

While the execution-based approach requires a language’s

interpreter, tagged AST nodes, as well as the implementation

of True’s debugging and interop APIs, for our parse-based

We had them already implemented, because debugging and proling sup-

port were more urgent for our uses.

approach, one needs only a parser that creates a collection

of structures, which encodes the relevant information for

an IDE. This collection of structures is then queried by the

language-agnostic part of the language server, whenever the

IDE requests information. Thus, a key insight here is that

the needed information can be represented with language-

agnostic structures to enable IDE services.

Though, our main focus is on the design of the API that

enables us to populate this collection of structures with min-

imal eort during parsing.

3.2 API Overview

The language-agnostic API to record the structural infor-

mation for a le is illustrated in g. 3. It denes only two

new concepts: an element and an element id. However, it

borrows concepts such as tokens and diagnostics from the

LSP, which are indicated with the LSP prex for type names.

The LSP’s enum types identify for instance what kind of

element it is, as dened in the protocol. The others merely

group descriptive information that is used to either locate

an element in a le or display information to the user. The

LSP types do not generally encode structure or semantics

directly (see section 2.1).

With our API, most IDE features can be directly derived

from the notion of the language element, or as the LSP calls

it, a symbol. Elements can be contained in other elements,

and have an id, name, what kind of element they are, a detail

string, a signature descriptor, an overall range of code where

its denition can be found, and the range of code that should

be selected when navigating to it, typically the name.

When parsing and analyzing an input le, the

Structures

class is used to record the information. It provides methods

that record elements either in one or in two steps. This is

useful to capture the nesting of elements. Thus, at one point

the denition is started, and it is completed only later, for

instance once its closing parenthesis is parsed. The nesting

structure can then for instance be displayed in an outline

view. The single-step denition also takes a bool to indicate

whether the denition is after navigation. This enables us to

model for instance object elds.

In addition to recording their denition, we also record

when elements are referenced. They can be referenced either

indirectly with an element id, or directly, for instance when

a class is named and the parser can directly resolve it.

Element ids encode equality classes for language concepts.

By subclassing

ElementId

, we distinguish dierent types of

elements, e.g. classes, methods, and local variables, without

introducing a notion of lookup, lexical scopes, or similar

into the API, keeping the eort of implementing the record-

ing for a new language low. While this does not provide

sucient structure to implement concepts such as types, in

While this approach is somewhat naive, it works well for SimpleLanguage,

Newspeak, and SOM.

Execution vs. Parse-Based Language Servers DLS ’22, December 07, 2022, Auckland, New Zealand

Eclipse

VSCode

NetBeans

Vim

Emacs

Language

Server

Protocol

TypeScript LS

Rust LS

Run-Time LS Parse-Time LS

Interpreter

ASTs with tags

Interop

Debugger

for each language:

Parser

Collection

Structures

communicates:

- file content

- requests, e.g. completion

- notifications, e.g. lint

…

Figure 2. High-level overview of the Execution and Parse-based Language Servers. The language server protocol connects

IDEs with language servers communicating le content, requests (e.g. for code completion), and notications (e.g., for lint

information). The language servers are commonly specic to a language such as TypeScript or Rust. The language-agnostic

approaches dier in their key components.

ElementId

match(query: string): bool

getName(): string

hashCode(): int

equals(obj): bool

Structures

startElement(kind: LSP.KindEnum): Element

startElement(name: string, kind: LSP.KindEnum, id: ElementId,

idRange: LSP.Range, a!erNav: bool, outerScope: int): Element

completeElement(elem: Element)

recordElement(name: string, id: ElementId, kind: LSP.KindEnum,

range: LSP.Range, a!erNav: bool, outerScope: int)

referenceElement(elem: Element, range: LSP.Range)

referenceElement(id: ElementId, range: LSP.Range)

recordToken(range: LSP.Range, type: LSP.TokenEnum,

modiﬁers: LSP.ModiﬁersEnum)

addDiagnostic(diag: LSP.Diagnostic)

Element

setId(id: ElementId)

setName(name: string)

setKind(kind: LSP.KindEnum)

setDetail(val: string)

setSignature(val: LSP.SigInfo)

setRange(val: LSP.Range)

setSelectionRange(val: LSP.Range)

abstract class,

to be implemented

for each distinct

language element

API provided by our framework, used when parsing and processing an input file to record the structural information

Figure 3. Overview of the API of our parse-based Language Server. These concepts are sucient to support all common IDE

features with minimal development eort for a language implementer. Subclasses of

ElementId

need to be implemented for

each distinct language element.

Element

and

Structures

are provided by our framework, and are used to capture all needed

information for the language server. Types with the LSP-prex correspond to the structures in the language server protocol,

and typically consist of source locations and string labels to be shown to the user in the IDE.

practice, it results in a highly useful developer experience for

dynamic languages. Furthermore, we assume that a parser

distinguishes dierent variables in some way. When it rep-

resents them as objects or a unique identier, an element id

only needs to compare these representations to distinguish

variables correctly, too. See section 4 lst. 2 for an example.

Tokens are recorded to enable semantic highlighting. To-

kens distinguish dierent kinds, e.g., for a local access, a

eld access, or a keyword, as well as possible modier, such

as static, abstract, or documentation. The IDE will use this

token information for syntax highlighting. Since we can use

the semantic analysis from the language’s parser/compiler

to disambiguate for instance accesses to elds and local vari-

ables, the resulting highlighting can be more precise than

custom grammar approximations used in many editors, or

even the actual grammar, which may for instance conated

elds and locals to a single type of Identifier.

Finally, diagnostics are additional information, warnings,

or errors to be reported to the developer. They can be syn-

tax errors preventing parsing, or even linter warnings that

identify stylistic issues or code smells.

4 Utilizing the API for SimpleLanguage

We will illustrate how our language-agnostic API can be

used to provide IDE features for SimpleLanguage. SimpleLan-

guage is a language used to document the True framework

(see section 2.3). Its parsers is generated by ANTLR [Parr

2013] and its semantics are implemented using True.

To utilize our design with SimpleLanguage, we forego the

existing SimpleLanguage implementation and use only its

DLS ’22, December 07, 2022, Auckland, New Zealand Stefan Marr, Humphrey Burchell, and Fabio Niephaus

ANTLR grammar le. Thus, we demonstrate how to use our

approach independent of True and that it is general enough

to be used with the widely used ANTLR parser generator.

Structural Adaptation. The SimpleLanguage implemen-

tation and its grammar remain unchanged. This is possible

since ANTLR generates a

ParseTreeListener

, which pro-

vides enter and exit events for all grammar rules. This allows

us to retrieve all relevant information from the parser. To

avoid changes to any of the other generated classes and uti-

lize the grammar actions without change, we provide our

own version of the

SLNodeFactory

class. This class is nor-

mally used to construct True nodes that represent the AST.

In our case, we replace it by a factory that captures the high-

level information already produced by the grammar actions.

We subclassed the generated

SimpleLanguageParser

adapt the parser initialization and use our custom node fac-

tory, our parse tree listener, as well as a custom error lis-

tener. The error listener merely extracts the necessary po-

sition information and adds it as a diagnostic object to the

Structures data structure.

Capturing the Language’s Structures. Our node fac-

tory overrides most methods to use the information from

the parser to record the elements and token semantics as

illustrated in lst. 1. For instance, when parsing a function

denition, rst we call

startElement(name, id, kind,

range)

on line 4 with the lexical details extracted from the

parser. This creates a new

Element

to store these details,

The actions contained in SimpleLanguage’s grammar already

manage some details like adding parameters to a function

object by using the node factory. Thus, here we only need to

override its

addFormalParameter()

method to record an el-

ement for each parameter. Since these elements are implicitly

contained between the

startElement()

and

endElement()

calls for the function, we do not need to handle containment

or scope explicitly.

The other methods overridden in the node factory use

recordToken(range, type, modifiers)

to capture token

semantics for the semantic highlighting (see lines 9 and 14).

In SimpleLanguage, both function denition and assigning

to a name are parsed as variable denitions. Object elds are

created by assigning to an object property. To distinguish

between variable denitions and object properties, we de-

ne two subclasses of

ElementId

, i.e., two dierent equality

classes:

VarId

and

PropertyId

. Lst. 2 sketches

VarId

, and

PropertyId

is similar, but would omit the scope. Since Sim-

pleLanguage does not have any notion of classes or types,

and because object properties can be added arbitrarily, there

is not enough structure in the language, which would benet

from including a scope.

The node factory’s methods for creating read and assign-

ment nodes use

referenceElement(id, range)

to record

the referencing of variables. Note that we rely here solely

on the

ElementId

mechanism (see section 3.2), where our

1 class LS Fa ct or y ext en ds S LN od e Fa ct o ry {

2 @O ve rr id e

3 vo id st ar t Fu nc t io n (T oke n id , To ken s ) {

4 cu rr en tF n = st ru ct ure s . st ar tEl em en t (

5 id . t ext () , new VarI d ( id . t ext ()) ,

6 FUN CTION , id . ge tR an ge ()) ;

8 pa ra mN am es = new Ar rayList < >();

9 st ru ct ur es . r e co rd To ke n ( id , FU NCTION , NONE );

10 }

12 @O ve rr id e

13 SL Sta te m en tN o de c rea te Br ea k ( Tok en b ) {

14 st ru ct ur es . r e co rd To ke n (b , KEYWORD , m od if ie r );

15 ret urn su per . c re at eB rea k (b );

16 } }

Listing 1. Capturing token and structural information in

the node factory, which is otherwise used to construct the

nodes of the SimpleLanguage AST.

1 class VarI d ext ends El em en tI d {

2 pub lic fi nal St rin g na me ;

3 pub lic fi nal El em ent s cop e ;

5 pub lic int has hC od e () {

6 ret urn Obje ct s . hash ( name , sco pe ); }

7 pub lic bool ea n equ als ( Ob je ct o ) {

8 if ( o == n ull || get Cl as s () != o . ge tC la ss ()

9 || ! n ame . equ al s ((( Var Id ) o ). nam e )) {

10 retu rn fal se ;

11 }

12 ret urn sc ope == (( Va rId ) o ). s cop e ;

13 } }

Listing 2. Sketch of

VarId

to identify variable and function

names distinguishing them from object properties in Simple-

Language.

VarId

class encodes the scope of

Element

s, which avoids

the need to manage lexical scopes more explicitly.

The factory method for writing to a property will record a

new element, since in SimpleLanguage writing to an object

property may create it dynamically. The reading of a property

will simply record a reference to an element.

The custom parse tree listener is needed since Simple-

Language’s grammar actions do not pass all relevant lex-

ical details to the node factory. For instance, we use it to

record the tokens for various keywords such as

function

while

, and

return

for highlighting. Furthermore, when

leaving the grammar rule for a function, we add signature

details, based on the seen parameter names, and can trigger

the completion of the function’s element denition by calling

completeElement(elem).

This concludes a brief but representative description of

what a language implementer roughly needs to implement

Execution vs. Parse-Based Language Servers DLS ’22, December 07, 2022, Auckland, New Zealand

to gain IDE support based on our approach. Our implemen-

tations for Newspeak and SOM are fairly similar, the main

dierence is that they subclass custom recursive descent

parsers instead of an ANTLR grammar.

5 Supporting Existing Parsers for Use in

Language Servers

One of the challenges for a language-agnostic language

server is the reliance on existing parsers and/or compilers.

For responsiveness, IDEs may prefer incremental parsers.

They may also prefer parsers that can recover from a wide

range of errors to provide feedback while editing. A naive

heuristic in a language like Java might for instance ignore all

tokens after a parse error until a statement terminating semi-

colon or a closing brace is found, and then restart parsing

from there. For common code edits, such an approach would

mean that most of a le can be parsed and most informa-

tion can be obtained reliably, avoiding undesirable “icker”

eects, for instance when elements constantly appear and

disappear in an outline view.

By striving for a low-eort approach to tooling, we cannot

expect a language implementation to provide an incremen-

tal or even an error-recovering parser. Instead, we have to

assume that it aborts after the rst error. This means our

language-agnostic language server has to compensate for it

and still provide a predictable user experience. We use two

slightly dierent approaches for the handling of structural

data and the handling of semantic highlighting. Though, in

either case, we attempt to reparse a le on every change.

5.1 Caching Structural Information

Whenever a le is changed, we try to parse it with the parser

subclass augmented to collect structural information. Once

we have a successful parse, we will cache the result on the

le level, and use the found classes, methods, etc, for fu-

ture requests, for instance code completion or navigation.

Though, we do not use any information obtained from erro-

neous parse attempts, to minimize UI icker, i.e., elements

appearing and disappearing.

This approach is based on the assumption that most edits

will be made on existing les and that les are typically

checked into a project’s repository without parse errors.

Thus, when opening a le, the parser will initially succeed,

and we cache a valid representation of the le’s structure. The

cache will only be updated after another successful parse.

The worst-case scenario for this approach is when a new

le is created. Depending on the programmer, this le may

take a long time to parse successfully for the rst time, and

until then, no structural information will be available.

The other major weak point is that structural information

can become outdated when major changes are made to a le

without the parser succeeding. This means that position in-

formation may be outdated, for instance when large chunks

Figure 4. For the semantic highlighting, we combine the

parse result from before the error, with the ones of the last

successful parse, which is used for the lines after the error.

are inserted into the le, or deleted from it. In the future,

we may use the

didChange

notication of the LSP to correct

location information.

In our own use, neither of these two scenarios caused

usability issues, and our simple caching approach gives sur-

prisingly usable results. This is because we usually work on

existing les that we change step-by-step with small edits.

5.2 Caching Semantic Highlighting

For the semantic highlighting, the situation is a little dierent.

While the structural information is only observable either in

an outline view, where only very coarse details are given, or

when interacting with the code, the semantic highlighting is

more directly observable as token coloring.

Here we will also only cache results from a successful

parse, but if parsing fails, we combine token information

from the last successful parse with token information from

the one that failed. Specically, from the failing parse, we

take the token information up to the position of the error.

Then we add the token information from the last successful

parse for the following line and beyond. This means, when

there is an error, only the bit after the error, on the same line,

will not be highlighted, as can be seen in g. 4.

The main assumption here is that edits would not have

changed the le drastically, and reusing the older highlight-

ing for the rest of the le is likely ne. Thus, the worst-case

scenario is once again inserting or deleting large chunks of

the le. But, since we use the beginning of the failing parse,

starting a new le still gets useful highlighting information.

Despite the approach being very simplistic and not yet

adapting highlighting information based on the

didChange

notication, the result feels very usable and does not cause

“ickering” highlighting in the editor.

6 Implementation

Our language-agnostic language server is implemented in

Java 17 and uses the LSP4J

library as an implementation of

the LSP. The implementation is available as open source

and has been tested with Visual Studio Code as the client.

A Java implementation of the language server protocol, Eclipse Foundation,

https://github. com/eclipse/lsp4j

Eortless Language Servers, Stefan Marr et al., https://github

com/smarr/e

ortless-language-servers

DLS ’22, December 07, 2022, Auckland, New Zealand Stefan Marr, Humphrey Burchell, and Fabio Niephaus

The server keeps all information for parsed les directly

in memory. The data is kept separately per le and the data

format is close to the one required for the LSP.

The main classes were already illustrated in g. 3. The

Structures

class keeps the data for a specic le. It keeps

a list of scopes, diagnostics, elements for after-navigation

lookups, and maps from element id to element or references.

The nesting of elements is implicitly constructed by the

startElement

and

completeElement

methods. Each ele-

ment itself can contain a list of child elements, as well as a

list of contained references. We use this structure for lookups

based on source locations in the form of line and column

numbers, by identifying the element most specic to the

location. This enables us for instance to identify the current

element for hover information, goto denition, and code

completion requests. For code completion, we also keep a

list of elements that can only appear after navigation, e.g.,

after the dot to access elds or methods in Java.

The list of diagnostics collected during parsing and possi-

bly any compilation steps, are also kept here.

To enable lookups based on names, we keep one map from

ElementId

to a set of

Element

objects, and one to map

from an ElementId to a list of references.

The subclasses of

ElementId

need to implement

equals()

and

hashCode()

methods to dene equivalence classes, for

instance to distinguish local variables from class names, in

languages that distinguish them. The subclasses can also

override the

matches(str)

method to adapt the search se-

mantics based on language-specic requirements. For in-

stance, a language with Pascal case or camel case may want

to adapt the matching so that identiers match based on the

capitalized letters, when someone searches using them as

an abbreviation. Our current standard implementation will

only match the case-insensitive beginning of an element’s

name, which works well for a variety of dierent languages.

Future work could combine our approach with something

like SemanticDB [Burmako 2018] to avoid keeping all data

in memory. Keeping everything in memory makes our lan-

guage server likely impractical for code bases with millions

of lines of code. Though, since our goal is to lower the en-

try barrier for the implementation of language servers, the

corresponding young or researchy languages are unlikely to

have large enough code bases to require such optimizations.

Tradeos between Execution and Parsing

To assess the tradeos between parse and execution-based

language servers, we contrast the requirements for both ap-

proaches, and then compare the precision and completeness,

as well as the eort needed to realize either.

7.1 Architectural Requirements

For both parse and execution-based language servers, we

assume as previously mentioned that the standard parser

of the language implementation is used. Thus, we expect a

batch parser/compiler, which aborts on the rst error. While

incremental parsers would be desirable, they are not required,

but can be supported for both types of language servers.

For our parse-based approach, we need to be able to adapt

the parser. Though, we do not have any other requirements.

While we implemented this approach for languages in Java,

we do not see any obstacles to apply it to languages or LSP

libraries implemented in other languages.

As demonstrated with SimpleLanguage’s ANTLR gram-

mar, it is even possible to support ANTLR-based languages

without any change to their grammar. Instead, all details can

be collected based on ANTLR’s

ParseTreeListener

s, or in

the case of SimpleLanguage, in tandem with the language-

specic node factory.

For SOM we used TrueSOM, and for Newspeak we used

SOMns’ implementation. In both cases, we rely on their hand-

written recursive descent parsers. We were able to subclass

the parser and lexer classes to access the desired information.

For convenience, we changed the parsers and lexers slightly,

to avoid having to track the grammar context in the subclass

explicitly. Thus, we made

private

methods

protected

that we can override them, and in some cases, introduced

methods in the parsers to disambiguate the grammar context,

for instance to distinguish local variables from elds.

Since we cache information only from a successful parse,

as discussed in section 5.1, information for the language

server is only available after parsing indeed succeeded.

For the execution-base approach of Stolpe et al

, the major

requirement is that languages are implemented based on

the True language framework. As detailed in section 2.3,

a language implementer can augment the AST with tags to

identify declarations, function calls, or variable accesses, and

needs to implement a few methods to provide details such

as the kind of element, documentation, and signatures. By

relying on existing True APIs for debugging and language

interoperability, the investment into those also gives the

benet of these other features.

However, the main requirement to collect much of the

desired data is that the code does not only parse success-

fully, but can also be executed. This is needed for instance

to collect structural information, identify concrete elements

referenced, or run-time types, as we detail in the following

section. The biggest drawback of this approach is that one

rst needs well-dened entry points into the code and ide-

ally fully cover the code with unit tests. Furthermore, the

runtime may not have all desired information, because the

language designers may not have needed, e.g., access to class

comments at run time.

7.2 Precision and Completeness of IDE Features

For each of the IDE features, the parse and execution-based

approaches have dierent benets and drawbacks, which we

discuss in the following and summarize in table 1 on page 12.

Execution vs. Parse-Based Language Servers DLS ’22, December 07, 2022, Auckland, New Zealand

Figure 5. The outline view shows all elements dened in a

le and their nesting relation.

Navigational Features.

Symbols. To display a list of elements contained in a le

or workspace, or to display a tree-like outline view as in

g. 5, we need to report all dened symbols to the client.

For the parse-based approach, this is straightforward. The

language-specic part supplies all details via our API. This

includes what kind of element it is, documentation, any avail-

able type information, and source locations.

For the execution-based approach, the situation is slightly

more complex. Before execution, one only has the list of

functions or methods in a le, and can traverse their ASTs to

identify local variable denitions using the

DeclarationTag

(see section 3). Only after execution, information on classes

or other containing structures can be obtained. This infor-

mation may then still be only partial depending on the code

structure. For instance, in the case of Newspeak, which sup-

ports nested classes, executing the top-level le would not

provide information for nested classes. A full suite of test

cases or other well-dened entry points for execution would

be needed to be able to extract all structural information.

Thus, locals and functions can be obtained in both ap-

proaches, but for globals, including classes, as well as prop-

erties the execution-based approach may require execution.

Goto Denition. In our parse-based approach, we utilize

the equality of

ElementId

s (see section 3.2) to nd candidate

denitions when receiving a request for goto denition. For

dynamic languages, this means we may have low precision,

i.e., we may propose candidates that cannot be referenced,

but we can identify candidates in all successfully parsed les.

To distinguish local variables, our approach diers slightly

between the languages. For SOM and Newspeak, the parser

provides us with an object representing the denition, which

allows us to reject variables with the same name in other

functions as candidates. For SimpleLanguage, we include

the current function as part of the element id for variables,

which models the lexical scope accurately (see lst. 2). An

example can be seen in g. 6.

The execution-based approach has initially only the struc-

tural information on functions and variable declarations

Figure 6. The goto denition action shows all candidates

for the

inc

function, directly showing one of the three here.

identied by the

DeclarationTag

. But at run time, it can ob-

serve concrete functions/methods being used at call sites and

obtain globals and information about properties from the de-

bugging and interop APIs. With this, it can either eliminate

candidates with the same name, or rank the ones observed

at execution time higher.

Thus, the execution-based approach can use this extra

information to improve precision, but requires execution to

obtain the globals and properties.

References and Highlights. IDEs may highlight all oc-

currences of a selected symbol, e.g., a variable or function,

and oer a search for all references to a symbol.

With the parse-based approach, we detect all candidates

and can distinguish the reading and writing accesses, as sup-

ported by the protocol. Eliminating candidates, for instance

based on a language’s scoping rules is done based on the

equality semantics of ElementId.

With the execution approach, using the

ReadVariableTag

and

WriteVariableTag

, local variable identity can be deter-

mined without execution. For functions, all candidates can be

determined by name. Based on run-time information, these

can be ranked and possibly even eliminated. References to

globals, e.g., classes, require run-time information.

Informational Features.

Syntax Errors. For reporting syntax errors, or any other

errors reporting during parsing, both approaches have sim-

ilar benets. Since errors are determined already during

parsing, execution is not generally required.

Signature Help. Our parse-based approach can harvest

the details about function signatures, argument names, and

similar during parsing. It can also capture comments, though,

here a change to the parser or lexer may be necessary. In

ANTLR grammars, comments are often skipped at the lexer

level, but can also be redirected to a separate token channel.

Similarly, hand-written parsers may simply discard com-

ments. In either case, the changes to capture comments are

normally fairly localized.

For the execution-based approach, if the language keeps

this information at run time, one may be able to query objects.

For SimpleLanguage, the parser neither keeps comments nor

separates arguments from local variables. Thus, signature

DLS ’22, December 07, 2022, Auckland, New Zealand Stefan Marr, Humphrey Burchell, and Fabio Niephaus

Figure 7. Code completion for functions yields all possible

candidates, including signature information.

help and documentation will be incomplete. On the other

hand, execution can be used to augment signatures for in-

stance with run-time type information.

Hover Information. For hover information, the situation

is similar to signature help. For the parse-based approach,

we can capture for instance class comments and display

them. The execution-based approach will only get access

to such structural information via execution, and only if

it is preserved. On the other hand, it is possible to show

concrete run-time types, or even values, and extract concrete

information from those.

Code Completion. Supporting code completion brings its

own set of challenges. Stolpe et al

already identied the most

important issue: in many cases code completion is requested

at a point where the code may not currently parse correctly,

which is a challenge both approaches need to overcome.

Completing Globals and Locals. In the simplest case

for code completion, we may try to complete some kind of

basic expression. Depending on the language, many of these

requests will parse successfully. In our parse-based approach,

we either propose all symbols, i.e., elements at this point in

the scope, or when the cursor is at the end of an identier,

use this identier to select candidates with partial name

matching. Generally, we walk the nested

Element

objects,

and propose all elements found in the lexical order, inside

out, which seems to provide good results.

With the execution-based approach, results from the last

successful parse and/or run are used. Information for locals

can be used from the last parse and will be complete. Though,

globals require run-time information. For improved results,

the run-time information can be used to eliminate and rank

the completion candidates.

Completing Functions. For functions and methods, both

approaches rely on the last successful parse of the le, which

yields all lexically dened functions and methods as illus-

trated in g. 7. Though, run-time information can be used

to eliminate or rank completion candidates.

Completing Properties. A complication for oering code

completion of property names is that in many languages

the parser will reject the program at the point when the

character to access a property, e.g., the dot as in

obj.member

is typed into the editor. The program will only parse when

at least the rst letter of the property name is entered, too.

For our parse-based approach, this did not present a major

challenge. We keep the structural information from the last

successful parse, which in most cases will be after

obj

was

typed completely. When the parsing fails, we see that the

last inserted character was a dot and can use this to query for

completions that are after navigation as we call it. We include

this concept directly in the API to be able to more easily

identify completion candidates. In most languages, members

in the form of elds and methods will follow lookup rules

separate from lexical scoping. The after-navigation ag is

part of the denition of

Element

s, and considered when

searching for candidates. When there are already characters

of a property name, we will use it to narrow down the list

of candidates. Otherwise, we propose all candidates of the

current le rst, and then all the other ones.

As mentioned earlier, we do not have any type informa-

tion or other structural details that could be used to further

narrow down the candidates, which is a classic challenge for

tooling for dynamic languages.

For the execution-based approach, an earlier successful

result will be used. Here the concrete type, e.g., of the object

before the dot can be queried for its members, which are

then proposed as completion candidates. Assuming that the

type observed in the test run is representative, this gives

highly accurate completion proposals.

Convenience Features. Features such as linting and se-

mantic highlighting are popular in IDEs, and add to their

convenience.

Linting. Linters often try to identify supercial or easy-

to-spot issues in code. They can comment on code formatting,

style, and sometimes even bugs.

In our parse-based approach, we added a few linters that

address basic issues. The simplest is a linter that detects

whether a le ends with a newline, which is something of a

convention in some environments. Perhaps more useful, we

also have a linter that checks all references created during

parsing for a corresponding candidate denition. If no de-

nition is found, depending on the language, it may indicate

a bug. These linters are opt-in per language and either work

on the whole workspace, or just on a single le.

For the execution-based approach, it is conceivable to

implement some basic linting, too. For instance, since all

method and function calls are tagged with

CallTag

, one can

identify whether there are any calls for which no candidates

exist. The errors of unit tests executed in the background

may also be used to add lint-like hints.

Semantic Highlighting. Syntax highlighting is popular

in many editors and IDEs. Usually, this requires replicating

Execution vs. Parse-Based Language Servers DLS ’22, December 07, 2022, Auckland, New Zealand

a language’s parser in some framework to express the high-

lighting information. These days TextMate grammars

and

Tree-sitter

seem popular. Though, in the author’s expe-

rience, for researchy languages, these grammars are often

minimal, incomplete approximations of the actual language.

With our parse-based approach, we capture the seman-

tics of all relevant tokens using the semantic analysis of the

existing parser/compiler. Thus the highlighting can distin-

guish, e.g., elds and methods, or locals and arguments. In

our experience, capturing this information requires typically

a single line of code per token type.

Marr et al

[2017] demonstrated the use of AST tagging to

a similar eect. Unfortunately, this requires many dierent

AST tags, and will only highlight the body of functions and

methods. Thus, it would be possible also in a execution-based

approach, but likely feel rather incomplete.

Code Lenses for Unit Tests. The LSP supports the notion

of code lenses, which enables us to annotate specic code

ranges for instance with additional information and actions.

In our parse-based approach, languages can opt into such

lenses and congure them. For instance, we provide a code

lense to detect methods that start with

test

as common for

the xUnit frameworks.

While not mentioned in Stolpe et al

[2019], providing code

lenses for such scenarios should be trivial, since it already

relies on unit tests to harvest information for the IDE.

7.3 Implementation Eort

To compare implementation eorts, we have only a limited

amount of data available. Though, for the execution-based

approach, in the ideal case, one needs to merely add tags

to AST nodes and implement a few small methods. Adding

tags is as simple as implementing a

hasTag(tag)

method,

returning

true

for the four tags to indicate declarations, calls,

reads, and writes on the relevant nodes. One may argue that

the APIs for debugging and language interop are already

implemented for the other desirable functionality. Thus, the

eort is minimal.

To assess the implementation size, we use

cloc

version

1.92, and report the number of lines of code (LOC).

For SimpleLanguage, we estimate that the implementation

of the interop and debugging APIs is about 454 LOC. This

is an estimate, since there is in some cases not a clear sep-

aration, and we simply counted all lines of code of the classes

SLObject

SLFunction

SLLanguageView

, and

SLType

, which

are dominated by the debugging and interop APIs.

In comparison, the language-specic code for our parse-

based approach is 491 LOC for SimpleLanguage, 920 LOC

for SOM, and 711 LOC for Newspeak.

TextMate: Language Grammars, MacroMates Ltd., https://macromates

om/manual/en/language_grammars

Tree-sitter, GitHub Inc., https://tree-sier. github. io/

Much of it is however boilerplate. For SimpleLanguage,

considering only the parser subclass, our custom node fac-

tory, and the token listener, all information is collected with

merely 320 LOC. In the SOM implementation, the main code

is in the parser and lexer subclasses, which add up to 472

LOC. For Newspeak, we only needed to subclass the parser,

which amounts to 473 LOC. The other code is boiler plate,

equal()

and

hashCode()

methods for element ids, as well

as conversion functions from one representation of a lexical

position to another.

To provide a rough comparison, we sampled 20 language

server implementations from the LSP wiki,

for languages

we had heard of before. The goal here is to gain an intuition

of how our numbers compare to ad hoc language server

implementations. Note, these numbers are not well-dened

baseline for comparison, since these servers can spent arbi-

trary amounts of eort on better analysis and more precise

results and support dierent sets of LSP features, which we

did not access here. Comparing against the execution-based

approach provides are more well-dened baseline.

Taking the latest version of the git repositories, removing

tests and conguration les, and measuring the LOC, we

nd the smallest language server to be 1

577 LOC and the

largest one 95

878 LOC, Microsoft’s deprecated open source

Python language server.

The mean is 18

985 LOC and the

median 12

055 LOC. Some repositories contain however also

the language server protocol implementation itself, making

the numbers appear somewhat larger.

Overall, we observe that with our approach, a language

server is about an order of magnitude less code, while pro-

viding the most widely used IDE features.

8 Discussion

The focus so far was on the specic IDE features, though

there are a number of other concerns worth discussing, in-

cluding language built-ins, polyglot code bases, various limi-

tations, as well as how practical batch parsers are from the

perspective of performance.

Recognizing Built-Ins. The execution-based approach

is implicitly able to capture all functions that are built into

a language implementation, since they are announced via

True’s standard instrumentation mechanisms.

However, with our approach, we require the information

to be given via our proposed API. This means built-ins are

often not directly represented. In the case of Newspeak and

SOM, we already collected the built-ins for other tooling, and

they were readily available. Though, for SimpleLanguage we

only used the ANTLR grammar, which means, this informa-

tion is not directly available, but can be provided manually.

Implementations: Language Servers, Microsoft, https://microsoft

github

o/language-server-protocol/implementors/servers/

Microsoft’s current PyLance Python Language Server is closed source,

and not available for comparison.

DLS ’22, December 07, 2022, Auckland, New Zealand Stefan Marr, Humphrey Burchell, and Fabio Niephaus

Table 1. Tradeos between Parse and Execution-based Language Servers

LSP Feature Parse Time Run Time

Symbols: locals, functions ✓ ✓

Symbols: globals, properties ✓ ⋆

Goto denition: locals ✓ ✓

Goto denition: functions ✓ ✓ + call targets

Goto denition: globals, properties ✓ ⋆ + types and properties

References and Highlights: locals ✓ ✓

References and Highlights: functions ✓ ✓ + call targets

References and Highlights: globals, properties ✓ ⋆ + types and properties

Syntax Errors ✓ ✓

Signature Help ✓ ⋆ + documentation (if preserved), types

Hover Information ✓ ⋆ + documentation (if preserved), types

Completion: globals ✓ ⋆

Completion: locals ✓ ✓

Completion: functions ✓ ✓ + call targets

Completion: properties ✓ ⋆ + types

Linting ✓ ⋆ + unit test errors

Semantic Highlighting ✓  + run-time tags [Marr et al. 2017]

Code Lens for Unit Tests ✓ ⋆

✓

Available directly



Limited static information

⋆

Reaches similar level after collecting run-time information

+ Enhanced details as more information can be observed at run time

Support for Polyglot Code Bases. One of the notable

features of the True and GraalVM ecosystem is its support

for polyglot programming. As already noted by Stolpe et al

[2019], the language server has immediate support for it,

simply be virtue of using the language interop API [Grimmer

et al. 2018] to obtain the needed information.

In our parse-based approach, this is not considered directly.

Even with the three languages supported, it does not work

out of the box to get for instance code completion proposals

from the other languages.

To support it, we could take inspiration from True’s in-

terop approach and add abstract methods on

ElementId

such

isExecutable()

and

isMemberReadable()

for which

languages can then indicate whether an element is suitable,

for instance as candidate for code completion given the cur-

rent lexical context.

Supporting Non-Local Syntax Features. When imple-

menting our approach for Newspeak, we stumbled over

Newspeak’s primary factory method syntax, which is essen-

tially a constructor. Lst. 3 sketches the

Error

class, where

the constructor takes

msg

as an argument. Inside of what

one could consider the lexical body, we dene in line 2 the

public eld message.

For our API (see g. 3), this meant we needed to add

outerScope

parameter, in which we pass an integer to

1 class Erro r sign al : msg = (

2 | pub lic m ess ag e = ms g . |

3 s elf sig nal .

4 )()

Listing 3. Newspeak class syntax denes elds inside the

constructor.

indicate the number of levels to go up the nesting chain

when recording an element. Though, Newspeak is not the

only language that benets from this approach. Python and

Ruby create elds during the constructor execution, as part

of the initial assignment. With this feature, these denitions

can be attributed to the surrounding class.

For other discontinuous language features, for instance

extensions to the same class in dierent Ruby les, one can

rely on the equivalence of

ElementId

objects and have the

dierent extension parts be equivalent, which ensures that

e.g., references and code completion work as expected.

Limitations and Practical Issues. From the user per-

spective, one observable limitation will be that code com-

pletion will oer to complete variables that have not been

dened yet. In some scenarios, this might even be useful,

e.g., when reordering code. Only on completion, the linter

or parser will produce a warning when the code is invalid.

Execution vs. Parse-Based Language Servers DLS ’22, December 07, 2022, Auckland, New Zealand

Another aspect that can cause practical limitations is the

parser design. For instance in the case of SOM, it is dicult

to get the correct details for code completion of a superclass’s

name. The problem is that the parser tries to load the super-

class, and then triggers Smalltalk code to handle the load

error, which does not preserve the error location. This makes

it dicult to support completion at this particular point. We

can imagine that other languages have other oddities, where

changes to the parser might be desirable.

For some languages, parsers may even minimize the work

performed for an initial parse. V8 does for instance delay

the full parsing of a function body to its rst activation. For

tooling such as ours, this is of course undesirable. In these

cases, one must hope that the parser has a ag like V8’s that

allows one to disable this optimization.

Performance of Batch Parsers. Since our approach relies

on parsers that already exist, and are not designed for use in

IDEs, we wondered what their performance would be. There

are of course conceptual limitations, such as not having error

recovery, and thus, only partial information for les with

errors. Though, this aside, we wondered whether the parsers

would be fast enough for usual usage scenarios.

To investigate the question, we build code generators for

Newspeak, SimpleLanguage, and SOM, which, based on a

library of methods, would generate les of arbitrary length.

Then we built benchmarks that measure the parsing speed,

as well as the overhead of answering the requests for all

symbols and the semantic tokens, which an IDE would query

relatively often after le changes to update the outline view

and semantic highlighting information.

Running these benchmarks on a decade-old Intel Xeon E5-

2620 with 16GB RAM, Ubuntu 20.04 and Java 17, makes us be-

lieve that there is no practical issue preventing our approach

from succeeding. The results are depicted in g. 8. They

indicate that batch parsers are fast enough, even for large

les. The SimpleLanguage parser, being based on ANTLR

is the slowest, but only takes 215ms for 10,000 LOC. The

Newspeak and SOM parsers stay well below it with 86ms

and 78ms respectively.

Johnson [2014] suggests that humans expect feedback in

under one second, but ideally expect a visible eect in about

100ms. For the combination of slower parsers and large les,

one thus may need to utilize the LSP’s support for partial

results and intermediate progress updates.

Current Status of the GraalVM Language Server. In

this paper, we focused our comparison on the system de-

scribed by Stolpe et al

[2019]. Though, as mentioned in

the beginning, it was the foundation for the GraalVM lan-

guage server. One important change in the product is that

the

DeclarationTag

was removed. For static information,

the system now delegates to existing language servers.

For widely used languages, this is a strategy that allows

the users to benet from the combination of a fully-featured

100

10000

1 10 100 1000 10000 1000000

lines of code

time in ms (lower is beer)

Newspeak

SimpleLanguage

SOM

Figure 8. Performance results of SimpleLanguage’s ANTLR-

based parser, as well as the custom recursive descent parsers

for Newspeak and SOM for les with dierent numbers of

lines of code. They take at most 215ms for processing a le

with 10,000 LOC. The area around the lines indicates the 5

and 95

percentile of 1,000 measurements.

custom-built language server, and the enhanced precision

and polyglot capabilities based on the run-time information.

Though, for niche and researchy languages this means that

our parse-based approach is needed to ll in the gap.

9 Conclusion and Future Work

We propose a novel API that enables implementers of dy-

namic languages to provide information to a language server

that covers many of the common features desired in an IDE.

The API design makes it independent of any implementation

framework and enables the use of preexisting parsers.

We compare our approach to Stolpe et al

[2019]’s execution-

based approach. In the best case, they require only a few lines

of code to tag AST nodes, and a small number of methods

to extract structural details. However, the drawback is that

les need to be executable and have either standard entry

points or be executed by unit tests to provide much of the

desirable information to the language server.

Our approach captures all information at parse time, and

sacrices only some precision when selecting candidates, e.g.

for highlighting references or code completion.

From our three implemented languages, SOM required

with 920 lines the most code overall. Much of it is Java boiler-

plate and largely straightforward code turning data from one

form into another. Furthermore, it is an order of magnitude

less code than the median 12

055 LOC for the 20 language

servers we sampled.

Since we use existing parsers, which are not designed for

use in IDEs, we also conrmed that their performance for

reasonably sized les is practical for continuous use, perhaps

to reparse after a keystroke.

DLS ’22, December 07, 2022, Auckland, New Zealand Stefan Marr, Humphrey Burchell, and Fabio Niephaus

Combined with the execution-based approach, our ap-

proach has the potential to give highly precise and complete

information for an IDE with signicantly less code than nor-

mal language servers. Thus, we believe our approach is not

just practical, but even a desirable design for building tooling

for existing niche and research languages.

Furthermore, by itself, our design enables low-eort lan-

guage servers and would be an ideal complement to libraries

implementing the language server protocol to give develop-

ers the option to support many IDE features rapidly.

Future Work. In this work, we focused on dynamic lan-

guages. In a sense, they are more forgiving and user expec-

tations around the precision of code completion may also be

dierent than perhaps for statically-typed languages. Thus,

one of the next steps would be to explore how to benet

from type information in a language-agnostic way to expand

the proposed API to support such languages better.

With our support for basic language-agnostic linting, we

demonstrated that a useful correctness check can be done

with minimal eort. Since static analysis frameworks such

as Infer rst map to an abstract intermediate form, one could

investigate what a language-agnostic language server would

need to capture to enable more advanced analyses.

Acknowledgments

The authors would like to thank Owen Oclee for the initial

implementation of the SOM language server, paving the way

for this language-agnostic language server. We also thank

the DLS’22 reviewers for their detailed and constructive

suggestions.

This work was supported by the Engineering and Physi-

cal Sciences Research Council (EP/V007165/1) and a Royal

Society Industry Fellowship (INF\R1\211001).

References

Eugene Burmako. 2018. SemanticDB: A Common Data Model for Scala

Developer Tools. In Proceedings of the 3rd ACM SIGPLAN International

Workshop on Meta-Programming Techniques and Reection (Boston, MA,

USA) (META’18). ACM, 2. https://doi.org/10.1145/3281074. 3281076

Sebastian Erdweg, Tijs van der Storm, Markus Völter, Laurence Tratt, Remi

Bosman, William R. Cook, Albert Gerritsen, Angelo Hulshout, Steven

Kelly, Alex Loh, Gabriël Konat, Pedro J. Molina, Martin Palatnik, Risto Po-

hjonen, Eugen Schindler, Klemens Schindler, Riccardo Solmi, Vlad Vergu,

Eelco Visser, Kevin van der Vlist, Guido Wachsmuth, and Jimi van der

Woning. 2015. Evaluating and comparing language workbenches: Exist-

ing results and benchmarks for the future. Computer Languages, Systems

& Structures 44 (2015), 24–47. https://doi

org/10

1016/j

2015

007

Special issue on the 6th and 7th International Conference on Software

Language Engineering (SLE 2013 and SLE 2014).

Moritz Eysholdt and Heiko Behrens. 2010. Xtext: Implement your Lan-

guage Faster than the Quick and Dirty way. In Proceedings of the ACM

international conference companion on Object oriented programming sys-

tems languages and applications companion (SPLASH ’10). ACM, 307–309.

https://doi. org/10.1145/1869542.1869625

Matthias Grimmer, Roland Schatz, Chris Seaton, Thomas Würthinger,

Mikel Luján, and Hanspeter Mössenböck. 2018. Cross-Language In-

teroperability in a Multi-Language Runtime. ACM Transactions on

Programming Languages and Systems 40, 2 (June 2018), 1–43. https:

//doi. org/10.1145/3201898

Michael Haupt, Michael Perscheid, and Robert Hirschfeld. 2011. Type

Harvesting: A Practical Approach to Obtaining Typing Information

in Dynamic Programming Languages. In Proceedings of the 2011 ACM

Symposium on Applied Computing - SAC '11. ACM, 1282–1289. https:

//doi. org/10.1145/1982185.1982464

Je Johnson. 2014. Designing with the Mind in Mind, Second Edition: Simple

Guide to Understanding User Interface Design Guidelines (2nd ed.). Morgan

Kaufmann Publishers Inc., San Francisco, CA, USA.

Lennart C.L. Kats and Eelco Visser. 2010. The Spoofax Language Work-

bench: Rules for Declarative Specication of Languages and IDEs. In

Proceedings of the ACM International Conference on Object Oriented Pro-

gramming Systems Languages and Applications (OOPSLA ’10). Association

for Computing Machinery (ACM), 444–463. https://doi

org/10

1145/

1932682. 1869497

Sven Keidel, Wulf Pfeier, and Sebastian Erdweg. 2016. The IDE Portability

Problem and Its Solution in Monto. In Proceedings of the 2016 ACM

SIGPLAN International Conference on Software Language Engineering

(Amsterdam, Netherlands) (SLE ’16). ACM, 152–162. https://doi

org

/10. 1145/2997364.2997368

Paul Klint, Tijs van der Storm, and Jurgen Vinju. 2009. RASCAL: A Domain

Specic Language for Source Code Analysis and Manipulation. In 2009

Ninth IEEE International Working Conference on Source Code Analysis and

Manipulation (SCAM ’09). IEEE, 168–177. https://doi

org/10

1109/SC

AM. 2009.28

Stefan Marr, Carmen Torres Lopez, Dominik Aumayr, Elisa Gonzalez Boix,

and Hanspeter Mössenböck. 2017. Kómpos: A Platform for Debugging

Complex Concurrent Applications. , 2 pages. https://doi

org/10

1145/

3079368. 3079378

Mauricio Verano Merino, Jurgen Vinju, and Tijs van der Storm. 2020. Bacatá:

Notebooks for DSLs, Almost for Free. The Art, Science, and Engineering

of Programming 4, 3 (Feb. 2020), 11:1–11:38. https://doi

org/10

22152/pr

ogramming-journal.org/2020/4/11

Terence Parr. 2013. The Denitive ANTLR 4 Reference (2nd ed.). Pragmatic

Bookshelf. 328 pages.

Daniel Stolpe, Tim Felgentre, Christian Humer, Fabio Niephaus, and Robert

Hirschfeld. 2019. Language-Independent Development Environment

Support for Dynamic Runtimes. In Proceedings of the 15th ACM SIGPLAN

International Symposium on D ynamic Languages (DLS’19). ACM, 80–90.

https://doi. org/10.1145/3359619.3359746

Michael L. Van de Vanter, Chris Seaton, Michael Haupt, Christian Humer,

and Thomas Würthinger. 2018. Fast, Flexible, Polyglot Instrumentation

Support for Debuggers and other Tools. Programming Journal 2, 3 (March

2018), 30. https://doi

org/10

22152/programming-journal

org/2018/2/14

Thomas Würthinger, Andreas Wöß, Lukas Stadler, Gilles Duboscq, Doug Si-

mon, and Christian Wimmer. 2012. Self-Optimizing AST Interpreters. In

Proceedings of the 8th Dynamic Languages Symposium (Tucson, Arizona,

USA) (DLS’12). ACM, 73–82. https://doi.org/10.1145/2384577.2384587