Pull to refresh

Compilers *

From source code to machine code

Show first
Rating limit
Level of difficulty

Writing an interpreter (virtual machine) for a simple byte-code + JIT compilation

Level of difficulty Medium
Reading time 10 min
Views 1.1K

There are two articles on Russian, the author of which writes a virtual machine (interpreter) for executing a simple bytecode and then applies different optimizations to make this virtual machine faster. Besides that, there is a compiler of a simple C-like language into this bytecode. After reading this article and getting familiar with the compiler, I thought that it would be interesting to try writing a virtual machine for this language that would be able to apply JIT-compilation to this bytecode with the libjit library. This article describes the experience of doing that.

I found several articles online that describe the usage of this library, but those that I saw, describe the compilation of concrete programs with libjit, while I was interested in compiling arbitrary bytecode. For people interested in further reading, there is an official titorial, a series of articles and a series of comparisons (in Russian).

The implementation was done in C++ because we aren`t playing games here. All my code is in my repository. The "main" branch has just the interpreter of the PigletVM bytecode; "labels-with-fallbacks" has a partial JIT compilation implementation (that doesn`t support JUMP instructions), "full-jit" has fully working JIT-compilationl; "making-jit-code-faster" makes code generated by JIT work faster and "universal-base-vm*" branches merge the interpreter and JIT-compilation implementations, by implementing a base generalised executor, which can be used for different implementations of PigletVM (both the interpreter and libjit compilation)

Read more
Total votes 3: ↑3 and ↓0 +3
Comments 10

DSL (domain-specific language) implementation with macros

Level of difficulty Medium
Reading time 8 min
Views 1.9K

This is a translation of my own article

The release of NewLang language with a brand new "feature" is coming, a remodeled version of the preprocessor that allows you to extend the language syntax to create different DSL dialects using macros.

What is it about?

DSL (Subject Oriented Language) is a programming language specialized for a specific application area. It is believed that the use of DSL significantly increases the level of abstractness of the code, and this allows to develop more quickly and efficiently and greatly simplifies the solution of many problems.

Conditionally, we can distinguish two approaches to DSL implementation:

  • Development of independent syntax translators using lexer and parser generators to define the grammar of the target language through BNF (Backus–Naur form) and regular expressions (Lex, Yacc, ANTLR, etc.) and then compiling the resulting grammar into machine code.
  • Development or integration of the DSL dialect into a general-purpose language (metalanguage), including the use of various libraries or special parsers / preprocessors.

We will talk about the second option, namely the implementation of DSL on the basis of general-purpose languages (metalanguages) and the new implementation of macros in NewLang as the basis for DSL development.

Read more →
Total votes 2: ↑2 and ↓0 +2
Comments 2

NFun — expression evaluator for .Net

Reading time 6 min
Views 2.1K

Nfun is an embedded language and expression executor that supports primitive types, arrays, structures and lambda expressions.

Most likely, you have already met tasks that require such a tool, and in this article I want to show examples of its application, its capabilities and why it may be useful to you.

Let's learn some nFun!
Rating 0
Comments 1

Structured Logging and Interpolated Strings in C# 10

Level of difficulty Medium
Reading time 10 min
Views 39K

Structured logging is gaining more and more popularity in the developers' community. In this article I'd like to demonstrate how we can use structured logging with the Microsoft.Extensions.Logging package and show the idea how we can extend it using the new features of C# 10.

Read more
Total votes 6: ↑6 and ↓0 +6
Comments 0

Comparing Huawei ExaGear to Apple's Rosetta 2 and Microsoft's solution

Reading time 7 min
Views 3.5K

November 10, 2020 was in many ways a landmark event in the microprocessor industry: Apple unveiled its new Mac Mini, the main feature of which was the new M1 chip, developed in-house. It is not an exaggeration to say that this processor is a landmark achievement for the ARM ecosystem: finally an ARM architecture chip whose performance surpassed x86 architecture chips from competitors such as Intel, a niche that had been dominated for decades.

But the main interest for us is not the M1 processor itself, but the Rosetta 2 binary translation technology. This allows the user to run legacy x86 software that has not been migrated to the ARM architecture. Apple has a lot of experience in developing binary translation solutions and is a recognized leader in this area. The first version of the Rosetta binary translator appeared in 2006 were it aided Apple in the transition from PowerPC to x86 architecture. Although this time platforms were different from those of 2006, it was obvious that all the experience that Apple engineers had accumulated over the years, was not lost, but used to develop the next version - Rosetta 2.

We were keen to compare this new solution from Apple, a similar product Huawei ExaGear (with its lineage from Eltechs ExaGear) developed by our team. At the same time, we evaluated the performance of binary translation from x86 to Arm provided by Microsoft (part of MS Windows 10 for Arm devices) on the Huawei MateBook E laptop. At present, these are the only other x86 to Arm binary translation solution that we are aware of on the open market.

Read more
Total votes 2: ↑2 and ↓0 +2
Comments 0

Is PHP compilable?! PVS-Studio searches for errors in PeachPie

Reading time 22 min
Views 653

PHP is widely known as an interpreted programming language used mainly for website development. However, few people know that PHP also has a compiler to .NET – PeachPie. But how well is it made? Will the static analyzer be able to find actual bugs in this compiler? Let's find out!

Read more
Total votes 2: ↑2 and ↓0 +2
Comments 0

Checking Clang 11 with PVS-Studio

Reading time 10 min
Views 679
PVS-Studio: I'm still worthy

Every now and then, we have to write articles about how we've checked another fresh version of some compiler. That's not really much fun. However, as practice shows, if we stop doing that for a while, folks start doubting whether PVS-Studio is worth its title of a good catcher of bugs and vulnerabilities. What if the new compiler can do that too? Sure, compilers evolve, but so does PVS-Studio – and it proves, again and again, its ability to catch bugs even in high-quality projects such as compilers.
Read more →
Total votes 3: ↑2 and ↓1 +1
Comments 0

Using Flex (Fast Lexical Analyzer Generator)

Reading time 5 min
Views 6.7K
Lexical analysis is the first stage of a compilation process. It's used for getting a token sequence from source code. It gets an input character sequence and finds out what the token is in the start position, whether it's a language keyword, an identifier, a constant (also called a literal), or, maybe, some error. A lexical analyzer (also known as tokenizer) sends a stream of tokens further, into a parser, which builds an AST (abstract syntax tree).

It's possible to write a lexer from scratch, but much more convenient to use any lexer generator. If we define some parsing rules, corresponding to an input language syntax, we get a complete lexical analyzer (tokenizer), which can extract tokens from an input program text and pass them to a parser.

One of such generators is Flex. In this article, we'll examine how it works in general, and observe some nontrivial nuances of developing a lexer with Flex.
Read more →
Total votes 5: ↑5 and ↓0 +5
Comments 4

PVS-Studio is now in Compiler Explorer

Reading time 4 min
Views 990

Not so long ago, a landmark event has happened: PVS-Studio appeared in Compiler Explorer! Now you can quickly and easily analyze the code for errors right on the godbolt.org site (Compiler Explorer). This feature opens up a large number of new possibilities – from quenching curiosity about the analyzer's abilities to being able to quickly share check results with a friend. This article will cover the topic on how to use these features. Caution – large GIFs!
Read more →
Total votes 1: ↑1 and ↓0 +1
Comments 0

Checking the GCC 10 Compiler with PVS-Studio

Reading time 9 min
Views 1.6K

PVS-Studo vs GCC 10

The GCC compiler is written with copious use of macros. Another check of the GCC code using PVS-Studio once again confirms the opinion of our team that macros are evil in the flesh. Not only does the static analyzer struggle with reviewing such code, but also a developer. GCC developers are certainly used to the project and are well versed in it. Nonetheless, it is very difficult to understand something on the third hand. Actually, due to macros, it was not possible to fully perform code checking. However, the PVS-Studio analyzer, as always, showed that it can find errors even in compilers.
Read more →
Total votes 4: ↑3 and ↓1 +2
Comments 1

Checking the Ark Compiler Recently Made Open-Source by Huawei

Reading time 6 min
Views 945
Picture 1

During the summer of 2019, Huawei gave a series of presentations announcing the Ark Compiler technology. The company claims that this open-source project will help developers make the Android system and third-party software much more fluent and responsive. By tradition, every new promising open-source project goes through PVS-Studio for us to evaluate the quality of its code.


The Ark Compiler was first announced by Huawei at the launch of the new smartphone models P30 and P30 Pro. It is claimed that the Ark Compiler will improve the fluency of the Android system by 24% and response speed by 44%. Third-party Android applications will also gain a 60% speed-up after recompilation with the Ark Compiler. The open-source version of the project is called OpenArkCompiler; its source code is available on Gitee, a Chinese fork of GitHub.
Read more →
Total votes 24: ↑24 and ↓0 +24
Comments 0

Why LLVM may call a never called function?

Reading time 11 min
Views 6.5K
I don’t care what your dragon’s said, it’s a lie. Dragons lie. You don’t know what’s waiting for you on the other side.

Michael Swanwick, The Iron Dragon’s Daughter
This article is based on the post in the Krister Walfridsson’s blog, “Why undefined behavior may call a never called function?”.

The article draws a simple conclusion: undefined behavior in a compiler can do anything, even something absolutely unexpected. In this article, I examine the internal mechanism of this optimization works.
Read more →
Total votes 8: ↑7 and ↓1 +6
Comments 0

Finding Bugs in LLVM 8 with PVS-Studio

Reading time 24 min
Views 2.5K
PVS-Studio and LLVM 8.0.0

It's been two years since we last checked the code of the LLVM project with PVS-Studio, so let's see if PVS-Studio is still the leader among tools for detecting bugs and security weaknesses. We'll do that by scanning the LLVM 8.0.0 release for new bugs.
Read more →
Total votes 26: ↑26 and ↓0 +26
Comments 0

Checking the Roslyn Source Code

Reading time 21 min
Views 1.6K
PVS-Studio vs Roslyn

Once in a while we go back to the projects that we have previously checked using PVS-Studio, which results in their descriptions in various articles. Two reasons make these comebacks exciting for us. Firstly, the opportunity to assess the progress of our analyzer. Secondly, monitoring the feedback of the project's authors to our article and the report of errors, which we usually provide them with. Of course, errors can be corrected without our participation. However, it is always nice when our efforts help to make a project better. Roslyn was no exception. The previous article about this project check dates back to December 23, 2015. It's quite a long time, in the view of the progress that our analyzer has made since that time. Since the C# core of the PVS-Studio analyzer is based on Roslyn, it gives us additional interest in this project. As a result, we're as keen as mustard about the code quality of this project. Now let's test it once again and find out some new and interesting issues (but let's hope that nothing significant) that PVS-Studio will be able to find.
Read more →
Total votes 34: ↑34 and ↓0 +34
Comments 0

My Pascal compiler and Polish contemporary art

Reading time 5 min
Views 6.9K


Several years ago I wrote a Pascal compiler. The motivation was simple: as a teenager, I had learnt from my first programming textbooks that a compiler is a very sophisticated thing. This claim eventually became a challenge and required to be tested by experience.


First, a simplistic PL/0 compiler came into being, and later an almost fully-functional Pascal compiler for MS-DOS has grown from it. My source of inspiration was the Compiler Construction book by Niklaus Wirth, the inventor of the Pascal language. I don't care if Wirth's views are now considered obsolete and have no direct connections to the IT mainstream, or if the compiler design fashion has changed. It is enough to know that his techniques are still simple, elegant, and — last but not least — bring much fun, since it is more appealing to parse a program source with a handwritten recursive descent parser and generate the machine code, rather than to call yaccs, bisons and all their descendants.

My compiler's fate was not so trivial. It has lived two lives: the first one in my own hands, and the second in the hands of computer antiquarians from Poland.
Total votes 27: ↑26 and ↓1 +25
Comments 1

Authors' contribution