Pull to refresh
46.27

Compilers *

From source code to machine code

Show first
Rating limit
Level of difficulty

Tree-sitter and Preprocessing: A Syntax Showdown

Level of difficultyMedium
Reading time5 min
Views812

According to the description,


Tree-sitter is a parser generator tool and an incremental parsing library. It can build a concrete syntax tree for a source file and efficiently update the syntax tree as the source file is edited.

But how does Tree-sitter handle languages that require a preprocessing stage?

Read more →
Total votes 2: ↑2 and ↓0+2
Comments1

Writing an interpreter (virtual machine) for a simple byte-code + JIT compilation

Level of difficultyMedium
Reading time10 min
Views1.8K

There are two articles on Russian, the author of which writes a virtual machine (interpreter) for executing a simple bytecode and then applies different optimizations to make this virtual machine faster. Besides that, there is a compiler of a simple C-like language into this bytecode. After reading this article and getting familiar with the compiler, I thought that it would be interesting to try writing a virtual machine for this language that would be able to apply JIT-compilation to this bytecode with the libjit library. This article describes the experience of doing that.

I found several articles online that describe the usage of this library, but those that I saw, describe the compilation of concrete programs with libjit, while I was interested in compiling arbitrary bytecode. For people interested in further reading, there is an official titorial, a series of articles and a series of comparisons (in Russian).

The implementation was done in C++ because we aren`t playing games here. All my code is in my repository. The "main" branch has just the interpreter of the PigletVM bytecode; "labels-with-fallbacks" has a partial JIT compilation implementation (that doesn`t support JUMP instructions), "full-jit" has fully working JIT-compilationl; "making-jit-code-faster" makes code generated by JIT work faster and "universal-base-vm*" branches merge the interpreter and JIT-compilation implementations, by implementing a base generalised executor, which can be used for different implementations of PigletVM (both the interpreter and libjit compilation)

Read more
Total votes 3: ↑3 and ↓0+3
Comments10

DSL (domain-specific language) implementation with macros

Level of difficultyMedium
Reading time8 min
Views2.5K

image
This is a translation of my own article


The release of NewLang language with a brand new "feature" is coming, a remodeled version of the preprocessor that allows you to extend the language syntax to create different DSL dialects using macros.


What is it about?


DSL (Subject Oriented Language) is a programming language specialized for a specific application area. It is believed that the use of DSL significantly increases the level of abstractness of the code, and this allows to develop more quickly and efficiently and greatly simplifies the solution of many problems.

Conditionally, we can distinguish two approaches to DSL implementation:


  • Development of independent syntax translators using lexer and parser generators to define the grammar of the target language through BNF (Backus–Naur form) and regular expressions (Lex, Yacc, ANTLR, etc.) and then compiling the resulting grammar into machine code.
  • Development or integration of the DSL dialect into a general-purpose language (metalanguage), including the use of various libraries or special parsers / preprocessors.

We will talk about the second option, namely the implementation of DSL on the basis of general-purpose languages (metalanguages) and the new implementation of macros in NewLang as the basis for DSL development.

Read more →
Total votes 2: ↑2 and ↓0+2
Comments2

Structured Logging and Interpolated Strings in C# 10

Level of difficultyMedium
Reading time10 min
Views45K

Structured logging is gaining more and more popularity in the developers' community. In this article I'd like to demonstrate how we can use structured logging with the Microsoft.Extensions.Logging package and show the idea how we can extend it using the new features of C# 10.

Read more
Total votes 6: ↑6 and ↓0+6
Comments0

Comparing Huawei ExaGear to Apple's Rosetta 2 and Microsoft's solution

Reading time7 min
Views3.9K

November 10, 2020 was in many ways a landmark event in the microprocessor industry: Apple unveiled its new Mac Mini, the main feature of which was the new M1 chip, developed in-house. It is not an exaggeration to say that this processor is a landmark achievement for the ARM ecosystem: finally an ARM architecture chip whose performance surpassed x86 architecture chips from competitors such as Intel, a niche that had been dominated for decades.

But the main interest for us is not the M1 processor itself, but the Rosetta 2 binary translation technology. This allows the user to run legacy x86 software that has not been migrated to the ARM architecture. Apple has a lot of experience in developing binary translation solutions and is a recognized leader in this area. The first version of the Rosetta binary translator appeared in 2006 were it aided Apple in the transition from PowerPC to x86 architecture. Although this time platforms were different from those of 2006, it was obvious that all the experience that Apple engineers had accumulated over the years, was not lost, but used to develop the next version - Rosetta 2.

We were keen to compare this new solution from Apple, a similar product Huawei ExaGear (with its lineage from Eltechs ExaGear) developed by our team. At the same time, we evaluated the performance of binary translation from x86 to Arm provided by Microsoft (part of MS Windows 10 for Arm devices) on the Huawei MateBook E laptop. At present, these are the only other x86 to Arm binary translation solution that we are aware of on the open market.

Read more
Total votes 2: ↑2 and ↓0+2
Comments0

Is PHP compilable?! PVS-Studio searches for errors in PeachPie

Reading time22 min
Views703

PHP is widely known as an interpreted programming language used mainly for website development. However, few people know that PHP also has a compiler to .NET – PeachPie. But how well is it made? Will the static analyzer be able to find actual bugs in this compiler? Let's find out!

Read more
Total votes 2: ↑2 and ↓0+2
Comments0

Checking Clang 11 with PVS-Studio

Reading time10 min
Views741
PVS-Studio: I'm still worthy

Every now and then, we have to write articles about how we've checked another fresh version of some compiler. That's not really much fun. However, as practice shows, if we stop doing that for a while, folks start doubting whether PVS-Studio is worth its title of a good catcher of bugs and vulnerabilities. What if the new compiler can do that too? Sure, compilers evolve, but so does PVS-Studio – and it proves, again and again, its ability to catch bugs even in high-quality projects such as compilers.
Read more →
Total votes 3: ↑2 and ↓1+1
Comments0

Using Flex (Fast Lexical Analyzer Generator)

Reading time5 min
Views8K
Lexical analysis is the first stage of a compilation process. It's used for getting a token sequence from source code. It gets an input character sequence and finds out what the token is in the start position, whether it's a language keyword, an identifier, a constant (also called a literal), or, maybe, some error. A lexical analyzer (also known as tokenizer) sends a stream of tokens further, into a parser, which builds an AST (abstract syntax tree).

It's possible to write a lexer from scratch, but much more convenient to use any lexer generator. If we define some parsing rules, corresponding to an input language syntax, we get a complete lexical analyzer (tokenizer), which can extract tokens from an input program text and pass them to a parser.

One of such generators is Flex. In this article, we'll examine how it works in general, and observe some nontrivial nuances of developing a lexer with Flex.
Read more →
Total votes 5: ↑5 and ↓0+5
Comments4

PVS-Studio is now in Compiler Explorer

Reading time4 min
Views1K
image1.png

Not so long ago, a landmark event has happened: PVS-Studio appeared in Compiler Explorer! Now you can quickly and easily analyze the code for errors right on the godbolt.org site (Compiler Explorer). This feature opens up a large number of new possibilities – from quenching curiosity about the analyzer's abilities to being able to quickly share check results with a friend. This article will cover the topic on how to use these features. Caution – large GIFs!
Read more →
Total votes 1: ↑1 and ↓0+1
Comments0

Checking the GCC 10 Compiler with PVS-Studio

Reading time9 min
Views1.7K

PVS-Studo vs GCC 10

The GCC compiler is written with copious use of macros. Another check of the GCC code using PVS-Studio once again confirms the opinion of our team that macros are evil in the flesh. Not only does the static analyzer struggle with reviewing such code, but also a developer. GCC developers are certainly used to the project and are well versed in it. Nonetheless, it is very difficult to understand something on the third hand. Actually, due to macros, it was not possible to fully perform code checking. However, the PVS-Studio analyzer, as always, showed that it can find errors even in compilers.
Read more →
Total votes 3: ↑2 and ↓1+2
Comments1

Checking the Ark Compiler Recently Made Open-Source by Huawei

Reading time6 min
Views1K
Picture 1

During the summer of 2019, Huawei gave a series of presentations announcing the Ark Compiler technology. The company claims that this open-source project will help developers make the Android system and third-party software much more fluent and responsive. By tradition, every new promising open-source project goes through PVS-Studio for us to evaluate the quality of its code.

Introduction


The Ark Compiler was first announced by Huawei at the launch of the new smartphone models P30 and P30 Pro. It is claimed that the Ark Compiler will improve the fluency of the Android system by 24% and response speed by 44%. Third-party Android applications will also gain a 60% speed-up after recompilation with the Ark Compiler. The open-source version of the project is called OpenArkCompiler; its source code is available on Gitee, a Chinese fork of GitHub.
Read more →
Total votes 24: ↑24 and ↓0+24
Comments0

Why LLVM may call a never called function?

Reading time11 min
Views6.9K
I don’t care what your dragon’s said, it’s a lie. Dragons lie. You don’t know what’s waiting for you on the other side.

Michael Swanwick, The Iron Dragon’s Daughter
This article is based on the post in the Krister Walfridsson’s blog, “Why undefined behavior may call a never called function?”.

The article draws a simple conclusion: undefined behavior in a compiler can do anything, even something absolutely unexpected. In this article, I examine the internal mechanism of this optimization works.
Read more →
Total votes 8: ↑7 and ↓1+6
Comments0

Checking the Roslyn Source Code

Reading time21 min
Views1.6K
PVS-Studio vs Roslyn

Once in a while we go back to the projects that we have previously checked using PVS-Studio, which results in their descriptions in various articles. Two reasons make these comebacks exciting for us. Firstly, the opportunity to assess the progress of our analyzer. Secondly, monitoring the feedback of the project's authors to our article and the report of errors, which we usually provide them with. Of course, errors can be corrected without our participation. However, it is always nice when our efforts help to make a project better. Roslyn was no exception. The previous article about this project check dates back to December 23, 2015. It's quite a long time, in the view of the progress that our analyzer has made since that time. Since the C# core of the PVS-Studio analyzer is based on Roslyn, it gives us additional interest in this project. As a result, we're as keen as mustard about the code quality of this project. Now let's test it once again and find out some new and interesting issues (but let's hope that nothing significant) that PVS-Studio will be able to find.
Read more →
Total votes 34: ↑34 and ↓0+34
Comments0

My Pascal compiler and Polish contemporary art

Reading time5 min
Views7.1K

Origins


Several years ago I wrote a Pascal compiler. The motivation was simple: as a teenager, I had learnt from my first programming textbooks that a compiler is a very sophisticated thing. This claim eventually became a challenge and required to be tested by experience.

image
ha.art.pl

First, a simplistic PL/0 compiler came into being, and later an almost fully-functional Pascal compiler for MS-DOS has grown from it. My source of inspiration was the Compiler Construction book by Niklaus Wirth, the inventor of the Pascal language. I don't care if Wirth's views are now considered obsolete and have no direct connections to the IT mainstream, or if the compiler design fashion has changed. It is enough to know that his techniques are still simple, elegant, and — last but not least — bring much fun, since it is more appealing to parse a program source with a handwritten recursive descent parser and generate the machine code, rather than to call yaccs, bisons and all their descendants.

My compiler's fate was not so trivial. It has lived two lives: the first one in my own hands, and the second in the hands of computer antiquarians from Poland.
Total votes 27: ↑26 and ↓1+25
Comments1

Authors' contribution