Compilers *

From source code to machine code

74,06

Rating

ArticlesPostsNewsAuthors

rsashka May 10 2025 at 14:50

About the C++ static analyzer as a Clang plugin

Medium

9 min

2.7K

C++ * Compilers * Programming *

Retrospective

This article is based on the experience of developing the memsafe library, which, using the Clang plugin, adds safe memory management and invalidation control of reference data types to C++ during source code compilation.

pinbraerts Aug 10 2024 at 13:45

Tree-sitter and Preprocessing: A Syntax Showdown

Medium

5 min

3.3K

C * C# * C++ * Compilers * Abnormal programming *

Review

Translation

According to the description,

Tree-sitter is a parser generator tool and an incremental parsing library. It can build a concrete syntax tree for a source file and efficiently update the syntax tree as the source file is edited.

But how does Tree-sitter handle languages that require a preprocessing stage?

vda19999 Aug 31 2023 at 04:38

Writing an interpreter (virtual machine) for a simple byte-code + JIT compilation

Medium

10 min

3.5K

C++ * *nix * Compilers *

Translation

There are two articles on Russian, the author of which writes a virtual machine (interpreter) for executing a simple bytecode and then applies different optimizations to make this virtual machine faster. Besides that, there is a compiler of a simple C-like language into this bytecode. After reading this article and getting familiar with the compiler, I thought that it would be interesting to try writing a virtual machine for this language that would be able to apply JIT-compilation to this bytecode with the libjit library. This article describes the experience of doing that.

I found several articles online that describe the usage of this library, but those that I saw, describe the compilation of concrete programs with libjit, while I was interested in compiling arbitrary bytecode. For people interested in further reading, there is an official titorial, a series of articles and a series of comparisons (in Russian).

The implementation was done in C++ because we aren`t playing games here. All my code is in my repository. The "main" branch has just the interpreter of the PigletVM bytecode; "labels-with-fallbacks" has a partial JIT compilation implementation (that doesn`t support JUMP instructions), "full-jit" has fully working JIT-compilationl; "making-jit-code-faster" makes code generated by JIT work faster and "universal-base-vm*" branches merge the interpreter and JIT-compilation implementations, by implementing a base generalised executor, which can be used for different implementations of PigletVM (both the interpreter and libjit compilation)

rsashka Mar 4 2023 at 11:42

DSL (domain-specific language) implementation with macros

Medium

8 min

4.3K

Compilers * Abnormal programming * Programming * Perfect code *

Opinion

This is a translation of my own article

The release of NewLang language with a brand new "feature" is coming, a remodeled version of the preprocessor that allows you to extend the language syntax to create different DSL dialects using macros.

What is it about?

DSL (Subject Oriented Language) is a programming language specialized for a specific application area. It is believed that the use of DSL significantly increases the level of abstractness of the code, and this allows to develop more quickly and efficiently and greatly simplifies the solution of many problems.

Conditionally, we can distinguish two approaches to DSL implementation:

Development of independent syntax translators using lexer and parser generators to define the grammar of the target language through BNF (Backus–Naur form) and regular expressions (Lex, Yacc, ANTLR, etc.) and then compiling the resulting grammar into machine code.
Development or integration of the DSL dialect into a general-purpose language (metalanguage), including the use of various libraries or special parsers / preprocessors.

We will talk about the second option, namely the implementation of DSL on the basis of general-purpose languages (metalanguages) and the new implementation of macros in NewLang as the basis for DSL development.

tmteam Jan 24 2023 at 08:00

NFun — expression evaluator for .Net

6 min

Open source * .NET * GitHub * Compilers * C# *

Translation

Nfun is an embedded language and expression executor that supports primitive types, arrays, structures and lambda expressions.

Most likely, you have already met tasks that require such a tool, and in this article I want to show examples of its application, its capabilities and why it may be useful to you.

Let's learn some nFun!

alextretyak Apr 30 2022 at 21:00

Lexical Analysis in 11l

6 min

5.3K

Programming * Compilers *

This article discusses the lexical analyzer, which is an integral part of any compiler.

The task of the lexical analyzer is to split the source code of the program into tokens.

So for example the code

print(1 + 2)

will be tokenized as
print, (, 1, +, 2 and )

PahanMenski Nov 24 2021 at 07:34

Structured Logging and Interpolated Strings in C# 10

Medium

10 min

54K

Programming * .NET * Compilers * C# *

Structured logging is gaining more and more popularity in the developers' community. In this article I'd like to demonstrate how we can use structured logging with the Microsoft.Extensions.Logging package and show the idea how we can extend it using the new features of C# 10.

Andrey2008 Oct 8 2021 at 19:09

Detecting errors in the LLVM release 13.0.0

16 min

PVS-Studio corporate blogC++ * Open source * Information Security * Compilers *

0871_LLVM_13/image2.png
Commercial static analyzers perform deeper and fuller code analysis compared to compilers. Let's see what PVS-Studio found in the source code of the LLVM 13.0.0 project.

Armmaster Sep 10 2021 at 09:30

Comparing Huawei ExaGear to Apple's Rosetta 2 and Microsoft's solution

7 min

5.7K

Huawei corporate blogCompilers * CPU

Translation

November 10, 2020 was in many ways a landmark event in the microprocessor industry: Apple unveiled its new Mac Mini, the main feature of which was the new M1 chip, developed in-house. It is not an exaggeration to say that this processor is a landmark achievement for the ARM ecosystem: finally an ARM architecture chip whose performance surpassed x86 architecture chips from competitors such as Intel, a niche that had been dominated for decades.

But the main interest for us is not the M1 processor itself, but the Rosetta 2 binary translation technology. This allows the user to run legacy x86 software that has not been migrated to the ARM architecture. Apple has a lot of experience in developing binary translation solutions and is a recognized leader in this area. The first version of the Rosetta binary translator appeared in 2006 were it aided Apple in the transition from PowerPC to x86 architecture. Although this time platforms were different from those of 2006, it was obvious that all the experience that Apple engineers had accumulated over the years, was not lost, but used to develop the next version - Rosetta 2.

We were keen to compare this new solution from Apple, a similar product Huawei ExaGear (with its lineage from Eltechs ExaGear) developed by our team. At the same time, we evaluated the performance of binary translation from x86 to Arm provided by Microsoft (part of MS Windows 10 for Arm devices) on the Huawei MateBook E laptop. At present, these are the only other x86 to Arm binary translation solution that we are aware of on the open market.

Firensis Aug 17 2021 at 07:43

Is PHP compilable?! PVS-Studio searches for errors in PeachPie

22 min

1.5K

PVS-Studio corporate blogC# * Compilers * .NET * PHP *

PHP is widely known as an interpreted programming language used mainly for website development. However, few people know that PHP also has a compiler to .NET – PeachPie. But how well is it made? Will the static analyzer be able to find actual bugs in this compiler? Let's find out!

Minatych Aug 11 2021 at 08:01

Intermodular analysis of C++ projects in PVS-Studio

9 min

1.6K

PVS-Studio corporate blogC * Compilers * C++ *

Recently PVS-Studio has implemented a major feature—we supported intermodular analysis of C++ projects. This article covers our and other tools' implementations. You'll also find out how to try this feature and what we managed to detect using it.

32bit_me Apr 27 2021 at 18:45

On commutativity of addition

2 min

3.7K

Assembler * C * Compilers *

Does an assembly change, if we write (b + a) instead (a + b)?
Let's check out.

Let's write:

__int128 add1(__int128 a, __int128 b) {
    return b + a;
}

and compile it with risc-v gcc 8.2.0:

+15

Andrey2008 Oct 27 2020 at 11:36

Checking Clang 11 with PVS-Studio

10 min

1.4K

PVS-Studio corporate blogCompilers * Information Security * Open source * C++ *

Every now and then, we have to write articles about how we've checked another fresh version of some compiler. That's not really much fun. However, as practice shows, if we stop doing that for a while, folks start doubting whether PVS-Studio is worth its title of a good catcher of bugs and vulnerabilities. What if the new compiler can do that too? Sure, compilers evolve, but so does PVS-Studio – and it proves, again and again, its ability to catch bugs even in high-quality projects such as compilers.

32bit_me Oct 4 2020 at 17:26

Using Flex (Fast Lexical Analyzer Generator)

5 min

11K

Programming * Compilers *

Lexical analysis is the first stage of a compilation process. It's used for getting a token sequence from source code. It gets an input character sequence and finds out what the token is in the start position, whether it's a language keyword, an identifier, a constant (also called a literal), or, maybe, some error. A lexical analyzer (also known as tokenizer) sends a stream of tokens further, into a parser, which builds an AST (abstract syntax tree).

It's possible to write a lexer from scratch, but much more convenient to use any lexer generator. If we define some parsing rules, corresponding to an input language syntax, we get a complete lexical analyzer (tokenizer), which can extract tokens from an input program text and pass them to a parser.

One of such generators is Flex. In this article, we'll examine how it works in general, and observe some nontrivial nuances of developing a lexer with Flex.

GGribkov Jul 6 2020 at 12:20

PVS-Studio is now in Compiler Explorer

4 min

1.8K

PVS-Studio corporate blogC * C++ * Compilers * Programming *

Not so long ago, a landmark event has happened: PVS-Studio appeared in Compiler Explorer! Now you can quickly and easily analyze the code for errors right on the godbolt.org site (Compiler Explorer). This feature opens up a large number of new possibilities – from quenching curiosity about the analyzer's abilities to being able to quickly share check results with a friend. This article will cover the topic on how to use these features. Caution – large GIFs!

Andrey2008 Apr 16 2020 at 19:41

Checking the GCC 10 Compiler with PVS-Studio

9 min

2.5K

PVS-Studio corporate blogCompilers * Open source * C++ * C *

The GCC compiler is written with copious use of macros. Another check of the GCC code using PVS-Studio once again confirms the opinion of our team that macros are evil in the flesh. Not only does the static analyzer struggle with reviewing such code, but also a developer. GCC developers are certainly used to the project and are well versed in it. Nonetheless, it is very difficult to understand something on the third hand. Actually, due to macros, it was not possible to fully perform code checking. However, the PVS-Studio analyzer, as always, showed that it can find errors even in compilers.

SvyatoslavMC Dec 2 2019 at 06:39

Checking the Ark Compiler Recently Made Open-Source by Huawei

6 min

1.6K

PVS-Studio corporate blogC * C++ * Open source * Compilers *

During the summer of 2019, Huawei gave a series of presentations announcing the Ark Compiler technology. The company claims that this open-source project will help developers make the Android system and third-party software much more fluent and responsive. By tradition, every new promising open-source project goes through PVS-Studio for us to evaluate the quality of its code.

Introduction

The Ark Compiler was first announced by Huawei at the launch of the new smartphone models P30 and P30 Pro. It is claimed that the Ark Compiler will improve the fluency of the Android system by 24% and response speed by 44%. Third-party Android applications will also gain a 60% speed-up after recompilation with the Ark Compiler. The open-source version of the project is called OpenArkCompiler; its source code is available on Gitee, a Chinese fork of GitHub.

+24

32bit_me Jul 1 2019 at 19:52

Why LLVM may call a never called function?

11 min

8.3K

C++ * Open source * Compilers * Programming *

I don’t care what your dragon’s said, it’s a lie. Dragons lie. You don’t know what’s waiting for you on the other side.

Michael Swanwick, The Iron Dragon’s Daughter

This article is based on the post in the Krister Walfridsson’s blog, “Why undefined behavior may call a never called function?”.

The article draws a simple conclusion: undefined behavior in a compiler can do anything, even something absolutely unexpected. In this article, I examine the internal mechanism of this optimization works.

Andrey2008 Apr 29 2019 at 13:43

Finding Bugs in LLVM 8 with PVS-Studio

24 min

3.4K

PVS-Studio corporate blogCompilers * Open source * DevOps * C++ *

It's been two years since we last checked the code of the LLVM project with PVS-Studio, so let's see if PVS-Studio is still the leader among tools for detecting bugs and security weaknesses. We'll do that by scanning the LLVM 8.0.0 release for new bugs.

+23

n0mo Apr 3 2019 at 09:44

Checking the Roslyn Source Code

21 min

2.4K

PVS-Studio corporate blog.NET * C# * Visual Studio * Compilers *

Once in a while we go back to the projects that we have previously checked using PVS-Studio, which results in their descriptions in various articles. Two reasons make these comebacks exciting for us. Firstly, the opportunity to assess the progress of our analyzer. Secondly, monitoring the feedback of the project's authors to our article and the report of errors, which we usually provide them with. Of course, errors can be corrected without our participation. However, it is always nice when our efforts help to make a project better. Roslyn was no exception. The previous article about this project check dates back to December 23, 2015. It's quite a long time, in the view of the progress that our analyzer has made since that time. Since the C# core of the PVS-Studio analyzer is based on Roslyn, it gives us additional interest in this project. As a result, we're as keen as mustard about the code quality of this project. Now let's test it once again and find out some new and interesting issues (but let's hope that nothing significant) that PVS-Studio will be able to find.

+31