Pull to refresh

Development

Show first
Rating limit
Level of difficulty

How to speed up LZ4 decompression in ClickHouse?

Reading time23 min
Views17K
When you run queries in ClickHouse, you might notice that the profiler often shows the LZ_decompress_fast function near the top. What is going on? This question had us wondering how to choose the best compression algorithm.

ClickHouse stores data in compressed form. When running queries, ClickHouse tries to do as little as possible, in order to conserve CPU resources. In many cases, all the potentially time-consuming computations are already well optimized, plus the user wrote a well thought-out query. Then all that's left to do is to perform decompression.



So why does LZ4 decompression becomes a bottleneck? LZ4 seems like an extremely light algorithm: the data decompression rate is usually from 1 to 3 GB/s per processor core, depending on the data. This is much faster than the typical disk subsystem. Moreover, we use all available CPU cores, and decompression scales linearly across all physical cores.
Read more →

How to quickly check out interesting warnings given by the PVS-Studio analyzer for C and C++ code?

Reading time5 min
Views1.1K

Once in a while, programmers who start getting acquainted with the PVS-Studio code analyzer ask me: «Is there a list of warnings that accurately indicate errors?» There is no such list because uninteresting (false) warnings in one project are very important and useful in another one. However, one can definitely start digging into the analyzer from the most exciting warnings. Let's take a closer look at this topic.
Read more →

SQL Index Manager – a long story about SQL Server, grave digging and index maintenance

Reading time14 min
Views2.8K
Every now and then we create our own problems with our own hands… with our vision of the world… with our inaction… with our laziness… and with our fears. As a result, it seems to become very convenient to swim in the public flow of sewage patterns… because it is warm and fun, and the rest does not matter – we can smell round. But after a fail comes the realization of the simple truth – instead of generating an endless stream of causes, self-pity and self-justification, it is enough just to do what you consider the most important for yourself. This will be the starting point for your new reality.

For me, the written below is just such a starting point. The way is expected to be lingering…
Let's go?

WSL 2 is now available in Windows Insiders

Reading time3 min
Views4.2K

We’re excited to announce starting today you can try the Windows Subsystem for Linux 2 by installing Windows build 18917 in the Insider Fast ring! In this blog post we’ll cover how to get started, the new wsl.exe commands, and some important tips. Full documentation about WSL 2 is available on our docs page.


Read more →

Security of mobile OAuth 2.0

Reading time12 min
Views15K
image

Popularity of mobile applications continues to grow. So does OAuth 2.0 protocol on mobile apps. It's not enough to implement standard as is to make OAuth 2.0 protocol secure there. One needs to consider the specifics of mobile applications and apply some additional security mechanisms.

In this article, I want to share the concepts of mobile OAuth 2.0 attacks and security mechanisms used to prevent such issues. Described concepts are not new but there is a lack of the structured information on this topic. The main aim of the article is to fill this gap.
Read more →

How Moovit improved its app to help people with disabilities ride transit with confidence

Reading time4 min
Views899

Alexandr Epaneshnikov, a 19-year-old Russian student who is legally blind, recently decided he wanted to be more independent by commuting on his own and relying less on his mom for rides to school. It meant taking a streetcar to a subway to his high school in Moscow, a 30-minute trip that Epaneshnikov assuredly navigates with a cane and Moovit, an urban mobility app optimized for screen readers.


Read more →

Bluetooth stack modifications to improve audio quality on headphones without AAC, aptX, or LDAC codecs

Reading time7 min
Views72K
Before reading this article, it is recommended to read the previous one: Audio over Bluetooth: most detailed information about profiles, codecs, and devices / по-русски

Some wireless headphone users note low sound quality and lack of high frequencies when using the standard Bluetooth SBC codec, which is supported by all headphones and other Bluetooth audio devices. A common recommendation to get better sound quality is to buy devices and headphones with aptX or LDAC codecs support. These codecs require licensing fees, that's why devices with them are more expensive.

It turns out that the low quality of SBC is caused by artificial limitations of all current Bluetooth stacks and headphones' configuration, and this limitation can be circumvented on any existing device with software modification only.
Read more →

The one who resurrected Duke Nukem: interview with Randy Pitchford, magician from Gearbox

Reading time38 min
Views4.2K
RUVDS and Habr continue the series of interviews with interesting people in IT field. Last time we talked to Richard «Levelord» Gray, level designer of popular games Duke Nukem, American McGee’s Alice, Heavy Metal F.A.K.K.2, SiN, Serious Sam, author of well-known «You’re not supposed to be here» phrase.

Today we welcome Randall Steward «Randy» Pitchford II, president, CEO and co-founder of Gearbox Software video game development company.

Randy started in 3D Realms where contributed to Duke Nukem 3D Atomic Edition and Shadow Warrior. Then he founded Gearbox Software and made Half-Life: Opposing Force, which won D.I.C.E in 2000. Other Gearbox titles include Half-Life: Blue Shift, Half-Life: Decay, Counter-Strike: Condition Zero, James Bond 007: Nightfire, Tony Hawk's Pro Skater 3, Halo: Combat Evolved and of course Borderlands.

The interview team also includes editor of Habr Nikolay Zemlyanskiy, Richard «Levelord» Gray, Randy’s wife Kristy Pitchford and Randy’s son Randy Jr.


Indexes in PostgreSQL — 10 (Bloom)

Reading time11 min
Views7.7K
In the previous articles we discussed PostgreSQL indexing engine and the interface of access methods, as well as hash indexes, B-trees, GiST, SP-GiST, GIN, RUM, and BRIN. But we still need to look at Bloom indexes.

Bloom


General concept


A classical Bloom filter is a data structure that enables us to quickly check membership of an element in a set. The filter is highly compact, but allows false positives: it can mistakenly consider an element to be a member of a set (false positive), but it is not permitted to consider an element of a set not to be a member (false negative).

The filter is an array of $m$ bits (also called a signature) that is initially filled with zeros. $k$ different hash functions are chosen that map any element of the set to $k$ bits of the signature. To add an element to the set, we need to set each of these bits in the signature to one. Consequently, if all the bits corresponding to an element are set to one, the element can be a member of the set, but if at least one bit equals zero, the element is not in the set for sure.

In the case of a DBMS, we actually have $N$ separate filters built for each index row. As a rule, several fields are included in the index, and it's values of these fields that compose the set of elements for each row.

By choosing the length of the signature $m$, we can find a trade-off between the index size and the probability of false positives. The application area for Bloom index is large, considerably «wide» tables to be queried using filters on each of the fields. This access method, like BRIN, can be regarded as an accelerator of sequential scan: all the matches found by the index must be rechecked with the table, but there is a chance to avoid considering most of the rows at all.
Read more →

Nullable Reference types in C# 8.0 and static analysis

Reading time12 min
Views3.7K

Picture 9


It's not a secret that Microsoft has been working on the 8-th version of C# language for quite a while. The new language version (C# 8.0) is already available in the recent release of Visual Studio 2019, but it's still in beta. This new version is going to have a few features implemented in a somewhat non-obvious, or rather unexpected, way. Nullable Reference types are one of them. This feature is announced as a means to fight Null Reference Exceptions (NRE).
Read more →

Improve your mobile application using machine learning technology

Reading time4 min
Views1.1K
Today, even mobile application developing company has begun to consolidate ML related to other cutting edge technologies, for example, AI and predictive analysis. This is on the grounds that ML empowers mobile applications to learn, adjust, and improve after some time.

It’s an incredible accomplishment when you consider the way that changes requested an express order from designers for gadgets to execute a particular activity. At the point when this was the standard, software engineers needed to estimate and record for each conceivable situation (and this was a fantastic test).

Be that as it may, with ML in portable applications, we have removed the speculating game from the condition. It can likewise upgrade User Experience (UX) by understanding client conduct. So you can wager that ML in versatile won’t be restricted to voice associates and chatbots.
Read more →

Support of Visual Studio 2019 in PVS-Studio

Reading time19 min
Views1.1K


Support of Visual Studio 2019 in PVS-Studio affected a number of components: the plugin itself, the command-line analyzer, the cores of the C++ and C# analyzers, and a few utilities. In this article, I will briefly explain what problems we encountered when implementing support of the IDE and how we addressed them.
Read more →

Various things in MetaPost

Reading time8 min
Views15K
What is the best tool to use for drawing vector pictures? For me and probably for many others, the answer is pretty obvious: Illustrator, or, maybe, Inkscape. At least that's what I thought when I was asked to draw about eight hundred diagrams for a physics textbook. Nothing exceptional, just a bunch of black and white illustrations with spheres, springs, pulleys, lenses and so on. By that time it was already known that the book was going to be made in LaTeX and I was given a number of MS Word documents with embedded images. Some of them were scanned pictures from other books, some were pencil drawings. Picturing days and nights of inkscaping this stuff made me feel dizzy, so soon I found myself fantasizing about a more automated solution. For some reason MetaPost became the focus of these fantasies.



Read more →

Exceptional situations: part 1 of 4

Reading time11 min
Views2.2K


Introduction


It’s time to talk about exceptions or, rather, exceptional situations. Before we start, let’s look at the definition. What is an exceptional situation?


This is a situation that makes the execution of current or subsequent code incorrect. I mean different from how it was designed or intended. Such a situation compromises the integrity of an application or its part, e.g. an object. It brings the application into an extraordinary or exceptional state.


But why do we need to define this terminology? Because it will keep us in some boundaries. If we don’t follow the terminology, we can get too far from a designed concept which may result in many ambiguous situations. Let’s see some practical examples:


 struct Number
 {
     public static Number Parse(string source)
     {
         // ...
         if(!parsed)
         {
             throw new ParsingException();
         }
         // ...
     }

     public static bool TryParse(string source, out Number result)
     {
        // ..
        return parsed;
     }
 }

This example seems a little strange, and it is for a reason. I made this code slightly artificial to show the importance of problems appearing in it. First, let’s look at the Parse method. Why should it throw an exception?

Read more →

Indexes in PostgreSQL — 9 (BRIN)

Reading time18 min
Views9.4K
In the previous articles we discussed PostgreSQL indexing engine, the interface of access methods, and the following methods: hash indexes, B-trees, GiST, SP-GiST, GIN, and RUM. The topic of this article is BRIN indexes.

BRIN


General concept


Unlike indexes with which we've already got acquainted, the idea of BRIN is to avoid looking through definitely unsuited rows rather than quickly find the matching ones. This is always an inaccurate index: it does not contain TIDs of table rows at all.

Simplistically, BRIN works fine for columns where values correlate with their physical location in the table. In other words, if a query without ORDER BY clause returns the column values virtually in the increasing or decreasing order (and there are no indexes on that column).

This access method was created in scope of Axle, the European project for extremely large analytical databases, with an eye on tables that are several terabyte or dozens of terabytes large. An important feature of BRIN that enables us to create indexes on such tables is a small size and minimal overhead costs of maintenance.

This works as follows. The table is split into ranges that are several pages large (or several blocks large, which is the same) — hence the name: Block Range Index, BRIN. The index stores summary information on the data in each range. As a rule, this is the minimal and maximal values, but it happens to be different, as shown further. Assume that a query is performed that contains the condition for a column; if the sought values do not get into the interval, the whole range can be skipped; but if they do get, all rows in all blocks will have to be looked through to choose the matching ones among them.

It will not be a mistake to treat BRIN not as an index, but as an accelerator of sequential scan. We can regard BRIN as an alternative to partitioning if we consider each range as a «virtual» partition.

Now let's discuss the structure of the index in more detail.
Read more →

Should array length be stored into a local variable in C#?

Reading time6 min
Views18K
I notice that people often use construction like this:

var length = array.Length;
for (int i = 0; i < length; i++) {
    //do smth
}

They think that having a call to the Array.Length on each iteration will make CLR to take more time to execute the code. To avoid it they store the length value in a local variable.
Let’s find out (once and for all !) if this is a viable thing or using a temporary variable is a waste of time.
Read more →

Fancy Euclid's “Elements” in TeX

Reading time7 min
Views29K


In 2016, I came across Oliver Byrne's “The first six books of the Elements of Euclid.” The main feature of this book is that instead of ordinary letter designations such as “triangle ABC,” it employs inclusions of miniature pictures directly in the text, that is, for example, an image of a triangle. As difficult as it probably was in the XIX century, as easy, with the right tools, it should be to make such a book nowadays. And so I decided to find out by myself whether that's the case.
Read more →