Pull to refresh

Development

Show first
Rating limit
Level of difficulty

PVS-Studio checks the code quality in the .NET Foundation projects: LINQ to DB

Reading time11 min
Views618

The .NET Foundation is an independent organization, created by Microsoft, to support open-source projects around the DotNet platform. Currently, the organization gathered many libraries under its wing. We have already tested some of these libraries with the help of PVS-Studio. The next project to check with the analyzer - LINQ to DB.

Read more

ruDALL-E: Generating Images from Text. Facing down the biggest computational challenge in Russia

Reading time11 min
Views11K

Multimodality has led the pack in machine learning in 2021. Neural networks are wolfing down images, text, speech and music all at the same time.  OpenAI is, as usual, top dog, but as if in defiance of their name, they are in no hurry to share their models openly.  At the beginning of the year, the company presented the DALL-E neural network, which generates 256x256 pixel images in answer to a written request.  Descriptions of it can be found as articles on arXiv and examples on their blog.  

As soon as DALL-E flushed out of the bushes, Chinese researchers got on its tail.  Their open-source CogView neural network does the same trick of generating images from text.  But what about here in Russia? One might say that “investigate, master, and train” is our engineering motto.  Well, we caught the scent, and today we can say that we created from scratch a complete pipeline for generating images from descriptive textual input written in Russian.

In this article we present the ruDALL-E XL model, an open-source text-to-image transformer with 1.3 billion parameters as well as ruDALL-E XXL model, an text-to-image transformer with 12.0 billion parameters which is available in DataHub SberCloud, and several other satellite models.

Read more

On the recent vulnerability in Diebold Nixdorf ATMs

Reading time8 min
Views4.4K

Hi there! A while ago, Positive Technologies published the news that ATMs manufactured by Diebold Nixdorf (previously known as Wincor), or more specifically, the RM3 and CMDv5 cash dispensers, contained a vulnerability which allowed attackers to withdraw cash and upload modified (vulnerable) firmware. And since my former colleague Alexei Stennikov and I were directly involved in finding this vulnerability, I would like to share some details.

Read more

PVS-Studio to check the RPCS3 emulator

Reading time10 min
Views1.1K

RPCS3 is an interesting project that emulates the PS3 console. It is actively evolving. Recently we heard the news that the emulator learned how run all the games from the console's catalog. That's a good excuse to analyze the project. We'll see which errors remained after new fixes were added to the project.


0886_rpcs3/image1.png

Read more →

How to choose a static analysis tool

Reading time8 min
Views2.3K

Tools to improve and control code quality can be a key success factor in a complex software project implementation. Static analyzers belong to such tools. Nowadays, you can find various static analyzers: from free open-source to cross-functional commercial solutions. On the one hand, it's great – you can choose from many options. On the other hand – you have to perform advanced research to find the right tool for your team.

Read more

Full motion video with digital audio on the classic 8-bit game console

Reading time13 min
Views1.5K

Back in 2016 an United States based music composer and performer Sergio Elisondo released an one-man band music album A Winner Is You (know your meme), with multi-instrumental cover versions of tunes from numerous memorable classic NES games. A special feature of this release has been its version released in the NES cartridge format that would run on a classic unmodified console and play digitized audio of the full album, instead of the typical chiptune sound you would expect to come from this humble console. I was involved with the software development part of this project.

This year Sergio makes a return with a brand new music release. This time it is all original music album You Are Error, heavily influenced with the video game music aesthetics. It also comes with a special extra. This time we have raised the stakes, and a new NES cartridge release includes not only the digitized audio, but full motion videos for each song, done in the silhouette cutout style similar to the famous Bad Apple video. Yet again, this project is crowdfunded via Kickstarter. It already got the asked amount in a mere 7 hours, but there is still a little time to jump on the bandwagon and get yourself a copy. In the meantime I would like to share an insight on the technical side of both projects.

Read more

Lingtrain Aligner. How to make parallel books for language learning. Part 1. Python and Colab version

Reading time8 min
Views4.1K

title


If you're interested in learning new languages or teaching them, then you probably know such a way as parallel reading. It helps to immerse yourself in the context, increases the vocabulary, and allows you to enjoy the learning process. When it comes to reading, you most likely want to choose your favorite author, theme, or something familiar and this is often impossible if no one has published such a variant of a parallel book. It's becoming even worse when you're learning some cool language like Hungarian or Japanese.


Today we are taking a big step forward toward breaking this situation.


We will use the lingtrain_aligner tool. It's an open-source project on Python which aims to help all the people eager to learn foreign languages. It's a part of the Lingtrain project, you can follow us on Telegram, Facebook and Instagram. Let's start!


Find the texts


At first, we should find two texts we want to align. Let's take two editions of "To Kill a Mockingbird" by Harper Lee, in Russian and the original one.

Read more →

Extending and moving a ZooKeeper ensemble

Reading time3 min
Views2.6K

    Once upon a time our DBA team had a task. We had to move a ZooKeeper ensemble which we had been using for Clickhouse cluster. Everyone is used to moving an ensemble by moving its data files. It seems easy and obvious but our Clickhouse cluster had more than 400 TB replicated data. All replication information had been collected in ZooKeeper cluster from the very beginning. At the end of the day we couldn’t miss even a row of data. Then we looked for information on the internet. Unfortunately there was a good tutorial about 3.4.5 and didn’t fit our version 3.6.2. So we decided to use “the extending” for moving our ensemble.

Read more

Best warnings of static analyzer

Reading time3 min
Views1K

Everyone who runs the static analyzer on a project for the first time is slightly shocked by hundreds, thousands or even tens of thousands of warnings. It may be frustrating. Is my code so terrible? Or is the analyzer lying? In any case, filtering by the severity changes the situation, not completely though. That's why we thought about how we could improve the first experience with the analyzer. Let me show you the new feature step by step...

Read more

Modula-3. The article from “Computer newspaper” N12 2000

Reading time9 min
Views770

 

One of the main tenets of the Unix philosophy is that a good tool for a good cause. Suppose you have a task to develop a large application that should have multiple threads of execution, possibly be distributed and, of course, have a graphical interface. I would like to make such a program quickly and without unnecessary mistakes. 

I think the first question to ask in a situation like this is, "Which programming language is right?" C is not a bad choice, but not for such a project. It does not scale very well, and does not have the means of working with processes at all. Then C++? But C++ is a complex language, and past experience has shown that it will take a fair amount of time to debug memory allocation problems. What else? 

There is a well-designed tool for just such a job. It is a Modula-3 language developed and implemented by the Digital Equipment Corporation Systems Research Center (SRC). Modula-3 is a modern, modular, object-oriented language. Other features include automatic memory management (built-in garbage collector), exception handling, support for dynamic types, and multi-threaded programming. 

The SRC implementation includes a compiler, a minimal recompilation system (m3build), and a wide range of libraries and sample applications. It must be said that SRC Modula-3 is a free system supplied with source code, including a compiler and a run-time kernel. In addition, SRC Modula-3 has been implemented for a dozen platforms, including Windows 95/NT.

  The goal of the developers of the language, in their own words, was not innovation, but the careful selection and consolidation of ideas, time-tested and proven to be useful in practice. Modula-3 is a simple but full-featured language for building large and reliable software packages with a long life cycle.

Read more

Easy concurrency with Python Shared Object

Reading time23 min
Views8.8K

Project repository.
Year old article about general concepts of the project.


So you want to build a multitasking system using python? But you actually hesitate because you know you'll have to either use multitasking module, which is slow and/or somewhat inconvenient, or a more powerfull external tool like Redis or RabbitMQ or even large DBMS like MongoDB or PostgreSQL, which require some glue (i.e. very far from native python code) and apply their own restrictions on what you can do with your data. If you think «why do I need so much hassle if I just want to run few worker threads in python using the data structures I already have in my python program and using functions I've already written? I just want to run this code in threads! Oh, I wish there was no GIL in Python» — then welcome to the club.


Of course many of us can build from scratch a decent tool that would make use of multiple cores. However, having already existing working software (Pandas, Tensorflow, SciPy, etc) is always cheaper than any development of new software. But the status quo in CPython tells us one thing: you cannot remove GIL because everything is based on GIL. Although making shit into gold could require much work, the ability to alleviate the transition from slow single-threaded shit to a slow not-so-single-threaded gold-looking shit might be worth it, so you won't have to rewrite your whole system from scratch.


Read more →

How we sympathize with a question on StackOverflow but keep silent

Reading time3 min
Views724

How we sympathize with a question on StackOverflow but keep silent
On the stackoverflow.com website, we frequently see questions about how to look for bugs of a certain type. We know that PVS-Studio can solve the problem. Unfortunately, we have to keep silent. Otherwise, StackOverflow moderators may consider it as an obvious attempt to promote our product. This article describes a particular case of such a situation that makes us suffer deeply.

Read more →

OWASP Top Ten and Software Composition Analysis (SCA)

Reading time9 min
Views1.4K

The OWASP Top Ten 2017 category A9 (which became A6 in OWASP Top Ten 2021) is dedicated to using components with known vulnerabilities. To cover this category in PVS-Studio, developers have to turn the analyzer into a full SCA solution. How will the analyzer look for vulnerabilities in the components used? What is SCA? Let's try to find the answers in this article!

Read more

Q3 2021 DDoS attacks and BGP incidents

Reading time7 min
Views3.5K

The third quarter of 2021 brought a massive upheaval in the scale and intensity of DDoS attacks worldwide.

It all led to September when together with Yandex, we uncovered one of the most devastating botnets since the Mirai and named it Meris, as it was held accountable for a series of attacks with a very high RPS rate. And as those attacks were aimed all over the world, our quarterly statistics also changed.

This quarter, we've also prepared for your consideration a slice of statistics on the application layer (L7) DDoS attacks. Without further ado, let us elaborate on the details of DDoS attacks statistics and BGP incidents for Q3, 2021.

Read more

Best Digital Communication API Platform Reviewed and Compared (2022)

Reading time5 min
Views2.2K

Digital communication APIs and SDKs! the most powerful tool in the era of digitalization. Unlike other tools, these real time communication APIs have spread their impact over all industries and have successfully grabbed the attention of proficient developers too.

Based on the demand and need of developers to know more about these digital communication APIs and SDKs concerning their market availability with pricing, features and functionalities, I have posted this article to get you some clarity with research on the top most real time chat API and SDK providers. So, let’s start over.

Read more

Using the Machine Learning model to detect credit card fraud

Reading time3 min
Views1.8K

When we move towards the digital world, we shouldn’t forget that cybersecurity has been playing a major role in our life. Talks about digital security have been stiff. The main challenge we would face is abnormality.

During an online transaction, most of the product-lovers prefer credit cards. The credit limit available in credit cards would allow us to purchase even when our bank balance is insufficient. But this is great news for cyber attackers eyeing your money.

For tackling this problem, we should depend upon a system to make hardpressed transactions effortless.

This is where we need a system to track the transaction patterns. With AI, we can abort any abnormal transaction, precisely for credit card fraud detection AI.

As of now, we will come across a number of machine learning algorithms to classify unusual transactions where Artificial Intelligence detect fraud. We only need past data and the right algorithm to fit the data in the right form in case of credit card fraud detection ai.

How do we make this happen? Let’s look into the process of credit card fraud detection AI:

Import the needed libraries

The best step to detect credit card fraud detection with AI is to import the libraries. The best practice would be to import the necessary libraries in a single section for the purpose of quick modification. To use the credit card data, we can use the PCA’s transformed version or RFECV, RFE, VIF and SelectKBest to get the best model features.

Import Dataset

Machine learning helps with fraud detection. It’s quite simple to import the dataset when you use the pandas module in python. You can run the run command for importing your data. 

Read more