Search
Write a publication
Pull to refresh

All streams

Show first
Period
Level of difficulty

The Rules for Data Processing Pipeline Builders

Reading time5 min
Views3.8K


"Come, let us make bricks, and burn them thoroughly."
– legendary builders

You may have noticed by 2020 that data is eating the world. And whenever any reasonable amount of data needs processing, a complicated multi-stage data processing pipeline will be involved.


At Bumble — the parent company operating Badoo and Bumble apps — we apply hundreds of data transforming steps while processing our data sources: a high volume of user-generated events, production databases and external systems. This all adds up to quite a complex system! And just as with any other engineering system, unless carefully maintained, pipelines tend to turn into a house of cards — failing daily, requiring manual data fixes and constant monitoring.


For this reason, I want to share certain good engineering practises with you, ones that make it possible to build scalable data processing pipelines from composable steps. While some engineers understand such rules intuitively, I had to learn them by doing, making mistakes, fixing, sweating and fixing things again…


So behold! I bring you my favourite Rules for Data Processing Pipeline Builders.

Read more →

Modern Google-level STT Models Released

Reading time2 min
Views5.5K


We are proud to announce that we have built from ground up and released our high-quality (i.e. on par with premium Google models) speech-to-text Models for the following languages:


  • English;
  • German;
  • Spanish;

You can find all of our models in our repository together with examples, quality and performance benchmarks. Also we invested some time into making our models as accessible as possible — you can try our examples as well as PyTorch, ONNX, TensorFlow checkpoints. You can also load our model via TorchHub.


PyTorch ONNX TensorFlow Quality Colab
English (en_v1) link Open In Colab
German (de_v1) link Open In Colab
Spanish (es_v1) link Open In Colab
Read more →

The magic of Virtualization: Proxmox VE introductory course

Reading time8 min
Views3.1K

Today, I am going to explain how to quickly deploy several virtual servers with different operating systems on a single physical server without much effort. This will enable any system administrator to manage the whole corporate IT infrastructure in a centralized manner and save a huge amount of resources.
Read more →

Programming as an endless educational pursuit

Reading time5 min
Views1.4K
When one embarks on the journey to master the craft of programming, they come to the realisation that it has no finish line. No matter how good you are, there are still things to learn, solutions to explore.

Today, we’ll talk about the importance of remaining a lifelong student, language adoption trends according to StackOverflow and why programming itself might not be what you end up learning to become better.

Read more →

The Silverfish Programming Language

Reading time9 min
Views2.6K

They say, each professional developer must have done at least three pet projects: a sophisticated logging utility, a smart json parser, and an amazing programming language. Once we have both logger and parser accomplished, we finally decided to reveal our desperate success in creation one of the most innovative programming languages named Silverfish.


Карасик → На самом деле плотвичка

Read more →

Putting theory to practice: juggling work and study at the Department of Photonics and Optical Information Technology

Reading time4 min
Views1.5K
Master’s degrees are really useful. Postgrad education allows BA holders to put their new-found skills into practice, and secure great jobs further down the road. But students often need help assessing this choice, particularly if they majored in uncommon subjects — like photonics.

To set the record straight, we talked to the people behind, and the graduates of our MA programs in photonics and optical computing. In this article you’ll learn about part-time work available for photonics students, graduates’ job-hunting prospects, and the academic career options that open up.

Read more →

Making a demo for an old phone — AONDEMO

Reading time13 min
Views4.1K
I wanted to make a demo ever since I saw the classic Polish mega demo Lyra II for first time in 1997. I also wanted to do something for the largest Russian demo party Chaos Constructions for a long while, but have never gotten around that, being occupied with other duties. Finally, in 2018 the time has come, and I fulfilled both desires at once, Van Damm's double impact style — made a demo called AONDEMO that entered ZX Spectrum 640K Demo compo at Chaos Constructions.


I bet the red thing you've just seen does not look much a Spectrum to you. Here's the story.

Read more →

Upcoming SameSite Cookie Changes in ASP.NET and ASP.NET Core

Reading time5 min
Views3.9K
SameSite is a 2016 extension to HTTP cookies intended to mitigate cross site request forgery (CSRF). The original design was an opt-in feature which could be used by adding a new SameSite property to cookies. It had two values, Lax and Strict.

Setting the value to Lax indicated the cookie should be sent on navigation within the same site, or through GET navigation to your site from other sites. A value of Strict limited the cookie to requests which only originated from the same site. Not setting the property at all placed no restrictions on how the cookie flowed in requests. OpenIdConnect authentication operations (e.g. login, logout), and other features that send POST requests from an external site to the site requesting the operation, can use cookies for correlation and/or CSRF protection. These operations would need to opt-out of SameSite, by not setting the property at all, to ensure these cookies will be sent during their specialized request flows.

Google is now updating the standard and implementing their proposed changes in an upcoming version of Chrome. The change adds a new SameSite value, «None», and changes the default behavior to «Lax». This breaks OpenIdConnect logins, and potentially other features your web site may rely on, these features will have to use cookies whose SameSite property is set to a value of «None».

However browsers which adhere to the original standard and are unaware of the new value have a different behavior to browsers which use the new standard as the SameSite standard states that if a browser sees a value for SameSite it does not understand it should treat that value as «Strict». This means your .NET website will now have to add user agent sniffing to decide whether you send the new None value, or not send the attribute at all.

Read more →

Optimising server distribution across the racks

Reading time5 min
Views1.9K
Recently, a colleague asked me in a chat:

— Is there an article how to pack servers into the racks properly?

I realised that I'm unaware of it. So, I decided to write my text.

Firstly, this is an article about bare metal servers in the data centre (DC) facilities. Secondly, we estimate that there are a lot of servers (hundreds or thousands); the article doesn't make sense for fewer quantities. Thirdly, we consider that there are three constraints in the racks: physical space, electric power per each one, and cabinets stay in the rows adjacent to each other, so we can use a single ToR switch to connect servers in them.
The answer to the original question depends significantly...

Announcing Support for Native Editing of Jupyter Notebooks in VS Code

Reading time3 min
Views1.9K
With October release of the Python extension, we’re excited to announce the support of native editing of Jupyter notebooks inside Visual Studio Code! You can now directly edit .ipynb files and get the interactivity of Jupyter notebooks with all of the power of VS Code.

You can manage source control, open multiple files, and leverage productivity features like IntelliSense, Git integration, and multi-file management, offering a brand-new way for data scientists and developers to experiment and work with data efficiently. You can try out this experience today by downloading the latest version of the Python extension and creating/opening a Jupyter Notebook inside VS Code.



Since the initial release of our data science experience in VS Code, one of the top features that users have requested has been a more notebook-like layout to edit their Jupyter notebooks inside VS Code. In the rest of this post we’ll take a look at the new capabilities this offers.
Read more →

How to debug and profile any EXE with Visual Studio

Reading time3 min
Views5.6K
Have you ever needed to debug or profile an executable (.exe file) that you don’t have source for or can’t build? Then the least known Visual Studio project type, the EXE project, is for you!

In Visual Studio you can open any EXE as a ‘project’. Just go to File->Open->Project/Solution and browse to the .exe file. Like you would if it was a .sln file. Visual Studio will then open that EXE as a project. This feature has been around for a long time. It works on all currently supported Visual Studio versions and the docs for it are at  ‘Debug an app that isn’t part of a Visual Studio solution‘.

 
Read more →

Introducing Cascadia Code font

Reading time2 min
Views2K
Cascadia Code is finally here! You can install it directly from the GitHub repository’s releases page or automatically receive it in the next update of Windows Terminal.



Wait, what’s Cascadia Code?


Cascadia Code was announced this past May at Microsoft’s Build event. It is the latest monospaced font shipped from Microsoft and provides a fresh experience for command line experiences and code editors. Cascadia Code was developed hand-in-hand with the new Windows Terminal application. This font is most recommended to be used with terminal applications and text editors such as Visual Studio and Visual Studio Code.
Read more →

Tips And Tricks For Conducting A Successful Mobile App A/B Test

Reading time4 min
Views1.2K


As per the latest stats reveal, there are more than 2.7 billion smartphone users globally and over 2.8 million apps on Google Play Store.


Now, it's a fact that the number of mobile users is increasing at outstanding speed, and so is the name of apps on Google Play Store.


But does all the apps success to make a difference? No, but just a handful of having stood out and gain popularity.


For instance, there are several games on the Play Store, but why only Candy Crush, Subway Surfers and Angry Birds topped the charts while the other games struggle for even ten downloads.


The trick is to provide the players with what they want, and in turn, the response for such apps is tremendous.


However, at this point, when the competition is very high, it's challenging to come up with something new that can stand out from the crowd.


It's crucial to create a brand name so that people can talk about it.

Read more →

Building a Bare-Metal Application on Intel Cyclone V for Absolute Beginners

Reading time7 min
Views9.5K
Setting up Linux on the development board like SocKit with a double-core ARM Cortex A9 is not rocket science. A manufacturer of the board supports the ready-to-use image, appropriate for installing on SD card or another media. But what if you are craving to touch bare metal, approaching a neck-breaking speed of code not restrained by an OS core? Well, it is possible, but not so easy and obvious. In this short essay, I'll give you step-by-step instruction, how to build and run you first bare-metal application on Cyclone V SoC, that uses ARM Cortex A9 core of the HPS subsystem of the SoC.

You need to have the development board with Intel (Altera) Cyclone V SoC. I used SoCKit board:


Ready? Let's go!

How to Catch a Cat with TLA+

Reading time3 min
Views2K
Many programmers struggle when using formal methods to solve problems within their programs, as those methods, while effective, can be unreasonably complex. To understand why this happens, let’s use the model checking method to solve a relatively easy puzzle:

Conditions


You’re in a hallway with seven doors on one side leading to seven rooms. A cat is hiding in one of these rooms. Your task is to catch the cat. Opening a door takes one step. If you guess the correct door, you catch the cat. If you do not guess the correct door, the cat runs to the next room.
Read more →

How to Maximize the Value of Product Backlog Grooming?

Reading time5 min
Views3.8K
The Agile methodology consists of various mandatory concepts and artifacts. A product backlog is one of them. This is actually a set of requirements received from the business and formulated in the form of development tasks.

Backlog grooming is not a magic wand; it's a comprehensive activity aimed to ensure that all the tasks are always in clear order. How can the grooming process be improved? And what are the special things about it?

image