Pull to refresh

My feed

Type
Rating limit
Level of difficulty
Warning
To set up filters sign in or sign up
Article

StarRocks vs. ClickHouse, Apache Druid, and Trino

Level of difficultyEasy
Reading time8 min
Views26

In the big data era, data is one of the most valuable assets for enterprises. The ultimate goal of data analytics is to power swift, agile business decision making. As database technologies advance at a breathtaking pace in recent years, a large number of excellent database systems have emerged. Some of them are impressive in wide-table queries but do not work well in complex queries. Some support flexible multi-table queries but are held back by slow query speed.

Each type of data has a data model that best represents them. However, in real business scenarios, there is no such thing as ultra-fast data analytics under the perfect data model. Big data engineers sometimes have to make compromises on data models. Such compromises may cause long latency in complex queries or damage the real-time query performance because engineers must take the trouble to convert complex data models into flat tables.

New business requirements put forward new challenges for database systems. A good OLAP database system must be able to deliver excellent performance in both wide-table and multi-table scenarios. This system must also reduce the workload of big data engineers and enable customers to query data of any dimension in real time without worrying about data construction.

Read more
Article

A Small Practical Guide to Calculating the Economic Value of AppSec and DevSecOps

Level of difficultyMedium
Reading time5 min
Views60

Investing in Application Security (AppSec) and DevSecOps is no longer optional; it's a strategic imperative. However, securing budget and justifying these initiatives requires moving beyond fear and speaking the language of business: Return on Investment (ROI).

This guide provides a structured framework for calculating the costs and benefits of embedding security into your software development lifecycle (SDLC). By understanding and applying concepts like Total Cost of Ownership (TCO), Lifecycle Cost Analysis (LCCA), and Return on Security Investment (ROSI), you can build a compelling financial case, guide your security strategy, and prove tangible value to stakeholders.

Read more
Article

Stream-first Gotenberg Client for Go

Level of difficultyMedium
Reading time2 min
Views79

Go client for Gotenberg — document conversion service supporting Chromium, LibreOffice, and PDF manipulation engines.

Features

- Chromium: Convert URLs, HTML, and Markdown to PDF

- LibreOffice: Convert Office documents (Word, Excel, PowerPoint) to PDF

- PDF Engines: Merge, split, and manipulate PDFs

- Webhook support: Async conversions with callback URLs

- Stream-first: Built on httpstream for efficient multipart uploads

Read more
Article

Stream-first HTTP Client for Go

Level of difficultyMedium
Reading time5 min
Views145

Stream-first HTTP Client for Go. Efficient, zero-buffer streaming for large HTTP payloads — built on top of net/http.

httpstream provides a minimal, streaming-oriented API for building HTTP requests without buffering entire payloads in memory.Ideal for large JSON bodies, multipart uploads, generated archives, or continuous data feeds.

- Stream data directly via io.Pipe—no intermediate buffers

- Constant memory usage (O(1)), regardless of payload size

- Natural backpressure (writes block when receiver is slow)

- Thin net/http wrapper—fully compatible

- Middleware support: func(http.RoundTripper) http.RoundTripper

- Fluent API for readability (GETPOSTMultipart, etc.)

- No goroutine leaks, no globals

httpstream connects your writer directly to the HTTP transport. Data is transmitted as it's produced, allowing the server to start processing immediately—without waiting for the full body to be buffered.

Read more
Article

The LLM's Narrative Engine: A Critique of Prompting

Level of difficultyEasy
Reading time8 min
Views79

In a previous article, I proposed the holographic hypothesis: an LLM isn't a database of facts, but an interference field—a landscape of probabilities shaped by billions of texts. But a static landscape is just potential. How does the model actually move through it? How does it choose one specific answer from infinite possibilities?

This is where the Narrative Engine comes in. If the holographic hypothesis describes the structure of an LLM's "mind," the narrative engine hypothesis describes its dynamics. It is the mechanism that drives the model, forcing its probabilistic calculations to follow the coherent pathways of stories. This article critiques modern prompting techniques through this new lens, arguing that we are not programming a machine, but initiating a narrative.

Read more
Article

What is design thinking and how to implement it in the UX design

Level of difficultyEasy
Reading time5 min
Views141

Design thinking is a customer‑focused, non‑linear iterative approach to solving problems and finding creative solutions in the process of creating a human‑centered intuitive design for a product. It involves cross‑functional teams working together to study their users, address complex problems and think outside the box to drive innovation. Let's discuss the stages, principles and goals of this important process, as well as the positive impact it has on the design teams. 

Stages of design thinking

Design thinking is a non‑linear process, which means each team can organize it in the way most suitable for their current workflow. Nevertheless, experts define 5 stages that design thinking should include, not necessarily in the following order. 

Read more
Article

Comparison: StarRocks vs Apache Druid

Level of difficultyEasy
Reading time5 min
Views129

Apache Druid has been a staple for real-time analytics. However, with evolving and sophisticated analytics demands, it has faced challenges in satisfying modern data performance needs. Enter StarRocks, a high-performance, open-source analytical database, designed to adeptly meet the advanced analytics needs of contemporary enterprises by offering robust capabilities and performance.

In this article, we’ll explore the functionalities, strengths, and challenges of both Apache Druid and StarRocks. Using practical examples and benchmark results, we aim to guide you in identifying which database might best meet your data needs.

Read more
Article

LLM as a Resonance-Holographic Field of Meanings

Level of difficultyEasy
Reading time14 min
Views442

Alright. I pose the same question to an LLM in various forms. And this statistical answer generator, this archive of human knowledge, provides responses that sometimes seem surprisingly novel, and other times, derivative and banal.

On Habr, you'll find arguments that an LLM is incapable of novelty and creativity. And I'm inclined to agree.
You'll also find claims that it shows sparks of a new mind. And, paradoxically, I'm inclined to agree with that, too.

The problem is that we often try to analyze an LLM as a standalone object, without fully grasping what it is at its core. This article posits that the crucial question isn't what an LLM knows or can do, but what it fundamentally is.

Read more
Article

How we boosted SQL query accuracy by 33% with LLMs

Level of difficultyMedium
Reading time8 min
Views412

Traditional approaches to SQL query generation often rely on instruction-tuned language models, but these can be inefficient and inaccurate. In this article, we’ll explore a new method based on reinforcement learning for model fine-tuning, which can improve both the accuracy and efficiency of SQL generation.

Read more
Article

OAuth 2.0 authorization in PostgreSQL using Keycloak as an example

Level of difficultyEasy
Reading time27 min
Views610

Hello, Habr! We continue the series of articles on the innovations of the Tantor Postgres 17.5.0 DBMS, and today we will talk about authorization support via OAuth 2.0 Device Authorization Flow is a modern and secure access method that allows applications to request access to PostgreSQL on behalf of the user through an external identification and access control provider, such as Keycloak, which is especially convenient for cloud environments and microservice architectures (the feature will also be available in PostgreSQL 18). In this article, we'll take a step-by-step look at configuring OAuth authorization in PostgreSQL using Keycloak: configure Keycloak, prepare PostgreSQL, write an OAuth token validator in PostgreSQL, and verify successful authorization via psql using Device Flow.

Read more
Article

Exposed: Custom column types

Level of difficultyEasy
Reading time8 min
Views350

Exposed is an SQL library for Kotlin with DSL and DAO APIs for database interactions. While it comes with support for standard SQL data types, you can extend its functionality by creating custom column types.

Custom column types are useful when Exposed lacks support for specific database types (like PostgreSQL's enum, inet or ltree) or when you want to map columns to domain-specific types that better align with your business logic. By implementing custom columns, you gain control over data storage and retrieval while maintaining type safety.

In this article, we'll explore how to create custom column types in Exposed by creating a simple column type for PostgreSQL's enum.

Read more
Article

4 best tips to building high-quality data products from SYNQ

Level of difficultyEasy
Reading time6 min
Views326

The “test everything” principle doesn’t improve data quality — it destroys it. Hundreds of useless alerts create noise that drowns out truly important signals, and the team stops responding to them. Google and Monzo have already moved away from this approach.

Here’s how to shift from blanket testing to targeted checks at nodes with the greatest impact radius — and why one well-placed test at the source is worth more than a hundred checks downstream.

Read more
Article

Privacy on Mobile: a practitioner’s checklist

Level of difficultyMedium
Reading time13 min
Views5.7K

People have always valued privacy. Developments of the past decades — the internet, social networks, targeted advertising — turned data into an asset. The AI wave multiplies what can be inferred from crumbs. Phones and apps are integral to people’s lives. Some users keep everything on their phones; others are more restrictive. It shouldn’t rely only on user awareness: developers should provide the first line of defence and the tools that protect a user’s right to privacy. Even if you already deal with most of these pieces daily, I want to share my mental model — how I frame decisions with checklists and a few concrete examples from practice.

Read more
Article

Emotions and Qualia: A New Approach

Level of difficultyEasy
Reading time6 min
Views361

At last, we arrive at qualia and emotions. Many of you will immediately think of Chalmers, the bat, redness, and zombies. Excellent. We can consider that ground covered.

Today, I will discuss a topic that seems distant from IT but, with each new breakthrough in AI, becomes ever more immediate: consciousness. It seems I speak of little else. So, to be precise, I will discuss its "hard problem": why do we experience at all? Why does the color red (and there’s the redness) feel red, and pain feel like pain?

This subjective, ineffable aspect of experience — the "what it is like" — is what philosophy calls qualia. For decades, it has been a dead end for scientists. But what if we're looking in the wrong direction? What if qualia are not an additional layer to computation, but an inherent property of the very architecture of computation?

Read more
Article

The Hidden Economics of Your Vacation: Why a 2-Hour Transfer in the Alps Can Cost More Than a Flight

Level of difficultyEasy
Reading time10 min
Views1.1K

We think of pricing as a simple logic of distance and quality. But after diving into a rare data-driven analysis of the €2 billion Alpine transfer market, I realized the real cost drivers are invisible forces: structural inefficiencies, information asymmetry, and the surprisingly high price of consumer trust.

I've always been fascinated by markets that defy simple logic. Why does a cup of artisanal coffee cost $7? Why is some enterprise software priced per seat, while another is priced per API call? These aren't just arbitrary numbers; they are the surface-level results of deep, often hidden, economic forces. Recently, I stumbled upon a perfect example of such a market in an unexpected place: the private ski transfer industry in the Alps.

Read more
Article

Shardman. A quick guide for the architect

Reading time22 min
Views719

The myth of the magical fast=true parameter is still alive and well, but in distributed databases, another contender appears: distributed=true. Neither one will save you if you don’t rethink your schema, sharding keys, sequences, queries, and migration process. We walk through every corner with a clear-eyed approach — from choosing sharding keys and colocated tables to CDC, topologies, and foreign key constraints — showing where performance really improves, where it gets more expensive, and how to deal with it.

Read more
Article

AI slop coding, or How to build ridiculously long attack chains with AI

Level of difficultyEasy
Reading time7 min
Views759

While researching malware used by attacker groups, we came across a series of unusual attacks that used GitHub repositories to store malicious files and victim data. These campaigns appear targeted rather than large-scale, and it seems the attackers relied heavily on AI during development. The earliest activity we traced was in September 2024, and the most recent in April 2025.

Our Threat Intelligence team investigates complex attacks featuring novel persistence and data collection methods and unique infrastructures. Sometimes we find simple two-line scripts, and other times we run into "bombs" that trigger dozens of different payloads at once. But it's pretty rare for us to come across such long chains of really simple AI-written scripts that still work, tied together in a way that clearly wasn't random. Think of this as an APT-style attack implemented at the "script kiddie" level (a derogatory term in hacker culture for those who rely on scripts or programs written by others).

Read more
1
23 ...