Pull to refresh

Data visualization *

We enclose the data in a beautiful shell

Show first
Rating limit
Level of difficulty

Release of Chipmunk v.3

Level of difficulty Easy
Reading time 3 min
Views 615

We released the new version of Chipmunk, software for viewing/analyzing log files. V.3 is wholly reworked with an accent to performance and rethought considering usability. Below short list of the most important changes and a general description.

Read more
Total votes 1: ↑1 and ↓0 +1
Comments 2

Rich text editors from backend perspective

Reading time 7 min
Views 4.2K
Welcome everyone, in this article I’m going to overview the most popular types of rich text editors, tradeoffs of their use from a backend perspective. By that I mean:

  • Streaming of content from the rich text editor to other infrastructure tools like full-text search, warehouses, etc.
  • Retrieving of content to clients: mobile, web, desktop.
  • Storing of content in some kind of storage (SQL database in my case)
  • Analyzing of content, which includes point 1, but also analyzing it from the perspective of our application
Read more →
Total votes 1: ↑0 and ↓1 -1
Comments 2

How to find an English teacher. Part 2

Reading time 4 min
Views 872

This is a continuation of story about using Data Science for finding an English teacher. If you have not read it yet - there is an opportunity to become familiar with it

Briefly  -  we had information about language teachers and tried to apply some basic ideas using pandas and our expectations. Unfortunately we got stuck on the third step, because there is not enough information for resolving our the last requirements  -  we need not more 3 candidates at the end.

It is an approach based on my own experience and can be unsuitable to your point of view, ideas, or principles.
Rating 0
Comments 0

How to find an English teacher. Part 1

Reading time 5 min
Views 1.5K

In the modern world, here and there ideas are arising about using data science for an extra benefit. For instance, Google can use a history of watched videos for providing recommendations about new ones. Online shops are using a recommendation system for increasing your receipt. However… if companies use the data for their benefit, could we do the same for own needs such as looking an online English teacher?


It is an approach based on my own experience and can be unsuitable to your point of view, ideas, or principles.

Total votes 2: ↑1 and ↓1 0
Comments 0

Kibana Tips & Tricks: How to view events in Discover mode

Reading time 3 min
Views 5.8K

Hi Habrausers!

As you may know Kibana is a visualization instrument, part of ELK (Elastic, Logstash, Kibana) stack. With the help of Kibana you may analyze and visualize your data, build different charts and combine them on the dashboard to present data in the most beautiful way.
People who use Kibana in our company have different background — some of them are technical who process data, some are managers who simply want to monitor some KPIs. And all have various questions. In spite of Kibana is rather popular in IT companies, there are not many articles or courses about it. To fill the gap I have created Kibana Tips & Tricks — weekly letters with frequently asked questions or themes. Such letters help our users to become more familiar with Kibana. There are no secrets — just detailed description of how you may work with your data.
I would like to share the first part of 'Kibana Tips & Tricks' with you — series of simple how-to articles for people who would like to know more about data analysis and visualization in Kibana. Today we will see how to view events in Kibana.
Read more →
Total votes 7: ↑7 and ↓0 +7
Comments 0

COVID YAAA! or Yet Another Analyze Attempt

Reading time 11 min
Views 1.2K


Hello, Habr!

About a month ago, I had a feeling of constant anxiety. I began to eat poorly, sleep even worse, and constantly read to a ton of news about the pandemic. Based on them, the coronavirus either captured, or liberated our planet, was either a conspiracy of world governments, or the vengeance of the pangolin, the virus either threatened everyone at once, or personally me and my sleeping cat…

Hundreds of articles, social media posts, youtube-telegram-instagram-tik-tok (yes, I sin) content of varying degrees of content quality did not lead me to anything but an even greater sense of anxiety.

But one day I bought buckwheat decided to end it all. As soon as possible!

What did you do?
Total votes 1: ↑0 and ↓1 -1
Comments 0

Habr — best articles, authors and statistics 2019

Reading time 6 min
Views 2.8K
2019 is coming to an end, and it's Christmas soon. It is also the time to grab all data and collect statistics and a rating of the most interesting Habr's articles for this period.

In this post the best articles and best Habr authors 2019 will be presented, I also will show some statistical graphs that I find interesting or unusual.

Let's get started.
Read more →
Total votes 23: ↑22 and ↓1 +21
Comments 11

Machine Learning for your flat hunt. Part 2

Reading time 9 min
Views 1.6K

Have you thought about the influence of the nearest metro to the price of your flat? 
What about several kindergartens around your apartment? Are you ready to plunge in the world of geo-spatial data?

The world provides so much information…

Read more →
Total votes 4: ↑4 and ↓0 +4
Comments 0

Keyword Tree: graph analysis for semantic extraction

Reading time 3 min
Views 1.6K


This post is a small abstract of full-scaled research focused on keyword recognition. Technique of semantics extraction was initially applied in field of social media research of depressive patterns. Here I focus on NLP and math aspects without psychological interpretation. It is clear that analysis of single word frequencies is not enough. Multiple random mixing of collection does not affect the relative frequency but destroys information totally — bag of words effect. We need more accurate approach for the mining of semantics attractors.

Read more →
Total votes 8: ↑7 and ↓1 +6
Comments 0

Automatic respiratory organ segmentation

Reading time 8 min
Views 2K

Manual lung segmentation takes about 10 minutes and it requires a certain skill to get the same high-quality result as with automatic segmentation. Automatic segmentation takes about 15 seconds.

I assumed that without a neural network it would be possible to get an accuracy of no more than 70%. I also assumed, that morphological operations are only the preparation of an image for more complex algorithms. But as a result of processing of those, although few, 40 samples of tomographic data on hand, the algorithm segmented the lungs without errors. Moreover, after testing in the first five cases, the algorithm didn’t change significantly and correctly worked on the other 35 studies without changing the settings.

Also, neural networks have a disadvantage — for their training we need hundreds of training samples of lungs, which need to be marked up manually.

Read more →
Total votes 11: ↑10 and ↓1 +9
Comments 1

Google News and Leo Tolstoy: visualizing Word2Vec word embeddings using t-SNE

Reading time 7 min
Views 13K

Everyone uniquely perceives texts, regardless of whether this person reads news on the Internet or world-known classic novels. This also applies to a variety of algorithms and machine learning techniques, which understand texts in a more mathematical way, namely, using high-dimensional vector space.

This article is devoted to visualizing high-dimensional Word2Vec word embeddings using t-SNE. The visualization can be useful to understand how Word2Vec works and how to interpret relations between vectors captured from your texts before using them in neural networks or other machine learning algorithms. As training data, we will use articles from Google News and classical literary works by Leo Tolstoy, the Russian writer who is regarded as one of the greatest authors of all time.

We go through the brief overview of t-SNE algorithm, then move to word embeddings calculation using Word2Vec, and finally, proceed to word vectors visualization with t-SNE in 2D and 3D space. We will write our scripts in Python using Jupyter Notebook.

Read more →
Total votes 28: ↑28 and ↓0 +28
Comments 0

Authors' contribution