The new issue of Data Science Digest is here! Hop to learn about the latest news, articles, tutorials, research papers, and event materials on DataScience, AI, ML, and BigData. All sections are prioritized for your convenience. Enjoy!
Algorithms *
Everything about algorithms
Memoization
Dynamic programming is applied to solve optimization problems. In optimization, we try to find out the maximum or minimum solution of something. It will find out the optimal solution to any problem if that solution exists. If the solution does not exist, dynamic programming is not able to get the optimal solution.
Optimization problems are the ones that require either lowest or highest possible results. We attempt to discover all the possible solutions in dynamic programming and then choose the best optimal solution. Dynamic programming problems are solved by utilizing the recursive formulas though we will not use a recursion of programming the procedures are recursive. Dynamic programming pursues the rule of optimality.
A dynamic programming working involves around following significant steps:
Big O Notation
Asymptotic notations are used to represent the complexity or running time of an algorithm. It is a technique of defining the upper and lower limits of the runtime performance of an algorithm. We can analyze the runtime performance of an algorithm with the help of asymptotic notations. Asymptotic notations are also used to describe the approximate running time of an algorithm.
Types of Asymptotic Notations
Following are the different types of asymptotic notations:
Context category
The mathematical model of signed sequences with repetitions (texts) is a multiset. The multiset was defined by D. Knuth in 1969 and later studied in detail by A. B. Petrovsky [1]. The universal property of a multiset is the existence of identical elements. The limiting case of a multiset with unit multiplicities of elements is a set. A set with unit multiplicities corresponding to a multiset is called its generating set or domain. A set with zero multiplicity is an empty set.
Data Science Digest — 21.04.21
Hi All,
I’m pleased to invite you all to enroll in the Lviv Data Science Summer School, to delve into advanced methods and tools of Data Science and Machine Learning, including such domains as CV, NLP, Healthcare, Social Network Analysis, and Urban Data Science. The courses are practiceoriented and are geared towards undergraduates, Ph.D. students, and young professionals (intermediate level). The studies begin July 19–30 and will be hosted online. Make sure to apply — Spots are running fast!
If you’re more used to getting updates every day, follow us on social media:
Telegram
Twitter
LinkedIn
Facebook
Regards,
Dmitry Spodarets.
Data Science Digest — We Are Back
Hi All,
I have some good news for you…
Data Science Digest is back! We’ve been “offline” for a while, but no worries — You’ll receive regular digest updates with top news and resources on AI/ML/DS every Wednesday, starting today.
If you’re more used to getting updates every day, follow us on social media:
Telegram  https://t.me/DataScienceDigest
Twitter  https://twitter.com/Data_Digest
LinkedIn  https://www.linkedin.com/company/datasciencedigest/
Facebook  https://www.facebook.com/DataScienceDigest/
And finally, your feedback is very much appreciated. Feel free to share any ideas with me and the team, and we’ll do our best to make Data Science Digest a better place for all.
Regards,
Dmitry Spodarets.
Algebra of text. Examples
The previous work from ref [1] describes the method of transforming a sign sequence into algebra through an example of a linguistic text. Two other examples of algebraic structuring of texts of a different nature are given to illustrate the method.
Algorithms in Go: Bit Manipulation
This article is a part of Algorithms in Go series where we discuss common algorithmic problems and their solution patterns.
In this edition, we take a closer look at bit manipulations. Bit operations can be extremely powerful and useful in an entire class of algorithmic problems, including problems that at first glance does not have to do anything with bits.
Let's consider the following problem: six friends meet in the bar and decide who pays for the next round. They would like to select a random person among them for that. How can they do a random selection using only a single coin?
The solution to this problem is not particularly obvious (for me:), so let's simplify a problem for a moment to develop our understanding. How would we do the selection if there were only three friends? In other words, how would we "mimic" a threesided coin with a twosided coin?
Converting text into algebra
Algebra and language (writing) are two different learning tools. When they are combined, we can expect new methods of machine understanding to emerge. To determine the meaning (to understand) is to calculate how the part relates to the whole. Modern search algorithms already perform the task of meaning recognition, and Google’s tensor processors perform matrix multiplications (convolutions) necessary in an algebraic approach. At the same time, semantic analysis mainly uses statistical methods. Using statistics in algebra, for instance, when looking for signs of numbers divisibility, would simply be strange. Algebraic apparatus is also useful for interpreting the calculations results when recognizing the meaning of a text.
Compilation of math functions into Linq.Expression
Here I am going to cover my own approach to compilation of mathematical functions into Linq.Expression. What we are going to have implemented at the end:
1. Arithmetical operations, trigonometry, and other numerical functions
2. Boolean algebra (logic), less/greater and other operators
3. Arbitrary types as the function's input, output, and those intermediate
Hope it's going to be interesting!
Algorithms in Go
Most solutions to algorithmic problems can be grouped into a rather small number of patterns. When we start to solve some problem, we need to think about how we would classify them. For example, can we apply fast and slow аlgorithmic pattern or do we need to use cyclic sortpattern? Some of the problems have several solutions based on different patterns. In this series, we discuss the most popular algorithmic patterns that cover more than 90% of the usual problems.
It is different from HighSchool Algorithms 101 Course, as it is not intended to cover things like Karatsuba algorithm (fast multiplication algorithm) or prove different methods of sorting. Instead, Algorithmic Patterns focused on practical skills needed for the solution of common problems. For example, when we set up a Prometheus alert for high request latency we are dealing with Sliding Window Pattern. Or let say, we organize a team event and need to find an available time slot for every participant. At the first glance, it is not obvious that in this case, we are actually solving an algorithmic problem. Actually, during our day we usually solve a bunch of algorithmic problems without realizing that we dealing with algorithms.
The knowledge about Algorithmic Patterns helps one to classify a problem and then apply the appropriate method.
But probably most importantly learning algorithmic patterns boost general programming skills. It is especially helpful when you are debugging some production code, as it trains you to understand the execution flow.
Patterns covered so far:
Stay tuned :)
<Promo> If you interested to work as a backend engineer, there is an open position in my squad. Prior knowledge of Golang is not required. I am NOT an HR and DO NOT represent the company in any capacity. However, I can share my personal experience as a backend engineer working in the company. </Promo>
Algorithms in Go: Iterative Postorder Traversal
In this article, we discuss the postorder traversal of a binary tree. What does postorder traversal mean? It means that at first, we process the left subtree of the node, then the right subtree of the node, and only after that we process the node itself.
Why would we need to do it in this order? This approach solves an entire class of algorithmic problems related to the binary trees. For example, to find the longest path between two nodes we need to traverse the tree in a postorder manner. In general, postorder traversal is needed when we cannot process the node without processing its children first. In this manner, for example, we can calculate the height of the tree. To know the height of a node, we need to calculate the height of its children and increment it by one.
Let's start with a recursive approach. We need to process the left child, then the right child and finally we can process the node itself. For simplicity, let's just save the values into slice out.
What's new in AngouriMath 1.2?
After 210 days, 600 commits, tens of debugging nights, and thousands of messages in the project chat, I finally released AngouriMath 1.2.
This is an opensource symbolic algebra library for C# and F#, maybe it is interesting for someone?
Algorithms in Go: Matrix Spiral
Most solutions to algorithmic problems can be grouped into a rather small number of patterns. When we start to solve some problem, we need to think about how we would classify them. For example, can we apply fast and slowalgorithmic pattern or do we need to use cyclic sortpattern? Some of the problems have several solutions with different patterns. In this article of series Algorithms in Go we consider an algorithmic pattern that solves an entire class of the problems related to a matrix. Let's take one of such problems and see how we can handle it.
How can we traverse a matrix in a spiral order?
Why PVSStudio Uses Data Flow Analysis: Based on Gripping Error in Open Asset Import Library
An essential part of any modern static code analyzer is data flow analysis. However, from an outside perspective, the use of data flow analysis and its benefit is unclear. Some people still consider static analysis a tool searching for something in code according to a certain pattern. Thus, we occasionally write blog posts to show how this or that technology, used in the PVSStudio analyzer, helps to identify another interesting error. Today, we have such an article about the bug found in the Base64, one of the encoding standard implementations of binary data.
Algorithms in Go: Dutch National Flag
The flag of the Netherlands consists of three colors: red, white and blue. Given balls of these three colors arranged randomly in a line (it does not matter how many balls there are), the task is to arrange them such that all balls of the same color are together and their collective color groups are in the correct order.
For simplicity instead of colors red, white, and blue we will be dealing with ones, twos and zeroes.
Let's start with our intuition. We have an array of zeroth, ones, and twos. How would we sort it? Well, we could put aside all zeroes into some bucket, all ones into another bucket, and all twos into the third. Then we can fetch all items from the first bucket, then from the second, and from the last bucket, and restore all the items. This approach is perfectly fine and has a great performance. We touch all the elements when we iterate through the array, and then we iterate through all the elements once more when we "reassamble" the array. So, the overall time complexity is O(n) + O(n) ~= O(n). The space complexity is also O(n) as we need to store all items in the buckets.
Can we do better than that? There is no way to improve our time complexity. However, we can think of a more efficient algorithm in regard to space complexity. How would we solve the problem without the additional buckets?
Let's make a leap of faith and pretend that somehow we were able to process a part of the array. We iterate through part of the array and put encountered zeroes and ones at the beginning of the array, and twos at the end of the array. Now, we switched to the next index i with some unprocessed value x. What should we do there?
Algorithms in Go: Merge Intervals
This is the third part of a series covering the implementation of algorithms in Go. In this article, we discuss the Merge Interval algorithm. Usually, when you start learning algorithms you have to deal with some problems like finding the least common denominator or finding the next Fibonacci number. While these are indeed important problems, it is not something that we solve every day. What I like about the Merge Interval algorithm is that we apply it in our everyday life, usually without even noticing that we are solving an algorithmic problem.
Let's say that we need to organize a meeting for our team. We have three colleagues Jay, May, and Ray and their time schedule look as follows (a colored line represents an occupied timeslot):
Doing «Data Science» even if you have never heard the words before
There’s a lot of talk about machine learning nowadays. A big topic – but, for a lot of people, covered by this terrible layer of mystery. Like black magic – the chosen ones’ art, above the mere mortal for sure. One keeps hearing the words “numpy”, “pandas”, “scikitlearn”  and looking each up produces an equivalent of a threetome work in documentation.
I’d like to shatter some of this mystery today. Let’s do some machine learning, find some patterns in our data – perhaps even make some predictions. With good old Python only – no 2gigabyte library, and no arcane knowledge needed beforehand.
Interested? Come join us.
Algorithms in Go: Sliding Window Pattern (Part II)
This is the second part of the article covering the Sliding Window Pattern and its implementation in Go, the first part can be found here.
Let's have a look at the following problem: we have an array of words, and we want to check whether a concatenation of these words is present in the given string. The length of all words is the same, and the concatenation must include all the words without any overlapping. Would it be possible to solve the problem with linear time complexity?
Let's start with string catdogcat
and target words cat
and dog
.
How can we handle this problem?
Algorithms in Go: Sliding Window Pattern
Let's consider the following problem: we have an array of integers and we need to find out the length of the smallest subarray the sum of which is no less than the target number. If we don't have such a subarray we shall return 1.
We can start with a naive approach and consider every possible subarray in the input:
Authors' contribution

alizar 2844.5 
ZlodeiBaal 1441.0 
Fil 1419.0 
agorkov 1345.0 
Leono 1086.0 
YUVladimir 1037.0 
valemak 1014.0 
mephistopheies 996.0 
haqreu 958.0 
Zalina 922.0