Scala *

General-purpose programming language providing support for functional programming and a strong static type system

ArticlesPostsNewsAuthors

Ninil Apr 1 2024 at 19:10

User-defined aggregation functions in Spark

Medium

6 min

Data Engineering*Big Data*Scala*

Below, we will discuss user-defined aggregation functions (UDAF) using org.apache.spark.sql.expressions.Aggregator, which can be used for aggregating groups of elements in a DataSet into a single value in any user-defined way.

Let’s start by examining an example from the official documentation that implements a simple aggregation

olegchir Dec 16 2020 at 14:17

Big Data Tools EAP 12 Is Out: Experimental Python Support and Search Function in Zeppelin Notebooks

3 min

1.2K

JetBrains corporate blogScala*Python*Big Data*

Update 12 of the Big Data Tools plugin for IntelliJ IDEA Ultimate, PyCharm Professional Edition, and DataGrip has been released. You can install it from the JetBrains Plugin Repository or from inside your IDE. The plugin allows you to edit Zeppelin notebooks, upload files to cloud filesystems, and monitor Hadoop and Spark clusters.

In this release, we've added experimental Python support and global search inside Zeppelin notebooks. We’ve also addressed a variety of bugs. Let's talk about the details.

olegchir Oct 9 2020 at 13:55

Big Data Tools Update 11 Is Out

3 min

1.8K

JetBrains corporate blogScala*Java*Big Data*

EAP 11 of the Big Data Tools plugin for IntelliJ IDEA Ultimate, PyCharm, and DataGrip is available starting today. You can install it from the JetBrains Plugin Repository or inside your IDE.

Big Data Tools is a new JetBrains plugin that allows you to connect to Hadoop and Spark clusters and monitor nodes, applications, and jobs. It also brings support for editing and running Zeppelin notebooks inside IntelliJ IDEA and DataGrip, so you can create, edit, and run Zeppelin notebooks without ever having to leave your favorite IDE. The plugin offers smart navigation, code completion, inspections, quick-fixes, and refactoring inside notebooks.

olegchir Oct 6 2020 at 11:42

ZTools for Apache Zeppelin

8 min

1.4K

JetBrains corporate blogBig Data*Java*Scala*

Zeppelin is a web-based notebook for data engineers that enables data-driven, interactive data analytics with Spark, Scala, and more.

The project recently reached version 0.9.0-preview2 and is being actively developed, but there are still many things to be implemented.

One such thing is an API for getting comprehensive information about what's going on inside the notebook. There is already an API that completely solves the problems of high-level notebook management, but it doesn’t help if you want to do anything more complex.

sainnr Oct 6 2019 at 18:39

Automate SOAP client auto-generation routines with WSDL import for SBT and Scala

5 min

3.9K

DevOps*Java*Scala*

Working with SOAP often gets tricky, and dealing with WSDL might be a huge contribution to the complexity of this task. Really, it could be the least expected thing to face when you are into a modern & fancy language like for example, Scala, that is well known for its reactiveness and asynchronous way of dealing with requests. In fact, many of the software developers that have made their way into industry quite recently, might not even know about SOAP and WSDL protocols, and get quickly annoyed or even enraged when first trying to connect to such a legacy service. So, should we deprecate this altogether in favour of modern technology stack, or maybe there is a less painful solution?

ThatAnnoyingCatAt4am Aug 29 2019 at 05:15

Escaping the Thicket of Tests: Building a Shortcut from a Fixture to an Assertion

15 min

1.2K

2ГИС corporate blogScala*Programming*IT systems testing*Functional Programming*

In this article, I would like to propose an alternative to the traditional test design style using functional programming concepts in Scala. This approach was inspired by many months of pain from maintaining dozens of failing tests and a burning desire to make them more straightforward and more comprehensible.

Even though the code is in Scala, the proposed ideas are appropriate for developers and QA engineers who use languages supporting functional programming. You can find a Github link with the full solution and an example at the end of the article.

primetalk Apr 11 2019 at 09:09

Compilable configuration of a distributed system

17 min

1.5K

Праймтолк corporate blogDevOps*Scala*Abnormal programming*Programming*

In this post we'd like to share an interesting way of dealing with configuration of a distributed system.
The configuration is represented directly in Scala language in a type safe manner. An example implementation is described in details. Various aspects of the proposal are discussed, including influence on the overall development process.

Overall configuration management process

(на русском)