Sparkube: when OLAP becomes child’s play

Check out our report on Tools-in-Action Project Sparkube: turn Apache Spark into an OLAP cube

By Yassine Bargach Software Engineer @Yagach


Margo Consultants participated in Devoxx France 2018 , the conference for Passionate Developers organized from April 18 to 20, 2018 in Paris. Discover below our report on the Tools-in-Action dedicated to the Sparkube project: Transformer Apache Spark in OLAP cube, hosted by Antoine Chambille on Wednesday, April 18th.


You finally have your favourite notebook, your Spark cluster is well configured and powerful, and your HDFS contains everything you need in data. And yet, do you know that you can boost the added value of your data, thanks to the power of OLAP? At the Devoxx France 2018 conference, Antoine Chambille, R & D Director at Activeviam , came to present the Sparkube project that has just been launched.


Let’s start with a small point about OLAP.

Cube en 3 dimensions

OLAP systems (for OnLine Analytical Processing) are intended to allow users to navigate intuitively in the data. This technology is based on a particular structure: The Hypercube (or multidimensional cube).

The cube makes it possible to analyse the data under different analysis axes, called hierarchies. For example, the date of the transaction, the category of the product, the geographical area, the name of the seller … It also contains measures, which are aggregated on the various axes. Examples of measures can be the total number (count), the amount, the margin …


But how to use this cube?

The MDX (for Multidimensional Expressions) is de-facto the query language of OLAP cubes. Offered by Microsoft and subsequently adopted by a large number of OLAP solutions, this powerful language is particularly suited for cube structure, even though the learning curve may not be as fast as desired.

However, there is a protocol, XMLA (XML for Analysis), which allows the discovery (list of axes, measurements …) and the execution of MDX queries on an OLAP cube. Coupled with a suitable graphical interface, the user will no longer have to deal with MDX, and can generate through the interface all types of requests. The icing on the cake: Excel implements the XMLA protocol, and thus allows to connect, without any additional plugin, to all the servers that implement it.


And Sparkube in all this?

Sparkube is a library developed by the publisher Activeviam, capable of introspecting a Spark DataFrame, and creating an adequate OLAP cube structure. Sparkube will expose the cube on an XMLA interface. It will respond to client discovery queries, transform all MDX queries into Spark queries, and finally return the result at the end of execution.
In addition to operating “as is” with Excel, Activeviam also provides its own client, Active UI.

Interactions avec Sparkube


And in practice?

Nothing simpler, after having recovered a dataset of more than 370 000 lines for the test, we follow the instructions on the Sparkube page. We start by creating the DataSet, and then we ask Sparkube to create and expose the cube.

Sparkube responds that the cube is well exposed on http: // localhost: 9090 / xmla.

And here we are, we can start exploring.

We recover for example the average price of a car according to the brand, the mileage, or according to the accidental history of the car:

Sparkube - Tableau de données à trois axes et une mesure

We can also insert graphics:

Sparkube - Pourcentage par type de voiture

Sparkube - Comparaison de prix médian entre deux type, par année d'immatriculation

The tests are performed on a dataset of more than 370,000 rows, of 20 columns. The construction of the cube lasts about 3 minutes, and the request response time is between 2 and 5 seconds.



Sparkube certainly does not offer all the flexibility and performance of solutions specifically designed for OLAP. These solutions can in particular pre-aggregate and index all the points of the cube, they profit from particular hardware architectures (NUMA for example) and are mainly based on a horizontal scalability to limit the network exchanges (and to allow very powerful nonlinear aggregations) . However, the project remains very promising, and the ability to prove concepts as easily with very good performance on reasonable sizes demonstrates the interest of OLAP for users.

For the moment, the page of Sparkube does not contain enough information, but it is interesting to follow the evolution of this project!




Watch the video of the conference hosted by Antoine Chambille  :  Sparkube Transformer Apache Spark project in OLAP cube.

By Yassine Bargach Software Engineer @Yagach
High Performance IT


Cookies, very used, but so little understood

Margo Consultants participated in Devoxx France 2018 , the conference for Passionate Developers organized from April 18 to 20, 2018 in Paris. Discover below a first report dedicated to the conference on HTTP cookies that was hosted on Thursday, April 19 by Hubert Sablonnière .

18/05/2018 Discover 

Modern Android Development: the new contributions of the Java language

Since the release of Android 1.0 in 2008, the Android ecosystem has evolved enormously, but especially in recent years, with the new contributions of the Java language, changes related to terminals and the appearance of new tools. Recommendations and good practices have evolved accordingly.

15/05/2018 Discover 

Introduction to Reactive Systems

Margo Consultants participated in  Devoxx France 2018 , the conference for Passionate Developers, organized from April 18 to 20, 2018 in Paris. Discover a synthesis on reactive systems illustrated by a concrete use case.

11/05/2018 Discover 

Digital Women Day #ForABetterWorld

On Tuesday, April 17th 2018, the Margo HR team attended the French Digital Women Day (Journée de la Femme Digitale), thought up by Delphine Remy-Boutang. It was a great opportunity to meet inspiring, bold, innovative and creative women. On the agenda: CEOs, Senior Officials, intra and entrepreneurs, Heads of Communication, General Managers, Secretaries of State. Many profiles with various careers and one common goal: innovation for all and by all.

07/05/2018 Discover 
Press release

Margo reveals its new strategic and brand positioning

Paris, February 12, 2018 - Margo, a French Consulting company created in 2005 and historically specialized in IT and in the Financial Markets, changes its business model in order to accompany new industries in their transformation plans. This is also the opportunity for Margo to reveal its new original brand identity, using a customizable logo.

12/02/2018 Discover 

Rémi, Microsoft Addict

Many friends told me about their respective companies, but Margo seemed to have the highest standards, across its recruitment process and missions.