On Kafka and Hadoop use cases in Europe – 35th meetup

As Big Data Belgium’s first meetup of 2016, we had 2 interesting topics scheduled: Apache Kafka performance and Hadoop use cases in Europe. So, a big thanks goes to both our speakers, but also to Co.Station BXL to host us!

Kafka, the Big Data message broker

Wannes De Smet – Sizing Servers

Often described as the heart of any scalable Big Data cluster, Apache Kafka has quickly become the message broker for your environment. As a message broker’s task is to reliably move messages from component A to B (and C), doing so in a scalable and reliable way with millions of messages is no small feat.

Wannes presented a short intro of Kafka, followed by a deep dive through the entire process of reliably producing and consuming messages. Oh, and doing all that in a distributed, highly-available, fault-tolerant manner, of course. He walked us through some the architectural requirements and operational intrinsics (configuration, monitoring, …) of using and operating a Kafka cluster, based on experiences learned from moving a complex stack to Kafka in production.

So if you are still shifting CSV files around, take some time to get to learn the ultimate upgrade.

Wannes can not share his slide deck with us, but I’m pretty sure that he’ll be keen to share his slides to you personally, if you provide the Sizing Servers Research Lab with your input on their Big Data performance research.

Hadoop in the real world: stories from across Europe

Tim Marston, Director, Regional Alliances, EMEA – Hortonworks

In the 2nd presentation of the evening, Tim Marston introduced us to HDP. This is Hortonworks flavour of Hadoop. He highlighted its strengths as a fully open-source system, before going into more detail on various use cases which they implemented across Europe. The slides give all the details.

Hortonworks – Hadoop Stories

Thank you all for being there!

May the data be with you!

Big Data and Data Science – 27th meetup

Our 27th meetup as a joint venture with DataScience.be was a huge succes! The goal was to give a thorough introduction of Big Data to the data scientists and business people of both organizations.

In total, 221 participants registered over both communities! Unfortunatley, quite a lot of people did not make it last night. That is probably due to the EU top that was happening  yesterday. (Note that the meetup was held at the VUB in Elsene.) But it was still a huge crowd.

Presentations on Big Data for Data Science

Philippe Van Impe, co-organizer of DataScience.be, gave an overview of last years activities of the DataScience.be community. He focussed specifically on their data for good and hackaton initiatives. In the presentation, he hid a product placement for BigBoards: on one of the pictures from a hackaton, Kris Peeter’s Hex was visible in the foreground. The Hex was used to do social network analysis!

Next, the DataScience.be’s team who have been working on their Médecins Sans Frontières (MSF), presented an overview of their work and results. The team was lead by Edward Vanden Berghe. They received a dataset from MSF on the organisation’s donations. The team screened the dataset for donator segmentations and looked for actionable insights to help MSF improve their revenues.

As 3rd speaker, I gave an introduction to Big Data and what it can mean to organisations, large and small. Finally, I touched on the importance of data science to give meaning to the data.

Daan Gerits took over and got into the details of how to setup a scalable and resilient Big Data architecture.

After the break, Ferdinand Casier en Mathias Verbeke exposed their EluciDATA project which starts in 2015. The goals is to help Belgian companies with data innovation. Any questions or request for participation can be send to info@elucidata.be!

And last but not least, Karim Douïeb explained how they are using Spark for call record details analysis for mobile operators. Really interesting!

The meetup ended at about 21h30 with a Q&A session with all presenters together. Very thoughtful questions were raised by a sharp audience!

Thank you all for participating!!!

Images from the 27th meetup

Strata 2014 – Claim your discount!

This year, Strata conference is going down from 19-21 November 2014 in Barcelona. Next to being a gorgeous city, the conference is another reason to visit for anyone with an interest for data! To give you an idea of what Strata is, I pulled a summary from the StrataConf website.

Moreover we got a discount code! Pull the link and code from the sponsors list on our meetup page!

About the O’Reilly Strata Conference

The best minds in data will gather in Barcelona this November for the O’Reilly Strata Conference to learn, connect, and explore the complex issues and exciting opportunities brought to business by big data, data science, and pervasive computing.

The future belongs to those who understand how to collect and use their data successfully. And that future happens at Strata.

Why You Should Attend

Strata Conference is where big data’s most influential business decision makers, strategists, architects, developers, and analysts gather to shape the future of their businesses and technologies. If you want to tap into the opportunity that big data presents, you want to be at Strata.

In a crowded market place of “Big Data” conferences, Strata has firmly established itself as the place where you go to meet people who think and do data science.

At Strata, you’ll:

  • Be among the first to understand how you can leverage the promise of this huge change, and survive the resulting disruption
  • Find new ways to leverage your data assets across industries and disciplines
  • Learn how to take big data from science project to real business application
  • Discover training, hiring, and career opportunities for data professionals
  • Meet-face-to face with other innovators and thought leaders

Experience Strata

Strata Conference delivers the nuts-and-bolts foundation for building a data-driven business—the latest on the skills, tools, and technologies you need to make data work—alongside the forward-looking insights and ahead-of-the-curve thinking O’Reilly is known for.

There was a palpable sense of excitement in the air. Obviously most of the attendees were already ‘data’ aficionados, but it’s clear that ‘data’ in various forms is on the radar for governments, large corporations, and the developer communities.

At Strata, you’ll find:

  • Three days of inspiring keynotes and intensely practical, information-rich sessions exploring the latest advances, case studies, and best practices
  • A sponsor pavilion with key players and latest technologies
  • A vibrant “hallway track” for attendees, speakers, journalists, and vendors to debate and discuss important issues
  • Plenty of events and opportunities to meet other business leaders, data professionals, designers, and developers

About O’Reilly

O’Reilly is followed by venture capitalists, business analysts, news pundits, tech journalists, and thought leaders because we have a knack for knowing what’s important now and what will be important next—and the ability to articulate the seminal narratives about emerging and game-changing technologies.

We don’t say this to brag. We say it to make a point: we’re not easily hypnotized by hype. We’ve seen the bubbles build and burst. For over three decades, we’ve been tapping into a deep network of alpha geeks and thought leaders to recognize the truly disruptive technologies amidst the fluff. So when we invest in a conference, we’re not just following the hype, we’re committed to creating a community around an issue we believe is transformative.

At O’Reilly, we think big data is not just important. We think it’s a game changer. That’s why we created Strata.

O’Reilly’s conferences forge new ties between industry leaders, raise awareness of technology issues we think are interesting and important, and crystallize the critical issues around emerging technologies. Understanding these emerging technologies—and how they will transform the way we do business—has never been more crucial. If you want to understand the challenges and opportunities wrought by big data, you’ll want to attend Strata.


More than 80 people showed up at our last meetup focused on Spark. Because there are more and more signs that Spark will become the successor to Hadoop MapReduce we invited some people who are already using Spark in production.

Andy gave an introduction to functional progamming and Scala in just 45 minutes, which is  definitely not enough for passing all details. His slides can be found here

Excellent meetup. The Scala introduction was so quick that it blew my mind but gave me enough information to follow the rest

(Eric Darchis)

We had Toni Verbeiren who gave an introduction to Spark and demonstrated Spark from the command line. Follow the links to his slides and visualization code.

Very interesting mix of Scala, Spark and Use Case

(Peter Vandenabeele)

Gerard Maas showed us how Spark is used in production at Virdata.com. With a cool demo of their platform in the end. His slides are availabele here: Spark-at-Virdata

It was Sparkling! (Radek O)

I am always amazed by the quality of the BigData.be and ScalaBe presentations. Big up to all of you ! (Frederic)

The presentations were recorded by Parleys.com and to be published in a “bigdata.be” channel. We’ll let you know when they become available over there.

Thanks to Ordina for the location and for providing food and drinks.

See you next time, we are always looking for venues and presenters.


Meetup 8: Call for participation

Hello all,

the next meetup is already approaching and we are still missing some interesting topics to discuss.

So if you have read something lately that is worth mentioning, or if you’re in the middle having a breakthrough on an interesting brain teaser, or if you are implementing a wonderful project or just doing anything else relevant to our domain, please take a moment to prep some slides and get a discussion going on our 8th meetup!

Looking forward to hearing from you all!


The 7th meetup or Waiting for CSI Ixelles

Three weeks ago our litte community on bigdata had their 7th meetup in Brussels. We think it is a good idea to hold our meetups in different cities, since we are the Belgian bigdata community. (If you can host a meetup in your city, please contact us!). Next to the typical  evening traffic chaos and a meeting of all European prime ministers there was a crime scene (some sort of knife fight) next to our meeting place, which caused some of our participants to arrive a bit later, than planned.

Nevertheless did we have a good schedule, which consisted out of two talks with lots of good interaction between the speakers and the audience.

The first talk was about storm a distributed realtime processing framework coming out of twitter. Daan Gerrits gave an introduction into storm and walked us through an example application he had created for this meetup.

The second talk (by me) was about apache giraph a graph processing framework on top of apache hadoop.

If you have been to one of our meetings and you liked it, please spread the word, leave comments here, and consider the “call for papers” for our 8th meetup in July open!