Data+AI Summit 2021 is Coming

Adi Polak

April 11, 2021

It’s almost been half a year since the last summit.

Data+AI Summit 2021 starts on Monday, May 24 till Friday, May 28. The training will be held on May 24-25 and will cater to a large set of practitioners, definitely more extensive than previous times: Data Analyst, Data Engineer, Data Scientist, ML Engineer, Partner Data Engineer, Platform Engineer, Technical. The wide range of roles makes me curious about the various technical personas in the Data and AI space. It’s not only Data Engineers and Data Scientists. There is a wide range of people who can benefit from attending the summit.

Interestingly, Monday training starts at 6 am PT, allowing European timezone folks to join at their after noon time.

The pass for the summit is free upon registration.

The agenda is somewhat missing from the main page, but trying out multiple URLs brought me to the whole agenda on the site.

There are 3 main focus areas:

Productionizing Machine Learning
Databricks Experience
Technical Deep Dives

As you know by now, I think all the sessions are great! Sessions were carefully picked out of hundreds to provide the maximum value to the summit participants.

But, let’s assume we can attend only three. Which 3 will you choose?

Here are my top picks:

1st: From the machine learning space - Strategies for Debugging Machine Learning Systems

This session is interesting since there are many ways to build machine learning models, from centralized distributed machine learning to decentralized distributed like Federated Learning. Of course, many places where things can go wrong when training on one machine or multiple machines. The space of security and adversaries in machine learning training is a magical one. Like a rare diamond, you don’t know its worth until carefully examined. For example, in federated learning, each device is training a model locally on its data and shares the model summary. Assuming one of the devices is an adversary, an attacker who wants to change the model’s overall result can share a false summary with the rest of the group and impact the model’s overall behavior. In some cases, that means smaller revenue, but when we use machine learning to save lives like healthcare usabilities, that’s more complicated.

2nd From the Databricks Experience space - Video Analytics At Scale: DL, CV, ML On Databricks Platform

The future of data is complex data. And complex data for is sound and video. Let me bottom-line this for you; video is here to stay, may it be social media, Netflix, or Autonomous drowns/cars/… The ability to process complex data at scale and enabling machine learning at scale is the future.

And lastly, from the Technical Deep Dives:

3rd Becoming a Data-Driven Organization with Modern Lakehouse

Understanding the Modern lakehouse is one thing, but knowing how to rally all the stakeholders and build a robust one is a different challenge.

I’m going to let you in on a secret, all sessions are going to be recorded and shared after the summit. But there is some magic in attending, asking questions and participating in Twitter and Chat with like minded who came to learn, exchange ideas and network.

Links:

conference

apache spark