Big Data with Max Welling: The challenges

Max Welling, Research chair in Machine Learning at the University of Amsterdam and VP technologies Qualcomm Netherlands, talks about Gdpr, augmented reality, wearables, interactive chatbots and the best skills to learn for a student.

Ruben Gansen

Although big data is very good at detecting correlations, especially subtle correlations that an analysis of smaller data sets might miss, it never tells us which correlations are meaningful.

Shervin Khorasani

Data analytics is a complex field, a fact that gets even more complicated when you factor in machine learning, deep learning, and other components of AI that are often used to analyze data. As such, there’s a huge demand for data scientists who are talented in various fields, purely because the job is heavily multidisciplinary.

Adnan Jazvic

In response to Jamyang Khachaturyan

Take, for instance, cookies, the pieces of code that are used to collect user data from websites for advanced analytics. While many countries now require websites to inform users about the use of cookies to collect data from browsers, there’s no way to know how much data or specific types of data that is collected via such websites.

Plus, there’s always the issue of data security when AI systems are handling massive amounts of data across networked, distributed databases. In many automated industries such as the telecoms industry, stolen data, for instance, can be used to launch automated spam calls like robocalls, a popular nuisance in many countries globally.

Jamyang Khachaturyan

In response to Elfriede Rothenberg

AI systems, even the most basic forms, are usually very complex, with tons of algorithms obscuring what the system is actually doing under the hood. As such, any data used for such processing is usually hidden from view, which raises questions about transparency and privacy of such data.

Take, for instance, cookies, the pieces of code that are used to collect user data from websites for advanced analytics. While many countries now require websites to inform users about the use of cookies to collect data from browsers, there’s no way to know how much data or specific types of data that is collected via such websites.

Elfriede Rothenberg

AI systems, even the most basic forms, are usually very complex, with tons of algorithms obscuring what the system is actually doing under the hood. As such, any data used for such processing is usually hidden from view, which raises questions about transparency and privacy of such data.

Olena Matey

In response to Kaan Buğra Kundakçı

Machine learning is often a big part of a "data science" project, e.g., it is often heavily used for exploratory analysis and discovery (clustering algorithms) and building predictive models (supervised learning algorithms). However, in data science, you often also worry about the collection, wrangling, and cleaning of your data (i.e., data engineering), and eventually, you want to draw conclusions from your data that help you solve a particular problem.

There are numerous examples of data science applications. Assume you are working for a credit company. Your boss gives you the task to find out whether a customer is creditworthy or not. You collect transaction data, maybe shipping records and customer ratings and so forth. Next, you'll probably use a machine learning algorithm to learn a predictive model. For example, let's assume you chose to grow a decision tree, and you concluded that this particular customer is not creditworthy. Finally, you prepare a nice presentation visualizing the decision tree to answer your boss' next question: Why is this customer not creditworthy?

Kaan Buğra Kundakçı

Machine learning is often a big part of a "data science" project, e.g., it is often heavily used for exploratory analysis and discovery (clustering algorithms) and building predictive models (supervised learning algorithms). However, in data science, you often also worry about the collection, wrangling, and cleaning of your data (i.e., data engineering), and eventually, you want to draw conclusions from your data that help you solve a particular problem.

Denny Daskalov

While AI tools present a range of new functionality for businesses, artificial intellignce also raises some ethical questions. Deep learning algorithms, which underpin many of the most advanced AI tools, only know what's in the data used during training. Most available data sets for training likely contain traces of human bias. This in turn can make the AI tools biased in their function.

Jalen Sepi Ozols

Artificial Intelligence will do wonders to help automate processes that, today, take time and manual labor but don’t contribute much to the bottom line or moving forward as a company. Automation will allow additional time and resources to be dedicated to what companies need to focus their energy on: customer experience.

Benjamin

In response to Tatum Okorie

While Big Data offers a ton of benefits, it comes with its own set of issues. This is a new set of complex technologies, while still in the nascent stages of development and evolution.  Some of the commonly faced issues include inadequate knowledge about the technologies involved, data privacy, and inadequate analytical capabilities of organizations. A lot of enterprises also face the issue of a lack of skills for dealing with Big Data technologies. Not many people are actually trained to work with Big Data, which then becomes an even bigger problem.

Well, i think every second programmer can be trained to work with AI. Of course it will take a while but an experienced professional programmer can become an AI developer for a few months i believe.

Professor Dodds

A lot of organizations claim that they face trouble with Data Security. This happens to be a bigger challenge for them than many other data-related problems. The data that comes into enterprises is made available from a wide range of sources, some of which cannot be trusted to be secure and compliant within organizational standards.  They need to use a variety of data collection strategies to keep up with data needs. This in turn leads to inconsistencies in the data, and then the outcomes of the analysis.

George Waters

Netflix is a content streaming platform based on Node.js. With the increased load of content and the complex formats available on the platform, they needed a stack that could handle the storage and retrieval of the data. They used the MEAN stack, and with a relational database model, they could in fact manage the data.

Tatum Okorie

While Big Data offers a ton of benefits, it comes with its own set of issues. This is a new set of complex technologies, while still in the nascent stages of development and evolution.  Some of the commonly faced issues include inadequate knowledge about the technologies involved, data privacy, and inadequate analytical capabilities of organizations. A lot of enterprises also face the issue of a lack of skills for dealing with Big Data technologies. Not many people are actually trained to work with Big Data, which then becomes an even bigger problem.

Sanjeev Jehoram Moriarty

Data volumes are continuing to grow and so are the possibilities of what can be done with so much raw data available. However, organizations need to be able to know just what they can do with that data and how much they can leverage to build insights for their consumers, products, and services. Of the 85% of companies using Big Data, only 37% have been successful in data-driven insights. A 10% increase in the accessibility of the data can lead to an increase of $65Mn in the net income of a company.

Alex Tetradze

80% of the data getting generated today is unstructured and cannot be handled by our traditional technologies. Earlier, an amount of data generated was not that high. We kept archiving the data as there was just need of historical analysis of data. But today data generation is in petabytes that it is not possible to archive the data again and again and retrieve it again when needed as Data scientists need to play with data now and then for predictive analysis unlike historical as used to be done with traditional.

Ruslan Grześkiewicz

Because data science is a broad term for multiple disciplines, machine learning fits within data science. Machine learning uses various techniques like regression and supervised clustering. On the other hand, ‘data’ in data science may or may not evolve from a machine or a mechanical process. So, the main difference between the two is that data science as a broader term not only focusses on algorithms and statistics but also takes care of the entire data processing methodology.

Lovro Dzvezdan Lam

Data science, analytics, and machine learning are growing at an astronomical rate and companies are now looking for professionals who can sift through the goldmine of data and help them drive swift business decisions efficiently.

Baldur Helgason

The need for big data velocity imposes unique demands on the underlying compute infrastructure. The computing power required to quickly process huge volumes and varieties of data can overwhelm a single server or server cluster. Organizations must apply adequate compute power to big data tasks to achieve the desired velocity. This can potentially demand hundreds or thousands of servers that can distribute the work and operate collaboratively.

Oberto

Python is the most common coding language I typically see required in data science roles, along with Java, Perl, or C/C++. Python is a great programming language for data scientists.  Are there any other technical skills a data scientist needs to have?

Timotej Vlašič

In response to Shila Vasuda Gupta

Currently, major companies are investing in AI to handle difficult customers in the future. Google's most recent development analyzes language and converts speech into text. The platform can identify angry customers through their language and respond appropriately. :)

Shila,

I am glad you brought that up.  Artificial intelligence is implemented in automated online assistants that can be seen as avatars on web pages. It can avail for enterprises to reduce their operation and training cost. A major underlying technology to such systems is natural language processing.

Shila Vasuda Gupta

Currently, major companies are investing in AI to handle difficult customers in the future. Google's most recent development analyzes language and converts speech into text. The platform can identify angry customers through their language and respond appropriately. :)

Gertruda Filipowski

The utilization of Big Data also faces challenges in the Education industry. From a technical point of view, a major challenge in the education industry is to incorporate big data from different sources and vendors and to utilize it on platforms that were not designed for the varying data. From a practical point of view, staff and institutions have to learn the new data management and analysis tools.

Waclaw Piatek

Large and small companies have cyber-threats within and outside of their control such as data breaches, theft of company secrets, spying, attacks on computer networks, and damage to critical systems. Many companies are considering the challenges of cybersecurity and looking to new business applications such as cloud computing to secure data. However, cloud computing has enormous security and privacy risks relating to dependence on untrustworthy or unevaluated third parties.

Sherman Wolff

In response to Baldur Helgason

Quite often, big data adoption projects put security off till later stages. And, frankly speaking, this is not too much of a smart move. Big data technologies do evolve, but their security features are still neglected, since it’s hoped that security will be granted on the application level. And what do we get? Both times (with technology advancement and project implementation) big data security just gets cast aside.

Indeed, data collection creates issues we need to be aware of.  Whereas tracking might follow you in real-time, a variety of internet companies and services can collect your browsing data and share your computer or router MAC address with third-party advertisers and companies.  With this data companies you have no direct interaction with can build up a pretty good profile of your internet habits and web browsing.

Mihail Antoniou

Thanks to Big Data, every Web ad you will soon see on Facebook and across the Web will have been bid upon in real time by advertisers who will pay based upon your perceived value as a potential customer.

Christopher Bradley

Absolutely agreed on fact that we should learn more coding langulages, but are all the people capable of doing that ? I mean, you should have certain skills, so that you can improve in that field ...

Baldur Helgason

Quite often, big data adoption projects put security off till later stages. And, frankly speaking, this is not too much of a smart move. Big data technologies do evolve, but their security features are still neglected, since it’s hoped that security will be granted on the application level. And what do we get? Both times (with technology advancement and project implementation) big data security just gets cast aside.

future hacker

Here is one challenge of Big Data I find it difficult to wrap my head around.

Nobody is hiding the fact that big data isn’t 100% accurate. And all in all, it’s not that critical. But it doesn’t mean that you shouldn’t at all control how reliable your data is. Not only can it contain wrong information, but also duplicate itself, as well as contain contradictions. And it’s unlikely that data of extremely inferior quality can bring any useful insights or shiny opportunities to your precision-demanding business tasks.

Alex Tetradze

In response to Nioh1992

Lucas,  Here are a couple of in-depth guides to augmented reality: source 1, source 2.  

Nioh,

Augmented reality is the technology that expands our physical world, adding layers of digital information onto it. Unlike Virtual Reality (VR), AR does not create the whole artificial environments to replace real with a virtual one. AR appears in direct view of an existing environment and adds sounds, videos, graphics to it.  A view of the physical real-world environment with superimposed computer-generated images, thus changing the perception of reality, is the AR.

Thomas Pfeiffer

Many algorithms are developed by a whole team, sometimes this team is made up of 3 people, sometimes this team can be a whole company sector as in Google's search engine, what if when I develop a feature I write in depth documentation. and then you as a customer have access to it, this could be a solution.

Nioh1992

Lucas,  Here are a couple of in-depth guides to augmented reality: source 1, source 2.  

Lucas Vermeulen

I would have loved to see a more in-depth discussion of augmented reality.  It seems to be an important and interesting application area of AI.

Martin D. Hoffmann

We spend a huge amount of time doing all kinds of things on our cell phones.  Some people say that is why we should know exactly how they work. I happen to disagree.  It is a topic that will likely be discussed for years to come.

Wioleta Brzezinski

As far as I know GDPR is already in effect in Europe.  The one thing I was interested in but could not quite conclude from the presentation is whether it covers algorithms too.

Olena Matey

It is my belief that one day we will have robots we would be able to talk to the way we talk to other human beings.


Please login to leave a response.