Big Data with Dino Pedreschi: The age of data

Dino Pedreschi, Professor of Computer Science at the University of Pisa and co-lead of KDD Lab - Knowledge Discovery and Data Mining Laboratory, talks about big data, machine learning and intelligent systems.

Marie Bourgeois

The new world of mass personalization requires relationships built upon mutual trust. And more than at any time in history, companies – and soon, big government, big education, big everything – are going to have to work hard to earn that trust from each and every one of us.

Gaetano Albertini

Data mining delivers vast quantities of data, often unstructured. Marketers are more familiar with interacting with data via dashboards that structure data to deliver analysis of commonalities, such as averages, ratios and percentages. The goal is to aggregate data in order to report a result, search for a pattern and find relationships between variables. Assumptions are made by humans, and data is queried to attest to that relationship. If valid, testing may continue on additional data.

Sárika Zsuzsi Görög

In response to Volodya Kuznetsov

The artificial intelligence (AI) industry has been leading the headlines consistently, and for good reason. It has already transformed industries across the globe, and companies are racing to understand how to integrate this emerging technology.

Advances in AI now mean product developers can create innovative and leading-edge products and services that, until recently, would not have been within reach of the average marketing budget.  These new products and services entering the market make AI adoption lower risk with a focus on delivering practical and immediately impactful results. Many past attempts resulted in expensive and custom-developed marketing technology projects that left their scars.

Volodya Kuznetsov

The artificial intelligence (AI) industry has been leading the headlines consistently, and for good reason. It has already transformed industries across the globe, and companies are racing to understand how to integrate this emerging technology.

Sigmund Gerhard

Machine Learning is defined as a practice of using the suitable algorithms to utilize the data for learning and predict the future trend for a particular area. Machine learning software contains the statistical and predictive analysis that used to recognize the patterns and find the hidden insights based on perceived data. The best examples of machine learning application is Virtual assistant devices like Amazon’s Aleza, Google Assistance, Apple’s Siri, Microsoft’s Cortana and social platforms like Facebook works on Machine learning principles and predict or respond as per the past behavior of the users to suggest them the most suitable things.

Teresa Guerrero

Data science is a term used for dealing with big data that includes data collection, cleansing, preparation and analysis for various purposes. A data scientist collects data from multiple sources and after analysis applies into predictive analysis or machine learning and sentiment analysis to extract the critical information from the data sets. These data scientists analysis and understand the data from business perspective and give useful insights and accurate predictions that can be used while taking critical business decisions.

Kaan Buğra Kundakçı

In response to Nikoleta Stavros

Data science is a pretty ambiguous, ill-defined term and interdisciplinary field; and people mean (expect) different things in different contexts. In my opinion, in practice, data science is pretty much the same as what we've known as data mining or KDD (Knowledge Discovery in Databases).

The typical skills of a data scientists are

  • Computer science: programming, hardware understanding, etc.
  • Math: Linear algebra, calculus, statistics
  • Communication: visualization and presentation
  • Domain knowledge

Nikoleta Stavros

Data science is a pretty ambiguous, ill-defined term and interdisciplinary field; and people mean (expect) different things in different contexts. In my opinion, in practice, data science is pretty much the same as what we've known as data mining or KDD (Knowledge Discovery in Databases).

Doriane Mateu Phạm

In generative models, the AI is more complex and doesn't rely on a previously collected database of answers. They respond to queries with newly generated code or phrases. These models can be used to simulate wide areas of conversation in chat bots or deal with new situations in general much more capably. These models simulate conversation with humans on broader topics better than retrieval-based systems but may make grammatical errors and also can be taught poor responses. 

Dorothea Petrescu

John McCarthy, an American computer scientist, coined the term "artificial intelligence" in 1956 at the Dartmouth Conference where the discipline was born. Today, it is an umbrella term that encompasses everything from robotic process automation to actual robotics. It has gained prominence recently due, in part, to big data, or the increase in speed, size and variety of data businesses now collect. AI can perform tasks such as identifying patterns in data more efficiently than humans, enabling businesses to gain more insight from their data.

Denny Daskalov

In response to Gunnr Østergård

The application of AI in the realm of self-driving cars also raises ethical concerns. When an autonomous vehicle is involved in an accident, liability is unclear. Autonomous vehicles may also be put in a position where an accident is unavoidable, forcing it to make ethical decisions about how to minimize damage.

Gunnr,

Another major concern is the potential for abuse of AI tools. Hackers are starting to use sophisticated machine learning tools to gain access to sensitive systems, complicating the issue of security beyond its current state. Deep learning-based video and audio generation tools also present bad actors with the tools necessary to create so-called deepfakes, convincingly fabricated videos of public figures saying or doing things that never took place.

Gunnr Østergård

The application of AI in the realm of self-driving cars also raises ethical concerns. When an autonomous vehicle is involved in an accident, liability is unclear. Autonomous vehicles may also be put in a position where an accident is unavoidable, forcing it to make ethical decisions about how to minimize damage.

Jalen Sepi Ozols

Artificial intelligence based home automation is the future. If everyone in the United States installed Nest or a similar smart thermostat, they would collectively save hundreds of millions of dollars annually in wasted energy since Nest is able to “learn” when people are or are not home. Nest and others automatically adjust temperature saving on energy use and costs.

Svetlana Barbieri

I believe it will be more like the science fiction movies, where we will maintain and work with the machines that do the work. However, these “jobs” will come with a level of prestige, as most people will probably live off a government sponsored socialism system. With AI and automation replacing so many jobs in the next 20 years, we will have to change social systems in order to adapt.

Prof. Dr.-Ing. Helga Breitner

With each wave of technology advancement, the quality of life for the world overall has increased. With AI, we will have better personalized healthcare, more efficient energy use, enhanced food production capabilities, improved jobs with less mundane work, and more. People will lead longer and more high quality lives.

Janko Kyllikki

One of the top benefits will be the emergence of personalized medicine. Rather than a one-size-fits-all approach, doctors will be able to tailor treatment on an individual basis and prescribe the right treatments and procedures based on your medical history. As far as living up to hype, yes — definitely. Though as with many new technologies it’s more of a question of “when” rather than “if.”

Chares Valentinianus Kavanaugh

In response to Jovanka Pokorny

Many high school and college students are familiar with services like Turnitin, a popular tool used by instructors to analyze students’ writing for plagiarism. While Turnitin doesn’t reveal precisely how it detects plagiarism, research demonstrates how ML can be used to develop a plagiarism detector.

The biggest change that’s coming is the move from humans using software as a tool, to humans working with software as team members. Software will monitor things, alert humans, and execute basic tasks without human intervention. This will free human time for the really creative or interesting tasks and greatly improve business. A.I. is going to have a much larger impact than the hype.

Jovanka Pokorny

Many high school and college students are familiar with services like Turnitin, a popular tool used by instructors to analyze students’ writing for plagiarism. While Turnitin doesn’t reveal precisely how it detects plagiarism, research demonstrates how ML can be used to develop a plagiarism detector.

Juniper Womack

IoT provides new opportunities for companies to solve customer issues instantly and pre-empt problems before they escalate. Continuous monitoring enables companies to anticipate -- and fix -- problems before the customer is aware of them. "Companies can remotely monitor mission-critical machinery and pre-emptively intervene, which prevents or reduces problems and lowers costs," Leggett said. For example, New England Biomedical Services Inc. uses IoT to monitor science lab usage of their recombinant and native enzymes for genomic research so they can restock supplies immediately.

Waclaw Piatek

The term big data was first used to refer to increasing data volumes in the mid-1990s. In 2001, Doug Laney, then an analyst at consultancy Meta Group Inc., expanded the notion of big data to also include increases in the variety of data being generated by organizations and the velocity at which that data was being created and updated. Those three factors -- volume, velocity and variety -- became known as the 3Vs of big data, a concept Gartner popularized after acquiring Meta Group and hiring Laney in 2005.

Emīlija Bonomo

In response to Lizaveta Hersch

Big data analytics is the often complex process of examining large and varied data sets -- or big data -- to uncover information including hidden patterns, unknown correlations, market trends and customer preferences that can help organizations make informed business decisions.

On a broad scale, data analytics technologies and techniques provide a means to analyze data sets and draw conclusions about them to help organizations make informed business decisions. BI queries answer basic questions about business operations and performance.  Big data analytics is a form of advanced analytics, which involves complex applications with elements such as predictive models, statistical algorithms and what-if analysis powered by high-performance analytics systems.

Lizaveta Hersch

Big data analytics is the often complex process of examining large and varied data sets -- or big data -- to uncover information including hidden patterns, unknown correlations, market trends and customer preferences that can help organizations make informed business decisions.

George Waters

Along with rise in unstructured data, there has also been a rise in the number of data formats. Video, audio, social media, smart device data etc. are just a few to name.

Careen Levi

Data science is an interdisciplinary field that includes statistics, predictive analytics, machine and deep learning and aims to get extra insights from data. The idea of data science is to run data experiments in order to reveal hidden patterns and dependencies.

Luned Birutė Mag Raith

In response to Dardan Dragić

Supervised learning is a method used to enable machines to classify objects, problems or situations based on related data fed into the machines. Machines are fed with data such as characteristics, patterns, dimensions, color and height of objects, people or situations repetitively until the machines are able to perform accurate classifications.

Supervised learning is a popular technology or concept that is applied to real-life scenarios. Supervised learning is used to provide product recommendations, segment customers based on customer data, diagnose disease based on previous symptoms and perform many other tasks.

Dardan Dragić

Supervised learning is a method used to enable machines to classify objects, problems or situations based on related data fed into the machines. Machines are fed with data such as characteristics, patterns, dimensions, color and height of objects, people or situations repetitively until the machines are able to perform accurate classifications.

Neelam Szczepański

In response to Esperanta Tomàs

Data science is an umbrella term that encompasses data analytics, data mining, machine learning, and several other related disciplines. While a data scientist is expected to forecast the future based on past patterns, data analysts extract meaningful insights from various data sources. A data scientist creates questions while a data analyst finds answers to the existing set of questions.

Esperanta,

Data science includes retrieval, collection, ingestion, and transformation of large amounts of data, collectively known as Big Data. Data science is responsible for bringing structure to big data, searching compelling patterns, and finally advising decision makers to bring in the changes effectively to suit the business needs. Data analytics and machine learning are two of the many tools and processes that data science uses.

Esperanta Tomàs

Data science is an umbrella term that encompasses data analytics, data mining, machine learning, and several other related disciplines. While a data scientist is expected to forecast the future based on past patterns, data analysts extract meaningful insights from various data sources. A data scientist creates questions while a data analyst finds answers to the existing set of questions.

Hrœrekr Franzese

A data analyst is usually the person who can do basic descriptive statistics, visualize data and communicate data points for conclusions. They must have a basic understanding of statistics, a very good sense of databases, the ability to create new views, and the perception to visualize the data. Data analytics can be referred to as the basic level of data science.

Aisha Kamila Kuhn

AI systems are either weak AI or strong AI. Weak AI, also known as narrow AI, is an AI system that is designed and trained for a particular task. Virtual personal assistants, such as Apple's Siri, are a form of weak AI.  Strong AI, also known as artificial general intelligence, is an AI system with generalized human cognitive abilities so that when presented with an unfamiliar task, it has enough intelligence to find a solution.

Valerija Vroomen

Ultimately, the value and effectiveness of big data depends on the human operators tasked with understanding the data and formulating the proper queries to direct big data projects. Some big data tools meet specialized niches and allow less technical users to make various predictions from everyday business data.

Oberto

The popularity of the term "data science" has exploded in business environments and academia, as indicated by a jump in job openings.  However, many critical academics and journalists see no distinction between data science and statistics.  Is there a difference between the two?

Mariana Lichtenberg

An interesting application of AI is the so-called Robo Reader.  Essay grading is very labor intensive, which has encouraged researchers and companies to build essay-grading AIs. While their adoption varies among classes and educational institutions, it’s likely that you (or a student you know) has interacted with these “robo-readers’ in some way.

Lucas Jessen

In response to elvira eva becket

Using big data, Telecom companies can now better predict customer churn; Wal-Mart can predict what products will sell, and car insurance companies understand how well their customers actually drive. Even government election campaigns can be optimized using big data analytics. Some believe, Obama’s win after the 2012 presidential election campaign was due to his team’s superior ability to use big data analytics.

I think this is a bit scary, knowing that a non-government company can predict my next buy makes me think of all the other use cases of big data analytics. Have you heard of the 24th frame? You can read more about it here.

elvira eva becket

Using big data, Telecom companies can now better predict customer churn; Wal-Mart can predict what products will sell, and car insurance companies understand how well their customers actually drive. Even government election campaigns can be optimized using big data analytics. Some believe, Obama’s win after the 2012 presidential election campaign was due to his team’s superior ability to use big data analytics.

Bernardo Amadeo

In response to Nick

I think the data may lost of his value during the years, so we need to update it evry day. But maybe some kinds of data can become even more valuable with the age.

Nick,

In general case, I think that the data should loose its value, because it becomes obsolete and not valid anymore - and as we can see, everything is improving rapidly. There is certainly some information that will be more valuable, for example - when looking back to see what the patterns were, but it wont be that wanted and particularly useful.

Susan Boil

In response to Vardeep Edwards

It's scary to see just how much data we as individuals leave every day. It will be interesting to see what sort of control measures will be put in place in the future. Will the consumer get more control over their data and what is done with it?

Is it me or GDPR seems to be a warning sign for us "accept and be aware of the problem", rather than actually applying control over companies and protect us from cyber attacks?

Alex Tetradze

In response to Mihail Antoniou

Big Data has the potential to utterly transform the relationship that individuals have with institutions, customers with companies, patients with the healthcare system, students with universities, and voters with government.

Mihail,

Since you mentioned universities, here is one article that discusses the kinds of big data degrees available, how much money people can make with such an education, and the things one can do with such a degree.

Mihail Antoniou

Big Data has the potential to utterly transform the relationship that individuals have with institutions, customers with companies, patients with the healthcare system, students with universities, and voters with government.

Baldur Helgason

In response to Aleksey Tyomkin

I am trying to get started with Big Data science.    Does anyone know whether they teach it in academia, and if so, what they call it--Data Science, Big Data Analytics, or something else?

Aleksey,

Here is a comparison between some of the best Big Data Analytics Master's programs.

Fabricio Ruiz

Big Data has the potential to utterly transform the relationship that individuals have with institutions, customers with companies, patients with the healthcare system, students with universities, and voters with government. And that means once it has fully penetrated society and industry, the Big Data revolution may very well prove a turning point in our economic – and ultimately, cultural – history as great as the electronics revolution. . . perhaps even as great as the first and second Industrial Revolutions.

Alex Tetradze

In response to YogaFan

@George, I think at one point in the video the presenter mentions that currently Big Data is measured in Petabytes.

When we discuss “Big Data” these days, it can be difficult to translate an exact understanding of what size the quantity of Data represents to stakeholders outside of the IT or Data Management circle. The Terabyte (TB) and Petabyte(PB) have now become the common currency of Data Managers’ lives, where just a few years ago, Gigabytes (GB) were as large as it got. Everything in the Data Management world is scaling massively, exponentially, and most of all – relentlessly. As long as daily business is carried on online, Data will continue to soar in volume and size.

Baldur Helgason

Vardeep,

I believe control over our personal data is precisely the reason the European Union implemented GDPR.  If you surf the web a lot, the warnings could get a bit obtrusive but I still appreciate knowing that something is being done to preserve my personal information.

Nick

I think the data may lost of his value during the years, so we need to update it evry day. But maybe some kinds of data can become even more valuable with the age.

YogaFan

@George, I think at one point in the video the presenter mentions that currently Big Data is measured in Petabytes.

Vardeep Edwards

It's scary to see just how much data we as individuals leave every day. It will be interesting to see what sort of control measures will be put in place in the future. Will the consumer get more control over their data and what is done with it?

PSJunkie

It seems many services may be disrupted by the fast evolution of Big Data and AI.  The debate whether that is normal and something to be expected or whether it is a bad thing that we should be wary of is probably going to go on for some time to come.

Slobodan Pavlicic

Dr. Pedreschi's definition of Big Data was the easiest one to understand that I have come across.

Aleksey Tyomkin

I am trying to get started with Big Data science.    Does anyone know whether they teach it in academia, and if so, what they call it--Data Science, Big Data Analytics, or something else?

George Waters

I have always wondered, when we talk about Big Data what data size are we talking about—do we measure them in Terabytes, Petabytes, or some other unit?


Please login to leave a response.