machine learning on big data: opportunities and challenges- future research direction for phd...

27
MACHINE LEARNING ON BIG DATA: OPPORTUNITIES AND CHALLENGES - FUTURE RESEARCH DIRECTION FOR PHD SCHOLARS An Academic presentation by Dr. Nancy Agnes, Head, Technical Operations, Phdassistance Group www.phdassistance.com Email: [email protected]

Upload: lopezphdassistance

Post on 21-May-2021

7 views

Category:

Education


0 download

DESCRIPTION

Machine Learning (ML) is rapidly used in a variety of applications. It has risen to prominence in recent years, owing in part to the emergence of big data. When it comes to big data, ML algorithms have never been more promising. Big data allows machine learning algorithms to discover finer-grained patterns and make more timely and precise predictions than ever before; however, it also poses significant challenges to machine learning, such as model scalability and distributed computing. Ph.D. Assistance serves as an external mentor to brainstorm your idea and translate that into a research model. Hiring a mentor or tutor is common and therefore let your research committee know about the same. We do not offer any writing services without the involvement of the researcher. Learn More: https://bit.ly/2RB1buD Contact Us: Website: https://www.phdassistance.com/ UK NO: +44–1143520021 India No: +91–4448137070 WhatsApp No: +91 91769 66446 Email: [email protected]

TRANSCRIPT

Page 1: Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance

MACHINE LEARNING ON BIG DATA: OPPORTUNITIES AND CHALLENGES - FUTURE RESEARCH DIRECTION FOR PHD SCHOLARS

An Academic presentation byDr. Nancy Agnes, Head, Technical Operations, Phdassistance Group www.phdassistance.comEmail: [email protected]

Page 2: Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance

In-brief

Introduction

Machine learning

Big data

Data preprocessing opportunities and challenges

Evaluation opportunities and challenges

Future research

Conclusion

Outline

TODAY'S DISCUSSION

Page 3: Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance

Machine Learning (ML) is rapidly used in a variety of applications. It has risen to prominence in recent years, owing in part to the emergence of big data. When it comes to big data, ML algorithms have never been more promising. Big data allows machine learning algorithms to discover finer-grained patterns and make more timely and precise predictions than ever before; however, it also poses significant challenges to machine learning, such as model scalability and distributed computing.

In-Brief

Page 4: Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance

In various fields as computer vision, speech recognition, natural language comprehension, neuroscience, fitness, and the Internet of Things, ML techniques have had enormous societal impacts.

The emergence of the era of big data has stirred up interest in M achine Learning Big Data has never promised or questioned machine learning algorithms to gain new insights into a variety of business applications and human behaviours.

Contd...

INTRODUCTION

Page 5: Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance

On the one hand, big data provides ML algorithms with unparalleled amounts of data from which to derive underlying patterns and create predictive models; on the other hand, conventional ML algorithms face crucial challenges such as scalability in order to fully unlock the value of big data.

With the ever-expanding world of big data, ML must develop and grow in order to turn big data into actionable intelligence.

Contd...

Page 6: Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance

ML aims to answer the question of how to build a computer system that improves itself over time.

The problem of learning from experience with respect to certain tasks and performance metrics is referred to as an ML problem.

Users may use ML techniques to deduce underlying structure and make predictions from large datasets.

Contd...

Page 7: Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance

ML thrives on strong computational environments, efficient learning techniques (algorithms), and rich and/or large data.

As a result, ML has a lot of potential and is an essential part of big data analytics

Page 8: Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance

Fig. 1. A Framework of machine learning on big data (MLBid)

Page 9: Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance

Data pre-processing, learning, and assessment are common stages of Machine Learning.

D ata pre-processing aids in the transformation of raw data into the "right form" for further learning steps.

Via data cleaning, extraction, transformation, and fusion, the pre-processing phase transforms such data into a form that can be used as inputs to learning.

Contd...

MACHINE LEARNING

Page 10: Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance

Using the pre-processed input data, the learning step selects learning algorithms and tunes model parameters to produce desired outputs.

Data pre-processing can be done with some learning methods,especially representational learning.

After that, the trained models are evaluated to see how well they do.

The essence of learning input, the goal of learning activities, and the timing ofdata availability are all characteristics of machine learning.

Contd...

Page 11: Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance

ML can be divided into three major categories based on the quality of the input available to a learning system: supervised learning, unsupervised learning, and reinforcement learning.

ML can be divided into two types: representational learning and task learning, depending on whether the learning goal is to learn particular tasks using input features or to learn the features themselves.

Each M achine Learning Algorithm can be classified in a variety of ways.

Page 12: Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance

Fig. 2. A multi-dimensional taxonomy of machine learning

Page 13: Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance

Volume, velocity, variety,veracity, and value are

the five dimensions of big data.

Starting from the bottom, we organised the five dimensions into a stack of high, data, and value layers.

The data layer is integral tobig data, and themeaning factor characterises the influence of big data real-world applications.

Contd...

BIG DATA

Page 14: Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance

The lower layer is more reliant on technical advancements, while the higher layer is more focused on applications that leverage big data's strategic strength.

Established machine learning paradigms and algorithms must be modifiedto understand the potential of big data analytics and to process big data efficiently.

We recognise key opportunities and challenges in this section.

We go through them individually for each of the three phases of machinelearning: preprocessing, learning, and assessment.

Contd...

Page 15: Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance

Fig. 3. Big data stack

Page 16: Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance

Data replication or inconsistency can have a significant impact on machine learning.

Traditional methods such as pairwise similarity comparison are no longer feasible for big data, despite a variety of techniques for detecting duplicates produced in the last 20 years.

Contd...

When two or more data samples represent the same object, duplication occurs.

DATA REDUNDANCY

DATA PREPROCESSING OPPORTUNITIES AND CHALLENGES

Page 17: Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance
Page 18: Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance

Furthermore, the conventional presumption that duplicated pairs are rarer than non-duplicated pairs is no longer true.

Dynamic Time Warping can be much faster than current Euclidean distance algorithms in this regard

DATA HETEROGENEITY

Big data promises to include multi-view data from a variety of repositories, in a variety of formats, and from a variety of population samples, and thus is highly heterogeneous.

Contd...

Page 19: Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance

The value of these multi-view heterogeneous data. As a result, combining all of the characteristics and treating them equally relevant is unlikely to result in optimal learning outcomes.

Big data offers the possibility of simultaneously learning from different views and then assembling multiple findings by learning the relevance of feature views to the task.

The approach is supposed to be resistant to data outliers and to be able to solve optimization and convergence problems.

Contd...

Page 20: Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance

DATA DISCRETIZATION

However, most currentdiscretizationdealing with large amounts of data.

methods would be ineffectivewhen

Traditional discretization approaches have been parallelized in big data platforms to solve big data problems, with a distributed variant of the entropy minimization discretizer based on the Minimum Description Length Principle improving both efficiency and accuracy.

Contd...

Page 21: Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance

DATA LABELLING

Active learning can be used as an optimization technique for marking activities in crowd-sourced databases, reducing the number of questions posed to the crowd and enabling crowd-sourced applications to scale.

Designing active L earning Algorithms for a crowd-sourced dataset, on the other hand, presents a number of practical challenges, including generality, scalability, and usability.

Another problem is that such a dataset cannot cover all user-specific contexts, resulting in output that is often inferior to user-centric training.

Contd...

Page 22: Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance

IMBALANCED DATA

Traditional stratified random sampling approaches have tackled theproblem ofunbalanced data.

However, if iterations of sub-sample generation and error metrics measurement are needed, the process can take a long time.

Furthermore, conventional sampling methods are unable to support data sampling over a user-specified subset of data that includes value-based sampling efficiently.

Parallel data sampling is needed by big data.

Page 23: Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance
Page 24: Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance

This paper provides a summary of the benefits and drawbacks of machine learning on big data.

Big data poses new possibilities for inspiring revolutionary and novel ML technologies to solve many associated technological problems and generate real-world impacts, while also posing multiple challenges for conventional ML in terms of scalability, adaptability, and usability.

Contd...

FUTURE RESEARCH

Page 25: Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance

These opportunities and challenges can be used to evaluate current research in this field.

According to the components of the MLBiD system, we also highlight some open Research issues in ML on big data, as shown in Table.

Page 26: Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance

In conclusion, machine learning is needed to address the challenges faced by big data and to discover hidden patterns, information, and insights from big data in order to transform its potential into real value for business decision-making and scientific exploration.

The combination of machine learning and big data points to a bright future in a modern frontier.

CONCLUSION

Page 27: Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance

Contact Us

UNITED KINGDOM+44-1143520021

INDIA+91-4448137070

[email protected]