Data-Driven Analytics in the Industrial Internet or How To Destroy My Job

Data

The Industrial Internet is the third disruptive wave, after the Industrial and the Internet revolutions. It is transforming our industries, just like the Internet revolution transformed our commerce. In this new context, we face a combination of hyper-connected intelligent machines, interacting with other machines and people, and generating large amounts data that need to be analyzed by descriptive, predictive, and prescriptive models. As a result, we see the resurgence of analytics as a key differentiator for creating new services, the emergence of cloud computing as an enabling technology for service delivery, and the growth of crowdsourcing as a new phenomenon in which people play critical roles in creating information and shaping decisions in a variety of problems. We explore the intersection of these three concepts from the perspective of a machine-learning researcher and show how his job and roles have evolved over time.

In the past, analytic model creation was an artisanal process, as models were handcrafted by experienced, knowledgeable model-builders. More recently, the use of meta-heuristics, such as evolutionary algorithms, has provided us with limited levels of automation in model building and maintenance. In the future, we expect data-driven analytic models to become a commodity. We envision having access to a large number of data-driven models, obtained by a combination of crowdsourcing, cloud-based evolutionary algorithms, outsourcing, in-house development, and legacy models. In this context, the critical issue will be model ensemble selection and fusion, rather than model generation.

We address this issue by proposing customized model ensembles on demand, inspired by Lazy Learning. In our approach, referred to as Lazy Meta-Learning, for a given query we find the most relevant models from a DB of models, using their meta-information. After retrieving the relevant models, we select a subset of models with highly uncorrelated errors (unless diversity was injected in their design process.) With these models we create an ensemble and use their meta-information for dynamic bias compensation and relevance weighting. The output is a weighted interpolation or extrapolation of the outputs of the models ensemble. The confidence interval around the output is reduced as we increase the number of uncorrelated models in the ensemble. This approach is agnostic with respect to the genesis of the models, making it scalable and suitable for a variety of applications.

Speaker: Pieor Bonissone, GE Global Research

Wednesday, 09/11/13

12:00 PM - 01:00 PM

Contact:

Website: Click to Visit

Cost:

Free

Save this Event:

iCalendar
Google Calendar
Yahoo! Calendar
Windows Live Calendar

CITRIS at UC Berkeley

Sutardja Dai Hall
Banatao Auditorium
Berkeley, CA 94720