Kamanja: A New Open Source Real-Time System for Scoring Data Mining Models

This talk will start with a number of complex data real-time use cases, such as a) complex event processing, b) supporting the modeling of a data mining department and c) developing enterprise applications on Apache big-data systems. While Hadoop and big data has been around for a while, banks and healthcare companies tend not to be early IT adopters. What are some of the security or roadblocks in Apache big data systems for such industries with high requirements?
Data mining models can be trained in dozens of packages, but what can simplify the deployment of models regardless of where they were trained or with what algorithm? Predictive Modeling Markup Language (PMML), is a type of XML with specific support for 15 families of data mining algorithms. Data mining software such as R, KNIME, Knowledge Studio, SAS Enterprise Miner are PMML producers. The new open-source product, Kamanja, is the first open-source, real-time PMML consumer (scoring system). One advantage of PMML systems is that it can reduce time to deploy production models from 1-2 months to 1-2 days - a pain point that may be less obvious if your data mining exposure is competitions or MOOCs. Kamanja is free on Github, supports Kafka, MQ, Spark, HBase and Cassandra among other things. Being a new open-source product, initially, Kamanja supports rules, trees and regression.
I will cover an architecture of a sample application using multiple streams of open source data, such as social network campaigns and tracking sentiment for the bank client and its competitors. Other real-time architectures cover credit card fraud detection. A brief demo will be given of the social network analysis application, with text mining.
An overview of products in the space will include popular Apache big data systems, real-time systems and PMML systems.
LigaData will be sponsoring pizza and salads from The Garret restaurant in Campbell.
Speaker: Greg Makowski, LigaDATA
Voices Conference Room
Sunday, 07/26/15
Contact:
Website: Click to VisitCost:
FreeSave this Event:
iCalendarGoogle Calendar
Yahoo! Calendar
Windows Live Calendar
