Research into real-time analytics for big data will enable opportunities to create new businesses and transform existing businesses because it combines elements that form the basis of a step-change in computational performance.

We are currently working on developing data streaming methods that scale to Big Data like large deep neural networks, but work well in all domains.

MOA is the most popular open source framework for data stream mining, with a very active growing community. It includes a collection of machine learning algorithms (classification, regression, clustering, outlier detection, concept drift detection and recommender systems) and tools for evaluation, that are suitable for data streams, i.e. cases where one doesn’t have the opportunity to re-process the data multiple times. 

People

  • Albert Bifet
  • Heitor Murilo Gomes
  • Bernhard Pfahringer
  • Geoff Holmes

Publications

Heitor Murilo Gomes, Albert BifetJesse ReadJean Paul BarddalFabrício EnembreckBernhard PfharingerGeoff HolmesTalel Abdessalem:
Adaptive random forests for evolving data stream classification.Machine Learning 106(9-10): 1469-1495 (2017)

Albert Bifet, Jiajin ZhangWei FanCheng HeJianfeng ZhangJianfeng QianGeoff HolmesBernhard Pfahringer: Extremely Fast Decision Tree Mining for Evolving Data Streams.KDD 2017: 1733-1742

Albert Bifet, Gianmarco De Francisci MoralesJesse ReadGeoff HolmesBernhard Pfahringer: Efficient Online Evaluation of Big Data Stream Classifiers.KDD 2015: 59-68

Albert Bifet, Geoff HolmesBernhard PfahringerRicard Gavaldà: Mining frequent closed graphs on evolving data streams. KDD 2011: 591-599

Hardy KremerPhilipp KranenTimm JansenThomas Seidl, Albert Bifet, Geoff HolmesBernhard Pfahringer: An effective evaluation measure for clustering on evolving data streams. KDD 2011: 868-876