Requirements for Statistical Analytics and Data Mining
OpenBudgets.eu: Fighting Corruption with Fiscal Transparency
Project Number: 645833
Start Date of Project: 01.05.2015
Duration: 30 months
Deliverable 2.3 Requirements for Statistical Analytics and Data Mining
Dissemination Level
Public
Due Date of Deliverable
Month 12, 30.04.2016
Actual Submission Date
01.06.2016
Work Package
WP 2, Data Collection and Mining
Task
T 2.3
Type
Report
Approval Status
Final
Version
1.0
Number of Pages
32
Filename
D2.3 - Requirements for Statistical Analytics and Data Mining.docx
Abstract: In this deliverable we present requirements for statistical analytics and data mining in the OpenBudgets.eu (OBEU) platform. Based on user needs assessed and reported in previous OBEU deliverables we formulate data mining and analytics tasks, discuss related tools and algorithms, and finally define corresponding requirements.
The information in this document reflects only the author's views and the European Community is not liable for any use that may be made of the information contained therein. The information in this document is provided "as is" without guarantee or warranty of any kind, express or implied, including but not limited to the fitness of the information for a particular purpose. The user thereof uses the information at his/ her sole risk and liability.
Project funded by the European Union's Horizon 2020 Research and Innovation Programme (2014 ? 2020)
History
Version 0.1 1.0
Date 11.05.2016 31.05.2016
Reason Version for internal review Final version for submission
D2.3 ? v.1.0
Revised by Kleanthis Koupidis Christiane Engels
Author List
Organisation Fraunhofer OKFGR OKFGR UBONN UBONN UEP UEP UEP UEP
Name Christiane Engels Charalampos Bratsas Kleanthis Koupidis Fathoni Musyaffa Fabrizio Orlandi David Chud?n Jaroslav Kucha Jindich Mynarz V?clav Zeman
Contact Information christiane.engels@iais.fraunhofer.de char.brat@ koupidis.okfgr@ musyaffa@iai.uni-bonn.de orlandi@iai.uni-bonn.de david.chudan@vse.cz jaroslav.kuchar@vse.cz mynarzjindrich@ vaclav.zeman@vse.cz
Page 2
D2.3 ? v.1.0
Executive Summary
In this deliverable we present the requirements for statistical analytics and data mining in the OpenBudgets.eu (OBEU) project. We start by elaborating the methodology used to collect the data mining and statistical analytics requirements. After identifying sources of collected data mining and analytics needs in previous OBEU deliverables, these needs are summarized. We continue with mapping those needs onto corresponding data mining and analytics tasks. A discussion regarding appropriate algorithms for the identified tasks follows. Based on the collected tasks, we describe related tools. Finally, we formulate the list of requirements for data mining and statistical analytics along with a priority for each requirement.
Page 3
Abbreviations and Acronyms
CSV DCV RDF OBEU
Comma-Separated Values Data Cube Vocabulary Resource Description Framework OpenBudgets.eu
D2.3 ? v.1.0
Page 4
D2.3 ? v.1.0
Table of Contents
1 INTRODUCTION ....................................................................................................... 8
2 PRELIMINARIES ...................................................................................................... 8
2.1 SEMANTIC MODEL ........................................................................................... 8
2.2 METHODOLOGY ............................................................................................... 9
3 DATA MINING AND ANALYTICS NEEDS AND TASKS .......................................... 9
3.1 SOURCES OF COLLECTED DATA MINING AND ANALYTICS NEEDS ........... 9
3.2 COLLECTED DATA MINING AND ANALYTICS NEEDS ..................................10
3.2.1
Analysis of the required functionality of OpenBudgets.eu (D4.2) .......... 10
3.2.2
User Requirements Reports ? First Cycle (D5.1) .................................. 11
3.2.3
Needs Analysis Report (D6.2) .............................................................. 12
3.2.4
Assessment Report (D7.1) ................................................................... 12
3.2.5
Stakeholder identification and outreach plan (D8.3).............................. 12
3.2.6
Additional Needs .................................................................................. 14
3.3 DATA MINING AND ANALYTICS TASKS .........................................................14
3.4 SUMMARY OF DATA MINING AND ANALYTICS TASKS ................................21
3.5 DISCUSSION OF IDENTIFIED DATA MINING AND ANALYTICS TASKS........21
3.5.1
Similarity Learning ................................................................................ 21
3.5.2
Rule/Pattern Mining .............................................................................. 22
3.5.3
Outlier/Anomaly Detection .................................................................... 22
3.5.4
Clustering ............................................................................................. 23
3.5.5
Graph/Network Analysis ....................................................................... 23
3.5.6
Pattern Matching .................................................................................. 24
3.5.7
Descriptive Statistics ............................................................................ 24
3.5.8
Comparative Analysis ........................................................................... 25
3.5.9
Time Series Analysis ............................................................................ 25
4 TOOLS .....................................................................................................................26
4.1 RAPIDMINER....................................................................................................26
4.2 WEKA ...............................................................................................................26
4.3 R .......................................................................................................................26
4.4 PYTHON ...........................................................................................................27
4.5 SPARQL ...........................................................................................................27
4.6 OPENSPENDING .............................................................................................27
4.7 EASYMINER .....................................................................................................28
5 REQUIREMENTS FOR STATISTICAL ANALYTICS AND DATA MINING ..............29
5.1 GENERAL FUNCTIONAL REQUIREMENTS ....................................................29
Page 5
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- job description data analyst
- research data analyst 1
- bi analyst position description
- research data analyst 2
- data quality requirements analysis and modeling
- data analyst yodalearning
- requirements for statistical analytics and data mining
- so you want to be a requirements analyst
- database requirements for modern development
Related searches
- analytics and reporting best practices
- difference between statistical significance and practical
- requirements for or requirements of
- requirements for va aid and attendance
- sat and gpa requirements for colleges
- data analytics vs data science
- cms requirements for history and physical
- data collection and data analysis
- data analytics vs data analysis
- jhu business analytics and risk management
- analytics and information management deloitte
- code requirements for stair treads and risers