Big Data: Technologies, Methods, Concepts

Lecturer:

Prof. Dr. Andreas Harth

 

Details:

Lectures with exercises
4 SWS, ECTS-Studium, ECTS-Credits: 5
Language: English
Time and location:  see campo.

 

Prerequisites

No specific prerequisites are required. Some basic knowledge in databases and web technologies could be useful.

 

Contents

Big Data refers to dataset that are too large or too complex to handle in traditional data management and processing systems. The course presents an overview of methods and technologies related to the storage and processing of Big Data, including:

  • Introduction

  • Distributed Systems and Cloud Computing

  • Big Data Processing Systems, including Map/Reduce

  • Theory and Practice of NoSQL Systems

  • Knowledge Graphs

  • Data Mapping and Integration

The course concludes with an outlook on further topics, including data mining and machine learning.

Course Objectives:

The course teaches the fundamentals of Big Data, including real-world use cases, as well as current technical challenges and opportunities with Big Data. Students will learn about the foundational algorithms used in large-scale distributed systems. Further, students will learn how to make use of available technologies to store, process and integrate Big Data on cloud infrastructures and to perform data analytics tasks. The hands-on sessions include setting up a cloud environment, and querying and visualizing a large dataset.

Learning Target/Skills:

  • Explain the V’s of Big Data

  • Outline the distributed architectures used in Map/Reduce systems

  • Explain Brewer’s CAP theorem

  • Write algorithms with map and reduce functions

  • Outline the use of similarity metrics for data mapping

  • Explain steps involved in large-scale data integration and data analytics

Literature:

Jure Leskovec, Anand Rajaraman, Jeff Ullman, Mining of Massive Datasets, http://mmds.org/.

AnHai Doan, Alon Halevy, Zachary Ives, Principles of Data Integration, Morgan Kaufmann, 2012.

Join the campo instance: https://www.campo.fau.de/qisserver/pages/startFlow.xhtml?_flowId=searchCourseNonStaff-flow&_flowExecutionKey=e2s6