A transdisciplinary perspective on Big Data (PhD Course)

Mission and Goals

Big data is everywhere and researchers from all disciplines are addressing this topic from their own perspective, creating vertical excellent experiments, but often loosing the wider picture. This course aims at reconstructing such a picture critically analysing how each discipline contributes to practice and academic debate.

Calendar

LecturesLecturerdateRoom
Part 1: Grand challenges of Big Data
Introduction [slides]Della Valle21/01/2019 14:00-15:00Sala Seminari – Ed. 20 piano terra
Opportunities for Spatial research. Remapping and reconceptualising the urban [slides]Fedeli21/01/2019 15:00-18:00Sala Seminari – Ed. 20 piano terra
Opportunities for Economics and Finance [slides part 1, part 2, part 3]Pammolli22/01/2019 10:00-13:00Sala Conferenze Emilio Gatti – Ed. 20 – Piano Terra
The Big Data and Analytics Cycle for Decision Makers [slides]Arnaboldi24/01/2019 9:30-10:30Sala Conferenze Emilio Gatti – Ed. 20 – Piano Terra
Presentation of the group work (Students define a transdisciplinary research objective, highlighting their contribution to practice and academic debate)Arnaboldi24/01/2019 10:30-12,00Sala Conferenze Emilio Gatti – Ed. 20 – Piano Terra
Part 2: Making sense of Big Data
Introduction to data analytics with the R language [slides1, slides2]Vantini29/01/2019 14:00-17:00
Sala Seminari – Ed. 20 piano terra
The role of visualization
Ciuccarelli
11/02/2019 14:00-17:00BIO1 Ed. 21 – I Piano – room 024
Knowledge discovery and Data Mining [slides]Brambilla5/2/2019 10:00-13:00Sala Seminari – Ed. 20 piano terra
Part 3: Big Data technologies
Models: from relational to non-relational models (NoSQL, …) [slides]Brambilla7/2/2019 10:00-13:00Aula PT1
Ed. 20 – Piano Terra
Volume: Map Reduce basics from Hadoop to Apache Spark and Flink [slides]Ardagna 7/2/2019 14:00-17:00Room Alessandra Alario
Velocity: Information flow processing principle, approaches and tools [slides]
Della Valle
8/2/2019 9:30-12:30Sala Seminari – Ed. 20 piano terra
Discussion of domain applications and students’ transdisciplinary assignmentsArdagna/Brambilla/Della Valle../3/2019
Part 4: students’ reporting
To be detailed16/5/2019 10:00:13:00Sala Conferenze Emilio Gatti – Ed. 20 – Piano Terra

Team

Proposer and Coordinator:

  • DEIB: Emanuele Della Valle

Lecturers:

  • DEIB: Danilo Ardagna, Marco Brambilla, and Emanuele Della Valle
  • DIG: Michela Arnaboldi and Fabio Pammolli
  • DMAT: Piercesare Secchi and Simone Vantini
  • DESIGN: Paolo Ciuccarelli
  • DASTU: Valeria Fedeli

Teaching organization

The course is divided in 3 parts. The 1st provides a transversal view on grand challenges to which big data can contribute and allows understanding what big data is. The 2nd one presents the main paradigms and techniques for data analytics. The 3rd one teaches how practically tame volume, variety, velocity, and veracity.

Part 1: Grand challenges of Big Data

  • Opportunities for social, environmental and economic problems.
  • Problem of current research: lack of transversal view.
  • Students define a transdisciplinary research objective, highlighting their contribution to practice and academic debate. This initial work is the starting point for the assignment and a fil-rouge across the course.

Part 2: Making sense of Big Data

  • Introduction to data analytics with the R language
  • Knowledge discovery and Data Mining
  • The role of visualization
  • Discussion of domain applications and students’ transdisciplinary assignments

Part 3: Taming volume and velocity, without forgetting variety and veracity with Big Data technologies

  • Scaling computation and storage horizontally
  • Map Reduce basics from Hadoop to Apache Spark and Flink
  • Information flow processing principle, approaches and tools
  • Hands-on Apache Spark to tame volume and velocity in data analytics
  • Discussion of domain applications and students’ transdisciplinary assignments

Part 4: students’ reporting 16.5.2019

  • Group: Bridges Maintenance Allocation Tool And Road Traffic Management System [pdf]
    • Alireza Entezami
    • Bobrovskiy Vadim
    • Castaldo Anna Giulia
    • Tognoli Marco
    • Milan Dragoljevic
  • Group: Exploring Big Data potentialities in the field of Asset Management for maintenance policies improvement [pdf]
    • Antonello Federico
    • Casanova Luca
    • Rota Francesco
    • Scrivano Salvatore
    • Adalberto Polenghi
  • Group: I Heart Original [pdf]
    • Cicci Ludovica
    • Piersanti Roberto
    • Salvador Matteo
  • Group: I Heart Reloaded
    • Di Gregorio Simone
    • Fresca Stefania
    • Pozzi Silvia
    • Stella Simone
  • Group: Analysis On The Operating Conditions Of An Energy Storage System: Laying The Basis For An Efficiency Prediction Model [pdf]
    • Foschi Jacopo
    • Meraldi Lorenzo
    • Polinelli Francesco Niccolò
  • Group: Clustering of hydrogenases [pdf]
    • Alessio Domenico Leto
    • Manuel Ruiz

Evaluation

Students will be required to build a research case, identifying business value, data and methods, using the tools to analyze and visualize data, critically analyzing pitfalls, and highlighting their contributions

As a general guideline your presentation should cover the following topics:

  • who you are (name, surname, skills, time spent)
  • the problem (why is it a Big Data problem: is it about volume? velocity? variety? veracity? a mix of them?
  • the (partial) solution (which Big Data tools/methods illustrated during the course did you use? Why?)
  • lesson learned (what did go as expected? what did not? why?

Target a duration of 25 minutes for your presentation and a 5 minutes for the question/answering session.

Please remember that the mark is granted both based on your presentation and on the questions you rise on the presentations of other teams.