2nd International Workshop on Data Platform Design, Management, and Optimization
Co-located with EDBT/ICDT 2023
Special Issue invitation to Information System Frontiers (IF 5.261) for best papers

What DataPlat is about

Information systems have evolved into complex data platforms supporting end-to-end data-intensive needs, such as storage, computation, and analysis of data with heterogeneous structures. However, a smart and comprehensive support for data scientists and architects to govern the data through the whole life-cycle is still necessary.

Supporting data management and governance requires the collection of metadata capturing the distinguishing features of the data; this enables advanced functionalities spanning from data research and profiling to provenance control, orchestration of data pipelines, incremental data integration, efficient querying, automated analytics, and homogeneous data access. The challenges begin with metadata management in terms of the modeling effort, storage, complexity of retrieval activities, and effective exploitation. While coping with big-data issues, the enabled functionalities must: (i) handle the heterogeneity of storage and computation engines (including DBMSs supporting multiple data models and cloud storage systems with limited control and predictability), (ii) meet suitability requirements for less-skilled users, and (iii) limit the costs of pay-as-you-go resources.

This workshop calls for innovative solutions --- from researchers and practitioners --- that address the aforementioned challenges. We welcome papers that contribute to the advancement of data platforms in engineering, optimizing, and simplifying the different aspects of data and metadata management and fruition.

Topics

The scope of the workshop includes, but is not limited to the following topics.

  • Metadata modeling for data platforms
  • Techniques for metadata discovery and management
  • Advanced search, exploration, and profiling of data and metadata
  • Semantic enrichment of metadata
  • Data governance
  • Data wrangling
  • Provenance and data versioning control
  • Orchestration and optimization of data transformation pipelines
  • Data integration and querying in multimodel databases, multistores, polystores
  • Query processing, optimization, and performance
  • Entity resolution and data fusion
  • Big data management and querying
  • Artificial Intelligence solutions for data platforms
  • AutoML techniques
  • Cloud computing and architectures
  • Advanced architectures for data lakes and data platforms
  • Analysis, design, implementation, and testing of data platforms
  • Case studies and project experiences

Submission

[NEW 28.03.23] Papers are available here.

[NEW 01.08.23] Submission deadline has been extended to January 22, 2023.

[NEW 22.11.22] Authors of the best papers will be invited to submit an extended version to a Special Issue with Springer's Information Systems Frontiers journal (IF: 5.261).

Submissions should present original results and substantial new work not currently under review or published elsewhere. DataPlat 2023 will follow a single-blind review process to evaluate submissions on the basis of originality, relevance, quality, and technical contribution. The following submissions are accepted:

  • Regular and short research papers (up to 10 and 5 pages, respectively)
  • Vision papers (up to 5 pages)
  • Application papers (up to 5 pages)

Papers must be submitted via Microsoft CMT in PDF.

DataPlat 2023 Submission Site on Microsoft CMT

Accepted papers will be published online at CEUR. Papers should be in 2-column style (including all material) and must be formatted with the same rules as all EDBT Workshop papers using the CEUR-ART style (templates for LaTeX and DOCX are available here and on Overleaf). Please make sure to enable the two column style in the template. All accepted workshop papers will be published in the CEUR-WS series, in a joint volume will all EDBT 2023 workshops.

All accepted papers are expected to be presented at the workshop, and at least one author is required to register.

Important dates

Paper submission: January 8, 2023January 22, 2023

Authors notification: February 5, 2023February 10, 2023

Camera ready: February 19, 2023

Workshop date: March 28, 2023

Committees

Program Chairs & Organizers

Matteo Francia

DISI - University of Bologna

Enrico Gallinucci

DISI - University of Bologna

Patrick Marcel

LIFAT - Université de Tours

Veronika Peralta

LIFAT - Université de Tours

Program Committee

  • Duncan Ruiz - Escola Politécnica - PUCRS, Brazil
  • Franck Ravat - Université Paul Sabatier, France
  • Jérome Darmont - University of Lion, France
  • Sana Sellami - Aix Marseille University, France
  • Sandra Sampaio - University of Manchester, UK
  • Sandro Bimonte - INRAE Clermont Ferrand, France
  • Sergi Nadal - Universitat Politècnica de Catalunya, Spain
  • Shaleen Deep - Microsoft
  • Theodoros Toliopoulos - Aristotle University of Thessaloniki, Greece

Keynote speaker

Angela Bonifati

Lyon 1 University & CNRS Liris, France

The Quest for Schemas in Graph Databases

Property graphs are a widespread data model for representing interconnected multi-labeled data enhanced with properties as key/value pairs. These highly expressive graphs are used in a wide range of domains, such as social and transportation networks, biological networks, finance, cybersecurity, logistics and planning, to name a few. Property graphs are the building blocks of future graph ecosystems, in which OLTP and OLAP processes are intertwined with complex advanced processes, such as learning, scientific computing and business intelligence. While property graphs are currently used in a variety of graph databases, a rather fragmented landscape emerges in terms of the supported query and schema languages. In particular, the coverage of schema and constraints is limited if not completely lacking in these systems. In this talk, I will present recent advances in terms of schemas and constraints for property graphs, as part of our work within the LDBC community groups. I will also focus on graph schema discovery and constraint satisfaction following these proposals for property graph schema and constraints. Finally, I will pinpoint future directions of research in this new exciting area of data management.

Workshop program

The schedule is in EEST (UTC+3) - Athens Time

time title speaker / authors
08:30 - 09:00Conference registration
09:00 - 10:30[Keynote] Angela Bonifati - Shared with DOLAP
10:30 - 11:00Coffee Break
11:00 - 11:05Opening of the DataPlat - Comonos Workshop
11:05 - 11:20MongoDB Data Versioning Performance: local versus AtlasEla Pustulka and Lucia de Espona Pernas
11:20 - 11:45Easy-to-use interfaces for supporting the user in the semantic annotation of web tablesSara Bonfitto, Paolo Perlasca, and Marco Mesiti
11:45 - 12:10Toulouse: Learning Join Order Optimization Policies for Rule-based Data EnginesAntonios Karvelas, Alkis Simitsis, Yannis E Foufoulas, and Yannis Ioannidis
12:10 - 12:35Mining Data Wrangling Workflows for Patterns, Reuse and Optimisation OpportunitiesAbdullah Kh Almasaud, Sandra Sampaio, and Pedro Sampaio
12:35 - 12:50Prediction of user-brand associations based on sentiment analysisMariella Bonomo, Simona Ester Rombo, and Filippo Rotolo
12:50 - 14:30Lunch Break
14:30 - 14:45Towards a Multi-Model Approach to Support User-Driven Extensibility in Data Warehouses: Agro-ecology Case StudySandro Bimonte, Fagnine Alassane Coulibaly, Stefano Rizzi, Sylvie malembic-maher, and Frederic Fabre
14:45 - 15:10HEALER: A Data Lake Architecture for HealthcareCarlo Manco, Tommaso Dolci, Fabio Azzalini, Enrico Barbierato, Marco Gribaudo, and Letizia Tanca
15:10 - 15:25Data migration in column family database evolution using MDEPablo Suárez-Otero, Michael Mior, María José Suárez-Cabal and Javier Tuya
15:25 - 15:50Propagating schema changes to code: An approach based on a unified data modelAlberto Hernández Chillón, Jesus Garcia-Molina, José Ramón Hoyos and María José Ortín Ibáñez
15:50 - 16:15Effective queries for mega-analysis in cognitive neuroscienceMateusz Pawlik, Anna Ravenschlag, Monique Denissen, Bianca Löhnert, Nicole Himmelstoß and Florian Hutzler
16:15 - 16:20Farewell
Presentations from remote can be found on our YouTube channel