PROFILES & DATA:SEARCH

News

March 5. Tentative workshop program updated.
February 14. Accepted papers announced.
Click to see all accepted papers
- Emilia Kacprzak, Laura Koesten, Jeni Tennison and Elena Simperl Characterising Dataset Search Queries (Short paper)
- Mohamed Ben Ellefi, Odile Papini, Djamal Merad, Jean-Marc Boi, Jean-Philip Royer, Jérôme Pasquet, Jean-Christophe Sourisseau, Filipe Castro, Mohamad Motasem Nawaf and Pierre Drap Cultural Heritage Resources Profiling: Ontology-based Approach
- Semih Yumuşak, Andreas Kamilaris, Erdogan Dogdu, Halife Kodaz, Elif Uysal and Riza Emre Aras A Discovery and Analysis Engine for Semantic Web
- Sean Soderman, Anusha Kola, Maxim Podkorytov, Michel Geyer and Michael Gubanov Hybrid.AI: A Learning Search Engine for Large-scale Structured Data
- Zhiyu Chen, Haiyan Jia, Jeff Heflin and Brian Davison Generating Schema Labels through Dataset Content Analysis
- Sebastian Neumaier, Lőrinc Thurnay, Thomas J. Lampoltshammer and Tomáš Knap Search, Filter, Fork, and Link Open Data - The ADEQUATe platform: data- and community-driven quality improvements (Short paper)
January 24. The submission deadline is extended to February 6.
January 10. Keynote speakers announced:

Maarten de Rijke,
University of Amsterdam

Objectives and Goals

The web of data has seen tremendous growth recently. New forms of structured data have emerged in the form of web markup, such as schema.org, and a large amount of data in web tables. Considering these rich, heterogeneous and evolving data sources which cover a wide variety of domains, the exploitation of web data becomes increasingly important in the context of various applications, including (federated) search, question answering and fact verification.

The objective of this workshop is to bring together researchers and practitioners interested in the development of data search techniques, data profiling, and dataset retrieval on the web. This includes looking at the specifics of data-centric information seeking behaviour, understanding interaction challenges in data search on the web, and analysing the cognitive processes involved in the consumption of structured data by users. At the same time we aim to discuss technologies addressing data search – including semantics, information retrieval for web data (ranking algorithms and indexing), in particular in the context of decentralised and distributed systems, such as the web. We are interested in approaches to analyse, characterise and discover data sources. We want to facilitate a discussion around data search across formats and domain-specific applications.

We envision the workshop as a forum for researchers and practitioners to come together and discuss common challenges and identify synergies for joint initiatives. We welcome contributions describing technical approaches, as well as those related to Human Computer Interaction research in data discovery, profiling and retrieval.

Topics and Themes

PROFILES & DATA:SEARCH ’18 seeks application-oriented papers, as well as more theoretical papers and position papers. The workshop proposes a multidisciplinary discussion on the following themes, with a focus on RDF, CSV, JSON and other structured and semi-structured datasets:

Data Search

Dataset retrieval
Search results presentation for datasets
Semantic dataset search
Evaluation of dataset search tools and algorithms
Decentralised and distributed architectures and algorithms in data search
Fusing, cleaning, ranking and refining search results
Approaches to personalisation in dataset search
Scalability & performance of distributed data queries
Query routing taking into account relevance, quality and profiles of distributed datasets

Data Profiling

Dataset profile representation (vocabularies, schemas)
Profiling and assessment of novel forms of entity-centric Web data
Data summarisation
Data quality analysis for query routing
Novel applications using dataset profiles
Topic profiling of datasets
Dataset indexing and profiling approaches

Human Data Interaction

Information seeking behaviour for data
User modeling for data search
Analysing behavioral traces during data search
Usability of data portals and data discovery tools
Data search result presentation to support sense making

We are interested in contributions using a variety of methods. This can include, for example, user studies, lab experiments, system based evaluation, but also experiments using gamification and crowdsourcing.

Submission Guidelines

We welcome the following types of contributions:

We encourage full papers (8 pages), short papers (4 pages) as well as position papers (2 pages). All submissions must be written in English and must be formatted according to the ACM format. The proceedings of the workshop will be included in the companion proceedings of The WebConf2018. Each submission will be reviewed by at least 2 members of the PC. Papers will be evaluated according to their significance, originality, technical content, style, clarity, and relevance to the workshop. Please submit your contributions electronically in PDF format via the Easychair system: https://easychair.org/conferences/?conf=profiles-datasearch2018

We follow a single-blind process with at least two reviewers per paper. Papers will be evaluated according to their significance, originality, technical content, style, clarity, and relevance to the workshop.

Important Dates

Workshop paper submissions due: ~~24 January~~ 6 February 2018

Workshop paper notifications sent: 14 February 2018

Camera-ready copies due: 01 March 2018

PROFILES & DATA:SEARCH Workshop: 24 April 2018

Tentative Schedule

09:00 – 09:10	Introduction & welcome
09:10 – 09:20	Opening
09:20 – 10:20	Keynote talk Maarten de Rijke Learning to Search for Datasets
Abstract Over the years, search engines have developed to return a broad range of retrievable items, from documents to people, locations, and products. Research datasets are being turned in retrievable items too. This raises a number of interesting challenges. Starting from the user end (What do users want from datasets?) to increasing the retrievability of datasets (What kind of contextual information is available to enrich datasets so as to make the more easily retrieval?) to optimizing rankers for datasets in the absence of large volumes of interaction data (How can we train learning to rank datasets algorithms in weakly supervised ways?). In the talk I will survey recent progress in these three areas and identify important open problems.
10:20 – 11:00	Break
11:00 – 12:20	Paper presentations Zhiyu Chen, Haiyan Jia, Jeff Heflin and Brian Davison Generating Schema Labels through Dataset Content Analysis (11:00 - 11:20) Semih Yumuşak, Andreas Kamilaris, Erdogan Dogdu, Halife Kodaz, Elif Uysal and Riza Emre Aras A Discovery and Analysis Engine for Semantic Web (11:20 - 11:40) Sean Soderman, Anusha Kola, Maxim Podkorytov, Michel Geyer and Michael Gubanov Hybrid.AI: A Learning Search Engine for Large-scale Structured Data (11:40 - 12:00) Emilia Kacprzak, Laura Koesten, Jeni Tennison and Elena Simperl Characterising Dataset Search Queries (12:00 - 12:15)( Short paper)
12:20 – 13:40	Lunch break
13:40 – 14:40	Keynote talk Aidan Hogan Profiling Graphs: Order from Chaos
Abstract Graphs are being increasingly adopted as a flexible data model in scenarios (e.g., Google’s Knowledge Graph, Facebook’s Graph API, Wikidata, etc.) where multiple editors are involved in content creation, where the schema is ever changing, where data are incomplete, where the connectivity of resources plays a key role—scenarios where relational models traditionally struggle. But with this flexibility comes a conceptual cost: it can be difficult to summarise and understand, at a high level, the content that a given graph contains. Hence profiling graphs becomes of increasing importance to extract order, a posteriori, from the chaotic processes by which such graphs are often generated. This talk will motivate the use of graphs as a data model, abstract recent trends in graph data management, and then turn to the issue of profiling graphs: what are the goals of such profiling, the principles by which graphs can be summarised, the main techniques by which this can/could be achieved? The talk will emphasise the importance of profiling graphs while highlighting a variety of open research questions yet to be tackled.
14:40 – 15:00	Paper presentation Mohamed Ben Ellefi, Odile Papini, Djamal Merad, Jean-Marc Boi, Jean-Philip Royer, Jérôme Pasquet, Jean-Christophe Sourisseau, Filipe Castro, Mohamad Motasem Nawaf and Pierre Drap Cultural Heritage Resources Profiling: Ontology-based Approach (14:40-15:00)
15:00 – 15:40	Coffee break
15:40 – 15:55	Paper presentation Sebastian Neumaier, Lőrinc Thurnay, Thomas J. Lampoltshammer and Tomáš Knap Search, Filter, Fork, and Link Open Data - The ADEQUATe platform: data- and community-driven quality improvements (15:40-15:55) (Short paper)
15:55 – 16:50	Panel discussion with Paul Groth, Aidan Hogan, Jeni Tennison, Stefan Dietze and Natasha Noy
16:50 – 17:00	Summary of discussions, wrap up

Chairs and Organizers

Program Committee

Charlie Abela (University of Malta)
Alessandro Adamou (The Insight Centre, Ireland)
Marco Antonio Casanova (Pontifical Catholic University of Rio de Janeiro, Brazil)
Philipp Cimiano (Bielefeld University, Germany)
Enrico Daga (The Open University, UK)
Ruslan Fayzrakhmanov (University of Oxford, UK)
Max Froumentin (Government Digital Service, UK)
Simon Gottschalk (L3S Research Center, Germany)
Michael Gubanov (University of Texas at San Antonio, USA)
Peter Haase (metaphacts, Germany)
Tom Heath (Arup, UK)
Luis-Daniel Ibáñez (University of Southampton, UK)
Emilia Kacprzak (The Open Data Institute, UK)
Eva Méndez (University Carlos III of Madrid, Spain)
Stefano Modafferi (University of Southampton, UK)
Dmitry Mouromtsev (ITMO University, Russia)
Axel-Cyrille Ngonga Ngomo (University of Paderborn, Germany)
Natalya Noy (Google, USA)
Andreas Nuernberger (Otto-von-Guericke University of Magdeburg, Germany)
Liudmila Ostroumova Prokhorenkova (Yandex, Russia)
Bernardo Pereira Nunes (Pontifical Catholic University of Rio de Janeiro, Brazil)
Axel Polleres (Vienna University of Economics and Business - WU, Austria)
Muhammad Saleem (University Of Leizpig, Germany)
Emanuel Sallinger (University of Oxford, UK)
Arno Scharl (Modul University, Austria)
Nicolas Tempelmeier (L3S Research Center, Germany)
Thanassis Tiropanis (University of Southampton, UK)
Konstantin Todorov (LIRMM / University of Montpellier, France)
Nicolas Torzec (Yahoo, USA)
Raquel Trillo-Lado (Universidad de Zaragoza, Spain)
Jürgen Umbrich (Vienna University of Economics and Business - WU, Austria)
Ran Yu (L3S Research Center, Germany)

Organization Committee

Laura Koesten, Open Data Institute and University of Southampton.

Dr. Elena Demidova, L3S Research Center (Hannover, Germany).

Dr. Vadim Savenkov, Vienna University of Economics and Business.

Dr. John Breslin, National University of Ireland Galway.

Prof. Oscar Corcho, Universidad Politécnica de Madrid.

Dr. Stefan Dietze, L3S Research Center (Hannover, Germany).

Prof. Elena Simperl, University of Southampton.