International Workshop on Profiling and Searching Data on the Web

April 24, 2018, Lyon, France. Co-located with The Web Conference '2018


Maarten de Rijke,
University of Amsterdam
Aidan Hogan,
Universidad de Chile

Objectives and Goals

The web of data has seen tremendous growth recently. New forms of structured data have emerged in the form of web markup, such as schema.org, and a large amount of data in web tables. Considering these rich, heterogeneous and evolving data sources which cover a wide variety of domains, the exploitation of web data becomes increasingly important in the context of various applications, including (federated) search, question answering and fact verification.

The objective of this workshop is to bring together researchers and practitioners interested in the development of data search techniques, data profiling, and dataset retrieval on the web. This includes looking at the specifics of data-centric information seeking behaviour, understanding interaction challenges in data search on the web, and analysing the cognitive processes involved in the consumption of structured data by users. At the same time we aim to discuss technologies addressing data search – including semantics, information retrieval for web data (ranking algorithms and indexing), in particular in the context of decentralised and distributed systems, such as the web. We are interested in approaches to analyse, characterise and discover data sources. We want to facilitate a discussion around data search across formats and domain-specific applications.

We envision the workshop as a forum for researchers and practitioners to come together and discuss common challenges and identify synergies for joint initiatives. We welcome contributions describing technical approaches, as well as those related to Human Computer Interaction research in data discovery, profiling and retrieval.

Topics and Themes

PROFILES & DATA:SEARCH ’18 seeks application-oriented papers, as well as more theoretical papers and position papers. The workshop proposes a multidisciplinary discussion on the following themes, with a focus on RDF, CSV, JSON and other structured and semi-structured datasets:

Data Profiling

Human Data Interaction

We are interested in contributions using a variety of methods. This can include, for example, user studies, lab experiments, system based evaluation, but also experiments using gamification and crowdsourcing.

## Submission Guidelines

We welcome the following types of contributions:

We encourage full papers (8 pages), short papers (4 pages) as well as position papers (2 pages). All submissions must be written in English and must be formatted according to the ACM format. The proceedings of the workshop will be included in the companion proceedings of The WebConf2018. Each submission will be reviewed by at least 2 members of the PC. Papers will be evaluated according to their significance, originality, technical content, style, clarity, and relevance to the workshop. Please submit your contributions electronically in PDF format via the Easychair system: https://easychair.org/conferences/?conf=profiles-datasearch2018

We follow a single-blind process with at least two reviewers per paper. Papers will be evaluated according to their significance, originality, technical content, style, clarity, and relevance to the workshop.

Important Dates

Workshop paper submissions due: 24 January 6 February 2018

Workshop paper notifications sent: 14 February 2018

Camera-ready copies due: 01 March 2018

PROFILES & DATA:SEARCH Workshop: 24 April 2018

Tentative Schedule

09:00 – 09:15Introduction by organisers
09:15 – 10:00Keynote Maarten de Rijke
10:00 - 10:45Keynote Aidan Hogan
10:45 – 11:00 Coffee break
11:00 – 12:30 Presentation of papers, questions, discussions
12:30 – 14:00 Lunch break
14:00 – 15:30 Round tables
We expect 2 to 4 round tables around topics based on accepted submissions and statements of interest by the workshop participants.
15:30 – 16:00 Coffee break
16:00 – 17:00 Summary of discussions from the round tables, call to action for future activities, wrap up of the workshop
19:00 – 22:00 Social event

Chairs and Organizers

Program Committee


Laura Koesten, laura.koesten@theodi.org is a Marie Curie Skłodowska fellow, doing her PhD at the Open Data Institute and at the University of Southampton in the UK. She is part of WDAqua, a European Union’s Horizon 2020 initiative to advance state of the art Question Answering. Her research interests are Human Computer Interaction, Interactive Information Retrieval with a focus on dataset retrieval, Open Data and and Semantic Interfaces. In her PhD she is looking at ways to improve Human Data Interaction in IIR systems. She publishes at CHI and has a background in Human Factors, with an MSc from Loughborough University.

Dr. Elena Demidova, demidova@L3S.de is a Senior Researcher at the L3S Research Center (Hannover, Germany). Her main research interests are in Web, Semantic Web, cross-lingual data analytics, Web Data and Database Usability. Elena coordinates Data4UrbanMobility project and has been involved in leading roles in EU projects, such as WDAqua ITN, ARCOMEM IP and KEYSTONE Cost Action. Her work has been published throughout major conferences and journals and she has been reviewer and committee member for scientific events and publications. Elena is the main organizer of the PROFILES workshop series since 2014 and has an extensive experience in organizing international scientific and project events.

Dr. Vadim Savenkov, vadim.savenkov@wu.ac.at, is a post-doc researcher in the Vienna University of Economics and Business (WU). He did his PhD on the foundations of information integration in Vienna University of Technology (2012). His research spans various topics of data integration on the Web, including querying and updates in the Semantic Web context. He is running an Austrian research project CommuniData.at, which aims at increasing the accessibility of Open Data for the end users.

Dr. John Breslin, john.breslin@insight-centre.org, is a Senior Lecturer in Electronic Engineering at the National University of Ireland Galway. He is Director of TechInnovate at NUI Galway. He is a Funded Investigator at the FutureMilk Centre and the Confirm Centre for Smart Manufacturing, and is Co-Principal Investigator at the Insight Centre for Data Analytics. He co-created the SIOC framework, implemented in hundreds of applications (by Yahoo, Boeing, Vodafone, etc.) on over 25,000 websites. He has been budget holder for over €4 million of funding, as PI/co-PI on projects totalling nearly €30 million received for NUI Galway. He has written over 180 peer-reviewed academic publications (h-index of 36, >5000 citations, with best paper awards from SEMANTICS, ICEGOV, ESWC, IEEE PELS), and co-authored the books “The Social Semantic Web” and “Social Semantic Web Mining”. John is co-founder of boards.ie (Ireland’s largest social media website), adverts.ie (classified ads website), and StreamGlider (real-time streaming newsreader app). He has won two IIA Net Visionary awards. He is an advisor to AYLIEN, BuilderEngine, CrowdGather, and Pocket Anatomy. He is co-founder of Startup Galway and the Galway City Innovation District. He also serves on the boards of WestBIC and the American Council of Exercise.

Prof. Oscar Corcho, ocorcho@fi.upm.es, is professor at Departamento de Inteligencia Artificial (Facultad de Informática , Universidad Politécnica de Madrid), and he co-leads the Ontology Engineering Group. His research activities are focused on Open Science, Open Data, Semantic Web and Ontological Engineering. In these areas, he has participated in a number of national and European research projects. Previously, he worked as a Marie Curie research fellow at the University of Manchester, and was a research manager at iSOCO. He holds a degree in Computer Science, an MSc in Software Engineering and a PhD in Computational Science and Artificial Intelligence from UPM. He was awarded the Third National Award by the Spanish Ministry of Education in 2001, and the Juan López de Peñalver Award by the Spanish Royal Society of Engineering in 2016. He has published several books, from which “Ontological Engineering” can be highlighted as it is being used as a reference book in a several university lectures worldwide, and more than 100 papers in journals, conferences and workshops. He usually participates in the organisation or in the programme committees of relevant international conferences and workshops.

Dr. Stefan Dietze, dietze@L3S.de, is a research group leader at the L3S Research Center (Hannover, Germany). His main research interests are in Artificial Intelligence, Web Science and Information Retrieval and the application of these areas to the problem of retrieving, linking and fusing entity-centric Web data. Stefan has been coordinator of a large number of international R&D projects and is involved in a variety of community activities aimed at increasing the take-up of Semantic Web technologies, data and vocabularies. He has published more than 150 papers in major conferences and journals and is organizer, committee and editorial board member for numerous scientific events and publications.

Prof. Elena Simperl, e.simperl@soton.ac.uk, is professor of Computer Science at the University of Southampton, UK. Her research interests include knowledge engineering, Social Web technologies, and crowdsourcing. She has contributed and led over 20 national and European research projects and authored more than 100 scientific publications, and chaired the European Semantic Web Conference (ESWC) in 2011 and 2012 and International Semantic Web Conference (ISWC) in 2016. She was vice-president of STI International until 2016 and the director of the ESWC summer school series. She has co-chaired more than 15 workshops, including the series on Theory and Practice of Social Machines (SOCM) at WWW, Crowdsourcing the Semantic Web at ISWC and Ontology Engineering in a Data Driven World at EKAW.