The 2nd Workshop on Practical LLM-assisted Data-to-Text Generation

News

09/09/2024: The workshop programme and accepted papers list are now available!
06/09/2024: Craig Thomson (ADAPT/DCU), and Marco Valentino (Neuro-Symbolic AI Group / Idiap Research Institute) confirmed as keynote speakers!
24/07/2024: Based on multiple requests, we decided to update the deadline. The submission page is now open until Friday, July 26, 23.59 AoE. Good luck with your submissions!
01/07/2024: We created a public Google group https://groups.google.com/g/public-d2t2024/ where you can follow the news and get technical feedback. For private matters see the /contacts page.
28/06/2024: We now have a date! Practical D2T 2024 will be on 23 September @INLG 2024, Tokyo, Japan!
20/06/2024: Call for paper is online!
19/06/2024: the official Practical D2T 2024 website is online!

Background

Natural Language Generation (NLG) has been an active area of research for decades, both academically and industrially. Data-to-text (D2T) generation (Reiter and Dale, 1997; Gatt and Krahmer, 2018) is the NLG task where a system describes structured data in natural language. Traditionally, commercial D2T systems have been based on symbolic approaches (Dale, 2020; Leppänen et al., 2017), i.e. handcrafted rules or templates. More experimental approaches to D2T, such as E2E and Transformer-based systems (Dušek et al., 2020; Sharma et al., 2022) have been limited to research because of well-known issues like knowledge gaps, lack of factuality, and hallucination (Ji et al., 2023; Wang et al., 2023).

The recently introduced instruction-tuned, multi-task Large Language Models (LLMs) promise to become a viable alternative to rule-based D2T systems. They exhibit the ability to capture knowledge, follow instructions, and produce coherent text from various domains (Sanh et al., 2021; Ouyang et al., 2022). However, even the best LLMs still suffer from well-known issues of neural models, such as lack of controllability and risk of producing harmful text. Recent research thus proposed various approaches to improve the semantic accuracy of LLMs D2T, including prompt tuning (Su et al., 2022; Ye and Durrett, 2022), targeted fine-tuning (Zhang et al., 2024), Retrieval Augmented Generation (RAG) (Jiang et al., 2023; Chen et al., 2024), external tool integration (Wang et al., 2024), and neuro-symbolic approaches (Sarker et al, 2021; Hitzler et al., 2022).

Practical D2T 2024 aims to build a space for researchers to discuss and present innovative work on D2T systems using LLMs. Building upon the 2023 edition’s hackathon, Practical D2T 2024 opens up a broader range of activities, including a special track for neuro-symbolic D2T approaches and a hackathon focused on the evaluation and semantic accuracy of D2T using LLMs.