Agentic table talk


Abstract

The paper presents a heuristic-driven solution called Agentic Table Talk, which utilizes general-purpose Large Language Models (LLMs) to solve the problem of Table Question Answering (Tabular Q/A). The goal of this task is to answer the user query based on the provided table. Earlier approaches required whole tables to prompt LLMs. As tables can grow infinitely large, directly incorporating entire tables into prompts becomes infeasible due to the finite context window of LLMs, which can only handle a limited number of input tokens. In contrast, some approaches are entirely driven by Retrieval Augmented Generation (RAG) to ground the LLMs with K relevant records where the value of K is predefined and hard coded. The main drawback of such approaches is that they do not yield satisfactory results for Tabular Q/A, where data is structured and cellular. Additionally, the meaning of individual cells is often context-dependent and can vary substantially based on the values of neighboring cells. To address these challenges, this paper proposes a novel combination of a lightweight heuristic called RAGular SubGraph Retrieval combined with the ReAct Agentic Framework powered by general-purpose LLMs. In this combination, the proposed heuristic adopts the core principles of traditional RAG while placing greater emphasis on preserving the neighborhood relationships of the selected cells. The output from the heuristic guides the subsequent ReAct Agent with a focused input context that consists of only relevant cells and their neighborhood, based on the given input query. Alternatively, the ReAct Agent receives the user query along with the filtered tabular data as a subgraph that is assumed to be part of a larger, unknown graph. The primary objective of the ReAct Agent is to use explicit reasoning to identify the facts necessary to answer the provided user query. This may involve reasoning over the known subgraph or taking actions (such as re-triggering the heuristic with a slightly modified query or finding the closest K-Adjacent neighbors of a relevant cell) to explore new, unexplored cells based on the structure of the already known regions. Both components work together iteratively to discover relevant facts for generating an authentic answer to the user query based on the provided reference table, until the process successfully locates all the necessary information or sufficient confidence in the query's un-answerability is achieved. The proposed approach is evaluated using general-purpose LLMs of varying sizes on benchmarked AIT-QA 1 and HiTab 2 datasets, along with a synthetic dataset 3 (SD) featuring large tables. The evaluation examines how heuristic-driven input contexts affect the reasoning abilities of LLMs of varying sizes in managing complex tables and compares these results to existing methodologies. The codebase is publicly available on Git 4 repository, allowing all experiments to be easily re-executed for reproducing the results.
Ask to review this manuscript

Notes for potential reviewers

  • Volunteering is not a guarantee that you will be asked to review. There are many reasons: reviewers must be qualified, there should be no conflicts of interest, a minimum of two reviewers have already accepted an invitation, etc.
  • This is NOT OPEN peer review. The review is single-blind, and all recommendations are sent privately to the Academic Editor handling the manuscript. All reviews are published and reviewers can choose to sign their reviews.
  • What happens after volunteering? It may be a few days before you receive an invitation to review with further instructions. You will need to accept the invitation to then become an official referee for the manuscript. If you do not receive an invitation it is for one of many possible reasons as noted above.

  • PeerJ Computer Science does not judge submissions based on subjective measures such as novelty, impact or degree of advance. Effectively, reviewers are asked to comment on whether or not the submission is scientifically and technically sound and therefore deserves to join the scientific literature. Our Peer Review criteria can be found on the "Editorial Criteria" page - reviewers are specifically asked to comment on 3 broad areas: "Basic Reporting", "Experimental Design" and "Validity of the Findings".
  • Reviewers are expected to comment in a timely, professional, and constructive manner.
  • Until the article is published, reviewers must regard all information relating to the submission as strictly confidential.
  • When submitting a review, reviewers are given the option to "sign" their review (i.e. to associate their name with their comments). Otherwise, all review comments remain anonymous.
  • All reviews of published articles are published. This includes manuscript files, peer review comments, author rebuttals and revised materials.
  • Each time a decision is made by the Academic Editor, each reviewer will receive a copy of the Decision Letter (which will include the comments of all reviewers).

If you have any questions about submitting your review, please email us at [email protected].