Enhancing the LLM-based Chemistry Assistant ChemCrow with Interactivity: A Human-Centered Design Approach

Problem Description

Large Language Models (LLMs) have rapidly reshaped the landscape of Conversational Agents (CAs). To address challenges like hallucinations, research has extended LLM functionality with Retrieval-Augmented Generation (RAG) integrating the models with external knowledge sources and enabling them to answer domain-specific questions. Additionally, the explicit prompting of chain-of-thought has emerged as a strategy to guide LLMs in solving complex tasks. These developments have facilitated the implementation of LLM-powered agents with extensive capabilities consisting of a reasoning engine and access to different external tools and knowledge bases. In literature several LLM-based assistants have been introduced, e.g. ChemCrow has been designed for solving chemistry-related tasks. However, limited knowledge exists on how users interact with LLM-based agents and how these agents should be designed to facilitate user understanding, contributing as well as validating the agents reasoning process.

Goal of the thesis

The primary objective is to design and implement an enhanced LLM-based chemistry assistant based on the existing ChemCrow framework. A specific focus should be set on advanced interactivity with users. The thesis should follow a human-centered design approach involving potential users in the form of chemistry students. Overall, a series of experimental studies should be performed.  In a first step, the goal is to understand how users interact with the existing ChemCrow assistant. Insights gained from this analysis will inform design of an enhanced LLM-based chemistry assistant.                                 

Work Packages

  • Implement an interactive LLM-based chemistry assistant using the ChemCrow framework
  • Experimental design for evaluating the assistant with chemistry students
  • Conduct experimental studies and collect data on user interactions
  • Analysis of user feedback to derive insights on improving the assistant's design and functionality


  • Good programming skills in Python
  • General interest in generative AI, large language models, and/or human-computer interaction
  • Good time management and organizational skills
  • English skills


If you are interested and want to apply for this topic, please contact Till Carlo Schelhorn (till.schelhorn∂kit.edu) with a short motivation statement, your CV, and a current transcript of records. Feel free to reach out beforehand if you have any questions.


Bran, Andres M., Sam Cox, Oliver Schilter, Carlo Baldassari, Andrew D. White, and Philippe Schwaller. “ChemCrow: Augmenting Large-Language Models with Chemistry Tools,” October 2, 2023. https://doi.org/10.48550/arXiv.2304.05376.

Wei, Jason, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc V. Le, and Denny Zhou. “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models.” Advances in Neural Information Processing Systems 35 (December 6, 2022): 24824–37.

Yao, Shunyu, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. “ReAct: Synergizing Reasoning and Acting in Language Models,” March 9, 2023. https://doi.org/10.48550/arXiv.2210.03629.