[MA] Semantic AI Tool for Data Retrieval: Enhancing LLM Accuracy in Engineering Contexts

Description

Large-scale engineering projects—such as high-voltage direct current (HVDC) stations, offshore wind connections, and gas-fired power plants—generate vast amounts of multi-modal data about the engineering design and project execution processes. Managing this data efficiently is critical for successful project execution and long-term asset management activities.

Recent advances in AI, particularly large language models (LLMs), offer new opportunities to automate engineering data management, extraction, and analytics. AI-powered tools can summarize technical content across documents, reducing time spent on administrative tasks and enabling engineers to focus on complex problem-solving. However, a key challenge lies in accurately interpreting domain-specific terminology and contextual nuances—such as distinguishing between “rated current” and “short circuit current” of an equipment. General-purpose AI models often misinterpret such terms, leading to unreliable outputs and reduced trust in AI-assisted workflows.

This thesis explores semantic techniques, such as semantic similarity models, to improve the precision of AI-driven information retrieval in engineering contexts. The goal is to evaluate and compare semantic similarity models for distinguishing between closely related engineering terms and to develop a prototype AI tool that integrates semantic understanding into natural language queries. This tool will support engineers by automating data extraction from technical documents and ensuring consistent terminology across teams.

Expected Outcomes

Evaluation of semantic similarity models for engineering terminology
Prototype AI tool with semantic matching capabilities
Performance analysis comparing semantic vs. non-semantic approaches

This thesis is conducted in cooperation with Siemens Energy. If you are interested in this topic, please contact Prakhar Mehta (prakhar.mehta@siemens-energy.com).