<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]--><style><!--
/* Font Definitions */
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Aptos;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        font-size:12.0pt;
        font-family:"Calibri",sans-serif;
        mso-ligatures:standardcontextual;}
span.EmailStyle17
        {mso-style-type:personal-compose;
        font-family:"Calibri",sans-serif;
        color:windowtext;}
.MsoChpDefault
        {mso-style-type:export-only;}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="#0563C1" vlink="#954F72" style="word-wrap:break-word">
<div class="WordSection1">
<div align="center">
<table class="MsoNormalTable" border="0" cellspacing="0" cellpadding="0" width="600" style="width:6.25in">
<tbody>
<tr>
<td style="padding:0in 0in 0in 0in">
<p class="MsoNormal"><span style="font-family:"Aptos",sans-serif"><img width="600" height="171" style="width:6.25in;height:1.7812in" id="Picture_x0020_2" src="cid:image001.png@01DBA938.6D3FEBD0" alt="Thesis Defense Announcement at the Cullen College of Engineering"></span><span style="font-family:"Aptos",sans-serif;mso-ligatures:none"><o:p></o:p></span></p>
<div align="center">
<table class="MsoNormalTable" border="0" cellspacing="0" cellpadding="0" style="background:white">
<tbody>
<tr>
<td style="padding:30.0pt 15.0pt 7.5pt 15.0pt">
<p class="MsoNormal" align="center" style="text-align:center;mso-line-height-alt:15.0pt">
<b><span style="font-size:18.0pt;font-family:"Times New Roman",serif;color:#C8102E">Agentic Framework for Domain-specific RAG Evaluation<br>
</span></b><b><span style="font-size:13.5pt;font-family:"Times New Roman",serif;color:black;mso-ligatures:none">Pham, Quoc Huy</span></b><span style="font-size:11.0pt;font-family:"Times New Roman",serif;mso-ligatures:none"><o:p></o:p></span></p>
<p class="MsoNormal" align="center" style="text-align:center;line-height:16.5pt">
<span style="font-size:10.5pt;font-family:"Arial",sans-serif;color:black;mso-ligatures:none">April 14, 2025, 12:30 p.m. to 2:00 p.m. (CST)<br>
Location: AERB #222 and </span><span style="font-size:11.0pt;font-family:"Aptos",sans-serif;color:black"><a href="https://urldefense.com/v3/__https://teams.microsoft.com/l/meetup-join/19*3ameeting_YjA1MDRhOWUtYjM5ZC00MjA2LWFkMGEtOTgyMjc1OGI1ZjI0*40thread.v2/0?context=*7b*22Tid*22*3a*22170bbabd-a2f0-4c90-ad4b-0e8f0f0c4259*22*2c*22Oid*22*3a*225dd5bb4d-564d-4df0-bdc3-e572e5ebc98a*22*7d__;JSUlJSUlJSUlJSUlJSUl!!LkSTlj0I!FUshXraW4rXH8r7P92a3pU17kwkj2ODrqGFrUHmTPD-wkbUJo3jEk33Qb2GuMXM7MzchAbwAKb1yX9IhDBrn8KdHHvU$"><span style="font-size:10.5pt;font-family:"Arial",sans-serif;color:#467886;mso-ligatures:none">Teams
link</span></a></span><span style="font-size:11.0pt;font-family:"Aptos",sans-serif;mso-ligatures:none"><o:p></o:p></span></p>
<p class="MsoNormal" align="center" style="text-align:center;line-height:16.5pt">
<span style="font-size:10.5pt;font-family:"Arial",sans-serif;color:black;mso-ligatures:none">Meeting ID: 217 618 049 368<o:p></o:p></span></p>
<p class="MsoNormal" align="center" style="text-align:center;line-height:16.5pt">
<span style="font-size:10.5pt;font-family:"Arial",sans-serif;color:black;mso-ligatures:none">Passcode: B7BH3j5s<o:p></o:p></span></p>
<p class="MsoNormal" align="center" style="margin-bottom:3.75pt;text-align:center;line-height:16.5pt">
<b><span style="font-size:10.5pt;font-family:"Arial",sans-serif;color:black;mso-ligatures:none"><o:p> </o:p></span></b></p>
<p class="MsoNormal" align="center" style="margin-bottom:3.75pt;text-align:center;line-height:16.5pt">
<b><span style="font-size:10.5pt;font-family:"Arial",sans-serif;color:black;mso-ligatures:none">Committee Chair:</span></b><span style="font-size:10.5pt;font-family:"Arial",sans-serif;color:black;mso-ligatures:none"><br>
Vedhus Hoskere, Ph.D. </span><span style="font-size:11.0pt;font-family:"Aptos",sans-serif;mso-ligatures:none"><o:p></o:p></span></p>
<p class="MsoNormal" align="center" style="margin-bottom:15.0pt;text-align:center;line-height:15.0pt">
<b><span style="font-size:10.5pt;font-family:"Arial",sans-serif;color:black;mso-ligatures:none">Committee Members:</span></b><span style="font-size:10.5pt;font-family:"Arial",sans-serif;color:black;mso-ligatures:none"><br>
Craig Glennie, Ph.D. | Nima Ekhtari, Ph.D. | Todd Bradford, LTC</span><span style="font-size:10.5pt;font-family:"Aptos",sans-serif;mso-ligatures:none"><o:p></o:p></span></p>
</td>
</tr>
<tr>
<td style="padding:0in 15.0pt 15.0pt 15.0pt">
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><b><span style="font-family:"Arial",sans-serif;color:#C8102E;mso-ligatures:none">Abstract</span></b><span style="font-family:"Arial",sans-serif;color:#C8102E;mso-ligatures:none"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:10.5pt;font-family:"Arial",sans-serif;color:black;mso-ligatures:none">Large Language Models (LLMs) have revolutionized generation and understanding of textual
information and have wide-ranging applications. While the capabilities of LLMs are immediately impressive to any user, extended use quickly reveals the problems of inaccurate text generation associated with these models. Retrieval-Augmented Generation (RAG)
approaches are used to enhance LLM reliability by grounding responses in knowledge bases, significantly reducing hallucinations.
<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:10.5pt;font-family:"Arial",sans-serif;color:black;mso-ligatures:none">However, RAG performance is highly domain sensitive, necessitating careful tuning of
components before deployment in specialized applications. This challenge underscores the critical need for robust evaluation frameworks tailored to domain-specific RAG systems. Existing methods often rely on heuristic-based metrics such as exact match or BLEU
scores, which fail to capture deeper semantic reasoning and nuanced understanding. Additionally, these frameworks typically depend on manually curated Question-Answer (QA) datasets, which are often unavailable or insufficient in specialized domains.
<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:10.5pt;font-family:"Arial",sans-serif;color:black;mso-ligatures:none">To address these limitations, we propose an Agentic Framework for Domain-Specific RAG
Evaluation. Our approach introduces a synthetic data generation pipeline that simplifies the adaptation of RAG systems to new domains. We incorporate LLM-as-a-judge metrics to enable a more holistic and versatile evaluation of both synthetic datasets and RAG
performance. We present MiliQA, a synthetic data set derived from military documents and compare its quality against public data sets of QA such as Aurelio Mixtral, HuggingFace QA, and WikiEval. To validate our metrics, we benchmark them against human-annotated
datasets, including STS-B and SQuAD 2.0. Finally, we demonstrate the applicability of our framework by evaluating key components of RAG systems including embedding models, LLMs, and multiple RAG methodologies, using MiliQA. The results demonstrate the practical
value of our proposed framework in guiding the design and optimization of RAG systems for domain-specific applications.<o:p></o:p></span></p>
</td>
</tr>
<tr>
<td style="padding:0in 15.0pt 15.0pt 15.0pt">
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-family:"Aptos",sans-serif;color:black"><img border="0" width="600" height="82" style="width:6.25in;height:.8541in" id="Picture_x0020_1" src="cid:image002.png@01DBA938.6D3FEBD0" alt="Engineered For What's Next"></span><b><span style="font-family:"Arial",sans-serif;color:#C8102E;mso-ligatures:none"><o:p></o:p></span></b></p>
</td>
</tr>
</tbody>
</table>
</div>
</td>
</tr>
<tr>
<td style="padding:0in 0in 0in 0in"></td>
</tr>
</tbody>
</table>
</div>
<p class="MsoNormal"><span style="font-size:11.0pt;mso-ligatures:none"><o:p> </o:p></span></p>
<p class="MsoNormal" style="margin-bottom:8.0pt;line-height:106%"><span style="font-size:11.0pt;line-height:106%;font-family:"Aptos",sans-serif"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
</div>
</body>
</html>