Skip to content

Feat : Implementations of RAG pipeline for merchant policy#4

Open
Rd4dev wants to merge 5 commits into
mainfrom
ai-rag-feature
Open

Feat : Implementations of RAG pipeline for merchant policy#4
Rd4dev wants to merge 5 commits into
mainfrom
ai-rag-feature

Conversation

@Rd4dev

@Rd4dev Rd4dev commented Jun 13, 2026

Copy link
Copy Markdown
Owner

Resolves #3

Included in this PR

  1. New AI-RAG-SERVICE microservice to handle all the assistant support for mert.
  2. Spring AI implementation of utilizing the AI model inference via Ollama.
  3. This is currently built on local model: qwen2.5-coder:7b

[WIP] - To clear debug logs

Summary by CodeRabbit

  • New Features
    • Added a new microservice with POST /ask that answers questions using extracted PDF content, on-demand embeddings, and relevance ranking.
  • Chores
    • Added project tooling: Maven wrapper, IDE project/encoding settings, and repository ignore/attributes updates.
  • Tests
    • Added a basic application context startup test.

Rd4dev added 3 commits June 13, 2026 13:39
This is going to be the first of the iterations to getting in touch with how we can query on local documents to have better responses, and this for now runs on the local ollama model - which is slow but those are considerations for later to optimize or move to cloud. While the current focus will lie on the feature implementation and making / extending it to be accurate for various needs
This is a quick save point right when introducing the return policy (generated via Gemini holding 22 pages for merchants on Indian terms) but the results are very slow. The 22 page document was provided in full without any chunking or embedding and that when queries resulting in a 5 to 8x delay to provide the answer considering everything is run locally on ollama qwen coder 2.7. This will need to be compared again future versions after chunking
@coderabbitai

coderabbitai Bot commented Jun 13, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Adds a new Maven Spring Boot module ai-rag-service with IDE/project entries and wrapper scripts, a Spring application and test, an HTTP POST /ask controller, PDF-parsing knowledge service, embedding and cosine utilities, and a RAG service that builds context from top-ranked policy chunks.

Changes

AI RAG Service Module Implementation

Layer / File(s) Summary
IDE and repository config
.idea/compiler.xml, .idea/encodings.xml, .idea/misc.xml, .idea/modules.xml, ai-rag-service/.gitattributes, ai-rag-service/.gitignore
IntelliJ module registration and compiler annotation-processing profile for ai-rag-service, UTF-8 encoding mapping, and git attributes/ignore rules for wrapper and IDE files.
Maven wrapper, scripts, and POM
ai-rag-service/.mvn/wrapper/maven-wrapper.properties, ai-rag-service/mvnw, ai-rag-service/mvnw.cmd, ai-rag-service/pom.xml
Maven wrapper configured for Maven 3.9.16 with Unix and Windows bootstrap scripts; pom.xml declares Spring Boot parent, Spring AI (Ollama), PDFBox, Lombok annotation processing, and build plugin settings.
Spring Boot bootstrap & config
ai-rag-service/src/main/java/com/mert/airagservice/AiRagServiceApplication.java, ai-rag-service/src/main/resources/application.properties, ai-rag-service/src/test/java/com/mert/airagservice/AiRagServiceApplicationTests.java
Application entrypoint annotated with @SpringBootApplication, properties configuring Ollama endpoints and server port 4003, and a context-load smoke test.
RAG API and services
ai-rag-service/src/main/java/com/mert/airagservice/dto/AskRequest.java, .../service/PdfKnowledgeService.java, .../service/EmbeddingService.java, .../service/RagService.java, .../rag/CosineSimilarity.java, .../controller/AssistantController.java
AskRequest DTO; PdfKnowledgeService extracts and chunks a bundled PDF; EmbeddingService wraps an EmbeddingModel; RagService ranks chunks by cosine similarity and builds top-3 context; AssistantController exposes POST /ask which builds a strict context prompt and calls the ChatClient.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 A rabbit nudges a new module awake,
PDFs are chunked, embeddings take shape,
Maven boots, IntelliJ aligns,
Ollama listens, the controller signs—
RAG hops in to answer what you make.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: implementation of a RAG pipeline for merchant policy support. It directly reflects the core objective of the PR.
Linked Issues check ✅ Passed The PR implements all core coding requirements from #3: RAG pipeline for merchant support with token/latency tracking via stdout, embeddings and similarity-based retrieval, and extensible architecture supporting policy chunks and future expansion.
Out of Scope Changes check ✅ Passed All changes are scoped to building the ai-rag-service microservice required by #3. IDE configuration files and Maven wrapper setup are standard project setup activities necessary for the new module.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ai-rag-feature

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.idea/compiler.xml:
- Around line 13-21: Update the IntelliJ annotation-processor configuration so
it does not hard-code a specific Lombok JAR: edit the <processorPath> entry that
currently points to
"$MAVEN_REPOSITORY$/org/projectlombok/lombok/1.18.46/lombok-1.18.46.jar" and
either remove that <entry> entirely or replace it with a Maven-managed reference
(e.g., rely on classpath-based processorPath or the IDE's Maven-managed artifact
resolution) so the <processorPath> and <entry> do not pin lombok to 1.18.46 and
will honor the version declared/managed by the project.

In `@ai-rag-service/.mvn/wrapper/maven-wrapper.properties`:
- Around line 1-3: The maven-wrapper properties file is missing checksum pins,
so add distributionSha256Sum (for the apache-maven-3.9.16-bin.zip referenced by
distributionUrl) and wrapperSha256Sum (for the maven-wrapper binary) to enable
checksum validation: verify the official apache-maven-3.9.16-bin.zip (using
Apache’s PGP/SHA-512 artifacts), compute its SHA-256 and set
distributionSha256Sum=<computed-sha256>, then compute the SHA-256 for the
maven-wrapper jar used by this project and set
wrapperSha256Sum=<computed-sha256>; keep existing keys (wrapperVersion,
distributionType, distributionUrl) unchanged.

In `@ai-rag-service/mvnw`:
- Around line 194-223: The script can fall through without downloading if
neither wget/curl are found and set_java_home fails; add a clear fail-fast error
path by inserting an else branch after the existing 'elif set_java_home; then'
block that calls die with a descriptive message (e.g., "No download mechanism
available: install wget or curl, or ensure Java is available") so failures
surface immediately; reference the existing helpers set_java_home and die and
include the distributionUrl in the message for context.

In
`@ai-rag-service/src/main/java/com/mert/airagservice/controller/AssistantController.java`:
- Around line 55-57: Remove the raw System.out.println calls that print model
output and latency in AssistantController (delete the prints of
response.chatResponse() and the "LATENCY ms" line); if latency must be recorded,
replace with a structured logger call that logs only the numeric latency metric
(compute end-start into a long millis variable) using the class logger (e.g.,
logger.info or metrics recording) and do not include the model response text or
any request/context data in logs.
- Around line 53-54: AssistantController currently does a blocking synchronous
call via chatClient.prompt(prompt).call() with no timeouts/retries; update the
client invocation to enforce explicit connection/read timeouts and a retry
policy and/or offload the call to a separate thread so servlet threads aren't
exhausted: change the ChatClient usage in AssistantController to use a
configured non-blocking client or a timed call (e.g., use a WebClient-backed
Spring AI ChatClient or wrap the .call() in a CompletableFuture with a timeout
and proper exception handling), configure retry (exponential backoff, max
attempts) for transient failures, and ensure timeouts are set via configuration
(application.properties) or a dedicated bean that builds the Ollama
client/WebClient with connect/read timeouts and retry filter; alternatively
annotate the controller method to delegate to an `@Async` service method that
performs the timed/retried chatClient.prompt(prompt).call() so the servlet
thread is not blocked.

In `@ai-rag-service/src/main/java/com/mert/airagservice/dto/AskRequest.java`:
- Line 3: Add Bean Validation to the AskRequest DTO and enforce it on the
controller: annotate the record component in AskRequest (the String question)
with validation annotations such as `@NotBlank` and `@Size`(max = 2000) (e.g.,
public record AskRequest(`@NotBlank` `@Size`(max = 2000) String question) { }) and
update the AssistantController.ask method signature to accept a validated body
by adding `@Valid` and `@RequestBody` to the parameter (e.g., public String
ask(`@Valid` `@RequestBody` AskRequest request) throws IOException { ... }), so
empty or oversized prompts are rejected at the DTO boundary.

In
`@ai-rag-service/src/main/java/com/mert/airagservice/service/PdfKnowledgeService.java`:
- Around line 13-18: getPolicyText() in PdfKnowledgeService leaks resources and
re-parses the PDF on every call; fix by loading and extracting the PDF once
(e.g., in the PdfKnowledgeService constructor or a `@PostConstruct` init method)
into a final cached field like policyText, and ensure all IO resources are
closed with try-with-resources (close the InputStream and PDDocument after
extraction); then have getPolicyText() simply return the cached policyText.
Ensure you reference the existing method getPolicyText(), the PDDocument usage
(Loader.loadPDF) and PDFTextStripper extraction when moving logic into init and
add proper exception handling for the startup load.

In `@ai-rag-service/src/main/resources/application.properties`:
- Around line 2-5: The properties file currently hardcodes
spring.ai.ollama.base-url, spring.ai.ollama.chat.model and server.port; change
these to use externalized configuration placeholders (e.g., ${OLLAMA_BASE_URL},
${OLLAMA_CHAT_MODEL}, ${SERVER_PORT}) so values come from environment variables
or a config server at runtime, and provide sensible local defaults only for
development; update any code reading these properties (references to
spring.ai.ollama.base-url and spring.ai.ollama.chat.model) to expect
externalized values rather than hardcoded constants.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: d094aec1-48cc-4395-8bf1-bcface122baf

📥 Commits

Reviewing files that changed from the base of the PR and between 70ba33e and 14fec71.

⛔ Files ignored due to path filters (1)
  • ai-rag-service/src/main/resources/knowledge/merchant-refund-policy.pdf is excluded by !**/*.pdf
📒 Files selected for processing (16)
  • .idea/compiler.xml
  • .idea/encodings.xml
  • .idea/misc.xml
  • .idea/modules.xml
  • ai-rag-service/.gitattributes
  • ai-rag-service/.gitignore
  • ai-rag-service/.mvn/wrapper/maven-wrapper.properties
  • ai-rag-service/mvnw
  • ai-rag-service/mvnw.cmd
  • ai-rag-service/pom.xml
  • ai-rag-service/src/main/java/com/mert/airagservice/AiRagServiceApplication.java
  • ai-rag-service/src/main/java/com/mert/airagservice/controller/AssistantController.java
  • ai-rag-service/src/main/java/com/mert/airagservice/dto/AskRequest.java
  • ai-rag-service/src/main/java/com/mert/airagservice/service/PdfKnowledgeService.java
  • ai-rag-service/src/main/resources/application.properties
  • ai-rag-service/src/test/java/com/mert/airagservice/AiRagServiceApplicationTests.java

Comment thread .idea/compiler.xml
Comment on lines +13 to +21
<profile name="Annotation profile for ai-rag-service" enabled="true">
<sourceOutputDir name="target/generated-sources/annotations" />
<sourceTestOutputDir name="target/generated-test-sources/test-annotations" />
<outputRelativeToContentRoot value="true" />
<processorPath useClasspath="false">
<entry name="$MAVEN_REPOSITORY$/org/projectlombok/lombok/1.18.46/lombok-1.18.46.jar" />
</processorPath>
<module name="ai-rag-service" />
</profile>

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Locate ai-rag-service pom.xml and any other pom that mentions lombok
echo "== POM files containing 'lombok' =="
rg -n --hidden --glob '!*node_modules/*' --glob '*.xml' 'lombok' . || true

# Print the exact lombok dependency/version from ai-rag-service/pom.xml (if present)
if [ -f "ai-rag-service/pom.xml" ]; then
  echo "== ai-rag-service/pom.xml (lombok sections) =="
  rg -n --hidden --glob '*.xml' 'lombok' ai-rag-service/pom.xml || true
  echo
  # Attempt to extract the lombok version value
  rg -n --hidden 'lombok-([0-9]+\.[0-9]+\.[0-9]+)|<artifactId>lombok</artifactId>|<version>' ai-rag-service/pom.xml || true
else
  echo "== ai-rag-service/pom.xml not found at path ai-rag-service/pom.xml =="
fi

# Read the referenced .idea/compiler.xml snippet
echo "== .idea/compiler.xml lines 1-80 =="
if [ -f ".idea/compiler.xml" ]; then
  sed -n '1,80p' .idea/compiler.xml
else
  echo ".idea/compiler.xml not found"
fi

Repository: Rd4dev/Mert

Length of output: 3475


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== ai-rag-service/pom.xml: show Lombok-related blocks with context =="
# Show around each lombok groupId/artifactId occurrence
rg -n --hidden '<groupId>org\.projectlombok</groupId>|<artifactId>lombok</artifactId>' ai-rag-service/pom.xml
echo

# Dump a wider slice around the first Lombok occurrence to capture the version tag
# (adjusting line ranges based on matches)
# Use grep to get line numbers first
first=$(rg -n --hidden '<groupId>org\.projectlombok</groupId>' ai-rag-service/pom.xml | head -n1 | cut -d: -f1 || true)
second=$(rg -n --hidden '<groupId>org\.projectlombok</groupId>' ai-rag-service/pom.xml | sed -n '2p' | cut -d: -f1 || true)

if [ -n "${first:-}" ]; then
  start=$((first-20))
  end=$((first+60))
  echo "== Around first Lombok occurrence (lines $start-$end) =="
  sed -n "${start},${end}p" ai-rag-service/pom.xml
fi

if [ -n "${second:-}" ]; then
  start=$((second-20))
  end=$((second+80))
  echo "== Around second Lombok occurrence (lines $start-$end) =="
  sed -n "${start},${end}p" ai-rag-service/pom.xml
fi

echo
echo "== Full pom.xml lines 1-160 (small enough?) =="
wc -l ai-rag-service/pom.xml
sed -n '1,160p' ai-rag-service/pom.xml

Repository: Rd4dev/Mert

Length of output: 10910


🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n --hidden --glob '*.xml' 'lombok.*version|<version>.*lombok|1\.18\.[0-9]+' . || true
rg -n --hidden --glob '*.properties' 'lombok' . || true
rg -n --hidden --glob '*.pom' 'lombok' . || true

Repository: Rd4dev/Mert

Length of output: 177


🌐 Web query:

Spring Boot 4.1.0 dependency management Lombok version 1.18.x

💡 Result:

Spring Boot 4.1.0, released on June 10, 2026, manages the Project Lombok dependency automatically through its Bill of Materials (BOM) [1][2]. In this version of Spring Boot, the managed version for Lombok is 1.18.34 [3]. Because Spring Boot provides a curated dependency management system, you generally do not need to specify a version for Lombok in your Maven or Gradle build files when using the Spring Boot starters [4][5]. If you wish to override this managed version, you can do so by setting the lombok.version property in your project's configuration [6][5]. Top results: [1][2][6][3]

Citations:


Avoid pinning Lombok annotation processor to lombok-1.18.46.jar in IntelliJ

.idea/compiler.xml hard-codes "$MAVEN_REPOSITORY$/org/projectlombok/lombok/1.18.46/lombok-1.18.46.jar, but ai-rag-service/pom.xml declares Lombok without a version (so it’s managed by the Spring Boot parent 4.1.0, which manages Lombok as 1.18.34). Point IntelliJ to the Maven-managed processor artifact instead (or remove the fixed processorPath entry) to avoid annotation processing breaking when the resolved Lombok version changes or isn’t present.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.idea/compiler.xml around lines 13 - 21, Update the IntelliJ
annotation-processor configuration so it does not hard-code a specific Lombok
JAR: edit the <processorPath> entry that currently points to
"$MAVEN_REPOSITORY$/org/projectlombok/lombok/1.18.46/lombok-1.18.46.jar" and
either remove that <entry> entirely or replace it with a Maven-managed reference
(e.g., rely on classpath-based processorPath or the IDE's Maven-managed artifact
resolution) so the <processorPath> and <entry> do not pin lombok to 1.18.46 and
will honor the version declared/managed by the project.

Comment on lines +1 to +3
wrapperVersion=3.3.4
distributionType=only-script
distributionUrl=https://repo.maven.apache.org/maven2/org/apache/maven/apache-maven/3.9.16/apache-maven-3.9.16-bin.zip

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="ai-rag-service/.mvn/wrapper/maven-wrapper.properties"
echo "== $FILE =="
if [ -f "$FILE" ]; then
  nl -ba "$FILE" | sed -n '1,120p'
else
  echo "File not found: $FILE"
  exit 1
fi

Repository: Rd4dev/Mert

Length of output: 175


🌐 Web query:

Apache Maven 3.9.16 apache-maven-3.9.16-bin.zip SHA-256

💡 Result:

The official Apache Maven distribution directory for version 3.9.16 provides SHA-512 checksum files, but does not provide SHA-256 checksum files [1]. To verify the integrity of the apache-maven-3.9.16-bin.zip file, the Apache Maven project officially recommends verifying the PGP signature [2]. You can download the detached signature file (apache-maven-3.9.16-bin.zip.asc) from the official distribution index [1] and verify it against the public keys used by the Apache Maven developers [2]. If you specifically require the SHA-256 checksum for your own local setup (such as for use in a Maven wrapper configuration), you must calculate it yourself after downloading the file [3]. You can generate the SHA-256 checksum of the file on a system with standard tools using the following command: shasum -a 256 apache-maven-3.9.16-bin.zip Alternatively, if you are using Windows PowerShell, you can use: Get-FileHash apache-maven-3.9.16-bin.zip -Algorithm SHA256

Citations:


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="ai-rag-service/.mvn/wrapper/maven-wrapper.properties"
echo "== $FILE =="
if [ -f "$FILE" ]; then
  sed -n '1,200p' "$FILE"
else
  echo "File not found: $FILE"
  exit 1
fi

Repository: Rd4dev/Mert

Length of output: 282


🌐 Web query:

maven wrapper 3.3.4 distributionSha256Sum distributionSha512Sum properties

💡 Result:

In Maven Wrapper (including version 3.3.4), checksum verification is supported through specific properties in the.mvn/wrapper/maven-wrapper.properties file to ensure the integrity of downloaded files and prevent supply-chain attacks [1][2]. The supported properties for checksum verification are: 1. distributionSha256Sum: Specifies the expected SHA-256 checksum (in hex notation, small caps) of the Maven distribution being downloaded [1][2]. 2. wrapperSha256Sum: Specifies the expected SHA-256 checksum (in hex notation, small caps) of the maven-wrapper.jar file [3][1][2]. Regarding SHA-512 support: While there have been development discussions and pull requests (such as MWRAPPER-117) to introduce distributionSha512Sum and wrapperSha512Sum properties [4], these were not part of the standard, stable feature set for the 3.3.4 release. As of Maven Wrapper 3.3.4, the official documentation and API only explicitly recognize and implement SHA-256 checksum validation [3][1][5][6]. Key points for configuration: - Properties must be added to.mvn/wrapper/maven-wrapper.properties [1][2]. - When using the maven-wrapper-plugin to update or reinstall the wrapper, existing manual properties like distributionSha256Sum may be overwritten or removed unless handled carefully, as the plugin often regenerates the properties file [7]. - Checksums should be provided in hexadecimal format using lowercase letters [1][2].

Citations:


Pin Maven Wrapper checksums to prevent unchecked downloads
ai-rag-service/.mvn/wrapper/maven-wrapper.properties doesn’t set distributionSha256Sum (and also lacks wrapperSha256Sum), so the wrapper can’t validate downloaded artifacts via checksums beyond TLS.
Apache Maven doesn’t publish SHA-256 for Maven 3.9.16 (it provides SHA-512 and PGP signatures); verify the official apache-maven-3.9.16-bin.zip using those, compute its SHA-256, then pin distributionSha256Sum (and wrapperSha256Sum) to enable checksum validation.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ai-rag-service/.mvn/wrapper/maven-wrapper.properties` around lines 1 - 3, The
maven-wrapper properties file is missing checksum pins, so add
distributionSha256Sum (for the apache-maven-3.9.16-bin.zip referenced by
distributionUrl) and wrapperSha256Sum (for the maven-wrapper binary) to enable
checksum validation: verify the official apache-maven-3.9.16-bin.zip (using
Apache’s PGP/SHA-512 artifacts), compute its SHA-256 and set
distributionSha256Sum=<computed-sha256>, then compute the SHA-256 for the
maven-wrapper jar used by this project and set
wrapperSha256Sum=<computed-sha256>; keep existing keys (wrapperVersion,
distributionType, distributionUrl) unchanged.

Comment thread ai-rag-service/mvnw
Comment on lines +194 to +223
if [ -z "${MVNW_USERNAME-}" ] && command -v wget >/dev/null; then
verbose "Found wget ... using wget"
wget ${__MVNW_QUIET_WGET:+"$__MVNW_QUIET_WGET"} "$distributionUrl" -O "$TMP_DOWNLOAD_DIR/$distributionUrlName" || die "wget: Failed to fetch $distributionUrl"
elif [ -z "${MVNW_USERNAME-}" ] && command -v curl >/dev/null; then
verbose "Found curl ... using curl"
curl ${__MVNW_QUIET_CURL:+"$__MVNW_QUIET_CURL"} -f -L -o "$TMP_DOWNLOAD_DIR/$distributionUrlName" "$distributionUrl" || die "curl: Failed to fetch $distributionUrl"
elif set_java_home; then
verbose "Falling back to use Java to download"
javaSource="$TMP_DOWNLOAD_DIR/Downloader.java"
targetZip="$TMP_DOWNLOAD_DIR/$distributionUrlName"
cat >"$javaSource" <<-END
public class Downloader extends java.net.Authenticator
{
protected java.net.PasswordAuthentication getPasswordAuthentication()
{
return new java.net.PasswordAuthentication( System.getenv( "MVNW_USERNAME" ), System.getenv( "MVNW_PASSWORD" ).toCharArray() );
}
public static void main( String[] args ) throws Exception
{
setDefault( new Downloader() );
java.nio.file.Files.copy( java.net.URI.create( args[0] ).toURL().openStream(), java.nio.file.Paths.get( args[1] ).toAbsolutePath().normalize() );
}
}
END
# For Cygwin/MinGW, switch paths to Windows format before running javac and java
verbose " - Compiling Downloader.java ..."
"$(native_path "$JAVACCMD")" "$(native_path "$javaSource")" || die "Failed to compile Downloader.java"
verbose " - Running Downloader.java ..."
"$(native_path "$JAVACMD")" -cp "$(native_path "$TMP_DOWNLOAD_DIR")" Downloader "$distributionUrl" "$(native_path "$targetZip")"
fi

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail fast when no download mechanism is available.

If wget/curl are missing and set_java_home returns non-zero, the script currently falls through to checksum/extraction with no downloaded archive. That produces a misleading unzip/tar failure instead of a clear bootstrap error.

🔧 Proposed fix
 if [ -z "${MVNW_USERNAME-}" ] && command -v wget >/dev/null; then
   verbose "Found wget ... using wget"
   wget ${__MVNW_QUIET_WGET:+"$__MVNW_QUIET_WGET"} "$distributionUrl" -O "$TMP_DOWNLOAD_DIR/$distributionUrlName" || die "wget: Failed to fetch $distributionUrl"
 elif [ -z "${MVNW_USERNAME-}" ] && command -v curl >/dev/null; then
   verbose "Found curl ... using curl"
   curl ${__MVNW_QUIET_CURL:+"$__MVNW_QUIET_CURL"} -f -L -o "$TMP_DOWNLOAD_DIR/$distributionUrlName" "$distributionUrl" || die "curl: Failed to fetch $distributionUrl"
 elif set_java_home; then
   verbose "Falling back to use Java to download"
   ...
+else
+  die "Need wget, curl, or a working JDK to download Maven"
 fi
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ai-rag-service/mvnw` around lines 194 - 223, The script can fall through
without downloading if neither wget/curl are found and set_java_home fails; add
a clear fail-fast error path by inserting an else branch after the existing
'elif set_java_home; then' block that calls die with a descriptive message
(e.g., "No download mechanism available: install wget or curl, or ensure Java is
available") so failures surface immediately; reference the existing helpers
set_java_home and die and include the distributionUrl in the message for
context.

Comment on lines +53 to +54
ChatClient.CallResponseSpec response = chatClient.prompt(prompt)
.call();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Locate app config files:"
fd -i 'application*.properties|application*.yml|application*.yaml'

echo
echo "Search for Spring AI/Ollama timeout & retry settings:"
rg -n --iglob 'application*.{properties,yml,yaml}' 'spring\.ai\.ollama|timeout|connect-timeout|read-timeout|retry'

echo
echo "Search Java config for client timeout/retry customization:"
rg -n --type=java -C2 'ChatClient\.Builder|WebClient|RestClient|HttpClient|timeout|Retry'

Repository: Rd4dev/Mert

Length of output: 1310


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "All application*.properties/yml/yaml files:"
fd -i 'application*.properties|application*.yml|application*.yaml' .

echo
echo "Search for Ollama / Spring AI timeout & retry properties across config: "
rg -n --iglob '**/application*.{properties,yml,yaml}' '(spring\.ai\.ollama\..*|ollama\..*|timeout|connect-timeout|read-timeout|retry)' .

echo
echo
echo "Search Java for timeout/retry configuration in ChatClient / HTTP clients:"
rg -n --type=java -S '(ChatClient\.Builder|WebClient|RestClient|HttpClient|timeout|Timeout|retry|Retry)' ai-rag-service/src/main/java

Repository: Rd4dev/Mert

Length of output: 899


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Search for ChatClient.Builder creation/config in ai-rag-service:"
rg -n --type=java '`@Bean`|ChatClient\.Builder|ChatClient\.builder|spring\.ai\.ollama|Ollama' ai-rag-service/src/main/java

echo
echo "Search for any custom HttpClient/WebClient/timeout usage in ai-rag-service:"
rg -n --type=java '(WebClient|RestClient|HttpClient|timeout|Timeout|connect-timeout|read-timeout|retry|Retry)' ai-rag-service/src/main/java

Repository: Rd4dev/Mert

Length of output: 389


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Search whole repo for ChatClient usage:"
rg -n --type=java 'ChatClient(\.Builder|\.builder)?|spring\.ai\.ollama|Ollama' .

echo
echo
echo "Search for Spring `@Configuration` / `@Bean` creating ChatClient or HTTP client infrastructure:"
rg -n --type=java '(`@Configuration`|`@Bean`).*(ChatClient|ollama|Ollama|WebClient|RestClient|HttpClient|timeout|Retry|retry)' ai-rag-service/src/main/java

echo
echo "Search for any application-* property files:"
fd -i 'application*.{properties,yml,yaml}' ai-rag-service/src/main/resources

Repository: Rd4dev/Mert

Length of output: 1043


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="ai-rag-service/src/main/java/com/mert/airagservice/controller/AssistantController.java"
nl -ba "$FILE" | sed -n '1,120p'

Repository: Rd4dev/Mert

Length of output: 98


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="ai-rag-service/src/main/java/com/mert/airagservice/controller/AssistantController.java"
echo "First 140 lines of $FILE:"
cat -n "$FILE" | sed -n '1,140p'

Repository: Rd4dev/Mert

Length of output: 2683


Add explicit timeout/retry for the synchronous Ollama/Spring AI call

  • AssistantController makes a blocking call on the request thread: chatClient.prompt(prompt).call() (lines 53-54).
  • ai-rag-service/src/main/resources/application.properties only sets spring.ai.ollama.base-url and spring.ai.ollama.chat.model; no explicit timeout/retry (connection/read) configuration exists in config or Java.

Configure explicit client timeouts/retry policy for the Ollama/Spring AI call (and consider async/offloading) to avoid servlet-thread exhaustion on slow/hung model responses.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ai-rag-service/src/main/java/com/mert/airagservice/controller/AssistantController.java`
around lines 53 - 54, AssistantController currently does a blocking synchronous
call via chatClient.prompt(prompt).call() with no timeouts/retries; update the
client invocation to enforce explicit connection/read timeouts and a retry
policy and/or offload the call to a separate thread so servlet threads aren't
exhausted: change the ChatClient usage in AssistantController to use a
configured non-blocking client or a timed call (e.g., use a WebClient-backed
Spring AI ChatClient or wrap the .call() in a CompletableFuture with a timeout
and proper exception handling), configure retry (exponential backoff, max
attempts) for transient failures, and ensure timeouts are set via configuration
(application.properties) or a dedicated bean that builds the Ollama
client/WebClient with connect/read timeouts and retry filter; alternatively
annotate the controller method to delegate to an `@Async` service method that
performs the timed/retried chatClient.prompt(prompt).call() so the servlet
thread is not blocked.

@@ -0,0 +1,3 @@
package com.mert.airagservice.dto;

public record AskRequest(String question) { }

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate question at the DTO boundary.

Line 3 currently accepts null/blank/unbounded input, so /ask can forward empty or oversized prompts to the model path. Add bean validation here and enforce it with @Valid on the controller method argument.

Proposed fix
 package com.mert.airagservice.dto;
 
+import jakarta.validation.constraints.NotBlank;
+import jakarta.validation.constraints.Size;
+
-public record AskRequest(String question) { }
+public record AskRequest(
+        `@NotBlank`(message = "question is required")
+        `@Size`(max = 1000, message = "question is too long")
+        String question
+) { }
// In ai-rag-service/src/main/java/com/mert/airagservice/controller/AssistantController.java
public String ask(`@Valid` `@RequestBody` AskRequest request) throws IOException { ... }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ai-rag-service/src/main/java/com/mert/airagservice/dto/AskRequest.java` at
line 3, Add Bean Validation to the AskRequest DTO and enforce it on the
controller: annotate the record component in AskRequest (the String question)
with validation annotations such as `@NotBlank` and `@Size`(max = 2000) (e.g.,
public record AskRequest(`@NotBlank` `@Size`(max = 2000) String question) { }) and
update the AssistantController.ask method signature to accept a validated body
by adding `@Valid` and `@RequestBody` to the parameter (e.g., public String
ask(`@Valid` `@RequestBody` AskRequest request) throws IOException { ... }), so
empty or oversized prompts are rejected at the DTO boundary.

Comment on lines +2 to +5
spring.ai.ollama.base-url=http://localhost:11434
spring.ai.ollama.chat.model=qwen2.5-coder:7b

server.port=4003 No newline at end of file

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Externalize runtime configuration instead of hardcoding local endpoints.

Hardcoding Ollama host/model and server port makes this service fragile outside local dev. In non-local environments, requests will fail unless the exact localhost setup is replicated.

Suggested fix
 spring.application.name=ai-rag-service
-spring.ai.ollama.base-url=http://localhost:11434
-spring.ai.ollama.chat.model=qwen2.5-coder:7b
+spring.ai.ollama.base-url=${OLLAMA_BASE_URL:http://localhost:11434}
+spring.ai.ollama.chat.model=${OLLAMA_CHAT_MODEL:qwen2.5-coder:7b}
 
-server.port=4003
+server.port=${SERVER_PORT:4003}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ai-rag-service/src/main/resources/application.properties` around lines 2 - 5,
The properties file currently hardcodes spring.ai.ollama.base-url,
spring.ai.ollama.chat.model and server.port; change these to use externalized
configuration placeholders (e.g., ${OLLAMA_BASE_URL}, ${OLLAMA_CHAT_MODEL},
${SERVER_PORT}) so values come from environment variables or a config server at
runtime, and provide sensible local defaults only for development; update any
code reading these properties (references to spring.ai.ollama.base-url and
spring.ai.ollama.chat.model) to expect externalized values rather than hardcoded
constants.

@Rd4dev

Rd4dev commented Jun 13, 2026

Copy link
Copy Markdown
Owner Author

Additional Context on the runs:

When the prompt context is small (3 lines) the response time was around 4 seconds

promptToekns=1
completionTokens=14
totalTokens=15
Latency ms = 4694

To extend the context as we incorporate the Pdf context the latency shot over 8x time

promptTokens=4091, 
completionTokens=321, 
totalTokens=4412
LATENCY ms = 337681

@Rd4dev

Rd4dev commented Jun 13, 2026

Copy link
Copy Markdown
Owner Author

correct but inefficient

… retrieval with LLM

The system was very slow initially for every call to handle answer chunk embedding and question embedding reaching over 5 to 6 mins. This is handled by caching the results of the embedding, but still the question chunks need to be embedded so that still costs but overall bringing it down to 24 to 90 seconds, though I heavily doubt that the prolonged time is due to the local model taking time to print them all at once and not the time when it started to push out the answers. As smaller outputs pushed out first uncached question output in 8 seconds and a cached question output in 3 seconds, still a bit on the slower end, but all this is happening in the CPU.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@ai-rag-service/src/main/java/com/mert/airagservice/controller/AssistantController.java`:
- Around line 34-37: In AssistantController.ask validate request.question() for
null or blank before calling ragService.buildContext: check that
AskRequest.request.question() is non-null and not empty after trim, and if
invalid throw a client error (e.g. throw new
ResponseStatusException(HttpStatus.BAD_REQUEST, "question must be provided") or
return a 400 response) so embedding is never called with empty input; update the
validation at the start of the ask method to perform this guard and include a
clear error message.

In
`@ai-rag-service/src/main/java/com/mert/airagservice/rag/CosineSimilarity.java`:
- Around line 4-17: The CosineSimilarity.score method must validate inputs:
check that a and b are non-null and have the same length (throw
IllegalArgumentException with a clear message if not), compute dot, normA and
normB as before, and if either norm is zero return 0.0 (instead of performing
the division) to avoid NaN/Infinity; otherwise return dot / (Math.sqrt(normA) *
Math.sqrt(normB)). Ensure these guards are applied at the start of score(float[]
a, float[] b) and keep the core loop as-is once validation passes.

In `@ai-rag-service/src/main/java/com/mert/airagservice/service/RagService.java`:
- Around line 16-17: The chunkEmbeddingCache is declared as a non-thread-safe
HashMap in RagService and is mutated during request handling (symbol:
chunkEmbeddingCache in class RagService), which can race under concurrent
requests; replace the HashMap with a thread-safe map (e.g., new
ConcurrentHashMap<>()) and, where the cache is updated (the mutation sites that
currently call put/contains/putIfAbsent around chunkEmbeddingCache), use atomic
map operations such as computeIfAbsent or putIfAbsent to avoid races; update
imports and ensure no other code assumes HashMap-specific behavior.
- Around line 29-39: The comparator in RagService (when building topChunks)
repeatedly recomputes embeddings and CosineSimilarity.score during sorting;
precompute each chunk's embedding and its cosine score against questionVector
once (e.g., build a Map<String, Float[]> or List of pairs and a Map<String,
Double> scores using chunkEmbeddingCache.computeIfAbsent and
CosineSimilarity.score), then sort chunks using those cached scores (sorting by
the precomputed scoreB/scoreA) and collect the top results; update the code that
references chunkEmbeddingCache, embeddingService.embed, CosineSimilarity.score,
and the topChunks construction to use the precomputed score lookup rather than
computing inside the comparator.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 1cac9d3b-c095-4b3b-bd43-d6214d036067

📥 Commits

Reviewing files that changed from the base of the PR and between 14fec71 and 4b1353d.

📒 Files selected for processing (6)
  • ai-rag-service/src/main/java/com/mert/airagservice/controller/AssistantController.java
  • ai-rag-service/src/main/java/com/mert/airagservice/rag/CosineSimilarity.java
  • ai-rag-service/src/main/java/com/mert/airagservice/service/EmbeddingService.java
  • ai-rag-service/src/main/java/com/mert/airagservice/service/PdfKnowledgeService.java
  • ai-rag-service/src/main/java/com/mert/airagservice/service/RagService.java
  • ai-rag-service/src/main/resources/application.properties
🚧 Files skipped from review as they are similar to previous changes (1)
  • ai-rag-service/src/main/resources/application.properties

Comment on lines +34 to +37
public String ask(@RequestBody AskRequest request) throws IOException {
long start = System.currentTimeMillis();
String context = ragService.buildContext(request.question());

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate question before calling RAG.

Line [36] dereferences request.question() without guards; null/blank input can cascade into embedding failures and return 500 instead of a client error.

Suggested minimal fix
 import org.springframework.web.bind.annotation.PostMapping;
 import org.springframework.web.bind.annotation.RequestBody;
 import org.springframework.web.bind.annotation.RestController;
+import org.springframework.web.server.ResponseStatusException;
+import org.springframework.http.HttpStatus;
@@
     `@PostMapping`("/ask")
     public String ask(`@RequestBody` AskRequest request) throws IOException {
+        if (request == null || request.question() == null || request.question().isBlank()) {
+            throw new ResponseStatusException(HttpStatus.BAD_REQUEST, "question must not be blank");
+        }
         long start = System.currentTimeMillis();
         String context = ragService.buildContext(request.question());
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ai-rag-service/src/main/java/com/mert/airagservice/controller/AssistantController.java`
around lines 34 - 37, In AssistantController.ask validate request.question() for
null or blank before calling ragService.buildContext: check that
AskRequest.request.question() is non-null and not empty after trim, and if
invalid throw a client error (e.g. throw new
ResponseStatusException(HttpStatus.BAD_REQUEST, "question must be provided") or
return a 400 response) so embedding is never called with empty input; update the
validation at the start of the ask method to perform this guard and include a
clear error message.

Comment on lines +4 to +17
public static double score(float[] a, float[] b) {

double dot = 0;
double normA = 0;
double normB = 0;

for (int i = 0; i < a.length; i++) {
dot += a[i] * b[i];
normA += a[i] * a[i];
normB += b[i] * b[i];
}

return dot / (Math.sqrt(normA) * Math.sqrt(normB));
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard vector dimensions and zero-norm denominator.

Line [10]-[16] assumes equal lengths and non-zero norms. Mismatched arrays can crash; zero vectors produce NaN/Infinity and corrupt ranking behavior.

Suggested fix
 public class CosineSimilarity {
     public static double score(float[] a, float[] b) {
+        if (a == null || b == null) {
+            throw new IllegalArgumentException("Vectors must not be null");
+        }
+        if (a.length != b.length) {
+            throw new IllegalArgumentException("Vectors must have the same dimension");
+        }
 
         double dot = 0;
         double normA = 0;
         double normB = 0;
@@
-        return dot / (Math.sqrt(normA) * Math.sqrt(normB));
+        double denominator = Math.sqrt(normA) * Math.sqrt(normB);
+        if (denominator == 0d) {
+            return 0d;
+        }
+        return dot / denominator;
     }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
public static double score(float[] a, float[] b) {
double dot = 0;
double normA = 0;
double normB = 0;
for (int i = 0; i < a.length; i++) {
dot += a[i] * b[i];
normA += a[i] * a[i];
normB += b[i] * b[i];
}
return dot / (Math.sqrt(normA) * Math.sqrt(normB));
}
public static double score(float[] a, float[] b) {
if (a == null || b == null) {
throw new IllegalArgumentException("Vectors must not be null");
}
if (a.length != b.length) {
throw new IllegalArgumentException("Vectors must have the same dimension");
}
double dot = 0;
double normA = 0;
double normB = 0;
for (int i = 0; i < a.length; i++) {
dot += a[i] * b[i];
normA += a[i] * a[i];
normB += b[i] * b[i];
}
double denominator = Math.sqrt(normA) * Math.sqrt(normB);
if (denominator == 0d) {
return 0d;
}
return dot / denominator;
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ai-rag-service/src/main/java/com/mert/airagservice/rag/CosineSimilarity.java`
around lines 4 - 17, The CosineSimilarity.score method must validate inputs:
check that a and b are non-null and have the same length (throw
IllegalArgumentException with a clear message if not), compute dot, normA and
normB as before, and if either norm is zero return 0.0 (instead of performing
the division) to avoid NaN/Infinity; otherwise return dot / (Math.sqrt(normA) *
Math.sqrt(normB)). Ensure these guards are applied at the start of score(float[]
a, float[] b) and keep the core loop as-is once validation passes.

Comment thread ai-rag-service/src/main/java/com/mert/airagservice/service/RagService.java Outdated
@Rd4dev

Rd4dev commented Jun 13, 2026

Copy link
Copy Markdown
Owner Author

After chunking with top 3 scores

Initial Run

promptTokens=599, completionTokens=11, totalTokens=610
LATENCY ms = 80501

Subsequent run

promptTokens=1, completionTokens=11, totalTokens=12
LATENCY ms = 3474

The time taken to load things on the fly especially on first query is heavy so moving to the postcontruct, while still as things run on the cpu it takes time and question query embedding takes time, but most of it is to print the output

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ai-rag-service/src/main/java/com/mert/airagservice/service/RagService.java`:
- Around line 42-43: buildContext currently calls
embeddingService.embed(question) without validating the input; add a guard at
the start of the buildContext(String question) method to check for null or blank
(e.g., question == null || question.trim().isEmpty()) and throw a clear
IllegalArgumentException (or a service-level BadRequestException) with a message
like "question must not be null or blank" so embeddingService.embed(...) is
never invoked for invalid input; this prevents runtime/model errors when
AskRequest.question() is missing or empty.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 6d9fe252-9c73-49cb-bd1f-8b98aaa11033

📥 Commits

Reviewing files that changed from the base of the PR and between 4b1353d and b03413c.

📒 Files selected for processing (3)
  • ai-rag-service/src/main/java/com/mert/airagservice/controller/AssistantController.java
  • ai-rag-service/src/main/java/com/mert/airagservice/service/PdfKnowledgeService.java
  • ai-rag-service/src/main/java/com/mert/airagservice/service/RagService.java
💤 Files with no reviewable changes (2)
  • ai-rag-service/src/main/java/com/mert/airagservice/controller/AssistantController.java
  • ai-rag-service/src/main/java/com/mert/airagservice/service/PdfKnowledgeService.java

Comment on lines +42 to +43
public String buildContext(String question) {
float[] questionVector = embeddingService.embed(question);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate question before embedding to prevent avoidable request-time failures.

Line [43] sends question directly to the embedding model. Since AskRequest.question() is not validated in the provided path, null/blank payloads can trigger model/runtime errors and fail /ask with a server error.

Suggested patch
 public String buildContext(String question) {
+    if (question == null || question.isBlank()) {
+        throw new IllegalArgumentException("question must not be null or blank");
+    }
+
     float[] questionVector = embeddingService.embed(question);
     List<String> topChunks = chunks.stream()
         .map(chunk -> Map.entry(
                 chunk,
                 CosineSimilarity.score(
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ai-rag-service/src/main/java/com/mert/airagservice/service/RagService.java`
around lines 42 - 43, buildContext currently calls
embeddingService.embed(question) without validating the input; add a guard at
the start of the buildContext(String question) method to check for null or blank
(e.g., question == null || question.trim().isEmpty()) and throw a clear
IllegalArgumentException (or a service-level BadRequestException) with a message
like "question must not be null or blank" so embeddingService.embed(...) is
never invoked for invalid input; this prevents runtime/model errors when
AskRequest.question() is missing or empty.

@Rd4dev

Rd4dev commented Jun 15, 2026

Copy link
Copy Markdown
Owner Author

Todo:

  1. Trace down retrieval
  2. Fix chunking
    • sentence based or
    • section based or
    • paragraph based
  3. Make top 3 to bigger number (as similarity comparison is already done no extra effort)
    • though costs in prompts sent to LLM
  4. Re-rank

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RAG implementation to support LLM based support system

1 participant