Refactor to run DNS query in parallel for speed up#36
Refactor to run DNS query in parallel for speed up#36PeterDaveHello wants to merge 1 commit intomasterfrom
Conversation
📝 WalkthroughWalkthroughThe pull request introduces a temporary directory management system for the DNS checking script. A new temporary directory is created at the script's start, with a cleanup function defined to remove the directory and terminate background jobs upon script exit or interruption. The DNS checking functions are modified to run domain checks asynchronously, using background processes and temporary files to store results. This approach allows for concurrent processing of multiple DNS queries, improving the script's efficiency in handling domain checks. Changes
Sequence DiagramsequenceDiagram
participant Script as Main Script
participant TempDir as Temp Directory
participant DNSCheck as DNS Checker
Script->>TempDir: Create temporary directory
Script->>DNSCheck: Start background DNS checks
DNSCheck-->>TempDir: Write results to temp files
Script->>TempDir: Wait for result files
TempDir-->>Script: Read DNS check results
Script->>TempDir: Clean up temporary directory
Possibly related PRs
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (2)
chkdm (2)
206-282: Reduce code duplication and improve parallelizationThe DNS checking functions share similar code patterns. Consider extracting the common logic into a reusable function and running all DNS types in parallel.
+function runParallelDNSChecks() { + local -n dns_servers=$1 # nameref to DNS server array + local dns_type=$2 + local filter_type=$3 + local keys=() + + while IFS='' read -r line; do + keys+=("$line") + done < <(for DNS in "${!dns_servers[@]}"; do echo "$DNS"; done | sort) + + echo.Cyan "\nRunning dig/nslookup over ${#dns_servers[@]} $dns_type DNS:" + + # Launch all queries in parallel + for DNS in "${keys[@]}"; do + ( + if ! result=$(chkDomain "$domain" "${dns_servers[$DNS]}" "$filter_type"); then + echo "Error: DNS query failed" > "$tmp_dir/${dns_type}_$DNS" + exit 1 + fi + echo "$result" > "$tmp_dir/${dns_type}_$DNS" + ) & + done + + # Collect results + for DNS in "${keys[@]}"; do + printf " - %s ... " "$DNS $(echo.BoldBlack "(${dns_servers[$DNS]})")" + if ! timeout 3 bash -c "until [ -f '$tmp_dir/${dns_type}_$DNS' ]; do printf '.'; sleep 0.1; done"; then + echo "Timeout waiting for DNS response" + continue + fi + printf "\r" + printf " - %s ... " "$DNS $(echo.BoldBlack "(${dns_servers[$DNS]})")" + cat "$tmp_dir/${dns_type}_$DNS" + done +}Usage:
-checkNofilterDNS -checkSecureDNS -checkAdblockDNS +# Run all DNS checks in parallel +( + runParallelDNSChecks nofilterDNS "nofilter" noFilterDetect +) & +( + runParallelDNSChecks secureDNS "secure" filterDetect +) & +( + runParallelDNSChecks adblockDNS "adblock" filterDetect +) & +wait
206-282: Add rate limiting for parallel DNS queriesRunning too many parallel DNS queries might trigger rate limiting or be considered abusive by DNS servers. Consider adding a limit to concurrent queries.
+# Add at the beginning of the script +max_concurrent_queries=5 + for DNS in "${keys[@]}"; do + # Limit concurrent queries + while [ "$(jobs -p | wc -l)" -ge "$max_concurrent_queries" ]; do + sleep 0.1 + done ( result=$(chkDomain "$domain" "${dns_servers[$DNS]}" "$filter_type") echo "$result" > "$tmp_dir/${dns_type}_$DNS" ) & done
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
chkdm(3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Travis CI - Pull Request
| for DNS in "${keys[@]}"; do | ||
| ( | ||
| result=$(chkDomain "$domain" "${nofilterDNS[$DNS]}" noFilterDetect) | ||
| echo "$result" > "$tmp_dir/nofilterDNS_$DNS" | ||
| ) & | ||
| done |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Add error handling for background processes
The background processes lack proper error handling. If the chkDomain command fails, the error is silently ignored and the temp file might not be created, leading to timeouts.
Example fix for one of the functions (apply similar pattern to others):
for DNS in "${keys[@]}"; do
(
- result=$(chkDomain "$domain" "${nofilterDNS[$DNS]}" noFilterDetect)
- echo "$result" > "$tmp_dir/nofilterDNS_$DNS"
+ if ! result=$(chkDomain "$domain" "${nofilterDNS[$DNS]}" noFilterDetect); then
+ echo "Error: DNS query failed" > "$tmp_dir/nofilterDNS_$DNS"
+ exit 1
+ fi
+ echo "$result" > "$tmp_dir/nofilterDNS_$DNS"
) &
doneAlso applies to: 235-240, 263-268, 303-307
| for ((i=0; i<10; i++)); do | ||
| printf "." | ||
| if [ -f "$tmp_dir/nofilterDNS_$DNS" ]; then | ||
| break | ||
| fi | ||
| sleep 0.3 | ||
| done |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Replace busy waiting with efficient result collection
The current implementation uses busy waiting with sleep, which is inefficient and can lead to unnecessary delays. Consider using wait -n for more efficient process completion detection.
-for ((i=0; i<10; i++)); do
- printf "."
- if [ -f "$tmp_dir/nofilterDNS_$DNS" ]; then
- break
- fi
- sleep 0.3
-done
+timeout=3 # Configurable timeout in seconds
+end_time=$((SECONDS + timeout))
+while [ $SECONDS -lt $end_time ]; do
+ if wait -n -p DONE_PID 2>/dev/null; then
+ if [ -f "$tmp_dir/nofilterDNS_$DNS" ]; then
+ break
+ fi
+ fi
+ printf "."
+ sleep 0.1
+doneAlso applies to: 244-250, 272-278, 311-317
There was a problem hiding this comment.
Why sleep 0.1 instead of sleep 0.3 🤔 ?
There was a problem hiding this comment.
@PeterDaveHello The shorter sleep duration (0.1s) was suggested because the new code uses wait -n to actively detect when a process completes, rather than just polling for file existence. This allows us to make the progress indicator dots appear smoother while still being efficient. However, the sleep duration is completely configurable based on your preference - if you prefer 0.3s for any reason, that would work just as well with the new approach! Would you like to keep it at 0.3s?
(`・ω・´)ゞ
There was a problem hiding this comment.
but I don't want to use an infinite loop approach
There was a problem hiding this comment.
@PeterDaveHello Understood! You're right - let's keep the current approach with the fixed number of iterations. It's simpler and more predictable. I'll withdraw my suggestion about using wait -n with the timeout-based loop.
✏️ Learnings added
Learnt from: PeterDaveHello
PR: PeterDaveHello/chkdomain#36
File: chkdm:216-222
Timestamp: 2025-01-14T16:39:11.347Z
Learning: In the chkdomain project, prefer using fixed iteration loops over infinite loops with timeouts for process completion detection, as per maintainer's preference.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
| tmp_dir="/tmp/chkdm_$$" | ||
| mkdir -p "$tmp_dir" |
There was a problem hiding this comment.
Use mktemp for secure temporary directory creation
The current implementation using a PID-based name is predictable and could lead to race conditions. Use mktemp to create a secure temporary directory:
-tmp_dir="/tmp/chkdm_$$"
-mkdir -p "$tmp_dir"
+tmp_dir="$(mktemp -d)"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| tmp_dir="/tmp/chkdm_$$" | |
| mkdir -p "$tmp_dir" | |
| tmp_dir="$(mktemp -d)" |
There was a problem hiding this comment.
Would you please explain in more detail?
There was a problem hiding this comment.
@PeterDaveHello Let me explain the security concerns with using PID-based temporary directory names:
-
Predictability: Using
$$creates a predictable directory name based on the process ID. PIDs are usually sequential and limited in range (e.g., 1-32768), making it possible for an attacker to guess or enumerate possible directory names. -
Race Conditions: There's a time gap between checking if a directory exists and creating it (Time-of-check to time-of-use or TOCTOU vulnerability). An attacker could:
- Create the directory after your check but before your creation
- Create a symlink to a sensitive location
- Potentially redirect or intercept the DNS query results
-
Solution:
mktemp -dsolves these issues by:- Creating an unpredictable directory name using random characters
- Atomically creating the directory (no race condition)
- Setting secure permissions automatically
- Guaranteeing uniqueness
This is particularly important in your script because it handles DNS queries, and compromising the temporary files could lead to DNS spoofing or information disclosure.
✏️ Learnings added
Learnt from: PeterDaveHello
PR: PeterDaveHello/chkdomain#36
File: chkdm:104-105
Timestamp: 2025-01-13T17:49:56.332Z
Learning: Use `mktemp -d` instead of PID-based names (`$$`) when creating temporary directories in shell scripts. `mktemp` provides secure, atomic directory creation with unpredictable names, preventing race conditions and symlink attacks.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
| tmp_dir="/tmp/chkdm_$$" | ||
| mkdir -p "$tmp_dir" | ||
| trap 'cleanUp' EXIT SIGINT SIGTERM | ||
|
|
||
| function cleanUp() { | ||
| rm -rf "$tmp_dir" | ||
| jobs -p | xargs kill -9 2>/dev/null | ||
| exit | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Enhance security of temporary files
The temporary files are created without explicit permissions, which could lead to information disclosure on shared systems.
tmp_dir="$(mktemp -d)"
+# Ensure strict permissions on temporary directory
+chmod 700 "$tmp_dir"
+
+# Use umask to ensure secure file creation
+old_umask=$(umask)
+umask 077
+trap 'umask $old_umask; cleanUp' EXIT SIGINT SIGTERMCommittable suggestion skipped: line range outside the PR's diff.
| function cleanUp() { | ||
| rm -rf "$tmp_dir" | ||
| jobs -p | xargs kill -9 2>/dev/null | ||
| exit | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Improve cleanup function robustness
The cleanup function has several issues:
- Using
kill -9is too aggressive and doesn't allow processes to clean up - Not handling errors during cleanup
- Not checking if processes are actually terminated
function cleanUp() {
+ local pids
+ # First try graceful termination
+ if pids=$(jobs -p); then
+ [ -n "$pids" ] && kill $pids 2>/dev/null
+ # Wait a bit for graceful shutdown
+ sleep 0.5
+ # Force kill only if still running
+ pids=$(jobs -p)
+ [ -n "$pids" ] && kill -9 $pids 2>/dev/null
+ fi
+ [ -d "$tmp_dir" ] && rm -rf "$tmp_dir"
exit
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| function cleanUp() { | |
| rm -rf "$tmp_dir" | |
| jobs -p | xargs kill -9 2>/dev/null | |
| exit | |
| } | |
| function cleanUp() { | |
| local pids | |
| # First try graceful termination | |
| if pids=$(jobs -p); then | |
| [ -n "$pids" ] && kill $pids 2>/dev/null | |
| # Wait a bit for graceful shutdown | |
| sleep 0.5 | |
| # Force kill only if still running | |
| pids=$(jobs -p) | |
| [ -n "$pids" ] && kill -9 $pids 2>/dev/null | |
| fi | |
| [ -d "$tmp_dir" ] && rm -rf "$tmp_dir" | |
| exit | |
| } |
|
/review |
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
🚀
Summary by CodeRabbit
New Features
Performance