Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
81f4ede
feat: add `soaxreport` tool for ECH testing via SOAX proxies
jyyi1 Jan 20, 2026
5b5acd9
Refactor SOAX client into `internal/soax`
jyyi1 Jan 21, 2026
72f8229
Implement a reusable internal/curl package
jyyi1 Jan 21, 2026
e162e9e
implement performance stats collection via `curl -w`
jyyi1 Jan 27, 2026
413f1d8
implement ECH testing with SOAX proxy
jyyi1 Jan 27, 2026
1787668
Implement concurrent SOAX ECH testing
jyyi1 Jan 29, 2026
fa02c6f
Refactor to use CSV country list and include country names
jyyi1 Jan 29, 2026
ce7f872
Ensure same exit node for one ISP thru sticky session.
jyyi1 Jan 29, 2026
0888d55
Implement atomic progress tracking and simplify test func
jyyi1 Feb 2, 2026
79dd375
Separate input ISP and exit node ISP header into different columns.
jyyi1 Feb 2, 2026
f76ed9c
Update parallelism default value in README for soaxreport
jyyi1 Feb 4, 2026
7306193
Add initial analysis report for www.google.com
jyyi1 Feb 4, 2026
98df4fe
Add initial analysis report for mail.google.com
jyyi1 Feb 4, 2026
668f0bc
Add initial analysis report for www.youtube.com
jyyi1 Feb 4, 2026
eb8dc13
Update analysis report for www.google.com
jyyi1 Feb 4, 2026
0a9f9dd
Update divergence chart to ISP based
jyyi1 Feb 4, 2026
5f35f45
Force use main's refactored soaxreport and internal/soax files
jyyi1 Mar 31, 2026
0a5fb04
Restore soaxreport/report and untrack GEMINI.private.md
jyyi1 Mar 31, 2026
4908a8d
Remove redundant internal/curl package
jyyi1 Mar 31, 2026
4ea6723
feat: implement real exit IP discovery for soaxreport
jyyi1 Mar 31, 2026
07c07bc
feat: add geodb asn validation
jyyi1 Mar 31, 2026
7aa573c
docs: update soaxreport README with IP discovery and GeoDB ASN valida…
jyyi1 Mar 31, 2026
a2ee878
refactor(soaxreport): replace analyze.py with Jupyter notebook and up…
jyyi1 Apr 1, 2026
57383e7
docs(soaxreport): add instructions for generating the final report
jyyi1 Apr 1, 2026
0b1ad76
chore: rename soaxreport to ispreport
jyyi1 Apr 3, 2026
9ab2b5e
refactor(ispreport): move output to workspace/ispreport and fix docs
jyyi1 Apr 3, 2026
dd01cd3
feat: use curl json output to capture detailed proxy metrics
jyyi1 Apr 3, 2026
0cf1def
privacy(ispreport): remove raw IP addresses from CSV and struct
jyyi1 Apr 3, 2026
c315f39
docs, data: update ispreport README, AGENTS.md, and results CSVs to r…
jyyi1 Apr 6, 2026
3314c2a
Update report.ipynb with Cloudflare dataset results and enhanced visu…
jyyi1 Apr 6, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,11 @@ You are an expert in data analysis and networking protocols, with a deep underst

This project provides a suite of tools for analyzing the deployment and impact of DNS HTTPS resource records (RRs) and Encrypted ClientHello (ECH). The primary goal is to gather data on DNS latency, service support for ECH and related standards, and potential network interference.

The project is composed of two main Go-based command-line tools:
The project is composed of three main Go-based command-line tools:

1. **`dnsreport`**: Performs large-scale DNS analysis by querying a list of top domains (from the Tranco list) for A, AAAA, and HTTPS records. See `dnsreport/README.md` for more details.
2. **`greasereport`**: Tests ECH GREASE compatibility by issuing HEAD requests to top domains with and without ECH GREASE enabled, using a custom ECH-enabled `curl` binary. It also generates a report summarizing the findings. See `greasereport/README.md` for more details.
3. **`ispreport`**: Tests ISP-level ECH GREASE interference via proxies. It issues HEAD requests to target domains through various ISPs across different countries.

## Workspace

Expand Down
1 change: 1 addition & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ go 1.24.8

require (
github.com/miekg/dns v1.1.70
github.com/oschwald/maxminddb-golang v1.13.1
golang.getoutline.org/sdk/x v0.1.0
golang.org/x/sync v0.19.0
)
Expand Down
2 changes: 2 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ github.com/onsi/ginkgo/v2 v2.12.0 h1:UIVDowFPwpg6yMUpPjGkYvf06K3RAiJXUhCxEwQVHRI
github.com/onsi/ginkgo/v2 v2.12.0/go.mod h1:ZNEzXISYlqpb8S36iN71ifqLi3vVD1rVJGvWRCJOUpQ=
github.com/onsi/gomega v1.27.10 h1:naR28SdDFlqrG6kScpT8VWpu1xWY5nJRCF3XaYyBjhI=
github.com/onsi/gomega v1.27.10/go.mod h1:RsS8tutOdbdgzbPtzzATp12yT7kM5I5aElG3evPbQ0M=
github.com/oschwald/maxminddb-golang v1.13.1 h1:G3wwjdN9JmIK2o/ermkHM+98oX5fS+k5MbwsmL4MRQE=
github.com/oschwald/maxminddb-golang v1.13.1/go.mod h1:K4pgV9N/GcK694KSTmVSDTODk4IsCNThNdTmnaBZ/F8=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/quic-go/qpack v0.5.1 h1:giqksBPnT/HDtZ6VhtFKgoLOWmlyo9Ei6u9PqzIMbhI=
Expand Down
93 changes: 54 additions & 39 deletions internal/echtest/run.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ package echtest

import (
"bytes"
"encoding/json"
"fmt"
"os"
"os/exec"
Expand All @@ -26,18 +27,24 @@ import (
)

type TestResult struct {
Domain string
ECHGrease bool
Error string
CurlExitCode int
CurlErrorName string
Domain string
ECHGrease bool

GoError string
CurlExitCode int
CurlErrorName string
CurlErrorMessage string

DNSLookup time.Duration
TCPConnection time.Duration
TLSHandshake time.Duration
ServerTime time.Duration
TotalTime time.Duration
HTTPStatus int
Stderr string

HTTPStatus int
HTTPConnectStatus int

Stderr string
}

// curlExitCodeNames maps curl exit codes to their CURL_* string representations.
Expand Down Expand Up @@ -145,8 +152,7 @@ func Run(
targetURL := "https://" + domain

args := []string{
"-w",
"dnslookup:%{time_namelookup},tcpconnect:%{time_connect},tlsconnect:%{time_appconnect},servertime:%{time_starttransfer},total:%{time_total},httpstatus:%{http_code}",
"-w", ":::BEGIN_JSON:::%{json}:::END_JSON:::",
"--head",
"--max-time",
strconv.FormatFloat(maxTime.Seconds(), 'f', -1, 64),
Expand Down Expand Up @@ -198,43 +204,52 @@ func Run(
result.CurlExitCode = exitError.ExitCode()
result.CurlErrorName = curlExitCodeNames[result.CurlExitCode]
} else {
result.Error = fmt.Sprintf("failed to execute curl: %v", err)
result.GoError = fmt.Sprintf("failed to execute curl: %v", err)
return result
}
} else {
// Even if err is nil, there might be curl-level errors recorded in stderr
// that the caller might be interested in, though standard execution succeeded.
}

// parse the stdout stats
parts := strings.SplitSeq(stdout.String(), ",")
for part := range parts {
kv := strings.Split(part, ":")
if len(kv) != 2 {
continue
}
key := kv[0]
value := kv[1]

switch key {
case "dnslookup":
f, _ := strconv.ParseFloat(value, 64)
result.DNSLookup = time.Duration(f * float64(time.Second))
case "tcpconnect":
f, _ := strconv.ParseFloat(value, 64)
result.TCPConnection = time.Duration(f * float64(time.Second))
case "tlsconnect":
f, _ := strconv.ParseFloat(value, 64)
result.TLSHandshake = time.Duration(f * float64(time.Second))
case "servertime":
f, _ := strconv.ParseFloat(value, 64)
result.ServerTime = time.Duration(f * float64(time.Second))
case "total":
f, _ := strconv.ParseFloat(value, 64)
result.TotalTime = time.Duration(f * float64(time.Second))
case "httpstatus":
i, _ := strconv.Atoi(value)
result.HTTPStatus = i
// Parse JSON output
if stdout.Len() > 0 {
outStr := stdout.String()
const startMarker = ":::BEGIN_JSON:::"
const endMarker = ":::END_JSON:::"
startIndex := strings.Index(outStr, startMarker)
endIndex := strings.Index(outStr, endMarker)

if startIndex != -1 && endIndex != -1 && endIndex > startIndex {
jsonStr := outStr[startIndex+len(startMarker) : endIndex]
var curlOut struct {
TimeNamelookup float64 `json:"time_namelookup"`
TimeConnect float64 `json:"time_connect"`
TimeAppconnect float64 `json:"time_appconnect"`
TimeStarttransfer float64 `json:"time_starttransfer"`
TimeTotal float64 `json:"time_total"`
HTTPCode int `json:"http_code"`
HTTPConnect int `json:"http_connect"`
Errormsg string `json:"errormsg"`
}
if err := json.Unmarshal([]byte(jsonStr), &curlOut); err == nil {
result.DNSLookup = time.Duration(curlOut.TimeNamelookup * float64(time.Second))
result.TCPConnection = time.Duration(curlOut.TimeConnect * float64(time.Second))
result.TLSHandshake = time.Duration(curlOut.TimeAppconnect * float64(time.Second))
result.ServerTime = time.Duration(curlOut.TimeStarttransfer * float64(time.Second))
result.TotalTime = time.Duration(curlOut.TimeTotal * float64(time.Second))
result.HTTPStatus = curlOut.HTTPCode
result.HTTPConnectStatus = curlOut.HTTPConnect
result.CurlErrorMessage = curlOut.Errormsg
} else {
if result.GoError == "" {
result.GoError = fmt.Sprintf("failed to parse curl json: %v", err)
}
}
} else {
if result.GoError == "" {
result.GoError = "could not find JSON boundaries in curl output"
}
}
}

Expand Down
163 changes: 163 additions & 0 deletions ispreport/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,163 @@
# SOAX ECH GREASE Report Generation

This tool tests ECH GREASE compatibility by issuing requests via SOAX proxies.
It iterates through a list of countries and ISPs, running tests with and without
ECH GREASE to simulate diverse network vantage points.

## Requirements

You need to build the ECH-enabled `curl` and place it in the workspace directory. See [instructions](../curl/README.md).

You also need to set the SOAX credentials as environment variables and provide a list of ISO country codes.

**(Optional) ASN Validation:** To independently verify the ASN of the proxy exit nodes, you can provide a local IP-to-ASN database in `.mmdb` format (such as [MaxMind GeoLite2](https://dev.maxmind.com/geoip/geolite2-free-geolocation-data) or [DB-IP ASN Lite](https://db-ip.com/db/download/ip-to-asn-lite)).

### Configuration

**SOAX Credentials (Environment Variables)**

Set the following environment variables with your SOAX API details:

```bash
export SOAX_API_KEY="YOUR_API_KEY"
export SOAX_PACKAGE_KEY="YOUR_PACKAGE_KEY"
export SOAX_PACKAGE_ID="YOUR_PACKAGE_ID"
# Optional overrides:
# export SOAX_PROXY_HOST="proxy.soax.com"
# export SOAX_PROXY_PORT="5000"
```

**Country List (`countries.csv`)**

The countries file should be a CSV file containing country names and their 2-letter ISO codes. Lines starting with `#` are ignored.

```csv
"United States",US
"United Kingdom",GB
"Germany",DE
# Add more countries as needed
"Virgin Islands, U.S.",VI
```

You can download a complete list of country codes from [here](https://raw.githubusercontent.com/datasets/country-list/master/data.csv).

## Running

To run the tool, ensure your environment variables are set, then use the `go run` command from the project root directory.

**Basic Run:**

```sh
go run ./ispreport --targetDomain www.google.com
```

**With Independent ASN Validation (Recommended):**
First, download a free IP-to-ASN `.mmdb` database (e.g., from DB-IP) to your workspace.
```sh
go run ./ispreport --targetDomain www.google.com --asnDB workspace/dbip-asn-lite.mmdb
```

**With Custom IP Check URL and Verbose Logging:**
```sh
go run ./ispreport --targetDomain www.google.com --ipCheckURL https://ifconfig.me/ip --verbose
```

This will:

1. Load the SOAX credentials from the environment and the country list (`./workspace/countries.csv` by default).
2. For each country, fetch the list of available ISPs.
3. For each ISP, discover the real proxy exit IP via the `ipCheckURL`.
4. Issue requests to the target domain via the SOAX proxy, once with ECH GREASE and once without.
5. Save the results to `./workspace/ispreport/results-<domain>-countries<N>.csv`.

### Parameters

* `-workspace <path>`: Directory to store intermediate files. Defaults to `./workspace`.
* `-countries <path>`: Path to CSV file containing country names and ISO codes. Defaults to `./workspace/countries.csv`.
* `-targetDomain <domain>`: Target domain to test. Defaults to `www.google.com`.
* `-parallelism <number>`: Maximum number of parallel requests. Defaults to `16`.
* `-verbose`: Enable verbose logging.
* `-maxTime <duration>`: Maximum time per curl request. Defaults to `30s`.
* `-curl <path>`: Path to the ECH-enabled curl binary. Defaults to `./workspace/output/bin/curl`.
* `-ipCheckURL <url>`: URL used to discover the real external IP of the proxy. Defaults to `https://ipv4.icanhazip.com/`.
* `-asnDB <path>`: Optional path to a MaxMind or DB-IP `.mmdb` database file for independent ASN verification.

### Output Format

The tool generates two output files in the workspace directory:

1. **Results CSV** (`workspace/ispreport/results-<domain>-countries<N>.csv`): Contains the detailed test results for each request.
2. **ISP Audit Log** (`workspace/ispreport/isps-audit.json`): A JSON file mapping each country code to the list of ISPs discovered and used during the test. This is useful for auditing coverage.

The CSV file contains the following columns:

* `domain`: The domain that was tested.
* `country_code`: The 2-letter ISO country code.
* `country_name`: The full name of the country.
* `isp`: The ISP name of the proxy used.
* `asn`: The ASN of the proxy exit node as reported by the SOAX proxy headers.
* `exit_node_isp`: The ISP name reported by the SOAX proxy headers.
* `geodb_asn`: The ASN corresponding to the `discovered_ip`, looked up in the `-asnDB` (if provided).
* `geodb_as_name`: The AS organization name corresponding to the `discovered_ip`, looked up in the `-asnDB` (if provided).
* `asn_match`: `true` if the SOAX-reported `asn` matches the `geodb_asn`, `false` otherwise.
* `ech_grease`: `true` if ECH GREASE was enabled for the request, `false` otherwise.
* `go_error`: Any error that occurred during the request.
* `curl_exit_code`: The exit code returned by the `curl` command.
* `curl_error_name`: The human-readable name corresponding to the `curl` exit code.
* `curl_error_message`: The detailed error message from curl (if available).
* `dns_lookup_ms`: The duration of the DNS lookup.
* `tcp_connection_ms`: The duration of the TCP connection.
* `tls_handshake_ms`: The duration of the TLS handshake.
* `server_time_ms`: The time from the end of the TLS handshake to the first byte of the response.
* `total_time_ms`: The total duration of the request.
* `http_status`: The HTTP status code of the response.
* `http_connect_status`: The HTTP status code from the proxy connection.

## Generating the Final Report

After running the data collection tool, you can generate a visual report using the provided Jupyter notebook.

### 1. Organize the Data

The notebook expects data to be organized in subdirectories within `ispreport/report/` named after the tested domain.

1. Create a subdirectory for your results (e.g., for `www.google.com`):
```bash
mkdir -p ispreport/report/www_google_com
```

2. Copy and rename the generated results from the `workspace` directory:
```bash
# Use a wildcard to match the generated file with the number of countries
cp workspace/ispreport/results-www_google_com-countries*.csv ispreport/report/www_google_com/results.csv
cp workspace/ispreport/isps-audit.json ispreport/report/www_google_com/isps-audit.json
```

### 2. Setup the Environment

Running the notebook requires Python 3 and several data analysis libraries.

```bash
# From the project root:
# 1. Create the virtual environment if it doesn't exist
python3 -m venv workspace/.venv

# 2. Activate the virtual environment
source workspace/.venv/bin/activate

# 3. Install required dependencies
pip install pandas numpy matplotlib seaborn ipywidgets jupyter
```

### 3. Run the Notebook

1. Navigate to the report directory and start Jupyter:
```bash
cd ispreport/report
# If you didn't activate the venv yet, run: source ../../workspace/.venv/bin/activate
jupyter notebook report.ipynb
```

2. In the first code cell of the notebook, update the `DOMAIN` variable to match the name of the subdirectory you created (e.g., `DOMAIN = "www_google_com"`).

3. Run all cells in the notebook to generate the analysis and visualizations.
Loading