This repository contains the replication package for our empirical study analyzing performance evolution patterns in Java projects at the method level. Our research provides comprehensive insights into how code changes impact performance across 15 mature open-source Java projects.
Performance is a critical quality attribute in software development, yet the impact of method-level code changes on performance evolution remains poorly understood. We conducted a large-scale empirical study analyzing performance evolution in 15 mature open-source Java projects hosted on GitHub. Our analysis encompassed 739 commits containing 1,499 method-level code changes, using Java Microbenchmark Harness (JMH) for precise performance measurement and rigorous statistical analysis to quantify both the significance and magnitude of performance variations.
Our study addresses four key research questions:
- RQ1: What are the patterns of performance changes in Java projects over time?
- RQ2: How do different types of code changes correlate with performance impacts?
- RQ3: How do developer experience and code change complexity relate to performance impact magnitude?
- RQ4: Are there significant differences in performance evolution patterns across different domains or project sizes?
- 32.7% of method-level changes result in measurable performance impacts
- Performance regressions occur 1.3 times more frequently than improvements (18.5% vs 14.2%)
- No significant differences in performance impact distributions across code change categories
- Algorithmic changes demonstrate the highest improvement potential (25.6%) but carry substantial regression risk (33.9%)
- Senior developers produce more stable changes with fewer extreme variations
- Domain-size interactions reveal significant patterns, with web server + small projects exhibiting the highest performance instability (42.2%)
├── jperfevo/ # Main analysis package
│ ├── core/ # Core analysis components
│ │ ├── agreement_analyzer.py # Inter-rater agreement analysis (Cohen's κ)
│ │ ├── code_diff_generator.py # Code difference visualization
│ │ ├── code_pair_generator.py # Method pair extraction from Git history
│ │ ├── code_pair_inserter.py # Database insertion utilities
│ │ ├── github_author_experience.py # Developer experience quantification
│ │ ├── method_complexity_analyzer.py # Code change complexity scoring
│ │ ├── method_mapper.py # Method mapping between commit versions
│ │ └── performance_diff_significance.py # Statistical significance testing
│ ├── models/ # Data models
│ │ └── code_pair.py # Code pair data structure
│ ├── rq/ # Research question analysis modules
│ │ ├── rq1.py # RQ1: Temporal performance patterns
│ │ ├── rq2.py # RQ2: Code change type analysis
│ │ ├── rq3.py # RQ3: Developer experience & complexity
│ │ └── rq4.py # RQ4: Domain and size analysis
│ └── services/ # Utility services
│ ├── db_service.py # MongoDB database operations
│ └── similarity_service.py # Code similarity analysis
├── jphb-performance-data/ # Raw performance measurement data
│ ├── chronicle-core/ # Performance data per project
│ ├── client-java/
│ ├── objenesis/
│ └── protostuff/
│ └── performance_data.json # JMH benchmark execution results
├── results/ # Processed analysis results
│ ├── [project-name]/ # Per-project analysis results
│ │ ├── author_experiences.json # Developer experience scores
│ │ ├── method_complexities.json # Code change complexity metrics
│ │ ├── method_mappings.json # Method version mappings
│ │ └── labelings.json # Code change type classifications
│ ├── apm-agent-java/ # 15 projects total with results
│ ├── chronicle-core/
│ ├── client-java/
│ ├── feign/
│ ├── hdrhistogram/
│ ├── jctools/
│ ├── jdbi/
│ ├── jersey/
│ ├── jetty/
│ ├── netty/
│ ├── objenesis/
│ ├── protostuff/
│ ├── simpleflatmapper/
│ └── zipkin/
├── projects.json # Study dataset configuration
├── statistics.json # Project statistics and metadata
├── requirements.txt # Python dependencies
└── README.md # This file
Our study analyzes 15 mature open-source Java projects across diverse domains:
| Project | Domain | KLOC | Method Changes | Benchmarks | Results Available |
|---|---|---|---|---|---|
| jetty | Web Server | 339.06 | 2,472 | 12,720 | ✅ |
| netty | Networking | 216.98 | 4,241 | 7,669 | ✅ |
| fastjson2 | Data Processing | 178.5 | 1,726 | 3,725 | ✅ |
| apm-agent-java | Monitoring | 80.22 | 891 | 2,984 | ✅ |
| jersey | Web Server | 158.91 | 310 | 781 | ✅ |
| simpleflatmapper | Data Processing | 51.79 | 911 | 1,969 | ✅ |
| protostuff | Data Processing | 42.29 | 448 | 1,354 | ✅ |
| jctools | System Programming | 31.48 | 339 | 1,042 | ✅ |
| jdbi | Data Processing | 28.49 | 1,266 | 1,919 | ✅ |
| client-java | Monitoring | 27.38 | 155 | 667 | ✅ |
| zipkin | Monitoring | 23.51 | 656 | 2,726 | ✅ |
| feign | Web Server | 17.42 | 351 | 1,384 | ✅ |
| chronicle-core | System Programming | 13.25 | 780 | 3,170 | ✅ |
| hdrhistogram | Monitoring | 8.89 | 158 | 317 | ✅ |
| objenesis | Testing | 2.69 | 107 | 784 | ✅ |
Total: 1,499 method-level changes across 739 commits
This replication package includes complete processed datasets, making it immediately usable for:
- ✅ Verification of statistical analyses without recomputation
- ✅ Extension studies using our methodology on new projects
- ✅ Comparative analyses across different project characteristics
- ✅ Educational use for performance analysis techniques
All results in the results/ directory are derived from ~80 machine days of computation, providing reviewers with immediate access to the complete study dataset.
- Python 3.8+
- Java 8+ (for benchmark execution)
- MongoDB (for data storage)
- Git (for repository cloning)
- Clone the repository:
git clone https://github.com/mooselab/empirical-java-performance-evolution
cd empirical-java-performance-evolution- Create and activate virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Set up environment variables:
Create a
.envfile in the root directory:
DB_NAME=your_db_name
DB_URL=your_mongodb_url
CLOUD_DB_URL=your_cloud_mongodb_url # Optional
GITHUB_TOKEN=your_github_token # For API accessThe repository includes processed results for immediate analysis. You can directly run research question analyses on the provided datasets:
# Analyze temporal performance patterns using processed data
python -m jperfevo.rq.rq1
# Examine code change type impacts
python -m jperfevo.rq.rq2
# Explore developer experience and complexity relationships
python -m jperfevo.rq.rq3
# Investigate domain and size patterns
python -m jperfevo.rq.rq4- Microbenchmarking: Java Microbenchmark Harness (JMH) with 3 forks and 5 iterations (15 iterations in total)
- Instrumentation: Custom Java bytecode instrumentation for method-level metrics (see JIB)
- Statistical Analysis: Mann-Whitney U-test (p < 0.05) and Cliff's Delta effect size (|δ| ≥ 0.147)
We categorize method-level changes into seven types:
- ALG: Algorithmic Change
- CF: Control Flow
- DS: Data Structure & Variable
- REF: Refactoring & Code Cleanup
- ER: Exception & Return Handling
- CON: Concurrency
- API: API/Library Call
Multi-dimensional experience quantification based on:
- GitHub account age (20% weight)
- Project-specific contributions (30% weight)
- Total contributions across projects (25% weight)
- Code review participation (25% weight)
The analysis generates comprehensive visualizations and statistics:
- Temporal patterns across project lifecycles
- Code change impact distributions by category
- Developer experience correlations with performance outcomes
- Domain-specific and size-based patterns
- Statistical significance tests for all major findings
Results are saved in the plots/ directory organized by research question.
This replication package provides complete processed datasets including:
- Study analysis dataset: The dataset containing all the information aggregately in
dataset/dataset.csv - Performance measurements: Raw JMH benchmark execution results in
jphb-performance-data/ - Method mappings: Version tracking across 1,499 method changes in
results/*/method_mappings.json - Developer experience scores: Quantified contributor expertise in
results/*/author_experiences.json - Code change complexity metrics: Weighted complexity scores in
results/*/method_complexities.json - Code change classifications: Manual labels with inter-rater agreement (κ = 0.96) in
results/*/labelings.json - Statistical test results: All significance tests and effect sizes embedded in analysis scripts
Each project in results/ contains:
project-name/
├── author_experiences.json # Developer experience quantification
├── method_complexities.json # Code change complexity analysis
├── method_mappings.json # Method version mappings with performance data
└── labelings.json # Code change type classifications (when available)
We welcome contributions to improve the analysis tools and extend the study:
- Fork the repository
- Create a feature branch (
git checkout -b feature/improvement) - Commit your changes (
git commit -am 'Add improvement') - Push to the branch (
git push origin feature/improvement) - Create a Pull Request
- Kaveh Shahedi - kaveh.shahedi@polymtl.ca
- Heng Li (Corresponding Author) - heng.li@polymtl.ca
Department of Computer Engineering and Software Engineering
Polytechnique Montréal, Canada
This project is licensed under the MIT License - see the LICENSE file for details.
- Natural Sciences and Engineering Research Council of Canada (NSERC) - Grant #RGPIN-2021-03900
- Open-source projects and their maintainers for providing high-quality benchmarks
- Research community for foundational work in performance analysis