API documentation for all infrastructure modules
Quick Reference: Modules Guide | Infrastructure Docs | Getting Started
This document provides API reference for all public functions and classes in the infrastructure/ modules. All modules follow the thin orchestrator pattern with test coverage.
Note: These modules are part of the infrastructure layer. For project-specific code, see projects/{name}/src/.
This API reference covers modules from both the infrastructure layer (reusable, generic tools) and the project layer (project-specific scientific code).
These modules are located in infrastructure/ and provide generic tools applicable to any research project:
infrastructure/documentation/glossary_gen.py- API documentation generation from source code
infrastructure/validation/content/pdf_validator.py- PDF rendering validationinfrastructure/validation/integrity/checks.py- Output integrity verification
infrastructure/scientific/- Scientific computing best practices (modular package)
infrastructure/publishing/- Academic publishing workflows (DOI, citations, metadata)infrastructure/llm/- Local LLM integration for research assistanceinfrastructure/rendering/- Multi-format output generation (PDF, slides, web, posters)infrastructure/reporting/- Pipeline reporting and error aggregation
Project modules live in projects/{name}/src/ and contain project-specific scientific code.
Module names and contents vary by project; the list below is illustrative, not a required template.
projects/{name}/src/example.py- Basic mathematical operations
projects/{name}/src/data_generator.py- Synthetic data generation with configurable distributionsprojects/{name}/src/data_processing.py- Data cleaning, preprocessing, normalization, outlier detectionprojects/{name}/src/statistics.py- Descriptive statistics, hypothesis testing, correlation analysisprojects/{name}/src/metrics.py- Performance metrics, convergence metrics, quality metricsprojects/{name}/src/validation.py- Result validation, reproducibility verification, anomaly detection
projects/{name}/src/visualization.py- Publication-quality figure generation with consistent stylingprojects/{name}/src/plots.py- Plot type implementations (line, scatter, bar, heatmap, contour)
- Example modules (if your project needs them):
simulation.py,parameters.py,performance_analysis.py,reporting.py
Module inventory for
infrastructure/core/.
| Module | Description |
|---|---|
checkpoint |
Pipeline checkpoint system for resume capability |
cli |
CLI interface for core infrastructure modules |
config_cli |
Load manuscript configuration script (THIN ORCHESTRATOR) |
config_loader |
Configuration loader for manuscript metadata |
credentials |
Secure credential management for testing and operations |
environment |
Environment setup and validation utilities |
errors |
Typed error constants for consistent error messaging |
exceptions |
Custom exception hierarchy for the Research Project Template |
file_inventory |
File inventory and collection utilities |
file_operations |
File and directory operation utilities |
health_check |
System health monitoring and status checks |
logging_formatters |
Logging formatters for structured and template-based output |
logging_helpers |
Helper functions for logging utilities |
logging_progress |
Progress bars, spinners, and ETA calculations for logging |
logging_utils |
Unified Python logging module for the Research Project Template |
menu |
Interactive menu utilities (pure helpers) |
multi_project |
Multi-project orchestration system |
performance |
Performance monitoring and resource tracking utilities |
performance_monitor |
Performance monitoring and profiling utilities for research workflows |
pipeline |
Pipeline execution system for research projects |
pipeline_summary |
Pipeline summary generation and reporting |
progress |
Progress reporting utilities for pipeline operations |
retry |
Retry utilities for handling transient failures |
script_discovery |
Script discovery and execution utilities |
security |
Security utilities and input validation for the research template system |
For detailed class and function signatures for each core module, see Infrastructure Documentation.
The core module provides foundational utilities used across all infrastructure modules. See Core Module Guide for detailed usage.
| Symbol | Type | Import Path |
|---|---|---|
get_logger |
Function | infrastructure.core |
log_operation |
Function | infrastructure.core |
log_stage |
Function | infrastructure.core |
log_success |
Function | infrastructure.core |
format_duration |
Function | infrastructure.core |
TemplateError |
Exception | infrastructure.core |
CheckpointManager |
Class | infrastructure.core |
SystemHealthChecker |
Class | infrastructure.core |
monitor_performance |
Decorator | infrastructure.core |
ProgressBar |
Class | infrastructure.core |
| Subpackage | Purpose | Key Symbols |
|---|---|---|
core.config |
YAML config loading | load_config, get_config_as_dict |
core.exceptions |
Exception hierarchy | TemplateError, BuildError, ValidationError |
core.logging |
Structured logging | get_logger, ProjectLogger |
core.pipeline |
Pipeline execution | PipelineConfig, PipelineExecutor |
core.runtime |
Checkpoints, profiling | CheckpointManager, CodeProfiler |
core.telemetry |
Stage resource metrics | TelemetryCollector, TelemetryConfig |
core.files |
File operations | clean_coverage_files, copy_final_deliverables |
Local LLM integration via Ollama for manuscript review and literature search. See LLM Module Guide.
| Symbol | Type | Import Path |
|---|---|---|
LLMClient |
Class | infrastructure.llm |
OllamaClientConfig |
Dataclass | infrastructure.llm |
GenerationOptions |
Dataclass | infrastructure.llm |
generate_review |
Function | infrastructure.llm |
ReviewResult |
Dataclass | infrastructure.llm |
ReviewConfig |
Dataclass | infrastructure.llm |
ReviewMode |
Enum | infrastructure.llm |
Multi-format output generation. See Rendering Module Guide.
| Symbol | Type | Import Path |
|---|---|---|
RenderManager |
Class | infrastructure.rendering |
RenderingConfig |
Dataclass | infrastructure.rendering |
execute_render_pipeline |
Function | infrastructure.rendering.pipeline |
Pipeline reporting and executive summaries. See Reporting Module Guide.
| Symbol | Type | Import Path |
|---|---|---|
generate_pipeline_report |
Function | infrastructure.reporting |
ErrorAggregator |
Class | infrastructure.reporting |
OutputOrganizer |
Class | infrastructure.reporting |
execute_test_pipeline |
Function | infrastructure.reporting |
Multi-project discovery and management. See Project Module Guide.
| Symbol | Type | Import Path |
|---|---|---|
ProjectInfo |
Dataclass | infrastructure.project |
discover_projects |
Function | infrastructure.project |
get_project_metadata |
Function | infrastructure.project |
validate_project_structure |
Function | infrastructure.project |
Agent skill discovery and manifest generation. See Skills Module Guide.
| Symbol | Type | Import Path |
|---|---|---|
SkillDescriptor |
Dataclass | infrastructure.skills |
discover_skills |
Function | infrastructure.skills |
write_skill_manifest |
Function | infrastructure.skills |
manifest_matches_discovery |
Function | infrastructure.skills |
Cryptographic watermarking and provenance embedding. See Steganography Module Guide.
| Symbol | Type | Import Path |
|---|---|---|
SteganographyConfig |
Dataclass | infrastructure.steganography |
SteganographyProcessor |
Class | infrastructure.steganography |
embed_steganography |
Function | infrastructure.steganography |
process_pdf |
Function | infrastructure.steganography |
Container for API entry information.
Attributes:
module(str): Module namename(str): Function or class namekind(str): Type ("function" or "class")summary(str): First sentence of docstring
Scan src_dir and collect public functions/classes with summaries.
Parameters:
src_dir(str): Source directory to scan
Returns:
List[ApiEntry]: List of API entries sorted by module and name
Example:
from glossary_gen import build_api_index
entries = build_api_index("src")
for entry in entries:
print(f"{entry.module}.{entry.name} ({entry.kind})")Generate a Markdown table from API entries.
Parameters:
entries(List[ApiEntry]): List of API entries to format
Returns:
str: Markdown table string with headers and data rows
Example:
from glossary_gen import build_api_index, generate_markdown_table
entries = build_api_index("src")
table = generate_markdown_table(entries)
print(table)Replace content between begin_marker and end_marker (inclusive markers preserved).
Parameters:
text(str): Original textbegin_marker(str): Start markerend_marker(str): End markercontent(str): Content to inject
Returns:
str: Text with content replaced between markers
Exception raised for PDF validation errors.
Extract text content from PDF file.
Parameters:
pdf_path(Path): Path to PDF file
Returns:
str: Extracted text content
Raises:
PDFValidationError: If PDF cannot be read
Example:
from pathlib import Path
from infrastructure.validation import extract_text_from_pdf
text = extract_text_from_pdf(Path("output/code_project/pdf/code_project_combined.pdf"))Scan extracted text for common rendering issues.
Parameters:
text(str): Extracted text from PDF
Returns:
Dict[str, int]: Dictionary with issue counts:unresolved_references: Count of ?? patternswarnings: Count of warning patternserrors: Count of error patternsmissing_citations: Count of [?] patternstotal_issues: Sum of all issues
Example:
from infrastructure.validation import extract_text_from_pdf, scan_for_issues
text = extract_text_from_pdf(pdf_path)
issues = scan_for_issues(text)
print(f"Found {issues['total_issues']} issues")Extract the first N words from text, preserving punctuation.
Parameters:
text(str): Input textn(int): Number of words to extract (default: 200)
Returns:
str: String containing first N words
Example:
from infrastructure.validation import extract_text_from_pdf
from infrastructure.validation.content.pdf_validator import extract_first_n_words
text = extract_text_from_pdf(pdf_path)
preview = extract_first_n_words(text, n=100)
print(preview)Perform validation of PDF rendering.
Parameters:
pdf_path(Path): Path to PDF file to validaten_words(int): Number of words to extract for preview (default: 200)
Returns:
Dict[str, Any]: Validation report dictionary with:pdf_path: Path to PDFissues: Dictionary of issue countsfirst_words: First N words of documentsummary: Summary dictionary with has_issues and word_count
Raises:
PDFValidationError: If PDF cannot be read or validated
Example:
from pathlib import Path
from infrastructure.validation import validate_pdf_rendering
report = validate_pdf_rendering(Path("output/code_project/pdf/code_project_combined.pdf"))
if report['summary']['has_issues']:
print("PDF has issues:", report['issues'])Container for integrity verification results.
Attributes:
file_integrity(Dict[str, bool]): File integrity statuscross_reference_integrity(Dict[str, bool]): Cross-reference statusdata_consistency(Dict[str, bool]): Data consistency statusacademic_standards(Dict[str, bool]): Academic standards statusoverall_integrity(bool): Overall integrity statusissues(List[str]): Detected issueswarnings(List[str]): Warningsrecommendations(List[str]): Recommendations
Perform integrity verification.
Parameters:
output_dir(Path): Output directory to verify
Returns:
IntegrityReport: Integrity report object
Example:
from pathlib import Path
from infrastructure.validation import verify_output_integrity
report = verify_output_integrity(Path("output"))
if report.overall_integrity:
print("✅ All integrity checks passed")Verify cross-reference integrity in markdown files.
Parameters:
markdown_files(List[Path]): List of markdown files to check
Returns:
Dict[str, bool]: Dictionary mapping reference types to validity status
Example:
from pathlib import Path
from infrastructure.validation import verify_cross_references
files = list(Path("manuscript").glob("*.md"))
integrity = verify_cross_references(files)Container for publication metadata.
Attributes:
title(str): Publication titleauthors(List[str]): List of authorsabstract(str): Abstract textkeywords(List[str]): Keywordsdoi(Optional[str]): Digital Object Identifierjournal(Optional[str]): Journal nameconference(Optional[str]): Conference namepublication_date(Optional[str]): Publication datepublisher(Optional[str]): Publisher namelicense(str): License typerepository_url(Optional[str]): Repository URLcitation_count(int): Citation countdownload_count(int): Download count
Extract publication metadata from markdown files.
Parameters:
markdown_files(List[Path]): List of markdown files to analyze
Returns:
PublicationMetadata: Publication metadata object
Example:
from pathlib import Path
from infrastructure.publishing import extract_publication_metadata
files = list(Path("manuscript").glob("*.md"))
metadata = extract_publication_metadata(files)
print(f"Title: {metadata.title}")Generate BibTeX citation.
Parameters:
metadata(PublicationMetadata): Publication metadata
Returns:
str: BibTeX-formatted citation
Example:
from infrastructure.publishing import generate_citation_bibtex
bibtex = generate_citation_bibtex(metadata)
print(bibtex)Validate DOI format and checksum.
Parameters:
doi(str): DOI string to validate
Returns:
bool: True if DOI is valid, False otherwise
Example:
from infrastructure.publishing import validate_doi
if validate_doi("10.5281/zenodo.12345678"):
print("✅ Valid DOI")Location: infrastructure/scientific/ (modular package)
Container for benchmark results.
Attributes:
function_name(str): Function nameexecution_time(float): Execution time in secondsmemory_usage(Optional[float]): Memory usage (if available)iterations(int): Number of iterationsparameters(Dict[str, Any]): Benchmark parametersresult_summary(str): Summary of resultstimestamp(str): Benchmark timestamp
Container for numerical stability test results.
Attributes:
function_name(str): Function nametest_name(str): Test nameinput_range(Tuple[float, float]): Input range testedexpected_behavior(str): Expected behavior descriptionactual_behavior(str): Actual behavior descriptionstability_score(float): Stability score (0-1)recommendations(List[str]): Recommendations
check_numerical_stability(func: Callable, test_inputs: List[Any], tolerance: float = 1e-12) -> StabilityTest
Check numerical stability of a function across a range of inputs.
Parameters:
func(Callable): Function to testtest_inputs(List[Any]): List of input values to testtolerance(float): Numerical tolerance (default: 1e-12)
Returns:
StabilityTest: Stability test results
Example:
from infrastructure.scientific import check_numerical_stability
import numpy as np
stability = check_numerical_stability(my_func, np.linspace(-10, 10, 100))
print(f"Stability Score: {stability.stability_score:.2f}")benchmark_function(func: Callable, test_inputs: List[Any], iterations: int = 100) -> BenchmarkResult
Benchmark function performance across multiple inputs.
Parameters:
func(Callable): Function to benchmarktest_inputs(List[Any]): List of input valuesiterations(int): Number of iterations per input (default: 100)
Returns:
BenchmarkResult: Benchmark results
Example:
from infrastructure.scientific import benchmark_function
result = benchmark_function(my_func, test_inputs, iterations=50)
print(f"Execution Time: {result.execution_time:.4f}s")Automatic figure numbering, caption generation, and cross-referencing.
Manages figures with automatic numbering and cross-referencing.
Methods:
register_figure(filename, caption, label, section, ...)- Register a new figureget_figure(label)- Get figure metadata by labelgenerate_latex_figure_block(label)- Generate LaTeX figure blockgenerate_reference(label)- Generate LaTeX reference
Example:
from infrastructure.documentation import FigureManager
manager = FigureManager()
fig_meta = manager.register_figure("convergence.png", "Convergence analysis", "fig:convergence")
latex_block = manager.generate_latex_figure_block("fig:convergence")For API documentation of all modules, see:
- Infrastructure Documentation - infrastructure module descriptions
- Project Source Documentation - Project-specific module descriptions
- Scientific Simulation Guide - Simulation and analysis modules
- Visualization Guide - Visualization and figure management
- Image Management Guide - Image insertion and cross-referencing
Key Project Modules (illustrative; projects/{name}/src/ names vary by project):
data_processing.py- Data cleaning, normalization, outlier detectionmetrics.py- Performance metrics, convergence metrics, quality metricsvalidation.py- Result validation frameworksimulation.py- Core simulation frameworkparameters.py- Parameter management and sweepsperformance_analysis.py- Convergence and scalability analysis (example module name)reporting.py- Automated report generationplots.py- Plot type implementations
Key Infrastructure Modules (infrastructure/):
documentation/image_manager.py- Image insertion into markdowndocumentation/markdown_integration.py- Markdown integration utilitiesdocumentation/figure_manager.py- Automatic figure numbering and cross-referencing
This API reference covers all public functions and classes in the infrastructure/ and projects/{name}/src/ directories. All modules:
- Follow the thin orchestrator pattern
- Maintain required test coverage (90% project, 60% infra)
- Include type hints
- Provide detailed docstrings
For usage examples, see Modules Guide.
For implementation details, see Infrastructure Documentation and Project Source Documentation.
Related Documentation:
- Modules Guide - Usage examples
- Infrastructure Docs - Infrastructure module implementation
- Project Source Docs - Project module implementation
- Best Practices - Usage recommendations