Closed Beta Research Demonstrator for Gaia DR3 Candidate Prioritization, Astrometric Dynamics, Proper-Motion Evolution, Stellar Reconstruction and Exploratory Space-Data Intelligence
The Codex Alpha Computational Framework is an open-source research framework designed to support the exploratory analysis, prioritization and validation of candidate sources in large astronomical datasets, starting from ESA Gaia DR3.
The current closed-beta demonstrator combines:
- Gaia DR3 candidate data ingestion
- anomaly ranking
- graph-based structural analysis
- astrometric dynamics
- projected proper-motion evolution
- possible binary / comoving-pair prioritization
- candidate investigation workflows
- synthetic stellar reconstruction
- full candidate dossier generation
- external validation workflows
- local-first interactive scientific visualization
The framework is not designed to claim direct astrophysical discoveries by itself.
Its purpose is to help researchers identify which sources deserve further investigation.
The technical whitepaper for the closed-beta release of the Codex Alpha Computational Framework is available on Zenodo:
Codex Alpha Computational Framework: Technical Whitepaper
Version: v0.1.0 Closed Beta
Zenodo DOI: https://doi.org/10.5281/zenodo.20335018
Closed Beta Research Demonstrator
The framework is currently transitioning from an advanced alpha prototype to a closed beta demonstrator.
The core workflow is already operational on a Gaia DR3 demo dataset of approximately 1000 sources. Current development is focused on documentation, hardening, reproducibility, larger dataset ingestion, AI-assisted scoring, improved candidate validation, validation-oriented interoperability direction with the CDS ecosystem and future cloud scalability.
The current demo dataset is intentionally limited and is used to demonstrate the framework workflow, not to claim complete catalogue coverage.
Large astronomical catalogues such as Gaia DR3 contain an enormous number of sources and multidimensional parameters. The central problem is not only accessing the data, but identifying which sources are most worthy of deeper analysis.
The Codex Alpha Computational Framework addresses this by combining multiple exploratory layers:
Gaia DR3 data
→ anomaly ranking
→ graph structure
→ astrometric dynamics
→ projected proper-motion evolution
→ candidate investigation
→ stellar reconstruction
→ candidate dossier
→ external validation workflow
The framework helps prioritize candidates by combining:
- statistical anomaly indicators
- structural graph importance
- astrometric quantities
- kinematic proxies
- projected proper-motion visualization
- radial-velocity usage when available
- hidden companion suspicion indicators
- possible binary / comoving-pair involvement
- crossmatch availability
- validation-ready query generation
- synthetic candidate-level visualization
- exportable scientific dossier generation
The current closed-beta version also includes a fifth reconstruction layer that converts the selected candidate into a full proxy-based stellar dossier and synthetic visual model.
The current demonstrator provides five connected analysis interfaces.
The first interface provides the global operational view of the dataset.
It includes:
- dataset summary cards
- total analyzed sources
- anomaly counts
- graph node and edge counts
- interactive 3D relational graph viewer
- top structural source ranking
- selected source inspection panel
- interactive source table
- automatic pipeline report viewer
This page allows the user to quickly inspect the dataset, identify structurally important sources and synchronize source selection across the framework.
The second interface provides a broader interpretative layer.
It includes:
- Gaia physical map
- relational knowledge graph
- coherence-gradient-inspired proxy module
- candidate registry
- optional crossmatch result integration
- access to the astrometric dynamics layer
The goal of this interface is to move beyond isolated numerical anomalies and analyze candidate sources in their physical, relational and structural context.
The framework uses the notation ∇𝒦 only as internal Codex Alpha theoretical context and not as a direct physical measurement.
The third interface focuses on astrometric and kinematic candidate behavior.
It includes:
- 3D stellar field visualization
- hybrid Gaia / dynamics view
- physical Gaia field mode
- kinematic velocity-space mode
- distance estimates
- total proper motion
- tangential velocity
- approximate space velocity
- dynamics index
- hidden companion suspicion index
- possible binary / comoving-pair candidates
- dynamic candidate table
- synchronized source selection
This layer supports the identification of sources that may deserve further inspection because of unusual astrometric or kinematic behavior.
All outputs are candidate-level indicators and require external astrophysical validation.
The fourth interface is the investigation layer for the top-priority candidate pool and for the local Gaia DR3 demo-source motion field.
It includes:
- Candidate Signal Map 3D
- projected proper-motion evolution of up to approximately 1000 Gaia DR3 demo sources
- local tangent-plane source projection
- individual source motion based on proper motion
- radial-velocity contribution when available
- projected motion traces
- adjustable time speed
- adjustable motion scale
- adjustable trace intensity and trace width
- selected-source trajectory highlighting
- top-anomaly trajectory highlighting
- candidate cloud mode
- kinematic field mode
- signal-space mode
- active candidate profile
- hexagonal proxy chart
- anomaly score
- dynamics index
- hidden companion suspicion index
- structural importance
- distance estimate
- tangential velocity
- approximate space velocity
- Gaia color proxy
- possible pair involvement
- crossmatch status
- mission briefing
- evidence vector
- validation queries
- Gaia Archive / SIMBAD / VizieR / ESA Sky links
The projected motion visualization is not an orbital simulation and is not an N-body gravitational simulation.
It is a candidate-level kinematic visualization based on Gaia-derived observables and internal framework proxies. It is designed to show how selected sources and candidate groups evolve visually under proper-motion projection, not to confirm physical orbits, binarity, encounters or future dynamical states.
This interface is designed to help a researcher move from candidate selection to validation planning.
The fifth interface provides a candidate-level stellar reconstruction and reporting layer for the globally selected Gaia source.
It includes:
- selected source summary
- proxy-based stellar model
- synthetic 3D stellar twin visualization
- estimated visual temperature proxy
- radius and luminosity proxies
- activity and surface-contrast rendering proxies
- corona and flare visualization controls
- Top-50 anomaly queue
- full stellar dossier generation
- copy-ready scientific interpretation
- downloadable TXT / Markdown / LaTeX / JSON reports
- synchronized source selection from the anomaly queue
This interface is designed to help researchers move from candidate prioritization to communication, documentation and validation planning.
The synthetic 3D stellar twin is not a direct observational image of the selected star. It is a physically-informed procedural visualization constrained by available Gaia-derived observables and internal framework proxies.
The rendered surface, corona, flare layers, background field and visual morphology do not confirm stellar type, activity, companions, binarity, planets or exotic physical mechanisms.
The fourth interface includes a candidate-level projected motion layer for the Gaia DR3 demo dataset.
This layer visualizes the approximate evolution of source positions by using:
- initial source position
- right ascension
- declination
- parallax or distance proxy where available
pmrapmdecradial_velocitywhen available- local tangent-plane projection
- visual motion scaling
- temperature or BP-RP based color proxy
- visual size scaling
The purpose of this module is to make Gaia-derived motion information understandable inside an interactive 3D environment.
The motion traces are:
projected motion traces
not:
confirmed orbital paths
The visualization does not include:
- N-body gravitational integration
- stellar mutual attraction
- acceleration modelling
- galactic potential modelling
- confirmed future close encounters
- confirmed binary dynamics
- confirmed orbital evolution
The module should be interpreted as:
candidate-level kinematic projection based on available Gaia observables
not as:
a complete dynamical simulation of the local stellar environment
The visual motion scale is intentionally adjustable because real Gaia proper motions are often too small to be visually meaningful at short timescales inside a compact dashboard visualization.
For a selected source, the framework can support structured dossier generation including:
SOURCE_ID- Gaia coordinates
- parallax
- distance estimate
- PMRA
- PMDEC
- radial velocity
- anomaly score
- anomaly rank
- structural rank
- structural importance
- coherence proxy if available
- SIMBAD / VizieR / NSS crossmatch if available
- dynamics index
- hidden companion suspicion index
- possible binary / comoving-pair involvement
- projected proper-motion context
- synthetic stellar reconstruction parameters
- proxy-based temperature estimate
- proxy-based radius estimate
- proxy-based luminosity estimate
- visual stellar type proxy
- Top-50 anomaly queue context
- cautious scientific interpretation
- next validation steps
- copy-ready report text
- downloadable TXT / Markdown / LaTeX / JSON dossier exports
The dossier is intended for:
- internal research notes
- candidate tracking
- GitHub issues
- collaborator communication
- LaTeX-ready scientific documentation
- validation planning
- ESA BIC demonstration material
- reproducible candidate-level review
- structured handoff toward external catalogue validation services
The framework does not confirm:
- planets
- binary systems
- hidden companions
- black holes
- exotic objects
- close encounters
- orbital dynamics
- future stellar configurations
- new physical mechanisms
- direct physical measurements of
∇𝒦 - direct images of stellar surfaces
- confirmed stellar activity or flare events
Instead, the framework produces:
candidate-level prioritization indicators
Terms such as:
candidate
proxy
not confirmed
requires external validation
synthetic visualization
projected motion trace
candidate-level dossier
are used intentionally.
The correct interpretation is:
This source is interesting enough to deserve further validation.
not:
This source is confirmed to be astrophysically unusual.
The projected proper-motion evolution layer is intended for research communication, triage and visualization. It is not a substitute for orbital modelling, N-body simulation, catalogue validation, spectroscopic analysis, photometric modelling, expert review or independent astrophysical confirmation.
The synthetic stellar reconstruction layer is intended for research communication, triage and documentation. It is not a substitute for catalogue validation, spectroscopic analysis, photometric modelling, expert review or independent astrophysical confirmation.
The framework supports validation through links and query generation for:
- Gaia Archive
- Gaia NSS where applicable
- SIMBAD
- VizieR
- ESA Sky
- local neighbourhood queries
- possible pair consistency checks
- candidate-level dossier exports
Typical validation steps include:
- Verify the Gaia DR3 source directly in Gaia Archive.
- Check SIMBAD and VizieR object context.
- Check Gaia NSS, RUWE, astrometric excess noise and radial velocity where available.
- Compare parallax and proper motion with nearby sources before any comoving-pair interpretation.
- Treat all internal scores as prioritization proxies, not final classifications.
- Treat projected motion traces as visualization aids, not confirmed orbital paths.
- Use the exported dossier as a structured research note, not as a discovery claim.
- Validate any suspected binary, companion or unusual object interpretation through independent astrophysical methods.
The Codex Alpha Computational Framework is designed to act as a pre-validation intelligence layer before external astronomical validation.
It does not replace CDS Portal, SIMBAD, VizieR, Aladin or X-Match.
The long-term objective is to support structured candidate handoff toward established astronomical validation services, helping researchers move from Gaia candidate prioritization to catalogue inspection, object identification, sky visualization and cross-match workflows.
The framework aims to become interoperable with the CDS validation ecosystem, including CDS Portal, SIMBAD, VizieR, Aladin and X-Match, while maintaining a clear distinction between internal candidate prioritization and external catalogue-based validation.
No official CDS integration or partnership is claimed by this repository.
The current demonstrator uses a local Gaia DR3 demo dataset exported into the dashboard data package.
The dashboard reads local files from:
dashboard/public/data/
The current architecture is local-first and can run without external APIs for the core dashboard.
External links are provided for validation and catalogue inspection.
The current projected proper-motion evolution module is intentionally demonstrated on the same approximately 1000-source Gaia DR3 package used by the local demo. It is not intended to increase the dataset size during the current closed-beta demonstration.
Clone the repository:
git clone https://github.com/Miriadenera/codex-alpha-computational-framework.gitEnter the project directory:
cd codex-alpha-computational-frameworkInstall Python dependencies:
pip install -r requirements.txtFrom the repository root:
python -m pipeline.run_full_pipelineImportant: run this command from the repository root, not from inside the dashboard/ directory.
Correct:
codex-alpha-computational-framework> python -m pipeline.run_full_pipeline
Incorrect:
codex-alpha-computational-framework/dashboard> python -m pipeline.run_full_pipeline
Enter the dashboard directory:
cd dashboardInstall dashboard dependencies:
npm installStart the local Vite dashboard:
npm run devOpen:
http://localhost:5173/
To verify the production build:
cd dashboard
npm run buildThe Python pipeline can generate outputs such as:
results/gaia_dr3_anomaly_results.csv
results/gaia_dr3_feature_contributions.csv
results/gaia_dr3_anomaly_clusters.csv
results/gaia_dr3_emergent_structures.csv
results/gaia_dr3_graph_nodes.csv
results/gaia_dr3_graph_edges.csv
results/gaia_dr3_graph_centrality.csv
results/gaia_dr3_anomaly_sky_plot.png
results/gaia_dr3_relational_graph.png
results/gaia_dr3_pipeline_report.md
The dashboard export layer uses:
dashboard/public/data/summary.json
dashboard/public/data/anomalies.json
dashboard/public/data/feature_contributions.json
dashboard/public/data/clusters.json
dashboard/public/data/emergent_structures.json
dashboard/public/data/graph_nodes.json
dashboard/public/data/graph_edges.json
dashboard/public/data/graph_centrality.json
dashboard/public/data/report.md
dashboard/public/data/candidate_crossmatch_results.json
dashboard/public/data/possible_binary_pairs.json
Some optional files may be absent in the local demo package.
The dashboard is designed to degrade gracefully when optional validation files are not available.
ai/ -> AI-assisted exploratory analysis modules
analysis/ -> Statistical, clustering, centrality and structure analysis modules
crossmatch/ -> Candidate crossmatch utilities
dashboard/ -> Local interactive React/Vite dashboard
datasets/ -> Gaia DR3 datasets and loaders
docs/ -> Technical documentation and architecture notes
examples/ -> End-to-end execution examples
pipeline/ -> Automated analysis pipelines
prototypes/ -> Experimental prototype modules
reports/ -> Automatic report generation modules
results/ -> Generated outputs, reports and visualizations
simulations/ -> Future computational simulation modules
structures/ -> Relational graph construction modules
visualization/ -> Exploratory 2D and 3D visualization modules
The framework currently combines:
- Python data-processing pipeline
- Gaia DR3 data ingestion
- anomaly detection and ranking
- feature contribution analysis
- graph construction
- graph centrality analysis
- local dashboard export
- React/Vite frontend
- Three.js / WebGL visualizations
- interactive candidate inspection
- astrometric dynamics analysis
- projected proper-motion evolution
- local tangent-plane source projection
- proper-motion trace visualization
- radial-velocity visual projection when available
- possible binary / comoving-pair prioritization
- optional crossmatch integration
- validation-oriented external links
- synthetic stellar reconstruction viewer
- procedural Three.js stellar rendering
- candidate-level dossier export system
- future-oriented CDS ecosystem interoperability design
The current framework already supports AI-assisted exploratory analysis through anomaly detection and candidate prioritization.
Future AI development may include:
- explainable candidate scoring
- probabilistic binary / comoving-pair prioritization
- graph neural network experiments
- local open-source LLM assistance
- automated candidate dossier interpretation
- multi-catalogue anomaly detection
- AI-assisted validation planning
- human-in-the-loop scientific review
- AI-assisted candidate triage across larger Gaia-derived datasets
- AI-assisted report generation with explicit uncertainty tracking
- AI-assisted preparation of CDS-oriented validation workflows
The AI layer is intended to assist researchers, not replace scientific validation.
Focus:
- stabilize the closed beta demonstrator
- improve documentation and reproducibility
- reduce critical static-analysis findings
- add minimal automated testing
- strengthen Gaia DR3 candidate validation workflow
- improve binary / comoving-pair probability scoring
- harden the fourth projected proper-motion evolution interface
- harden the fifth stellar reconstruction and dossier interface
- document CDS-oriented validation handoff workflows
- prepare a technical whitepaper or preprint
- document the framework for ESA BIC and research partners
Target outcome:
stable closed-beta research demonstrator
Focus:
- integrate explainable AI scoring
- expand candidate dossier generation
- support larger Gaia-derived datasets
- prepare for Gaia DR4-compatible ingestion when applicable
- add multi-catalogue validation modules
- improve human-in-the-loop candidate review
- create researcher-oriented export formats
- improve automated validation planning
- introduce uncertainty-aware ranking and reporting
- develop interoperability with the CDS validation ecosystem, including CDS Portal, SIMBAD, VizieR, Aladin and X-Match
- support structured candidate handoff from Codex Alpha candidate dossiers to external catalogue validation workflows
- prepare Gaia-derived candidates for SIMBAD, VizieR, Aladin and X-Match inspection without claiming official CDS integration
Target outcome:
AI-assisted candidate intelligence platform
Focus:
- move from local-first demo to scalable architecture
- add cloud-ready deployment
- introduce API-based data ingestion
- support larger survey datasets
- implement collaborative candidate review
- integrate advanced visualization and reporting
- expand beyond Gaia-only workflows
- explore institutional partnerships
- support multi-catalogue validation workflows connected to established astronomical services
- provide structured interoperability patterns for external catalogue inspection and cross-match systems
Target outcome:
scalable space-data analytics platform
Focus:
- support institutional users
- develop SaaS or licensed deployment options
- expand beyond Gaia-only workflows
- integrate additional astronomical surveys
- provide premium analytics and support services
- build a sustainable open-core research/business model
- support research teams with explainable candidate-intelligence tools
- provide institutional-grade candidate triage before external catalogue validation
- offer validation-oriented workflows compatible with established astronomical data infrastructures
Target outcome:
commercially and institutionally deployable space-data intelligence system
The framework is relevant to space-data innovation because it addresses a concrete problem:
how to transform large astronomical catalogues into explainable, prioritized and validation-ready candidate investigations
Potential applications include:
- space-data analytics
- Gaia candidate prioritization
- astrometric anomaly investigation
- projected proper-motion visualization
- binary / comoving-pair candidate screening
- research workflow automation
- AI-assisted catalogue exploration
- scientific visualization
- institutional data-intelligence tools
- candidate dossier generation
- validation-oriented astronomical triage
- structured preparation for CDS Portal, SIMBAD, VizieR, Aladin and X-Match inspection
The current closed-beta demonstrator is designed to show technical feasibility, scientific caution and future scalability.
Possible development paths include:
- open-source research framework
- institutional licensing
- hosted dashboard services
- premium validation modules
- AI-assisted candidate scoring services
- custom data-analysis pipelines
- research collaboration tools
- multi-catalogue space-data analytics
- candidate intelligence dashboards
- white-label scientific data-analysis tools
- validation-oriented candidate triage services
The long-term objective is to evolve the framework into a scalable platform for explainable astronomical candidate intelligence.
The intended position is not to replace established scientific infrastructures such as CDS Portal, SIMBAD, VizieR, Aladin or X-Match. The business value is in preparing, ranking, documenting and structuring Gaia-derived candidates before external catalogue validation.
Codex Alpha Research:
https://www.codexalpha.org
Computational Framework page:
https://www.codexalpha.org/computational-framework
The framework originates from the broader Codex Alpha Research project.
Codex Alpha explores theoretical and computational approaches to information-based physical modeling, emergent structure and scientific data interpretation.
This repository focuses on the operational, computational and data-oriented side of the project.
This repository represents an active research and development project.
The current framework is a closed-beta demonstrator and should be interpreted as an exploratory scientific tool.
Outputs such as anomaly rankings, structural graph scores, hidden-companion indicators, possible pair involvement, projected proper-motion traces, synthetic stellar reconstruction and coherence-inspired proxies are not final astrophysical classifications.
All candidate interpretations require independent validation through external catalogues, Gaia Archive, SIMBAD, VizieR, Gaia NSS where applicable, and expert scientific review.
References to CDS Portal, SIMBAD, VizieR, Aladin and X-Match describe intended validation-oriented interoperability and external workflow direction. They do not imply official CDS endorsement, partnership or integration.
No claim of new physical discovery is made by the framework alone.