This repository contains BioFSharp.ML, an F#/.NET library with CNTK-backed machine learning helpers, plus a DPPOP command-line wrapper.
BioFSharp.ML.slnis the root solution.src/BioFSharp.MLcontains the library and embedded DPPOP CNTK model resources.src/DPPOP.CLIcontains thedppopexecutable/tool project.tests/BioFSharp.ML.Testsandtests/DPPOP.Testscontain xUnit tests.buildcontains the FAKE build project.plans/rescue_modernize.mdrecords the CNTK rescue and DPPOP modernization plan.scripts/Export-ImlpLegacyRuntime.ps1extracts the legacy CNTK/OpenMPI runtime from the local Docker image.
The repo pins .NET SDK 10.0.100 in global.json with latestMinor roll-forward.
Use the FAKE entry points from the repository root:
.\build.cmd
.\build.cmd RunTestsOn Unix-like shells:
./build.sh
./build.sh RunTestsThe default build target builds the solution. RunTests cleans, builds, and runs both test projects with coverage collection enabled.
CNTK is a preserved legacy dependency. Treat it as runtime infrastructure to rescue and stabilize, not as a dependency to modernize casually.
The rescued runtime is now archived in the Zenodo record:
https://doi.org/10.5281/zenodo.20026836
The local extraction script is retained as provenance/recovery tooling:
.\scripts\Export-ImlpLegacyRuntime.ps1The script reads csbdocker/imlp:1.0.0 by default and writes:
legacy-runtime.tar.gzruntime-manifest.jsonSHA256SUMS
The rescued archive contains:
/usr/local/cntk/cntk/lib/usr/local/cntk/cntk/dependencies/lib/usr/local/mpi/lib
The container runtime path assumptions are:
PATH=/usr/local/cntk/cntk/lib:/usr/local/mpi/bin:$PATH
LD_LIBRARY_PATH=/usr/local/cntk/cntk/dependencies/lib:/usr/local/cntk/cntk/lib:/usr/local/mpi/lib:$LD_LIBRARY_PATH
artifacts/ and zenodo-record/ are gitignored, so do not assume large binary runtime payloads are present in a fresh checkout.
After publishing/pulling the base image csbdocker/cntk-dotnet:1.0.1-cntk2.7-dotnet10, build the DPPOP container from the repository root:
docker build -t csbdocker/dppop .Run it with input files mounted under /data:
docker run --rm --mount "type=bind,source=C:/my-data,target=/data" csbdocker/dppop --proteome /data/proteome.fasta --proteins-of-interest /data/targets.fasta --model nonplant --output /data/results.tsv- Keep the library target conservative unless a task explicitly requires a target change;
src/BioFSharp.MLcurrently targetsnetstandard2.0. src/DPPOP.CLItargetsnet10.0and references the library project.- Keep DPPOP CLI behavior script-compatible where practical, but prefer the compiled CLI for deployment and tests.
- Be careful around project and solution registration when adding, renaming, or moving F# files. F# compile order is explicit in
.fsprojfiles. - Do not replace the CNTK runtime rescue path with upstream downloads. The modernization plan treats the local image as the recovery anchor.