diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
new file mode 100644
index 0000000..c34ca88
--- /dev/null
+++ b/CONTRIBUTING.md
@@ -0,0 +1,41 @@
+# Contributing
+
+## Contributing Guidelines
+
+Thank you for considering contributing to the ARX project! We welcome any and all contributions, no matter how big or small. Whether you're fixing a typo or refactoring the entire backend, your contributions are valuable to us.
+
+To ensure a positive and inclusive environment, we kindly ask all contributors to adhere to the following guidelines:
+
+### Code of Conduct
+
+Please review and abide by our [Code of Conduct](./CODE_OF_CONDUCT.md) in all discussions and interactions related to ARX, both within and outside of GitHub. We strive to maintain a safe and respectful space for everyone involved.
+
+## Getting Started
+
+ARX is written in the [Rust](https://rust-lang.org/) Programming language, Familiarity with such is a pre-requisite for writing code.
+
+Before submitting any changes, please ensure it meets the following requirements:
+
+* **It Builds:** the repository can be built with a single `cargo build` within the root folder
+* **It's Formatted:** all code is formatted according to the default rust formatting guidelines via `cargo fmt`
+* **It's Linted:** `cargo clippy` shows no code issues
+
+Patches can be submitted either directly via Github, or via email to [git@cebbinghaus.com](mailto:git@cebbinghaus.com)
+
+## Documentation
+
+If you would like to contribute to the ARX documentation, please ensure that your changes follow the guidelines outlined in the [docs/README.md](docs/README.md).
+
+## Project Layout
+
+### /common
+This is the core implementation of ARX, All of the primitives are defined here as well as a lot of helpers, This will likely be published as a library eventually for programatic use by other applications.
+
+### /client
+The client is the CLI used to generate archives and manage commiting / restoring indexes to stores. It utilizes the [clap](https://github.com/clap-rs/clap) commandline parser library, and deals primarily with the filesystem
+
+### /server
+This is the HTTP server implementation built on the [axum](https://github.com/tokio-rs/axum) framework utilizing Tokio Async for high concurrent throughput. It is a lighweight Wrapper around the common Store implementation to provide the ability to read and write archives
+
+### /docs
+This directory contains all of the ARX documentation. As well as RFCs defining the ARX systems behavior and data layouts
\ No newline at end of file
diff --git a/README.md b/README.md
index 9aa142c..1ba3452 100644
--- a/README.md
+++ b/README.md
@@ -1,76 +1,39 @@
-# ArtifactRepository
+
+
+
+
+
+
+
-This project aims to create a service, commandline, package/archive format allowing for the creation and distribution of "Artifact". An Artifact is any kind of file structure with some metadata attached to it. It could represent anything, as long as it can be defined through a file hirachy.
+The home of the ARX archiving system, It includes the cli, server and documentation / designs.
-## Design
+---
-The ArtifactRepository design is based on a Sha-512 Merkel Tree which serves as the foundation of the Artifact. Each file within an Artifact is content addressed by its sha512 hash and directories are stored in an identical format to git. In general the object structure closely matches that of git.
+## Why ARX?
-The very top level of an Artifact is called an **Index** (git calls it a commit). This defines the artifact and contains any and all relevant metadata, such as the timestamp of creation. But since it simply contains any and all metadata in a simple key value format similar to HTTP headers, it allows for the same level of flexibility and metadata to be attached to an index.
-Some possibilities could include:
-* Git has of source that produced output
-* Version number of produced artifact
-* Deployment configuration (Debug vs Release)
-* GPG Signature
-* Artifact type
+* **Deduplicating:** ARX deduplicates all files being archived, Before they are compressed. This results in smaller archive files and no wasted bandwidth.
-One required key however is the `tree` key which defines the hash of the top level tree which forms the root of the artifact. This tree can be iterated over to discover more trees and blobs which together make up the entirety of the artifact.
+* **Reliable/Resilient:** Built on the same architecture as Git, ARX Provides data integritry guarantees ensuring all data Extracted is identical to the original.
-Trees & Blobs are directly inherited from Git's design and would be interoperable if it weren't for the differing hash sizes.
+* **Fast:** Utilizing high performance Rust with a simple core design ensures generating archives is always fast and efficient.
-## Benefits
+## Quick Start
-One of the key benefits of going with a merkel tree approach very similar to git is the automatic deduplication which occurs. Since Artifacts are stored as individual files indexed by their content on the server, One file contained within multiple artifacts (multiple copies of a library) is deduplicated amongst them all and only 1 copy is stored. This also works perfectly for horizontally scaling multiple servers which can operate on the same exact data store without running into conflicts (deletion can still cause problems but that is not the primary focus).
+// TODO: Link to Installation documentation / setup instructions
-This deduplication extends all the way into the Artifact archive format which thanks to the hashes will also only store a single copy of each file. This makes for an incredibly efficient archive format when many duplicate files are to be expected.
+## Getting Help
-## Client
+// TODO: Create Community channel
-The AR Client is a small commandline utility allowing for local creation of Artifacts, Either in the .ar archive format or as a local artifact store similar to the one in the server (useful when creating multiple artifacts on the same machine and deduplication is desired)
+## Contributing
-It supports Uploading and Downloading Indexes to/from the ArtifactRepository Server which allows for distribution of artifacts amongst clients via the index hash.
+See [Contributing](CONTRIBUTING.md) Guidelines
-## Server
+For a detailed explanation of the ARX design and the archive format see the [design](docs/designs/design.md)
-The ArtifactRepository Server is a simple HTTP server which with basic REST calls allows uploading & downloading artifacts. Internally it stores these in its content addressed store which allows for multiple servers to back onto the same data source, enabling horizontal scaling.
+## License
-One of the key functionalities of a server is that it allows for exactly one upstream to defined which makes the server act in a sort of relay mode. Any artifacts uploaded to it will be mirrored to the upstream, and any artifacts requested will be queried against the upstream if they are not present locally. This allows for multi-tiered caching through the use of machine local, region local and global instances which mirror data between them depending on where the data is required.
-
-Another key consideration is the ability to back a sever onto local file systems as well as S3 compatible object storage API's for global replication and high availability.
-## Artifact File Format
-
-The artifact file format `.ar` is an Archive format which is purpose built for artifacts.
-
-It is structured as follows:
-
-| Data | Description |
-| ------- | ----------------------------------- |
-| [u8; 4] | header / magic number |
-| [u8; 2] | compression method |
-| [u8; N] | data (compressed with above method) |
-
-With the data layout being as follows
-
-| Section | Description |
-| ------------- | ------------- |
-| [HEADER] | Archive header |
-| [INDEX] | Index file |
-| [BLOBS/TREES] | Collection of Blobs & Trees |
-
-The HEADER must be laid out as follows:
-| Data | Description |
-| ---- | ----------- |
-| [entry; N] | Entries |
-| [u8; 1] | Null Terminator |
-
-with each entry as follows:
-
-| Data | Description |
-| ------- | ----------- |
-| [u8;64] | Hash |
-| [u64] | Offset |
-| [u64] | Length |
-
-All data within the data should be stored in its uncompressed form and taken directly from the binary object records.
-
-A supplementary artifact format `.sar` is entirely identical but without the requirement for every blob/tree to be present. Only those within the HEADER are guaranteed to exist within the archive and as such can aid in cutting down on data transmitted when a server/client is only missing a small number of files.
+ARX is distributed under the terms of the [GPL-2.0 License](https://github.com/ArtifactRepository/arx/blob/master/LICENSE) and derives from [Git](https://github.com/git/git)s designs also licensed under GPL-2.0.
\ No newline at end of file
diff --git a/client/src/main.rs b/client/src/main.rs
index 0487beb..6117f6b 100644
--- a/client/src/main.rs
+++ b/client/src/main.rs
@@ -2,1434 +2,1434 @@
use chrono::{DateTime, Utc};
use clap::{Parser, Subcommand};
use common::{
- archive::{
- Archive, ArchiveBody, ArchiveEntryData, ArchiveHeaderEntry, CompressionAlgorithm,
- CompressionLevel, FileEntryData, RawEntryData, SourceFileEntryData, HEADER,
- },
- object_body::Object as OtherObject,
- read_header_and_body, read_header_from_file, read_header_from_slice,
- read_object_into_headers_sync, Hash, Header, Mode, ObjectType, BLOB_KEY, INDEX_KEY, TREE_KEY,
+ archive::{
+ Archive, ArchiveBody, ArchiveEntryData, ArchiveHeaderEntry, CompressionAlgorithm,
+ CompressionLevel, FileEntryData, RawEntryData, SourceFileEntryData, HEADER,
+ },
+ object_body::Object as OtherObject,
+ read_header_and_body, read_header_from_file, read_header_from_slice,
+ read_object_into_headers_sync, Hash, Header, Mode, ObjectType, BLOB_KEY, INDEX_KEY, TREE_KEY,
};
use rayon::prelude::*;
use sha2::{Digest, Sha512};
use std::{
- collections::HashMap,
- fs::{create_dir, create_dir_all, read_dir, File},
- io::{BufRead, BufReader, BufWriter, Read, Write},
- ops::Deref,
- path::{Path, PathBuf},
- str::from_utf8,
+ collections::HashMap,
+ fs::{create_dir, create_dir_all, read_dir, File},
+ io::{BufRead, BufReader, BufWriter, Read, Write},
+ ops::Deref,
+ path::{Path, PathBuf},
+ str::from_utf8,
};
use ureq::SendBody;
#[derive(Debug)]
struct Hashed {
- inner: T,
- hash: Hash,
+ inner: T,
+ hash: Hash,
}
impl Deref for Hashed {
- type Target = T;
+ type Target = T;
- fn deref(&self) -> &Self::Target {
- &self.inner
- }
+ fn deref(&self) -> &Self::Target {
+ &self.inner
+ }
}
impl Hashed {
- // fn from_hash(cache: &PathBuf, hash: Hash) -> Self {
- // let (dir, file) = hash.get_parts();
+ // fn from_hash(cache: &PathBuf, hash: Hash) -> Self {
+ // let (dir, file) = hash.get_parts();
- // let file_path = cache.join(dir).join(file);
+ // let file_path = cache.join(dir).join(file);
- // assert!(file_path.exists());
+ // assert!(file_path.exists());
- // let reader = T::read_file_and_verify_type(&file_path);
+ // let reader = T::read_file_and_verify_type(&file_path);
- // drop(reader);
+ // drop(reader);
- // Self {
- // inner: T::from_file(cache, &file_path),
- // hash,
- // }
- // }
+ // Self {
+ // inner: T::from_file(cache, &file_path),
+ // hash,
+ // }
+ // }
- fn from_object(value: T) -> Self {
- Self {
- hash: value.get_hash(),
- inner: value,
- }
- }
+ fn from_object(value: T) -> Self {
+ Self {
+ hash: value.get_hash(),
+ inner: value,
+ }
+ }
}
trait Object {
- fn get_object_type(&self) -> ObjectType;
- fn get_hash(&self) -> Hash;
- fn get_prefix(&self) -> String;
- fn write_to(&self, path: &Path);
+ fn get_object_type(&self) -> ObjectType;
+ fn get_hash(&self) -> Hash;
+ fn get_prefix(&self) -> String;
+ fn write_to(&self, path: &Path);
- // fn from_file(cache: &PathBuf, file: &PathBuf) -> Self;
+ // fn from_file(cache: &PathBuf, file: &PathBuf) -> Self;
- // fn read_file_and_verify_type(path: &PathBuf) -> BufReader {
- // let f = File::open(file_path).unwrap();
- // let mut reader = BufReader::new(f);
+ // fn read_file_and_verify_type(path: &PathBuf) -> BufReader {
+ // let f = File::open(file_path).unwrap();
+ // let mut reader = BufReader::new(f);
- // let mut data = Vec::new();
- // reader.read_until(0, &mut data);
+ // let mut data = Vec::new();
+ // reader.read_until(0, &mut data);
- // if data.last() == Some(&0) {
- // data.pop();
- // }
+ // if data.last() == Some(&0) {
+ // data.pop();
+ // }
- // let name = String::from_utf8(data).unwrap();
+ // let name = String::from_utf8(data).unwrap();
- // let (typ, size) = name.split_once(' ').unwrap();
+ // let (typ, size) = name.split_once(' ').unwrap();
- // let object_type = ObjectType::from_str(typ);
+ // let object_type = ObjectType::from_str(typ);
- // assert!(object_type == T::get_object_type());
+ // assert!(object_type == T::get_object_type());
- // reader
- // }
+ // reader
+ // }
}
struct CacheObject<'a> {
- cache: &'a PathBuf,
- object_type: ObjectType,
- hash: Hash,
- size: u64,
- file: PathBuf,
+ cache: &'a PathBuf,
+ object_type: ObjectType,
+ hash: Hash,
+ size: u64,
+ file: PathBuf,
}
impl<'a> CacheObject<'a> {
- fn from_file(cache: &'a PathBuf, file_path: &PathBuf) -> Self {
- let file = File::open(file_path).unwrap();
- let mut file = BufReader::new(file);
+ fn from_file(cache: &'a PathBuf, file_path: &PathBuf) -> Self {
+ let file = File::open(file_path).unwrap();
+ let mut file = BufReader::new(file);
- let mut data = Vec::new();
- file.read_until(b'\0', &mut data).unwrap();
+ let mut data = Vec::new();
+ file.read_until(b'\0', &mut data).unwrap();
- if data.last() == Some(&0) {
- data.pop();
- }
+ if data.last() == Some(&0) {
+ data.pop();
+ }
- let data = String::from_utf8(data).expect("data to be a valid u8");
+ let data = String::from_utf8(data).expect("data to be a valid u8");
- let (object_type, size) = data.split_once(' ').unwrap();
+ let (object_type, size) = data.split_once(' ').unwrap();
- let object_type = ObjectType::from_str(object_type).unwrap();
+ let object_type = ObjectType::from_str(object_type).unwrap();
- let hash = Hash::from_path(file_path).unwrap();
+ let hash = Hash::from_path(file_path).unwrap();
- Self {
- cache,
- file: file_path.clone(),
- size: size.parse().unwrap(),
- object_type,
- hash,
- }
- }
+ Self {
+ cache,
+ file: file_path.clone(),
+ size: size.parse().unwrap(),
+ object_type,
+ hash,
+ }
+ }
- fn to_index(&self) -> Hashed {
- assert!(self.object_type == ObjectType::Index);
+ fn to_index(&self) -> Hashed {
+ assert!(self.object_type == ObjectType::Index);
- let file = File::open(&self.file).unwrap();
- let mut file: BufReader = BufReader::new(file);
+ let file = File::open(&self.file).unwrap();
+ let mut file: BufReader = BufReader::new(file);
- let mut data = Vec::new();
- file.read_until(b'\0', &mut data).unwrap();
+ let mut data = Vec::new();
+ file.read_until(b'\0', &mut data).unwrap();
- let mut metadata = HashMap::new();
+ let mut metadata = HashMap::new();
- let mut string_data = String::new();
+ let mut string_data = String::new();
- file.read_to_string(&mut string_data)
- .expect("Index to only contain string");
+ file.read_to_string(&mut string_data)
+ .expect("Index to only contain string");
- let string_data = string_data.trim_end();
+ let string_data = string_data.trim_end();
- let lines = string_data.split('\n').collect::>();
+ let lines = string_data.split('\n').collect::>();
- for line in lines {
- let (key, value) = line.split_once(':').unwrap();
+ for line in lines {
+ let (key, value) = line.split_once(':').unwrap();
- _ = metadata.insert(key, value.trim());
- }
+ _ = metadata.insert(key, value.trim());
+ }
- let timestamp = DateTime::parse_from_rfc3339(metadata["timestamp"]).unwrap();
+ let timestamp = DateTime::parse_from_rfc3339(metadata["timestamp"]).unwrap();
- let tree_hash = Hash::try_from(metadata["tree"]).expect("tree hash to be valid");
+ let tree_hash = Hash::try_from(metadata["tree"]).expect("tree hash to be valid");
- let tree_object = CacheObject::from_file(self.cache, &tree_hash.get_path(self.cache));
+ let tree_object = CacheObject::from_file(self.cache, &tree_hash.get_path(self.cache));
- assert!(tree_object.get_object_type() == ObjectType::Tree);
+ assert!(tree_object.get_object_type() == ObjectType::Tree);
- Hashed {
- hash: self.hash.clone(),
- inner: Index {
- timestamp: timestamp.into(),
- tree: tree_object.to_tree(Mode::Tree, ""),
- },
- }
- }
+ Hashed {
+ hash: self.hash.clone(),
+ inner: Index {
+ timestamp: timestamp.into(),
+ tree: tree_object.to_tree(Mode::Tree, ""),
+ },
+ }
+ }
- fn to_tree(&self, mode: Mode, path: &str) -> Hashed {
- assert!(self.object_type == ObjectType::Tree);
+ fn to_tree(&self, mode: Mode, path: &str) -> Hashed {
+ assert!(self.object_type == ObjectType::Tree);
- // println!("Reading tree {}", self.hash);
+ // println!("Reading tree {}", self.hash);
- let file = File::open(&self.file).unwrap();
- let mut file: BufReader = BufReader::new(file);
+ let file = File::open(&self.file).unwrap();
+ let mut file: BufReader = BufReader::new(file);
- // Read out the file header
- let _ = read_header_from_file(&mut file).expect("File header to be correct");
+ // Read out the file header
+ let _ = read_header_from_file(&mut file).expect("File header to be correct");
- let mut vec = Vec::new();
+ let mut vec = Vec::new();
- loop {
- let mut buffer = Vec::new();
- let bytes = file
- .read_until(0, &mut buffer)
- .expect("To have a file header");
+ loop {
+ let mut buffer = Vec::new();
+ let bytes = file
+ .read_until(0, &mut buffer)
+ .expect("To have a file header");
- if bytes == 0 {
- break;
- }
+ if bytes == 0 {
+ break;
+ }
- let string = from_utf8(&buffer[..buffer.len() - 1]).expect("valid utf8");
+ let string = from_utf8(&buffer[..buffer.len() - 1]).expect("valid utf8");
- let (mode, name) = string.split_once(' ').expect("space");
+ let (mode, name) = string.split_once(' ').expect("space");
- let mode = Mode::from_str(mode).expect("valid mode");
+ let mode = Mode::from_str(mode).expect("valid mode");
- let mut hash: [u8; 64] = [0; 64];
- file.read_exact(&mut hash).expect("file to contain hash");
+ let mut hash: [u8; 64] = [0; 64];
+ file.read_exact(&mut hash).expect("file to contain hash");
- let hash = Hash::from(hash);
+ let hash = Hash::from(hash);
- let object_file = hash.get_path(self.cache);
+ let object_file = hash.get_path(self.cache);
- let cache_object = CacheObject::from_file(self.cache, &object_file);
+ let cache_object = CacheObject::from_file(self.cache, &object_file);
- vec.push(match cache_object.object_type {
- ObjectType::Blob => TreeObject::Blob(cache_object.to_blob(mode, name)),
- ObjectType::Tree => TreeObject::Tree(cache_object.to_tree(mode, name)),
- ObjectType::Index => panic!("Invalid ObjectType in tree"),
- })
- }
+ vec.push(match cache_object.object_type {
+ ObjectType::Blob => TreeObject::Blob(cache_object.to_blob(mode, name)),
+ ObjectType::Tree => TreeObject::Tree(cache_object.to_tree(mode, name)),
+ ObjectType::Index => panic!("Invalid ObjectType in tree"),
+ })
+ }
- Hashed {
- hash: self.hash.clone(),
- inner: Tree {
- mode,
- path: path.to_owned(),
- contents: vec,
- },
- }
- }
+ Hashed {
+ hash: self.hash.clone(),
+ inner: Tree {
+ mode,
+ path: path.to_owned(),
+ contents: vec,
+ },
+ }
+ }
- fn to_blob(&self, mode: Mode, path: &str) -> Hashed {
- assert!(self.object_type == ObjectType::Blob);
+ fn to_blob(&self, mode: Mode, path: &str) -> Hashed {
+ assert!(self.object_type == ObjectType::Blob);
- let file = File::open(&self.file).unwrap();
- let mut file: BufReader = BufReader::new(file);
+ let file = File::open(&self.file).unwrap();
+ let mut file: BufReader = BufReader::new(file);
- // Read out the file header
- let Header { size, .. } =
- read_header_from_file(&mut file).expect("File header to be correct");
+ // Read out the file header
+ let Header { size, .. } =
+ read_header_from_file(&mut file).expect("File header to be correct");
- Hashed {
- hash: self.hash.clone(),
+ Hashed {
+ hash: self.hash.clone(),
- inner: Blob {
- mode,
- path: path.to_string(),
- file: self.file.clone(),
- size,
- },
- }
- }
+ inner: Blob {
+ mode,
+ path: path.to_string(),
+ file: self.file.clone(),
+ size,
+ },
+ }
+ }
}
impl<'a> Object for CacheObject<'a> {
- fn get_object_type(&self) -> ObjectType {
- self.object_type
- }
+ fn get_object_type(&self) -> ObjectType {
+ self.object_type
+ }
- fn get_hash(&self) -> Hash {
- self.hash.clone()
- }
+ fn get_hash(&self) -> Hash {
+ self.hash.clone()
+ }
- fn get_prefix(&self) -> String {
- format!("{} {}\0", self.object_type.to_str(), self.size)
- }
+ fn get_prefix(&self) -> String {
+ format!("{} {}\0", self.object_type.to_str(), self.size)
+ }
- fn write_to(&self, _: &Path) {
- unimplemented!("Should probably fix this")
- }
+ fn write_to(&self, _: &Path) {
+ unimplemented!("Should probably fix this")
+ }
}
#[derive(Debug)]
struct Index {
- timestamp: DateTime,
- tree: Hashed,
+ timestamp: DateTime,
+ tree: Hashed,
}
impl Index {
- fn get_body(&self) -> String {
- format!(
- "tree: {}\ntimestamp: {}\n\n",
- self.tree.hash,
- self.timestamp.to_rfc3339()
- )
- }
-
- fn from_path(path: &Path, cache: Option<&Path>) -> Hashed {
- assert!(path.is_dir());
- let tree = Tree::from_dir(path, cache);
- let index = Index {
- timestamp: Utc::now(),
- tree,
- };
- let hashed = Hashed::from_object(index);
- if let Some(cache) = cache {
- hashed.write_if_not_exists(cache);
- }
- hashed
- }
+ fn get_body(&self) -> String {
+ format!(
+ "tree: {}\ntimestamp: {}\n\n",
+ self.tree.hash,
+ self.timestamp.to_rfc3339()
+ )
+ }
+
+ fn from_path(path: &Path, cache: Option<&Path>) -> Hashed {
+ assert!(path.is_dir());
+ let tree = Tree::from_dir(path, cache);
+ let index = Index {
+ timestamp: Utc::now(),
+ tree,
+ };
+ let hashed = Hashed::from_object(index);
+ if let Some(cache) = cache {
+ hashed.write_if_not_exists(cache);
+ }
+ hashed
+ }
}
impl Object for Index {
- fn get_object_type(&self) -> ObjectType {
- ObjectType::Index
- }
+ fn get_object_type(&self) -> ObjectType {
+ ObjectType::Index
+ }
- fn get_hash(&self) -> Hash {
- let body = self.get_body();
- let mut hasher = Sha512::new();
- write!(hasher, "{}{}", self.get_prefix(), body).unwrap();
- Hash::from(hasher)
- }
+ fn get_hash(&self) -> Hash {
+ let body = self.get_body();
+ let mut hasher = Sha512::new();
+ write!(hasher, "{}{}", self.get_prefix(), body).unwrap();
+ Hash::from(hasher)
+ }
- fn get_prefix(&self) -> String {
- format!("{} {}\0", INDEX_KEY, self.get_body().len())
- }
+ fn get_prefix(&self) -> String {
+ format!("{} {}\0", INDEX_KEY, self.get_body().len())
+ }
- fn write_to(&self, path: &Path) {
- let mut file = File::create(path).unwrap();
+ fn write_to(&self, path: &Path) {
+ let mut file = File::create(path).unwrap();
- file.write_all(self.get_prefix().as_bytes()).unwrap();
- file.write_all(self.get_body().as_bytes()).unwrap();
- }
+ file.write_all(self.get_prefix().as_bytes()).unwrap();
+ file.write_all(self.get_body().as_bytes()).unwrap();
+ }
- // fn from_file(cache: &PathBuf, index: &PathBuf) -> Index {
- // let mut reader = Index::read_file_and_verify_type(index);
+ // fn from_file(cache: &PathBuf, index: &PathBuf) -> Index {
+ // let mut reader = Index::read_file_and_verify_type(index);
- // let mut line = String::new();
+ // let mut line = String::new();
- // let kv: HashMap<&str, &str> = HashMap::new();
+ // let kv: HashMap<&str, &str> = HashMap::new();
- // while reader.read_line(&mut line).is_ok() {
- // let (key, value) = line.split_once(':').unwrap();
+ // while reader.read_line(&mut line).is_ok() {
+ // let (key, value) = line.split_once(':').unwrap();
- // kv.insert(key, value.trim())
- // }
+ // kv.insert(key, value.trim())
+ // }
- // let timestamp = DateTime::from_utf8(kv["timestamp"]);
+ // let timestamp = DateTime::from_utf8(kv["timestamp"]);
- // Index {
- // timestamp: ,
- // tree: Hashed::from_hash(cache, kv["tree"].into())
- // }
- // }
+ // Index {
+ // timestamp: ,
+ // tree: Hashed::from_hash(cache, kv["tree"].into())
+ // }
+ // }
}
trait WithPath {
- fn get_path_component(&self) -> &String;
- fn get_mode(&self) -> &Mode;
+ fn get_path_component(&self) -> &String;
+ fn get_mode(&self) -> &Mode;
}
#[derive(Debug)]
struct Tree {
- mode: Mode,
- path: String,
- contents: Vec,
+ mode: Mode,
+ path: String,
+ contents: Vec,
}
impl Tree {
- fn get_body(&self) -> Vec {
- let mut value = Vec::new();
-
- for object in self.contents.iter() {
- value.extend_from_slice(&object.to_tree_bytes());
- }
-
- value
- }
-
- fn from_dir(path: &Path, cache: Option<&Path>) -> Hashed {
- assert!(path.is_dir());
-
- let entries: Vec = std::fs::read_dir(path)
- .expect("Failed to read directory")
- .map(|entry| entry.expect("Failed to read directory entry").path())
- .collect();
-
- let mut contents: Vec = entries
- .par_iter()
- .map(|path| {
- if path.is_dir() {
- TreeObject::Tree(Tree::from_dir(path, cache))
- } else {
- TreeObject::Blob(Blob::hash_and_write(path, cache))
- }
- })
- .collect();
-
- // read_dir returns entries in filesystem order, which is not
- // guaranteed to be stable. Sort by name so the resulting tree hash
- // is deterministic across platforms and repeated runs.
- contents.sort_by(|a, b| a.path_component().cmp(b.path_component()));
-
- let tree = Self {
- mode: Mode::Tree,
- contents,
- path: match path.file_name() {
- Some(v) => v.to_string_lossy().to_string(),
- None => panic!("{path:?} did not have a filename"),
- },
- };
-
- let hashed = Hashed::from_object(tree);
- if let Some(cache) = cache {
- hashed.write_if_not_exists(cache);
- }
- hashed
- }
+ fn get_body(&self) -> Vec {
+ let mut value = Vec::new();
+
+ for object in self.contents.iter() {
+ value.extend_from_slice(&object.to_tree_bytes());
+ }
+
+ value
+ }
+
+ fn from_dir(path: &Path, cache: Option<&Path>) -> Hashed {
+ assert!(path.is_dir());
+
+ let entries: Vec = std::fs::read_dir(path)
+ .expect("Failed to read directory")
+ .map(|entry| entry.expect("Failed to read directory entry").path())
+ .collect();
+
+ let mut contents: Vec = entries
+ .par_iter()
+ .map(|path| {
+ if path.is_dir() {
+ TreeObject::Tree(Tree::from_dir(path, cache))
+ } else {
+ TreeObject::Blob(Blob::hash_and_write(path, cache))
+ }
+ })
+ .collect();
+
+ // read_dir returns entries in filesystem order, which is not
+ // guaranteed to be stable. Sort by name so the resulting tree hash
+ // is deterministic across platforms and repeated runs.
+ contents.sort_by(|a, b| a.path_component().cmp(b.path_component()));
+
+ let tree = Self {
+ mode: Mode::Tree,
+ contents,
+ path: match path.file_name() {
+ Some(v) => v.to_string_lossy().to_string(),
+ None => panic!("{path:?} did not have a filename"),
+ },
+ };
+
+ let hashed = Hashed::from_object(tree);
+ if let Some(cache) = cache {
+ hashed.write_if_not_exists(cache);
+ }
+ hashed
+ }
}
impl WithPath for Tree {
- fn get_path_component(&self) -> &String {
- &self.path
- }
+ fn get_path_component(&self) -> &String {
+ &self.path
+ }
- fn get_mode(&self) -> &Mode {
- &self.mode
- }
+ fn get_mode(&self) -> &Mode {
+ &self.mode
+ }
}
impl Object for Tree {
- fn get_object_type(&self) -> ObjectType {
- ObjectType::Tree
- }
+ fn get_object_type(&self) -> ObjectType {
+ ObjectType::Tree
+ }
- fn get_hash(&self) -> Hash {
- let body = self.get_body();
- let mut hasher = Sha512::new();
- write!(hasher, "{}", self.get_prefix()).unwrap();
- hasher
- .write_all(&body)
- .expect("Body to be added to the hasher");
- Hash::from(hasher)
- }
+ fn get_hash(&self) -> Hash {
+ let body = self.get_body();
+ let mut hasher = Sha512::new();
+ write!(hasher, "{}", self.get_prefix()).unwrap();
+ hasher
+ .write_all(&body)
+ .expect("Body to be added to the hasher");
+ Hash::from(hasher)
+ }
- fn write_to(&self, path: &Path) {
- let mut file = File::create(path).unwrap();
+ fn write_to(&self, path: &Path) {
+ let mut file = File::create(path).unwrap();
- file.write_all(self.get_prefix().as_bytes()).unwrap();
- file.write_all(&self.get_body()).unwrap();
- }
+ file.write_all(self.get_prefix().as_bytes()).unwrap();
+ file.write_all(&self.get_body()).unwrap();
+ }
- fn get_prefix(&self) -> String {
- format!("{} {}\0", TREE_KEY, self.get_body().len())
- }
+ fn get_prefix(&self) -> String {
+ format!("{} {}\0", TREE_KEY, self.get_body().len())
+ }
- // fn from_file(cache: &PathBuf, file: &PathBuf) -> Self {
- // let mut reader = Object::read_file_and_verify_type(file);
+ // fn from_file(cache: &PathBuf, file: &PathBuf) -> Self {
+ // let mut reader = Object::read_file_and_verify_type(file);
- // let mut line = String::new();
+ // let mut line = String::new();
- // while reader.read_line(&mut line).is_ok() {
- // let (detail, hash) = line.split_once('\0').unwrap();
+ // while reader.read_line(&mut line).is_ok() {
+ // let (detail, hash) = line.split_once('\0').unwrap();
- // }
- // }
+ // }
+ // }
}
#[derive(Debug)]
struct Blob {
- mode: Mode,
- path: String,
- file: PathBuf,
- size: u64,
+ mode: Mode,
+ path: String,
+ file: PathBuf,
+ size: u64,
}
impl Blob {
- fn from_path(path: &Path) -> Self {
- assert!(path.is_file());
-
- Self {
- // TODO: Support other types
- mode: Mode::Normal,
- path: path.file_name().unwrap().to_string_lossy().to_string(),
- size: path.metadata().unwrap().len(),
- file: path.to_path_buf(),
- }
- }
-
- /// Hash the file and write the blob object to `cache` in a single I/O pass.
- /// The file content is buffered in memory while being hashed, then written
- /// to the content-addressed path. May need to reconsider this approach if it turns out
- /// People try and archive a 10Gb file on 2Gb of RAM
- fn hash_and_write(src: &Path, cache: Option<&Path>) -> Hashed {
- assert!(src.is_file());
-
- let size = src.metadata().unwrap().len();
- let prefix = format!("{} {}\0", BLOB_KEY, size);
-
- let mut hasher = Sha512::new();
- hasher.write_all(prefix.as_bytes()).unwrap();
-
- let f = File::open(src).unwrap();
- let mut reader = BufReader::new(f);
- let mut content = Vec::with_capacity(size as usize);
- let mut buf: [u8; 8192] = [0; 8192];
- loop {
- let n = reader.read(&mut buf).unwrap();
- if n == 0 {
- break;
- }
- hasher.write_all(&buf[..n]).unwrap();
- content.extend_from_slice(&buf[..n]);
- }
-
- let hash = Hash::from(hasher);
-
- if let Some(cache) = cache {
- let dest = hash.get_path(cache);
- if !dest.exists() {
- create_dir_all(dest.parent().unwrap()).unwrap();
- let mut out = File::create(&dest).unwrap();
- out.write_all(prefix.as_bytes()).unwrap();
- out.write_all(&content).unwrap();
- }
- }
-
- Hashed {
- hash,
- inner: Self {
- mode: Mode::Normal,
- path: src.file_name().unwrap().to_string_lossy().to_string(),
- file: src.to_path_buf(),
- size,
- },
- }
- }
+ fn from_path(path: &Path) -> Self {
+ assert!(path.is_file());
+
+ Self {
+ // TODO: Support other types
+ mode: Mode::Normal,
+ path: path.file_name().unwrap().to_string_lossy().to_string(),
+ size: path.metadata().unwrap().len(),
+ file: path.to_path_buf(),
+ }
+ }
+
+ /// Hash the file and write the blob object to `cache` in a single I/O pass.
+ /// The file content is buffered in memory while being hashed, then written
+ /// to the content-addressed path. May need to reconsider this approach if it turns out
+ /// People try and archive a 10Gb file on 2Gb of RAM
+ fn hash_and_write(src: &Path, cache: Option<&Path>) -> Hashed {
+ assert!(src.is_file());
+
+ let size = src.metadata().unwrap().len();
+ let prefix = format!("{} {}\0", BLOB_KEY, size);
+
+ let mut hasher = Sha512::new();
+ hasher.write_all(prefix.as_bytes()).unwrap();
+
+ let f = File::open(src).unwrap();
+ let mut reader = BufReader::new(f);
+ let mut content = Vec::with_capacity(size as usize);
+ let mut buf: [u8; 8192] = [0; 8192];
+ loop {
+ let n = reader.read(&mut buf).unwrap();
+ if n == 0 {
+ break;
+ }
+ hasher.write_all(&buf[..n]).unwrap();
+ content.extend_from_slice(&buf[..n]);
+ }
+
+ let hash = Hash::from(hasher);
+
+ if let Some(cache) = cache {
+ let dest = hash.get_path(cache);
+ if !dest.exists() {
+ create_dir_all(dest.parent().unwrap()).unwrap();
+ let mut out = File::create(&dest).unwrap();
+ out.write_all(prefix.as_bytes()).unwrap();
+ out.write_all(&content).unwrap();
+ }
+ }
+
+ Hashed {
+ hash,
+ inner: Self {
+ mode: Mode::Normal,
+ path: src.file_name().unwrap().to_string_lossy().to_string(),
+ file: src.to_path_buf(),
+ size,
+ },
+ }
+ }
}
impl WithPath for Blob {
- fn get_path_component(&self) -> &String {
- &self.path
- }
+ fn get_path_component(&self) -> &String {
+ &self.path
+ }
- fn get_mode(&self) -> &Mode {
- &self.mode
- }
+ fn get_mode(&self) -> &Mode {
+ &self.mode
+ }
}
impl Object for Blob {
- fn get_object_type(&self) -> ObjectType {
- ObjectType::Blob
- }
+ fn get_object_type(&self) -> ObjectType {
+ ObjectType::Blob
+ }
- fn get_hash(&self) -> Hash {
- let mut hasher = Sha512::new();
- hasher.write_all(self.get_prefix().as_bytes()).unwrap();
+ fn get_hash(&self) -> Hash {
+ let mut hasher = Sha512::new();
+ hasher.write_all(self.get_prefix().as_bytes()).unwrap();
- let f = File::open(&self.file).unwrap();
- let mut reader = BufReader::new(f);
+ let f = File::open(&self.file).unwrap();
+ let mut reader = BufReader::new(f);
- let mut buf: [u8; 1024] = [0; 1024];
+ let mut buf: [u8; 1024] = [0; 1024];
- while let Ok(bytes_read) = reader.read(&mut buf) {
- if bytes_read == 0 {
- break;
- }
+ while let Ok(bytes_read) = reader.read(&mut buf) {
+ if bytes_read == 0 {
+ break;
+ }
- hasher.write_all(&buf[..bytes_read]).unwrap();
- }
+ hasher.write_all(&buf[..bytes_read]).unwrap();
+ }
- Hash::from(hasher)
- }
+ Hash::from(hasher)
+ }
- fn write_to(&self, path: &Path) {
- let mut file = File::create(path).unwrap();
+ fn write_to(&self, path: &Path) {
+ let mut file = File::create(path).unwrap();
- file.write_all(self.get_prefix().as_bytes()).unwrap();
+ file.write_all(self.get_prefix().as_bytes()).unwrap();
- let mut src = File::open(&self.file).unwrap();
- std::io::copy(&mut src, &mut file).unwrap();
- }
+ let mut src = File::open(&self.file).unwrap();
+ std::io::copy(&mut src, &mut file).unwrap();
+ }
- fn get_prefix(&self) -> String {
- format!("{} {}\0", BLOB_KEY, self.size)
- }
+ fn get_prefix(&self) -> String {
+ format!("{} {}\0", BLOB_KEY, self.size)
+ }
}
#[derive(Debug)]
enum TreeObject {
- Tree(Hashed),
- Blob(Hashed),
+ Tree(Hashed),
+ Blob(Hashed),
}
trait ObjectWithPath: WithPath + Object {}
impl TreeObject {
- fn to_tree_bytes(&self) -> Vec {
- match self {
- Self::Tree(tree) => get_bytes_from_thing(tree.deref(), &tree.hash),
- Self::Blob(blob) => get_bytes_from_thing(blob.deref(), &blob.hash),
- }
- }
-
- fn path_component(&self) -> &str {
- match self {
- Self::Tree(tree) => tree.get_path_component(),
- Self::Blob(blob) => blob.get_path_component(),
- }
- }
+ fn to_tree_bytes(&self) -> Vec {
+ match self {
+ Self::Tree(tree) => get_bytes_from_thing(tree.deref(), &tree.hash),
+ Self::Blob(blob) => get_bytes_from_thing(blob.deref(), &blob.hash),
+ }
+ }
+
+ fn path_component(&self) -> &str {
+ match self {
+ Self::Tree(tree) => tree.get_path_component(),
+ Self::Blob(blob) => blob.get_path_component(),
+ }
+ }
}
fn get_bytes_from_thing(object: &T, hash: &Hash) -> Vec {
- let mut path = Vec::new();
+ let mut path = Vec::new();
- path.extend_from_slice(object.get_mode().to_string().as_bytes());
- path.push(b' ');
- path.extend_from_slice(object.get_path_component().as_bytes());
- path.push(0);
- path.extend_from_slice(&hash.hash);
+ path.extend_from_slice(object.get_mode().to_string().as_bytes());
+ path.push(b' ');
+ path.extend_from_slice(object.get_path_component().as_bytes());
+ path.push(0);
+ path.extend_from_slice(&hash.hash);
- path
+ path
}
impl Hashed {
- fn write_if_not_exists(&self, dir: &Path) {
- let path = self.hash.get_path(dir);
- let dir = path.parent().unwrap();
+ fn write_if_not_exists(&self, dir: &Path) {
+ let path = self.hash.get_path(dir);
+ let dir = path.parent().unwrap();
- let _ = create_dir_all(dir);
+ let _ = create_dir_all(dir);
- if !path.exists() {
- // println!("writing {:?} {:?}", T::get_object_type(), path);
- self.write_to(&path);
- }
- }
+ if !path.exists() {
+ // println!("writing {:?} {:?}", T::get_object_type(), path);
+ self.write_to(&path);
+ }
+ }
}
fn get_total_size(index: &Hashed) -> u128 {
- let mut total = 0;
+ let mut total = 0;
- for element in &index.contents {
- total += match element {
- TreeObject::Tree(tree) => get_total_size(tree),
- TreeObject::Blob(blob) => blob.size as u128,
- }
- }
+ for element in &index.contents {
+ total += match element {
+ TreeObject::Tree(tree) => get_total_size(tree),
+ TreeObject::Blob(blob) => blob.size as u128,
+ }
+ }
- total
+ total
}
fn commit_directory(cache: &PathBuf, path: &PathBuf) {
- assert!(path.exists());
- assert!(path.is_dir());
+ assert!(path.exists());
+ assert!(path.is_dir());
- if cache.exists() {
- assert!(cache.is_dir());
- } else {
- create_dir_all(cache).unwrap();
- }
+ if cache.exists() {
+ assert!(cache.is_dir());
+ } else {
+ create_dir_all(cache).unwrap();
+ }
- let Ok(path) = path.canonicalize() else {
- panic!("unable to canonicalize {path:?}");
- };
+ let Ok(path) = path.canonicalize() else {
+ panic!("unable to canonicalize {path:?}");
+ };
- let index = Index::from_path(&path, Some(cache));
+ let index = Index::from_path(&path, Some(cache));
- println!(
- "Finished generating Index for {} bytes of data",
- get_total_size(&index.tree)
- );
+ println!(
+ "Finished generating Index for {} bytes of data",
+ get_total_size(&index.tree)
+ );
- println!("{}", index.hash);
+ println!("{}", index.hash);
}
fn restore_directory(cache: &PathBuf, path: &PathBuf, index: Hash, validate: bool) {
- if !path.exists() {
- create_dir_all(path).expect("Directory Creation to work");
- }
+ if !path.exists() {
+ create_dir_all(path).expect("Directory Creation to work");
+ }
- if !path.is_dir() {
- panic!("Path provided must be a valid directory");
- }
+ if !path.is_dir() {
+ panic!("Path provided must be a valid directory");
+ }
- if read_dir(path).unwrap().any(|_| true) {
- panic!("Path provided must be an empty directory");
- }
+ if read_dir(path).unwrap().any(|_| true) {
+ panic!("Path provided must be an empty directory");
+ }
- let index_path = index.get_path(cache);
- let index_cache = Hashed::from_object(CacheObject::from_file(cache, &index_path));
+ let index_path = index.get_path(cache);
+ let index_cache = Hashed::from_object(CacheObject::from_file(cache, &index_path));
- let index = index_cache.to_index();
+ let index = index_cache.to_index();
- // println!("{index:?}");
+ // println!("{index:?}");
- write_tree(&index.tree, path);
+ write_tree(&index.tree, path);
- if validate {
- validate_tree(&index.tree, path);
- }
+ if validate {
+ validate_tree(&index.tree, path);
+ }
}
fn validate_tree(tree: &Tree, path: &Path) {
- for item in tree.contents.iter() {
- if let TreeObject::Tree(tree) = item {
- let tree_path = path.join(&tree.path);
- validate_tree(tree, &tree_path);
- continue;
- }
+ for item in tree.contents.iter() {
+ if let TreeObject::Tree(tree) = item {
+ let tree_path = path.join(&tree.path);
+ validate_tree(tree, &tree_path);
+ continue;
+ }
- let TreeObject::Blob(blob) = item else {
- unreachable!();
- };
+ let TreeObject::Blob(blob) = item else {
+ unreachable!();
+ };
- let blob_path = path.join(&blob.path);
+ let blob_path = path.join(&blob.path);
- let blob_hash = Hashed::from_object(Blob::from_path(&blob_path));
+ let blob_hash = Hashed::from_object(Blob::from_path(&blob_path));
- assert!(blob_hash.hash == blob.hash);
- }
+ assert!(blob_hash.hash == blob.hash);
+ }
}
fn write_tree(tree: &Tree, path: &Path) {
- for item in tree.contents.iter() {
- if let TreeObject::Tree(tree) = item {
- let tree_path = path.join(&tree.path);
+ for item in tree.contents.iter() {
+ if let TreeObject::Tree(tree) = item {
+ let tree_path = path.join(&tree.path);
- create_dir(&tree_path).expect("Directory creation to work");
+ create_dir(&tree_path).expect("Directory creation to work");
- write_tree(tree, &tree_path);
- continue;
- }
+ write_tree(tree, &tree_path);
+ continue;
+ }
- let TreeObject::Blob(blob) = item else {
- unreachable!();
- };
+ let TreeObject::Blob(blob) = item else {
+ unreachable!();
+ };
- let blob_path = path.join(&blob.path);
+ let blob_path = path.join(&blob.path);
- let file = File::create(blob_path).expect("File to be created");
- let mut writer = BufWriter::new(file);
+ let file = File::create(blob_path).expect("File to be created");
+ let mut writer = BufWriter::new(file);
- let cache_file = File::open(&blob.file).unwrap();
- let mut reader = BufReader::new(cache_file);
+ let cache_file = File::open(&blob.file).unwrap();
+ let mut reader = BufReader::new(cache_file);
- let _ = read_header_from_file(&mut reader);
+ let _ = read_header_from_file(&mut reader);
- let mut data: [u8; 1024] = [0; 1024];
- while let Ok(num) = reader.read(&mut data) {
- if num == 0 {
- break;
- }
- writer.write_all(&data[..num]).unwrap();
- }
- }
+ let mut data: [u8; 1024] = [0; 1024];
+ while let Ok(num) = reader.read(&mut data) {
+ if num == 0 {
+ break;
+ }
+ writer.write_all(&data[..num]).unwrap();
+ }
+ }
}
fn cat_object(cache: &Path, hash: &Hash) {
- let object_path = hash.get_path(cache);
-
- let file = File::open(&object_path).unwrap();
- let mut reader: BufReader = BufReader::new(file);
-
- read_header_from_file(&mut reader).expect("file to contain a valid header");
-
- let mut stdout = std::io::stdout();
- let mut data: [u8; 1024] = [0; 1024];
- while let Ok(num) = reader.read(&mut data) {
- if num == 0 {
- break;
- }
- stdout.write_all(&data[..num]).unwrap();
- }
- println!();
+ let object_path = hash.get_path(cache);
+
+ let file = File::open(&object_path).unwrap();
+ let mut reader: BufReader = BufReader::new(file);
+
+ read_header_from_file(&mut reader).expect("file to contain a valid header");
+
+ let mut stdout = std::io::stdout();
+ let mut data: [u8; 1024] = [0; 1024];
+ while let Ok(num) = reader.read(&mut data) {
+ if num == 0 {
+ break;
+ }
+ stdout.write_all(&data[..num]).unwrap();
+ }
+ println!();
}
fn push_cache(cache: &PathBuf, url: &String, hash: Option) {
- if let Some(hash) = hash {
- let file = hash.get_path(cache);
- upload_object(&hash, &file, url);
- return;
- }
-
- for entry in read_dir(cache).unwrap().filter_map(|x| x.ok()) {
- let Ok(metadata) = entry.metadata() else {
- continue;
- };
-
- if metadata.is_file() {
- continue;
- }
-
- let prefix = entry.file_name();
-
- for entry in read_dir(entry.path()).unwrap().filter_map(|x| x.ok()) {
- let Ok(metadata) = entry.metadata() else {
- continue;
- };
-
- if !metadata.is_file() {
- continue;
- }
-
- let name = format!(
- "{}{}",
- prefix.to_string_lossy(),
- entry.file_name().to_string_lossy()
- );
- let hash = Hash::try_from(name).expect("Hash to be valid");
-
- upload_object(&hash, &entry.path(), url);
- }
- }
+ if let Some(hash) = hash {
+ let file = hash.get_path(cache);
+ upload_object(&hash, &file, url);
+ return;
+ }
+
+ for entry in read_dir(cache).unwrap().filter_map(|x| x.ok()) {
+ let Ok(metadata) = entry.metadata() else {
+ continue;
+ };
+
+ if metadata.is_file() {
+ continue;
+ }
+
+ let prefix = entry.file_name();
+
+ for entry in read_dir(entry.path()).unwrap().filter_map(|x| x.ok()) {
+ let Ok(metadata) = entry.metadata() else {
+ continue;
+ };
+
+ if !metadata.is_file() {
+ continue;
+ }
+
+ let name = format!(
+ "{}{}",
+ prefix.to_string_lossy(),
+ entry.file_name().to_string_lossy()
+ );
+ let hash = Hash::try_from(name).expect("Hash to be valid");
+
+ upload_object(&hash, &entry.path(), url);
+ }
+ }
}
fn pull_tree(cache: &PathBuf, url: &String, tree_hash: &Hash) {
- let tree_path = tree_hash.get_path(cache);
-
- // tree already exists locally so we can skip downloading it
- if !tree_path.exists() {
- let Some(Header { object_type, .. }) = download_object(tree_hash, &tree_path, url) else {
- eprintln!("Unable to download object with hash {tree_hash}");
- return;
- };
- assert!(object_type == ObjectType::Tree);
- }
-
- let mut file = File::open(tree_path).expect("Index file to exist");
- let mut index_data = Vec::new();
- let _ = file
- .read_to_end(&mut index_data)
- .expect("file to be readable");
-
- let (_, data) = read_header_and_body(&index_data).expect("Index to be in the correct format");
-
- let index_body = common::object_body::Tree::from_data(data);
-
- for entry in index_body.contents {
- let obj_path = entry.hash.get_path(cache);
- let Some(Header { object_type, .. }) = download_object(&entry.hash, &obj_path, url) else {
- eprintln!("Unable to download object with hash {}", entry.hash);
- return;
- };
-
- assert!(object_type != ObjectType::Index);
-
- if object_type == ObjectType::Tree {
- pull_tree(cache, url, &entry.hash);
- }
- }
+ let tree_path = tree_hash.get_path(cache);
+
+ // tree already exists locally so we can skip downloading it
+ if !tree_path.exists() {
+ let Some(Header { object_type, .. }) = download_object(tree_hash, &tree_path, url) else {
+ eprintln!("Unable to download object with hash {tree_hash}");
+ return;
+ };
+ assert!(object_type == ObjectType::Tree);
+ }
+
+ let mut file = File::open(tree_path).expect("Index file to exist");
+ let mut index_data = Vec::new();
+ let _ = file
+ .read_to_end(&mut index_data)
+ .expect("file to be readable");
+
+ let (_, data) = read_header_and_body(&index_data).expect("Index to be in the correct format");
+
+ let index_body = common::object_body::Tree::from_data(data);
+
+ for entry in index_body.contents {
+ let obj_path = entry.hash.get_path(cache);
+ let Some(Header { object_type, .. }) = download_object(&entry.hash, &obj_path, url) else {
+ eprintln!("Unable to download object with hash {}", entry.hash);
+ return;
+ };
+
+ assert!(object_type != ObjectType::Index);
+
+ if object_type == ObjectType::Tree {
+ pull_tree(cache, url, &entry.hash);
+ }
+ }
}
fn pull_cache(cache: &PathBuf, url: &String, hash: Hash) {
- let index_path = hash.get_path(cache);
- let Some(Header { object_type, .. }) = download_object(&hash, &index_path, url) else {
- eprintln!("Unable to download object with hash {hash}");
- return;
- };
+ let index_path = hash.get_path(cache);
+ let Some(Header { object_type, .. }) = download_object(&hash, &index_path, url) else {
+ eprintln!("Unable to download object with hash {hash}");
+ return;
+ };
- assert!(object_type == ObjectType::Index);
+ assert!(object_type == ObjectType::Index);
- let mut file = File::open(index_path).expect("Index file to exist");
- let mut index_data = Vec::new();
- let _ = file
- .read_to_end(&mut index_data)
- .expect("file to be readable");
+ let mut file = File::open(index_path).expect("Index file to exist");
+ let mut index_data = Vec::new();
+ let _ = file
+ .read_to_end(&mut index_data)
+ .expect("file to be readable");
- let (_, data) = read_header_and_body(&index_data).expect("Index to be in the correct format");
+ let (_, data) = read_header_and_body(&index_data).expect("Index to be in the correct format");
- let index_body = common::object_body::Index::from_data(data);
+ let index_body = common::object_body::Index::from_data(data);
- pull_tree(cache, url, &index_body.tree);
+ pull_tree(cache, url, &index_body.tree);
}
fn upload_object(hash: &Hash, file: &PathBuf, url: &String) {
- let file = File::open(file).expect("File to exist");
- let mut reader = BufReader::new(file);
+ let file = File::open(file).expect("File to exist");
+ let mut reader = BufReader::new(file);
- let Header { object_type, size } =
- read_header_from_file(&mut reader).expect("file to be a valid object");
+ let Header { object_type, size } =
+ read_header_from_file(&mut reader).expect("file to be a valid object");
- let url = format!("{url}/object/{hash}");
+ let url = format!("{url}/object/{hash}");
- println!("Sending put request to {url}");
+ println!("Sending put request to {url}");
- let response = ureq::put(url)
- .header("Object-Type", object_type.to_str())
- .header("Object-Size", size.to_string())
- .send(SendBody::from_reader(&mut reader));
+ let response = ureq::put(url)
+ .header("Object-Type", object_type.to_str())
+ .header("Object-Size", size.to_string())
+ .send(SendBody::from_reader(&mut reader));
- if let Err(err) = response {
- eprintln!("There was an error sending request {err:?}")
- }
+ if let Err(err) = response {
+ eprintln!("There was an error sending request {err:?}")
+ }
}
fn download_object(hash: &Hash, file: &PathBuf, url: &String) -> Option {
- let url = format!("{url}/object/{hash}");
-
- let dir = file.parent().expect("Path to not be at root");
- create_dir_all(dir).expect("Directory to be created");
-
- if file.exists() {
- let file = File::open(file).expect("File to exist");
- let mut reader = BufReader::new(file);
-
- let mut buffer = Vec::new();
- reader
- .read_until(0, &mut buffer)
- .expect("Header to exist within file");
-
- // subtract one to get rid of the null byte
- return read_header_from_slice(&buffer[..buffer.len() - 1]);
- }
-
- println!("Sending get request to {url}");
-
- let response = ureq::get(url).call();
-
- let mut response = match response {
- Ok(v) => v,
- Err(err) => {
- eprintln!("There was an error sending request {err:?}");
- return None;
- }
- };
-
- let file = File::create(file).expect("File to exist");
- let mut writer = BufWriter::new(file);
-
- let response_headers = response.headers();
- let object_type: ObjectType = ObjectType::from_str(
- response_headers
- .get("Object-Type")
- .expect("Object-Type Header to be present in the response")
- .to_str()
- .expect("Header to be valid ascii"),
- )
- .expect("Header to be a valid ObjectType");
- let object_size: u64 = response_headers
- .get("Object-Size")
- .expect("Object-Size header to be present in the response")
- .to_str()
- .expect("Header to be valid ascii")
- .parse()
- .expect("Header to be a valid number");
-
- let header = Header::new(object_type, object_size);
-
- writer.write_all(header.to_string().as_bytes()).unwrap();
-
- let mut data: [u8; 1024] = [0; 1024];
- let mut reader = response.body_mut().as_reader();
- while let Ok(num) = reader.read(&mut data) {
- if num == 0 {
- break;
- }
-
- writer.write_all(&data[..num]).unwrap();
- }
-
- Some(header)
+ let url = format!("{url}/object/{hash}");
+
+ let dir = file.parent().expect("Path to not be at root");
+ create_dir_all(dir).expect("Directory to be created");
+
+ if file.exists() {
+ let file = File::open(file).expect("File to exist");
+ let mut reader = BufReader::new(file);
+
+ let mut buffer = Vec::new();
+ reader
+ .read_until(0, &mut buffer)
+ .expect("Header to exist within file");
+
+ // subtract one to get rid of the null byte
+ return read_header_from_slice(&buffer[..buffer.len() - 1]);
+ }
+
+ println!("Sending get request to {url}");
+
+ let response = ureq::get(url).call();
+
+ let mut response = match response {
+ Ok(v) => v,
+ Err(err) => {
+ eprintln!("There was an error sending request {err:?}");
+ return None;
+ }
+ };
+
+ let file = File::create(file).expect("File to exist");
+ let mut writer = BufWriter::new(file);
+
+ let response_headers = response.headers();
+ let object_type: ObjectType = ObjectType::from_str(
+ response_headers
+ .get("Object-Type")
+ .expect("Object-Type Header to be present in the response")
+ .to_str()
+ .expect("Header to be valid ascii"),
+ )
+ .expect("Header to be a valid ObjectType");
+ let object_size: u64 = response_headers
+ .get("Object-Size")
+ .expect("Object-Size header to be present in the response")
+ .to_str()
+ .expect("Header to be valid ascii")
+ .parse()
+ .expect("Header to be a valid number");
+
+ let header = Header::new(object_type, object_size);
+
+ writer.write_all(header.to_string().as_bytes()).unwrap();
+
+ let mut data: [u8; 1024] = [0; 1024];
+ let mut reader = response.body_mut().as_reader();
+ while let Ok(num) = reader.read(&mut data) {
+ if num == 0 {
+ break;
+ }
+
+ writer.write_all(&data[..num]).unwrap();
+ }
+
+ Some(header)
}
fn pack_archive(
- cache: &Path,
- path: &Path,
- index_hash: &Hash,
- compression: CompressionAlgorithm,
- compression_level: CompressionLevel,
+ cache: &Path,
+ path: &Path,
+ index_hash: &Hash,
+ compression: CompressionAlgorithm,
+ compression_level: CompressionLevel,
) -> anyhow::Result<()> {
- assert!(!path.exists());
- assert!(path.parent().map(|p| p.exists() && p.is_dir()) == Some(true));
+ assert!(!path.exists());
+ assert!(path.parent().map(|p| p.exists() && p.is_dir()) == Some(true));
- let index_path = index_hash.get_path(cache);
- assert!(index_path.exists());
+ let index_path = index_hash.get_path(cache);
+ assert!(index_path.exists());
- let index = {
- let file = File::open(index_path).expect("file to exist");
- let mut reader = BufReader::new(file);
- let mut data = Vec::new();
- let _ = reader.read_to_end(&mut data).expect("File to be readable");
+ let index = {
+ let file = File::open(index_path).expect("file to exist");
+ let mut reader = BufReader::new(file);
+ let mut data = Vec::new();
+ let _ = reader.read_to_end(&mut data).expect("File to be readable");
- let (header, body) = read_header_and_body(&data).expect("File to be correctly formatted");
- assert!(header.object_type == ObjectType::Index);
+ let (header, body) = read_header_and_body(&data).expect("File to be correctly formatted");
+ assert!(header.object_type == ObjectType::Index);
- common::object_body::Index::from_data(body)
- };
+ common::object_body::Index::from_data(body)
+ };
- let mut headers: HashMap = HashMap::new();
+ let mut headers: HashMap = HashMap::new();
- read_object_into_headers_sync(cache, &mut headers, &index.tree)?;
+ read_object_into_headers_sync(cache, &mut headers, &index.tree)?;
- //TODO: Surely there is an algorithm to more efficiently lay out this data
- let mut i = 0;
- let mut header_entries: Vec = Vec::new();
+ //TODO: Surely there is an algorithm to more efficiently lay out this data
+ let mut i = 0;
+ let mut header_entries: Vec = Vec::new();
- for (hash, header) in &headers {
- let prefix_length = header.to_string().len() as u64;
- let total_length = header.size + prefix_length;
+ for (hash, header) in &headers {
+ let prefix_length = header.to_string().len() as u64;
+ let total_length = header.size + prefix_length;
- header_entries.push(ArchiveHeaderEntry {
- hash: hash.clone(),
- index: i,
- length: total_length,
- });
+ header_entries.push(ArchiveHeaderEntry {
+ hash: hash.clone(),
+ index: i,
+ length: total_length,
+ });
- i += total_length;
- }
+ i += total_length;
+ }
- let archive = Archive {
- header: HEADER,
- compression,
- hash: index_hash.clone(),
- index,
- body: ArchiveBody {
- header: header_entries,
- entries: headers
- .into_keys()
- .map(|hash| FileEntryData(hash.get_path(cache)))
- .collect(),
- },
- };
+ let archive = Archive {
+ header: HEADER,
+ compression,
+ hash: index_hash.clone(),
+ index,
+ body: ArchiveBody {
+ header: header_entries,
+ entries: headers
+ .into_keys()
+ .map(|hash| FileEntryData(hash.get_path(cache)))
+ .collect(),
+ },
+ };
- let arx_file = File::create(path)?;
- let mut writer = BufWriter::new(arx_file);
+ let arx_file = File::create(path)?;
+ let mut writer = BufWriter::new(arx_file);
- archive.to_data(compression_level, &mut writer)?;
+ archive.to_data(compression_level, &mut writer)?;
- Ok(())
+ Ok(())
}
fn unpack_archive(cache: &Path, path: &Path) -> anyhow::Result<()> {
- assert!(path.exists() && path.is_file());
+ assert!(path.exists() && path.is_file());
- let file = File::open(path)?;
- let mut file = BufReader::new(file);
+ let file = File::open(path)?;
+ let mut file = BufReader::new(file);
- let archive = Archive::::from_data(&mut file)?;
+ let archive = Archive::::from_data(&mut file)?;
- assert!(archive.body.entries.len() == archive.body.header.len());
+ assert!(archive.body.entries.len() == archive.body.header.len());
- println!("Successfully read archive, Index {}", archive.hash);
+ println!("Successfully read archive, Index {}", archive.hash);
- let index_data = archive.index.to_data();
- let index_header = Header::new(ObjectType::Index, index_data.len() as u64);
+ let index_data = archive.index.to_data();
+ let index_header = Header::new(ObjectType::Index, index_data.len() as u64);
- let mut hasher = Sha512::new();
- hasher.write_all(index_header.to_string().as_bytes())?;
- hasher.write_all(&index_data)?;
- assert!(Hash::from(hasher) == archive.hash);
+ let mut hasher = Sha512::new();
+ hasher.write_all(index_header.to_string().as_bytes())?;
+ hasher.write_all(&index_data)?;
+ assert!(Hash::from(hasher) == archive.hash);
- let path = archive.hash.get_path(cache);
- let _ = create_dir_all(path.parent().unwrap());
+ let path = archive.hash.get_path(cache);
+ let _ = create_dir_all(path.parent().unwrap());
- {
- let index_file = File::create(path)?;
- let mut writer = BufWriter::new(index_file);
- writer.write_all(index_header.to_string().as_bytes())?;
- writer.write_all(&index_data)?;
- }
+ {
+ let index_file = File::create(path)?;
+ let mut writer = BufWriter::new(index_file);
+ writer.write_all(index_header.to_string().as_bytes())?;
+ writer.write_all(&index_data)?;
+ }
- for (header, entry) in archive.body.header.into_iter().zip(archive.body.entries) {
- let path = header.hash.get_path(cache);
- let _ = create_dir_all(path.parent().unwrap());
+ for (header, entry) in archive.body.header.into_iter().zip(archive.body.entries) {
+ let path = header.hash.get_path(cache);
+ let _ = create_dir_all(path.parent().unwrap());
- let file = File::create(path)?;
- let mut writer = BufWriter::new(file);
+ let file = File::create(path)?;
+ let mut writer = BufWriter::new(file);
- writer.write_all(&entry.turn_into_vec())?;
- }
+ writer.write_all(&entry.turn_into_vec())?;
+ }
- Ok(())
+ Ok(())
}
/// Archive body entry: either an already-serialised byte buffer (for tree and
/// index objects, which are small and built up in memory during the walk) or
/// a lazy source-file read (for blobs, which can be arbitrarily large).
enum ArchiveEntry {
- Raw(RawEntryData, u64),
- Source(SourceFileEntryData, u64),
+ Raw(RawEntryData, u64),
+ Source(SourceFileEntryData, u64),
}
impl ArchiveEntry {
- fn length(&self) -> u64 {
- match self {
- ArchiveEntry::Raw(_, len) => *len,
- ArchiveEntry::Source(_, len) => *len,
- }
- }
+ fn length(&self) -> u64 {
+ match self {
+ ArchiveEntry::Raw(_, len) => *len,
+ ArchiveEntry::Source(_, len) => *len,
+ }
+ }
}
impl ArchiveEntryData for ArchiveEntry {
- fn turn_into_vec(self) -> Vec {
- match self {
- ArchiveEntry::Raw(data, _) => data.turn_into_vec(),
- ArchiveEntry::Source(data, _) => data.turn_into_vec(),
- }
- }
+ fn turn_into_vec(self) -> Vec {
+ match self {
+ ArchiveEntry::Raw(data, _) => data.turn_into_vec(),
+ ArchiveEntry::Source(data, _) => data.turn_into_vec(),
+ }
+ }
}
fn collect_archive_entries(tree: &Hashed, entries: &mut HashMap) {
- if !entries.contains_key(&tree.hash) {
- let prefix = tree.get_prefix();
- let body = tree.get_body();
- let mut bytes = Vec::with_capacity(prefix.len() + body.len());
- bytes.extend_from_slice(prefix.as_bytes());
- bytes.extend_from_slice(&body);
- let length = bytes.len() as u64;
- entries.insert(
- tree.hash.clone(),
- ArchiveEntry::Raw(RawEntryData::new(bytes), length),
- );
- }
-
- for content in &tree.contents {
- match content {
- TreeObject::Tree(subtree) => collect_archive_entries(subtree, entries),
- TreeObject::Blob(blob) => {
- if entries.contains_key(&blob.hash) {
- continue;
- }
- let header = Header::new(ObjectType::Blob, blob.size);
- let length = header.to_string().len() as u64 + blob.size;
- entries.insert(
- blob.hash.clone(),
- ArchiveEntry::Source(
- SourceFileEntryData {
- source_path: blob.file.clone(),
- header,
- },
- length,
- ),
- );
- }
- }
- }
+ if !entries.contains_key(&tree.hash) {
+ let prefix = tree.get_prefix();
+ let body = tree.get_body();
+ let mut bytes = Vec::with_capacity(prefix.len() + body.len());
+ bytes.extend_from_slice(prefix.as_bytes());
+ bytes.extend_from_slice(&body);
+ let length = bytes.len() as u64;
+ entries.insert(
+ tree.hash.clone(),
+ ArchiveEntry::Raw(RawEntryData::new(bytes), length),
+ );
+ }
+
+ for content in &tree.contents {
+ match content {
+ TreeObject::Tree(subtree) => collect_archive_entries(subtree, entries),
+ TreeObject::Blob(blob) => {
+ if entries.contains_key(&blob.hash) {
+ continue;
+ }
+ let header = Header::new(ObjectType::Blob, blob.size);
+ let length = header.to_string().len() as u64 + blob.size;
+ entries.insert(
+ blob.hash.clone(),
+ ArchiveEntry::Source(
+ SourceFileEntryData {
+ source_path: blob.file.clone(),
+ header,
+ },
+ length,
+ ),
+ );
+ }
+ }
+ }
}
fn archive_directory(
- directory: &Path,
- out_file: &Path,
- algorithm: CompressionAlgorithm,
- level: CompressionLevel,
+ directory: &Path,
+ out_file: &Path,
+ algorithm: CompressionAlgorithm,
+ level: CompressionLevel,
) -> anyhow::Result<()> {
- assert!(!out_file.exists(), "output file must not already exist");
- assert!(
- out_file.parent().map(|p| p.exists() && p.is_dir()) == Some(true),
- "parent of output file must exist and be a directory"
- );
- assert!(directory.is_dir(), "source must be a directory");
-
- let directory = directory
- .canonicalize()
- .unwrap_or_else(|_| panic!("unable to canonicalize {directory:?}"));
-
- // start timer
- let start = std::time::Instant::now();
-
- let hashed_index = Index::from_path(&directory, None);
-
- println!(
- "Finished generating Index for {} bytes of data in {} seconds",
- get_total_size(&hashed_index.tree),
- start.elapsed().as_secs_f64()
- );
-
- // Collect trees + blobs, deduping by hash. Index lives in the archive
- // header, not in body entries.
- let mut entries: HashMap = HashMap::new();
- collect_archive_entries(&hashed_index.tree, &mut entries);
-
- let mut offset: u64 = 0;
- let mut header_entries: Vec = Vec::with_capacity(entries.len());
- let mut body_entries: Vec = Vec::with_capacity(entries.len());
- for (hash, entry) in entries {
- let length = entry.length();
- header_entries.push(ArchiveHeaderEntry {
- hash: hash.clone(),
- index: offset,
- length,
- });
- body_entries.push(entry);
- offset += length;
- }
-
- let archive_index = common::object_body::Index {
- tree: hashed_index.tree.hash.clone(),
- timestamp: hashed_index.timestamp,
- metadata: HashMap::new(),
- };
-
- let archive = Archive {
- header: HEADER,
- compression: algorithm,
- hash: hashed_index.hash.clone(),
- index: archive_index,
- body: ArchiveBody {
- header: header_entries,
- entries: body_entries,
- },
- };
-
- let out = File::create(out_file)?;
- let mut writer = BufWriter::new(out);
- archive.to_data(level, &mut writer)?;
-
- println!(
- "Finished writing archive in {} seconds",
- start.elapsed().as_secs_f64()
- );
-
- println!("{}", hashed_index.hash);
- Ok(())
+ assert!(!out_file.exists(), "output file must not already exist");
+ assert!(
+ out_file.parent().map(|p| p.exists() && p.is_dir()) == Some(true),
+ "parent of output file must exist and be a directory"
+ );
+ assert!(directory.is_dir(), "source must be a directory");
+
+ let directory = directory
+ .canonicalize()
+ .unwrap_or_else(|_| panic!("unable to canonicalize {directory:?}"));
+
+ // start timer
+ let start = std::time::Instant::now();
+
+ let hashed_index = Index::from_path(&directory, None);
+
+ println!(
+ "Finished generating Index for {} bytes of data in {} seconds",
+ get_total_size(&hashed_index.tree),
+ start.elapsed().as_secs_f64()
+ );
+
+ // Collect trees + blobs, deduping by hash. Index lives in the archive
+ // header, not in body entries.
+ let mut entries: HashMap = HashMap::new();
+ collect_archive_entries(&hashed_index.tree, &mut entries);
+
+ let mut offset: u64 = 0;
+ let mut header_entries: Vec = Vec::with_capacity(entries.len());
+ let mut body_entries: Vec = Vec::with_capacity(entries.len());
+ for (hash, entry) in entries {
+ let length = entry.length();
+ header_entries.push(ArchiveHeaderEntry {
+ hash: hash.clone(),
+ index: offset,
+ length,
+ });
+ body_entries.push(entry);
+ offset += length;
+ }
+
+ let archive_index = common::object_body::Index {
+ tree: hashed_index.tree.hash.clone(),
+ timestamp: hashed_index.timestamp,
+ metadata: HashMap::new(),
+ };
+
+ let archive = Archive {
+ header: HEADER,
+ compression: algorithm,
+ hash: hashed_index.hash.clone(),
+ index: archive_index,
+ body: ArchiveBody {
+ header: header_entries,
+ entries: body_entries,
+ },
+ };
+
+ let out = File::create(out_file)?;
+ let mut writer = BufWriter::new(out);
+ archive.to_data(level, &mut writer)?;
+
+ println!(
+ "Finished writing archive in {} seconds",
+ start.elapsed().as_secs_f64()
+ );
+
+ println!("{}", hashed_index.hash);
+ Ok(())
}
#[derive(Parser)]
#[command(version, about, long_about = None)]
struct Cli {
- #[arg(short, long, value_name = "Store", default_value = "~/.cache/arx")]
- store: PathBuf,
-
- //TODO: Implement tracing/logging in client
- /// Verbose mode (-v, -vv, etc). Can be used to increase logging verbosity.
- #[arg(short, long, action = clap::ArgAction::Count)]
- verbose: u8,
- /// Quiet mode (-q, -qq, etc). Can be used to decrease logging verbosity.
- #[arg(short, long, action = clap::ArgAction::Count)]
- quiet: u8,
-
- #[command(subcommand)]
- command: Commands,
+ #[arg(short, long, value_name = "Store", default_value = "~/.cache/arx")]
+ store: PathBuf,
+
+ //TODO: Implement tracing/logging in client
+ /// Verbose mode (-v, -vv, etc). Can be used to increase logging verbosity.
+ #[arg(short, long, action = clap::ArgAction::Count)]
+ verbose: u8,
+ /// Quiet mode (-q, -qq, etc). Can be used to decrease logging verbosity.
+ #[arg(short, long, action = clap::ArgAction::Count)]
+ quiet: u8,
+
+ #[command(subcommand)]
+ command: Commands,
}
#[derive(Subcommand)]
enum Commands {
- Commit {
- directory: PathBuf,
- },
-
- Restore {
- #[arg(short, long)]
- directory: PathBuf,
- #[arg(short, long)]
- index: Hash,
- #[arg(long)]
- validate: bool,
- },
-
- Cat {
- #[arg(long)]
- hash: Hash,
- },
-
- Push {
- #[arg(long)]
- url: String,
-
- #[arg(long)]
- index: Option,
- },
-
- Pull {
- #[arg(long)]
- url: String,
-
- #[arg(long)]
- index: Hash,
- },
-
- Pack {
- #[arg(long)]
- index: Hash,
-
- #[arg(long)]
- file: PathBuf,
-
- #[arg(long, default_value_t, alias = "compression", alias = "alg")]
- algorithm: CompressionAlgorithm,
-
- #[arg(long, default_value_t, allow_hyphen_values = true)]
- level: CompressionLevel,
- },
-
- Unpack {
- file: PathBuf,
- },
-
- /// Build an archive directly from a source directory, bypassing the local store.
- Archive {
- directory: PathBuf,
-
- #[arg(short, long, default_value = "archive.arx")]
- output: PathBuf,
-
- #[arg(long, default_value_t, alias = "compression", alias = "alg")]
- algorithm: CompressionAlgorithm,
-
- #[arg(long, default_value_t, allow_hyphen_values = true)]
- level: CompressionLevel,
- },
+ Commit {
+ directory: PathBuf,
+ },
+
+ Restore {
+ #[arg(short, long)]
+ directory: PathBuf,
+ #[arg(short, long)]
+ index: Hash,
+ #[arg(long)]
+ validate: bool,
+ },
+
+ Cat {
+ #[arg(long)]
+ hash: Hash,
+ },
+
+ Push {
+ #[arg(long)]
+ url: String,
+
+ #[arg(long)]
+ index: Option,
+ },
+
+ Pull {
+ #[arg(long)]
+ url: String,
+
+ #[arg(long)]
+ index: Hash,
+ },
+
+ Pack {
+ #[arg(long)]
+ index: Hash,
+
+ #[arg(long)]
+ file: PathBuf,
+
+ #[arg(long, default_value_t, alias = "compression", alias = "alg")]
+ algorithm: CompressionAlgorithm,
+
+ #[arg(long, default_value_t, allow_hyphen_values = true)]
+ level: CompressionLevel,
+ },
+
+ Unpack {
+ file: PathBuf,
+ },
+
+ /// Build an archive directly from a source directory, bypassing the local store.
+ Archive {
+ directory: PathBuf,
+
+ #[arg(short, long, default_value = "archive.arx")]
+ output: PathBuf,
+
+ #[arg(long, default_value_t, alias = "compression", alias = "alg")]
+ algorithm: CompressionAlgorithm,
+
+ #[arg(long, default_value_t, allow_hyphen_values = true)]
+ level: CompressionLevel,
+ },
}
fn main() {
- let mut cli = Cli::parse();
-
- cli.store = shellexpand::tilde(cli.store.to_str().unwrap())
- .into_owned()
- .into();
-
- match cli.command {
- Commands::Commit { directory } => commit_directory(&cli.store, &directory),
- Commands::Restore {
- directory,
- index,
- validate,
- } => restore_directory(&cli.store, &directory, index, validate),
- Commands::Cat { hash } => cat_object(&cli.store, &hash),
- Commands::Push { url, index } => push_cache(&cli.store, &url, index),
- Commands::Pull { url, index } => pull_cache(&cli.store, &url, index),
- Commands::Pack {
- index,
- file,
- algorithm,
- level,
- } => pack_archive(&cli.store, &file, &index, algorithm, level).expect("Packing to work"),
- Commands::Unpack { file } => unpack_archive(&cli.store, &file).expect("Packing to work"),
- Commands::Archive {
- directory,
- output,
- algorithm,
- level,
- } => archive_directory(&directory, &output, algorithm, level).expect("Archiving to work"),
- }
+ let mut cli = Cli::parse();
+
+ cli.store = shellexpand::tilde(cli.store.to_str().unwrap())
+ .into_owned()
+ .into();
+
+ match cli.command {
+ Commands::Commit { directory } => commit_directory(&cli.store, &directory),
+ Commands::Restore {
+ directory,
+ index,
+ validate,
+ } => restore_directory(&cli.store, &directory, index, validate),
+ Commands::Cat { hash } => cat_object(&cli.store, &hash),
+ Commands::Push { url, index } => push_cache(&cli.store, &url, index),
+ Commands::Pull { url, index } => pull_cache(&cli.store, &url, index),
+ Commands::Pack {
+ index,
+ file,
+ algorithm,
+ level,
+ } => pack_archive(&cli.store, &file, &index, algorithm, level).expect("Packing to work"),
+ Commands::Unpack { file } => unpack_archive(&cli.store, &file).expect("Packing to work"),
+ Commands::Archive {
+ directory,
+ output,
+ algorithm,
+ level,
+ } => archive_directory(&directory, &output, algorithm, level).expect("Archiving to work"),
+ }
}
#[cfg(test)]
mod tests {
- use super::*;
- use tempfile::TempDir;
-
- fn make_dir_with_files(files: &[&str]) -> TempDir {
- let dir = TempDir::new().unwrap();
- for name in files {
- std::fs::write(dir.path().join(name), name.as_bytes()).unwrap();
- }
- dir
- }
-
- #[test]
- fn tree_entries_are_sorted_by_name() {
- // Created in deliberately non-alphabetical order.
- let dir = make_dir_with_files(&["zebra.txt", "alpha.txt", "middle.txt"]);
-
- let tree = Tree::from_dir(dir.path(), None);
-
- let names: Vec<&str> = tree.contents.iter().map(|o| o.path_component()).collect();
- assert_eq!(names, vec!["alpha.txt", "middle.txt", "zebra.txt"]);
- }
-
- #[test]
- fn tree_hash_is_deterministic() {
- let dir = make_dir_with_files(&["c.txt", "a.txt", "b.txt"]);
-
- let path = dir.path().to_path_buf();
- let first = Tree::from_dir(&path, None).hash;
- let second = Tree::from_dir(&path, None).hash;
-
- assert_eq!(first, second, "tree hash must be stable across runs");
- }
-
- #[test]
- fn archive_produces_parseable_output_with_expected_shape() {
- let src = make_dir_with_files(&["alpha.txt", "beta.txt", "gamma.txt"]);
- let out_dir = TempDir::new().unwrap();
- let out = out_dir.path().join("out.arx");
-
- archive_directory(
- src.path(),
- &out,
- CompressionAlgorithm::None,
- CompressionLevel::Default,
- )
- .expect("archive to succeed");
-
- // Reopen and parse.
- let f = File::open(&out).unwrap();
- let mut reader = BufReader::new(f);
- let archive = Archive::::from_data(&mut reader).expect("archive to parse");
-
- // Archive hash must match the SHA-512 of the index header + body bytes —
- // the same integrity invariant `unpack_archive` enforces when restoring
- // into a store. This catches mismatches between archive.hash and
- // archive.index without needing a separate reference build.
- let index_data = archive.index.to_data();
- let index_header = Header::new(ObjectType::Index, index_data.len() as u64);
- let mut hasher = Sha512::new();
- hasher
- .write_all(index_header.to_string().as_bytes())
- .unwrap();
- hasher.write_all(&index_data).unwrap();
- assert_eq!(
- archive.hash,
- Hash::from(hasher),
- "archive hash must equal sha512(index header + body)"
- );
-
- // Body has 1 tree + 3 blobs = 4 entries (index is in the archive header, not body).
- assert_eq!(
- archive.body.entries.len(),
- 4,
- "expected 1 tree + 3 blob entries"
- );
- assert_eq!(archive.body.header.len(), 4);
- }
-
- #[test]
- fn archive_dedups_identical_file_contents() {
- // Two files with the same content → one blob entry in the archive.
- let src = TempDir::new().unwrap();
- std::fs::write(src.path().join("first.txt"), b"shared content").unwrap();
- std::fs::write(src.path().join("second.txt"), b"shared content").unwrap();
-
- let out_dir = TempDir::new().unwrap();
- let out = out_dir.path().join("out.arx");
-
- archive_directory(
- src.path(),
- &out,
- CompressionAlgorithm::None,
- CompressionLevel::Default,
- )
- .expect("archive to succeed");
-
- let f = File::open(&out).unwrap();
- let mut reader = BufReader::new(f);
- let archive = Archive::::from_data(&mut reader).expect("archive to parse");
-
- // 1 tree + 1 deduped blob = 2 body entries.
- assert_eq!(
- archive.body.entries.len(),
- 2,
- "duplicate-content files must share a single blob entry"
- );
- }
+ use super::*;
+ use tempfile::TempDir;
+
+ fn make_dir_with_files(files: &[&str]) -> TempDir {
+ let dir = TempDir::new().unwrap();
+ for name in files {
+ std::fs::write(dir.path().join(name), name.as_bytes()).unwrap();
+ }
+ dir
+ }
+
+ #[test]
+ fn tree_entries_are_sorted_by_name() {
+ // Created in deliberately non-alphabetical order.
+ let dir = make_dir_with_files(&["zebra.txt", "alpha.txt", "middle.txt"]);
+
+ let tree = Tree::from_dir(dir.path(), None);
+
+ let names: Vec<&str> = tree.contents.iter().map(|o| o.path_component()).collect();
+ assert_eq!(names, vec!["alpha.txt", "middle.txt", "zebra.txt"]);
+ }
+
+ #[test]
+ fn tree_hash_is_deterministic() {
+ let dir = make_dir_with_files(&["c.txt", "a.txt", "b.txt"]);
+
+ let path = dir.path().to_path_buf();
+ let first = Tree::from_dir(&path, None).hash;
+ let second = Tree::from_dir(&path, None).hash;
+
+ assert_eq!(first, second, "tree hash must be stable across runs");
+ }
+
+ #[test]
+ fn archive_produces_parseable_output_with_expected_shape() {
+ let src = make_dir_with_files(&["alpha.txt", "beta.txt", "gamma.txt"]);
+ let out_dir = TempDir::new().unwrap();
+ let out = out_dir.path().join("out.arx");
+
+ archive_directory(
+ src.path(),
+ &out,
+ CompressionAlgorithm::None,
+ CompressionLevel::Default,
+ )
+ .expect("archive to succeed");
+
+ // Reopen and parse.
+ let f = File::open(&out).unwrap();
+ let mut reader = BufReader::new(f);
+ let archive = Archive::::from_data(&mut reader).expect("archive to parse");
+
+ // Archive hash must match the SHA-512 of the index header + body bytes —
+ // the same integrity invariant `unpack_archive` enforces when restoring
+ // into a store. This catches mismatches between archive.hash and
+ // archive.index without needing a separate reference build.
+ let index_data = archive.index.to_data();
+ let index_header = Header::new(ObjectType::Index, index_data.len() as u64);
+ let mut hasher = Sha512::new();
+ hasher
+ .write_all(index_header.to_string().as_bytes())
+ .unwrap();
+ hasher.write_all(&index_data).unwrap();
+ assert_eq!(
+ archive.hash,
+ Hash::from(hasher),
+ "archive hash must equal sha512(index header + body)"
+ );
+
+ // Body has 1 tree + 3 blobs = 4 entries (index is in the archive header, not body).
+ assert_eq!(
+ archive.body.entries.len(),
+ 4,
+ "expected 1 tree + 3 blob entries"
+ );
+ assert_eq!(archive.body.header.len(), 4);
+ }
+
+ #[test]
+ fn archive_dedups_identical_file_contents() {
+ // Two files with the same content → one blob entry in the archive.
+ let src = TempDir::new().unwrap();
+ std::fs::write(src.path().join("first.txt"), b"shared content").unwrap();
+ std::fs::write(src.path().join("second.txt"), b"shared content").unwrap();
+
+ let out_dir = TempDir::new().unwrap();
+ let out = out_dir.path().join("out.arx");
+
+ archive_directory(
+ src.path(),
+ &out,
+ CompressionAlgorithm::None,
+ CompressionLevel::Default,
+ )
+ .expect("archive to succeed");
+
+ let f = File::open(&out).unwrap();
+ let mut reader = BufReader::new(f);
+ let archive = Archive::::from_data(&mut reader).expect("archive to parse");
+
+ // 1 tree + 1 deduped blob = 2 body entries.
+ assert_eq!(
+ archive.body.entries.len(),
+ 2,
+ "duplicate-content files must share a single blob entry"
+ );
+ }
}
diff --git a/common/src/archive.rs b/common/src/archive.rs
index bc72317..363fc0f 100644
--- a/common/src/archive.rs
+++ b/common/src/archive.rs
@@ -1,10 +1,10 @@
use std::{
- fmt::{self, Display},
- fs::File,
- io::{BufRead, BufReader, Read, Write},
- num::NonZero,
- path::PathBuf,
- str::FromStr,
+ fmt::{self, Display},
+ fs::File,
+ io::{BufRead, BufReader, Read, Write},
+ num::NonZero,
+ path::PathBuf,
+ str::FromStr,
};
use anyhow::anyhow;
@@ -14,10 +14,10 @@ use serde::{Deserialize, Serialize};
use sha2::{Digest, Sha512};
use crate::{
- object_body::{Index, Object},
- pipe,
- store::Store,
- Hash,
+ object_body::{Index, Object},
+ pipe,
+ store::Store,
+ Hash,
};
pub const HEADER: [u8; 4] = [b'a', b'r', b'x', b'a'];
@@ -25,245 +25,245 @@ pub const HEADER: [u8; 4] = [b'a', b'r', b'x', b'a'];
#[repr(u16)]
#[derive(Clone, Copy, Debug, Serialize, Deserialize, PartialEq, Eq, Default)]
pub enum CompressionAlgorithm {
- None = 0,
- #[default]
- Zstd = 2,
- Deflate = 4,
- LZMA2 = 8,
+ None = 0,
+ #[default]
+ Zstd = 2,
+ Deflate = 4,
+ LZMA2 = 8,
}
impl FromStr for CompressionAlgorithm {
- type Err = anyhow::Error;
-
- fn from_str(s: &str) -> Result {
- match s {
- "none" => Ok(CompressionAlgorithm::None),
- "deflate" => Ok(CompressionAlgorithm::Deflate),
- "lzma2" => Ok(CompressionAlgorithm::LZMA2),
- "zstd" => Ok(CompressionAlgorithm::Zstd),
- _ => Err(anyhow!("Invalid Compression Type")),
- }
- }
+ type Err = anyhow::Error;
+
+ fn from_str(s: &str) -> Result {
+ match s {
+ "none" => Ok(CompressionAlgorithm::None),
+ "deflate" => Ok(CompressionAlgorithm::Deflate),
+ "lzma2" => Ok(CompressionAlgorithm::LZMA2),
+ "zstd" => Ok(CompressionAlgorithm::Zstd),
+ _ => Err(anyhow!("Invalid Compression Type")),
+ }
+ }
}
impl Display for CompressionAlgorithm {
- fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
- match self {
- CompressionAlgorithm::None => write!(f, "none"),
- CompressionAlgorithm::Deflate => write!(f, "deflate"),
- CompressionAlgorithm::LZMA2 => write!(f, "lzma2"),
- CompressionAlgorithm::Zstd => write!(f, "zstd"),
- }
- }
+ fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
+ match self {
+ CompressionAlgorithm::None => write!(f, "none"),
+ CompressionAlgorithm::Deflate => write!(f, "deflate"),
+ CompressionAlgorithm::LZMA2 => write!(f, "lzma2"),
+ CompressionAlgorithm::Zstd => write!(f, "zstd"),
+ }
+ }
}
impl TryFrom for CompressionAlgorithm {
- type Error = ();
-
- fn try_from(v: u16) -> Result {
- match v {
- x if x == CompressionAlgorithm::None as u16 => Ok(CompressionAlgorithm::None),
- x if x == CompressionAlgorithm::Zstd as u16 => Ok(CompressionAlgorithm::Zstd),
- x if x == CompressionAlgorithm::Deflate as u16 => Ok(CompressionAlgorithm::Deflate),
- x if x == CompressionAlgorithm::LZMA2 as u16 => Ok(CompressionAlgorithm::LZMA2),
- _ => Err(()),
- }
- }
+ type Error = ();
+
+ fn try_from(v: u16) -> Result {
+ match v {
+ x if x == CompressionAlgorithm::None as u16 => Ok(CompressionAlgorithm::None),
+ x if x == CompressionAlgorithm::Zstd as u16 => Ok(CompressionAlgorithm::Zstd),
+ x if x == CompressionAlgorithm::Deflate as u16 => Ok(CompressionAlgorithm::Deflate),
+ x if x == CompressionAlgorithm::LZMA2 as u16 => Ok(CompressionAlgorithm::LZMA2),
+ _ => Err(()),
+ }
+ }
}
#[derive(Clone, Copy, Debug, Serialize, Deserialize, PartialEq, Eq, Default)]
pub enum CompressionLevel {
- #[default]
- Default,
- Fast,
- Best,
- #[serde(untagged)]
- Exact(i32),
+ #[default]
+ Default,
+ Fast,
+ Best,
+ #[serde(untagged)]
+ Exact(i32),
}
impl CompressionLevel {
- pub fn get_compression_level(
- &self,
- algorithm: CompressionAlgorithm,
- ) -> Result {
- // matrix of compression levels for each algorithm. The first dimension is the algorithm, the second dimension is the level (0-3)
- const LEVELS: [[i32; 3]; 4] = [
- [0, 0, 0], // None
- [3, 6, 15], // Zstd
- [6, 1, 9], // Deflate
- [5, 1, 9], // LZMA2
- ];
-
- let algorithm_index = match algorithm {
- CompressionAlgorithm::None => 0,
- CompressionAlgorithm::Zstd => 1,
- CompressionAlgorithm::Deflate => 2,
- CompressionAlgorithm::LZMA2 => 3,
- };
-
- let level = match self {
- CompressionLevel::Default => LEVELS[algorithm_index][0],
- CompressionLevel::Fast => LEVELS[algorithm_index][1],
- CompressionLevel::Best => LEVELS[algorithm_index][2],
- CompressionLevel::Exact(i) => *i,
- };
-
- if !Self::is_valid_for_algorithm(level, algorithm) {
- return Err(anyhow!(
- "Invalid compression level {level} for algorithm {algorithm}"
- ));
- }
-
- Ok(level)
- }
-
- fn is_valid_for_algorithm(level: i32, algorithm: CompressionAlgorithm) -> bool {
- match algorithm {
- CompressionAlgorithm::None => true,
- CompressionAlgorithm::Zstd => (-22..=22).contains(&level),
- CompressionAlgorithm::Deflate => (0..=9).contains(&level),
- CompressionAlgorithm::LZMA2 => (0..=9).contains(&level),
- }
- }
+ pub fn get_compression_level(
+ &self,
+ algorithm: CompressionAlgorithm,
+ ) -> Result {
+ // matrix of compression levels for each algorithm. The first dimension is the algorithm, the second dimension is the level (0-3)
+ const LEVELS: [[i32; 3]; 4] = [
+ [0, 0, 0], // None
+ [3, 6, 15], // Zstd
+ [6, 1, 9], // Deflate
+ [5, 1, 9], // LZMA2
+ ];
+
+ let algorithm_index = match algorithm {
+ CompressionAlgorithm::None => 0,
+ CompressionAlgorithm::Zstd => 1,
+ CompressionAlgorithm::Deflate => 2,
+ CompressionAlgorithm::LZMA2 => 3,
+ };
+
+ let level = match self {
+ CompressionLevel::Default => LEVELS[algorithm_index][0],
+ CompressionLevel::Fast => LEVELS[algorithm_index][1],
+ CompressionLevel::Best => LEVELS[algorithm_index][2],
+ CompressionLevel::Exact(i) => *i,
+ };
+
+ if !Self::is_valid_for_algorithm(level, algorithm) {
+ return Err(anyhow!(
+ "Invalid compression level {level} for algorithm {algorithm}"
+ ));
+ }
+
+ Ok(level)
+ }
+
+ fn is_valid_for_algorithm(level: i32, algorithm: CompressionAlgorithm) -> bool {
+ match algorithm {
+ CompressionAlgorithm::None => true,
+ CompressionAlgorithm::Zstd => (-22..=22).contains(&level),
+ CompressionAlgorithm::Deflate => (0..=9).contains(&level),
+ CompressionAlgorithm::LZMA2 => (0..=9).contains(&level),
+ }
+ }
}
impl Display for CompressionLevel {
- fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
- match self {
- CompressionLevel::Default => write!(f, "default"),
- CompressionLevel::Fast => write!(f, "fast"),
- CompressionLevel::Best => write!(f, "best"),
- CompressionLevel::Exact(i) => write!(f, "exact({i})"),
- }
- }
+ fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
+ match self {
+ CompressionLevel::Default => write!(f, "default"),
+ CompressionLevel::Fast => write!(f, "fast"),
+ CompressionLevel::Best => write!(f, "best"),
+ CompressionLevel::Exact(i) => write!(f, "exact({i})"),
+ }
+ }
}
impl FromStr for CompressionLevel {
- type Err = anyhow::Error;
-
- fn from_str(s: &str) -> Result {
- match s {
- "default" => Ok(CompressionLevel::Default),
- "fast" => Ok(CompressionLevel::Fast),
- "best" => Ok(CompressionLevel::Best),
- s if let Ok(val) = s.parse::() => Ok(CompressionLevel::Exact(val as i32)),
- _ => Err(anyhow!("Invalid compression level: {s}")),
- }
- }
+ type Err = anyhow::Error;
+
+ fn from_str(s: &str) -> Result {
+ match s {
+ "default" => Ok(CompressionLevel::Default),
+ "fast" => Ok(CompressionLevel::Fast),
+ "best" => Ok(CompressionLevel::Best),
+ s if let Ok(val) = s.parse::() => Ok(CompressionLevel::Exact(val as i32)),
+ _ => Err(anyhow!("Invalid compression level: {s}")),
+ }
+ }
}
pub struct Archive
where
- T: ArchiveEntryData,
+ T: ArchiveEntryData,
{
- pub header: [u8; 4],
- pub compression: CompressionAlgorithm,
- pub hash: Hash,
- pub index: Index,
- pub body: ArchiveBody,
+ pub header: [u8; 4],
+ pub compression: CompressionAlgorithm,
+ pub hash: Hash,
+ pub index: Index,
+ pub body: ArchiveBody,
}
impl Archive
where
- T: ArchiveEntryData,
+ T: ArchiveEntryData,
{
- pub fn to_data(
- self,
- compression_level: CompressionLevel,
- writer: &mut impl Write,
- ) -> anyhow::Result<()> {
- writer.write_all(&HEADER)?;
- writer.write_all(&(self.compression as u16).to_be_bytes())?;
- writer.write_all(&self.hash.hash)?;
- writer.write_all(&self.index.to_data())?;
- writer.write_all(&[0])?;
-
- let numerical_level = compression_level.get_compression_level(self.compression)?;
-
- match self.compression {
- CompressionAlgorithm::None => self.body.to_data(writer)?,
- CompressionAlgorithm::Deflate => {
- let mut gz_encoder = flate2::write::DeflateEncoder::new(
- writer,
- flate2::Compression::new(numerical_level as u32),
- );
- self.body.to_data(&mut gz_encoder)?;
- gz_encoder.finish()?.flush()?;
- }
- CompressionAlgorithm::LZMA2 => self.body.to_data(
- &mut lzma_rust2::Lzma2WriterMt::new(
- writer,
- lzma_rust2::Lzma2Options {
- lzma_options: LzmaOptions::with_preset(numerical_level as u32),
- chunk_size: NonZero::new(1024 * 64),
- },
- std::thread::available_parallelism().unwrap().get() as u32,
- )?
- .auto_finish(),
- )?,
- CompressionAlgorithm::Zstd => {
- let mut encoder = zstd::stream::write::Encoder::new(writer, numerical_level)?;
- encoder.multithread(
- std::thread::available_parallelism()
- .map(|n| n.get() as u32)
- .unwrap_or(1),
- )?;
- self.body.to_data(&mut encoder)?;
- encoder.finish()?.flush()?;
- }
- }
-
- Ok(())
- }
-
- pub fn from_data(reader: &mut impl Read) -> anyhow::Result> {
- let mut reader = BufReader::new(reader);
-
- let mut header: [u8; 4] = [0; 4];
- reader.read_exact(&mut header)?;
- assert!(header == HEADER);
-
- let mut compression: [u8; 2] = [0; 2];
- reader.read_exact(&mut compression)?;
-
- let compression: CompressionAlgorithm = u16::from_be_bytes(compression)
- .try_into()
- .map_err(|_| anyhow!("Invalid Compression"))?;
-
- let mut hash: [u8; 64] = [0; 64];
- reader.read_exact(&mut hash)?;
- let hash: Hash = hash.into();
-
- let mut index_bytes = Vec::new();
- let index_bytes_read = reader.read_until(0, &mut index_bytes)?;
-
- let index = Index::from_data(&index_bytes[..index_bytes_read - 1]);
-
- let body = match compression {
- CompressionAlgorithm::None => ArchiveBody::::from_data(&mut reader)?,
- CompressionAlgorithm::Deflate => ArchiveBody::::from_data(
- &mut flate2::read::DeflateDecoder::new(&mut reader),
- )?,
- CompressionAlgorithm::LZMA2 => ArchiveBody::::from_data({
- &mut lzma_rust2::Lzma2ReaderMt::new(
- &mut reader,
- lzma_rust2::LzmaOptions::DICT_SIZE_DEFAULT,
- None,
- std::thread::available_parallelism().unwrap().get() as u32,
- )
- })?,
- CompressionAlgorithm::Zstd => ArchiveBody::::from_data(
- &mut zstd::stream::read::Decoder::new(&mut reader)?,
- )?,
- };
-
- Ok(Archive {
- header: HEADER,
- compression,
- hash,
- index,
- body,
- })
- }
+ pub fn to_data(
+ self,
+ compression_level: CompressionLevel,
+ writer: &mut impl Write,
+ ) -> anyhow::Result<()> {
+ writer.write_all(&HEADER)?;
+ writer.write_all(&(self.compression as u16).to_be_bytes())?;
+ writer.write_all(&self.hash.hash)?;
+ writer.write_all(&self.index.to_data())?;
+ writer.write_all(&[0])?;
+
+ let numerical_level = compression_level.get_compression_level(self.compression)?;
+
+ match self.compression {
+ CompressionAlgorithm::None => self.body.to_data(writer)?,
+ CompressionAlgorithm::Deflate => {
+ let mut gz_encoder = flate2::write::DeflateEncoder::new(
+ writer,
+ flate2::Compression::new(numerical_level as u32),
+ );
+ self.body.to_data(&mut gz_encoder)?;
+ gz_encoder.finish()?.flush()?;
+ }
+ CompressionAlgorithm::LZMA2 => self.body.to_data(
+ &mut lzma_rust2::Lzma2WriterMt::new(
+ writer,
+ lzma_rust2::Lzma2Options {
+ lzma_options: LzmaOptions::with_preset(numerical_level as u32),
+ chunk_size: NonZero::new(1024 * 64),
+ },
+ std::thread::available_parallelism().unwrap().get() as u32,
+ )?
+ .auto_finish(),
+ )?,
+ CompressionAlgorithm::Zstd => {
+ let mut encoder = zstd::stream::write::Encoder::new(writer, numerical_level)?;
+ encoder.multithread(
+ std::thread::available_parallelism()
+ .map(|n| n.get() as u32)
+ .unwrap_or(1),
+ )?;
+ self.body.to_data(&mut encoder)?;
+ encoder.finish()?.flush()?;
+ }
+ }
+
+ Ok(())
+ }
+
+ pub fn from_data(reader: &mut impl Read) -> anyhow::Result> {
+ let mut reader = BufReader::new(reader);
+
+ let mut header: [u8; 4] = [0; 4];
+ reader.read_exact(&mut header)?;
+ assert!(header == HEADER);
+
+ let mut compression: [u8; 2] = [0; 2];
+ reader.read_exact(&mut compression)?;
+
+ let compression: CompressionAlgorithm = u16::from_be_bytes(compression)
+ .try_into()
+ .map_err(|_| anyhow!("Invalid Compression"))?;
+
+ let mut hash: [u8; 64] = [0; 64];
+ reader.read_exact(&mut hash)?;
+ let hash: Hash = hash.into();
+
+ let mut index_bytes = Vec::new();
+ let index_bytes_read = reader.read_until(0, &mut index_bytes)?;
+
+ let index = Index::from_data(&index_bytes[..index_bytes_read - 1]);
+
+ let body = match compression {
+ CompressionAlgorithm::None => ArchiveBody::::from_data(&mut reader)?,
+ CompressionAlgorithm::Deflate => ArchiveBody::::from_data(
+ &mut flate2::read::DeflateDecoder::new(&mut reader),
+ )?,
+ CompressionAlgorithm::LZMA2 => ArchiveBody::::from_data({
+ &mut lzma_rust2::Lzma2ReaderMt::new(
+ &mut reader,
+ lzma_rust2::LzmaOptions::DICT_SIZE_DEFAULT,
+ None,
+ std::thread::available_parallelism().unwrap().get() as u32,
+ )
+ })?,
+ CompressionAlgorithm::Zstd => ArchiveBody::::from_data(
+ &mut zstd::stream::read::Decoder::new(&mut reader)?,
+ )?,
+ };
+
+ Ok(Archive {
+ header: HEADER,
+ compression,
+ hash,
+ index,
+ body,
+ })
+ }
}
// /// Create a new `Body` from a [`Stream`].
@@ -295,63 +295,63 @@ where
// }
pub struct ArchiveHeaderEntry {
- pub hash: Hash,
- pub index: u64,
- pub length: u64,
+ pub hash: Hash,
+ pub index: u64,
+ pub length: u64,
}
pub trait ArchiveEntryData {
- fn turn_into_vec(self) -> Vec;
+ fn turn_into_vec(self) -> Vec;
}
pub struct RawEntryData(Vec);
impl RawEntryData {
- pub fn new(data: Vec) -> Self {
- RawEntryData(data)
- }
+ pub fn new(data: Vec) -> Self {
+ RawEntryData(data)
+ }
}
impl ArchiveEntryData for RawEntryData {
- fn turn_into_vec(self) -> Vec {
- self.0
- }
+ fn turn_into_vec(self) -> Vec {
+ self.0
+ }
}
pub struct ReaderEntryData(T)
where
- T: Read;
+ T: Read;
impl ReaderEntryData
where
- T: Read,
+ T: Read,
{
- pub fn new(reader: T) -> Self {
- ReaderEntryData(reader)
- }
+ pub fn new(reader: T) -> Self {
+ ReaderEntryData(reader)
+ }
}
impl ArchiveEntryData for ReaderEntryData
where
- T: Read,
+ T: Read,
{
- fn turn_into_vec(mut self) -> Vec {
- let mut data: Vec = Vec::new();
- self.0.read_to_end(&mut data).expect("Reading to work");
+ fn turn_into_vec(mut self) -> Vec {
+ let mut data: Vec = Vec::new();
+ self.0.read_to_end(&mut data).expect("Reading to work");
- data
- }
+ data
+ }
}
pub struct FileEntryData(pub PathBuf);
impl ArchiveEntryData for FileEntryData {
- fn turn_into_vec(self) -> Vec {
- let file = File::open(self.0).expect("File to be avaliable for read");
- let mut reader = BufReader::new(file);
- let mut data = Vec::new();
- pipe(&mut reader, &mut data).expect("reading to work");
- data
- }
+ fn turn_into_vec(self) -> Vec {
+ let file = File::open(self.0).expect("File to be avaliable for read");
+ let mut reader = BufReader::new(file);
+ let mut data = Vec::new();
+ pipe(&mut reader, &mut data).expect("reading to work");
+ data
+ }
}
/// An archive entry for a blob whose raw bytes live at `source_path` on disk —
@@ -360,243 +360,243 @@ impl ArchiveEntryData for FileEntryData {
/// prepends the object header so the resulting bytes are identical to what a
/// pack operation would have pulled from the store.
pub struct SourceFileEntryData {
- pub source_path: PathBuf,
- pub header: crate::Header,
+ pub source_path: PathBuf,
+ pub header: crate::Header,
}
impl ArchiveEntryData for SourceFileEntryData {
- fn turn_into_vec(self) -> Vec {
- let file = File::open(&self.source_path).expect("source file to be readable");
- let mut reader = BufReader::new(file);
- let prefix = self.header.to_string();
- let mut data = Vec::with_capacity(prefix.len() + self.header.size as usize);
- data.extend_from_slice(prefix.as_bytes());
- pipe(&mut reader, &mut data).expect("reading to work");
- data
- }
+ fn turn_into_vec(self) -> Vec {
+ let file = File::open(&self.source_path).expect("source file to be readable");
+ let mut reader = BufReader::new(file);
+ let prefix = self.header.to_string();
+ let mut data = Vec::with_capacity(prefix.len() + self.header.size as usize);
+ data.extend_from_slice(prefix.as_bytes());
+ pipe(&mut reader, &mut data).expect("reading to work");
+ data
+ }
}
pub struct StoreEntryData {
- pub store: Store,
- pub hash: Hash,
+ pub store: Store,
+ pub hash: Hash,
}
impl ArchiveEntryData for StoreEntryData {
- fn turn_into_vec(self) -> Vec {
- let mut object = futures::executor::block_on(self.store.get_object(&self.hash))
- .expect("Object to be available in store");
+ fn turn_into_vec(self) -> Vec {
+ let mut object = futures::executor::block_on(self.store.get_object(&self.hash))
+ .expect("Object to be available in store");
- let mut data: Vec = Vec::new();
- futures::executor::block_on(object.read_to_end(&mut data)).expect("Reading to work");
+ let mut data: Vec = Vec::new();
+ futures::executor::block_on(object.read_to_end(&mut data)).expect("Reading to work");
- data
- }
+ data
+ }
}
pub struct ArchiveBody
where
- T: ArchiveEntryData,
+ T: ArchiveEntryData,
{
- pub header: Vec,
- pub entries: Vec,
+ pub header: Vec,
+ pub entries: Vec,
}
impl ArchiveBody
where
- T: ArchiveEntryData,
+ T: ArchiveEntryData,
{
- #[allow(clippy::wrong_self_convention)]
- fn to_data(self, writer: &mut impl Write) -> anyhow::Result<()> {
- writer.write_all(&(self.header.len() as u64).to_be_bytes())?;
- for entry in &self.header {
- writer.write_all(&entry.hash.hash)?;
- writer.write_all(&entry.index.to_be_bytes())?;
- writer.write_all(&entry.length.to_be_bytes())?;
- }
-
- for entry in self.entries {
- writer.write_all(&entry.turn_into_vec())?;
- }
-
- writer.flush()?;
-
- Ok(())
- }
-
- fn from_data(reader: &mut impl Read) -> anyhow::Result> {
- let mut long: [u8; 8] = [0; 8];
- reader.read_exact(&mut long)?;
- let count = u64::from_be_bytes(long);
-
- println!("Loading {count} entries");
-
- if count == 0 {
- return Ok(ArchiveBody {
- header: Vec::new(),
- entries: Vec::new(),
- });
- }
-
- let mut header_entries: Vec = Vec::with_capacity(count as usize);
- let mut counter = 0;
- loop {
- if counter >= count {
- break;
- }
-
- let mut hash: [u8; 64] = [0; 64];
- reader.read_exact(&mut hash)?;
- let hash: Hash = hash.into();
-
- reader.read_exact(&mut long)?;
- let index = u64::from_be_bytes(long);
-
- reader.read_exact(&mut long)?;
- let length = u64::from_be_bytes(long);
-
- println!("Read object {hash}");
- header_entries.push(ArchiveHeaderEntry {
- hash,
- index,
- length,
- });
- counter += 1;
- }
-
- let mut counter: u64 = 0;
-
- header_entries.sort_by_key(|a| a.index);
- assert!(header_entries[0].index == 0);
-
- let mut entries: Vec = Vec::with_capacity(header_entries.len());
- for entry in &header_entries {
- assert!(entry.index == counter);
-
- let amount = entry.length;
- let mut data: Vec = vec![0; amount as usize];
- reader.read_exact(&mut data[..])?;
-
- let mut hasher = Sha512::new();
- hasher.write_all(&data)?;
- assert!(Hash::from(hasher) == entry.hash);
-
- entries.push(RawEntryData(data.to_vec()));
-
- counter += amount;
- }
-
- Ok(ArchiveBody {
- header: header_entries,
- entries,
- })
- }
+ #[allow(clippy::wrong_self_convention)]
+ fn to_data(self, writer: &mut impl Write) -> anyhow::Result<()> {
+ writer.write_all(&(self.header.len() as u64).to_be_bytes())?;
+ for entry in &self.header {
+ writer.write_all(&entry.hash.hash)?;
+ writer.write_all(&entry.index.to_be_bytes())?;
+ writer.write_all(&entry.length.to_be_bytes())?;
+ }
+
+ for entry in self.entries {
+ writer.write_all(&entry.turn_into_vec())?;
+ }
+
+ writer.flush()?;
+
+ Ok(())
+ }
+
+ fn from_data(reader: &mut impl Read) -> anyhow::Result> {
+ let mut long: [u8; 8] = [0; 8];
+ reader.read_exact(&mut long)?;
+ let count = u64::from_be_bytes(long);
+
+ println!("Loading {count} entries");
+
+ if count == 0 {
+ return Ok(ArchiveBody {
+ header: Vec::new(),
+ entries: Vec::new(),
+ });
+ }
+
+ let mut header_entries: Vec = Vec::with_capacity(count as usize);
+ let mut counter = 0;
+ loop {
+ if counter >= count {
+ break;
+ }
+
+ let mut hash: [u8; 64] = [0; 64];
+ reader.read_exact(&mut hash)?;
+ let hash: Hash = hash.into();
+
+ reader.read_exact(&mut long)?;
+ let index = u64::from_be_bytes(long);
+
+ reader.read_exact(&mut long)?;
+ let length = u64::from_be_bytes(long);
+
+ println!("Read object {hash}");
+ header_entries.push(ArchiveHeaderEntry {
+ hash,
+ index,
+ length,
+ });
+ counter += 1;
+ }
+
+ let mut counter: u64 = 0;
+
+ header_entries.sort_by_key(|a| a.index);
+ assert!(header_entries[0].index == 0);
+
+ let mut entries: Vec = Vec::with_capacity(header_entries.len());
+ for entry in &header_entries {
+ assert!(entry.index == counter);
+
+ let amount = entry.length;
+ let mut data: Vec = vec![0; amount as usize];
+ reader.read_exact(&mut data[..])?;
+
+ let mut hasher = Sha512::new();
+ hasher.write_all(&data)?;
+ assert!(Hash::from(hasher) == entry.hash);
+
+ entries.push(RawEntryData(data.to_vec()));
+
+ counter += amount;
+ }
+
+ Ok(ArchiveBody {
+ header: header_entries,
+ entries,
+ })
+ }
}
#[cfg(test)]
mod tests {
- use super::*;
- use chrono::{TimeZone, Utc};
- use std::collections::HashMap;
-
- fn empty_archive(compression: CompressionAlgorithm) -> Archive {
- let zero = Hash::from([0u8; 64]);
- Archive {
- header: HEADER,
- compression,
- hash: zero.clone(),
- index: Index {
- tree: zero,
- timestamp: Utc.timestamp_opt(0, 0).unwrap(),
- metadata: HashMap::new(),
- },
- body: ArchiveBody {
- header: Vec::new(),
- entries: Vec::new(),
- },
- }
- }
-
- #[test]
- fn all_named_compression_levels_are_valid_for_all_algorithms() {
- let algorithms = [
- CompressionAlgorithm::None,
- CompressionAlgorithm::Zstd,
- CompressionAlgorithm::Deflate,
- CompressionAlgorithm::LZMA2,
- ];
- let levels = [
- CompressionLevel::Default,
- CompressionLevel::Fast,
- CompressionLevel::Best,
- ];
-
- for algorithm in &algorithms {
- for level in &levels {
- level.get_compression_level(*algorithm).unwrap_or_else(|e| {
- panic!("{level} should be valid for {algorithm}: {e}");
- });
- }
- }
- }
-
- #[test]
- fn exact_levels_at_algorithm_bounds_are_valid() {
- // (algorithm, valid min, valid max)
- let bounds: [(CompressionAlgorithm, i32, i32); 3] = [
- (CompressionAlgorithm::Zstd, -22, 22),
- (CompressionAlgorithm::Deflate, 0, 9),
- (CompressionAlgorithm::LZMA2, 0, 9),
- ];
-
- for (algorithm, min, max) in &bounds {
- // min and max should succeed
- CompressionLevel::Exact(*min)
- .get_compression_level(*algorithm)
- .unwrap_or_else(|e| panic!("Exact({min}) should be valid for {algorithm}: {e}"));
- CompressionLevel::Exact(*max)
- .get_compression_level(*algorithm)
- .unwrap_or_else(|e| panic!("Exact({max}) should be valid for {algorithm}: {e}"));
-
- // one past each bound should fail
- assert!(
- CompressionLevel::Exact(min - 1)
- .get_compression_level(*algorithm)
- .is_err(),
- "Exact({}) should be invalid for {algorithm}",
- min - 1
- );
- assert!(
- CompressionLevel::Exact(max + 1)
- .get_compression_level(*algorithm)
- .is_err(),
- "Exact({}) should be invalid for {algorithm}",
- max + 1
- );
- }
- }
-
- #[test]
- fn none_algorithm_accepts_any_exact_level() {
- for level in [-100, -1, 0, 1, 100] {
- CompressionLevel::Exact(level)
- .get_compression_level(CompressionAlgorithm::None)
- .unwrap_or_else(|e| {
- panic!("None algorithm should accept Exact({level}): {e}");
- });
- }
- }
-
- #[test]
- fn zstd_archive_round_trip() {
- let mut bytes = Vec::new();
- empty_archive(CompressionAlgorithm::Zstd)
- .to_data(CompressionLevel::Default, &mut bytes)
- .expect("encode");
-
- let decoded = Archive::::from_data(&mut bytes.as_slice()).expect("decode");
-
- assert!(matches!(decoded.compression, CompressionAlgorithm::Zstd));
- assert!(decoded.body.header.is_empty());
- assert!(decoded.body.entries.is_empty());
- }
+ use super::*;
+ use chrono::{TimeZone, Utc};
+ use std::collections::HashMap;
+
+ fn empty_archive(compression: CompressionAlgorithm) -> Archive {
+ let zero = Hash::from([0u8; 64]);
+ Archive {
+ header: HEADER,
+ compression,
+ hash: zero.clone(),
+ index: Index {
+ tree: zero,
+ timestamp: Utc.timestamp_opt(0, 0).unwrap(),
+ metadata: HashMap::new(),
+ },
+ body: ArchiveBody {
+ header: Vec::new(),
+ entries: Vec::new(),
+ },
+ }
+ }
+
+ #[test]
+ fn all_named_compression_levels_are_valid_for_all_algorithms() {
+ let algorithms = [
+ CompressionAlgorithm::None,
+ CompressionAlgorithm::Zstd,
+ CompressionAlgorithm::Deflate,
+ CompressionAlgorithm::LZMA2,
+ ];
+ let levels = [
+ CompressionLevel::Default,
+ CompressionLevel::Fast,
+ CompressionLevel::Best,
+ ];
+
+ for algorithm in &algorithms {
+ for level in &levels {
+ level.get_compression_level(*algorithm).unwrap_or_else(|e| {
+ panic!("{level} should be valid for {algorithm}: {e}");
+ });
+ }
+ }
+ }
+
+ #[test]
+ fn exact_levels_at_algorithm_bounds_are_valid() {
+ // (algorithm, valid min, valid max)
+ let bounds: [(CompressionAlgorithm, i32, i32); 3] = [
+ (CompressionAlgorithm::Zstd, -22, 22),
+ (CompressionAlgorithm::Deflate, 0, 9),
+ (CompressionAlgorithm::LZMA2, 0, 9),
+ ];
+
+ for (algorithm, min, max) in &bounds {
+ // min and max should succeed
+ CompressionLevel::Exact(*min)
+ .get_compression_level(*algorithm)
+ .unwrap_or_else(|e| panic!("Exact({min}) should be valid for {algorithm}: {e}"));
+ CompressionLevel::Exact(*max)
+ .get_compression_level(*algorithm)
+ .unwrap_or_else(|e| panic!("Exact({max}) should be valid for {algorithm}: {e}"));
+
+ // one past each bound should fail
+ assert!(
+ CompressionLevel::Exact(min - 1)
+ .get_compression_level(*algorithm)
+ .is_err(),
+ "Exact({}) should be invalid for {algorithm}",
+ min - 1
+ );
+ assert!(
+ CompressionLevel::Exact(max + 1)
+ .get_compression_level(*algorithm)
+ .is_err(),
+ "Exact({}) should be invalid for {algorithm}",
+ max + 1
+ );
+ }
+ }
+
+ #[test]
+ fn none_algorithm_accepts_any_exact_level() {
+ for level in [-100, -1, 0, 1, 100] {
+ CompressionLevel::Exact(level)
+ .get_compression_level(CompressionAlgorithm::None)
+ .unwrap_or_else(|e| {
+ panic!("None algorithm should accept Exact({level}): {e}");
+ });
+ }
+ }
+
+ #[test]
+ fn zstd_archive_round_trip() {
+ let mut bytes = Vec::new();
+ empty_archive(CompressionAlgorithm::Zstd)
+ .to_data(CompressionLevel::Default, &mut bytes)
+ .expect("encode");
+
+ let decoded = Archive::::from_data(&mut bytes.as_slice()).expect("decode");
+
+ assert!(matches!(decoded.compression, CompressionAlgorithm::Zstd));
+ assert!(decoded.body.header.is_empty());
+ assert!(decoded.body.entries.is_empty());
+ }
}
diff --git a/common/src/hash.rs b/common/src/hash.rs
index 9b8ab64..7a5cb6e 100644
--- a/common/src/hash.rs
+++ b/common/src/hash.rs
@@ -11,207 +11,207 @@ use std::str::FromStr;
#[derive(Clone, Serialize)]
pub struct Hash {
- // Sha512 Hash value
- #[serde(skip)]
- pub hash: [u8; 64],
- hash_string: String,
+ // Sha512 Hash value
+ #[serde(skip)]
+ pub hash: [u8; 64],
+ hash_string: String,
}
impl Hash {
- pub fn get_parts(&self) -> (&str, &str) {
- (&self.hash_string[..2], &self.hash_string[2..])
- }
-
- pub fn as_str(&self) -> &str {
- &self.hash_string
- }
-
- pub fn from_string(value: &str) -> Option {
- if value.len() != 128 {
- return None;
- }
-
- let hash = hex::decode(value).ok()?;
-
- if hash.len() != 64 {
- return None;
- }
-
- Some(Self {
- hash: hash.try_into().unwrap(),
- hash_string: value.to_owned(),
- })
- }
-
- pub fn get_path(&self, cache_dir: &Path) -> PathBuf {
- let (dir, file) = self.get_parts();
- cache_dir.join(dir).join(file)
- }
-
- pub fn from_path(file: &Path) -> Option {
- let filename = file.file_name()?;
- let directory = file.parent()?.file_name()?;
-
- if directory.len() != 2 {
- return None;
- }
-
- if filename.len() != 126 {
- return None;
- }
-
- Self::try_from(directory.to_str()?.to_owned() + filename.to_str()?).ok()
- }
+ pub fn get_parts(&self) -> (&str, &str) {
+ (&self.hash_string[..2], &self.hash_string[2..])
+ }
+
+ pub fn as_str(&self) -> &str {
+ &self.hash_string
+ }
+
+ pub fn from_string(value: &str) -> Option {
+ if value.len() != 128 {
+ return None;
+ }
+
+ let hash = hex::decode(value).ok()?;
+
+ if hash.len() != 64 {
+ return None;
+ }
+
+ Some(Self {
+ hash: hash.try_into().unwrap(),
+ hash_string: value.to_owned(),
+ })
+ }
+
+ pub fn get_path(&self, cache_dir: &Path) -> PathBuf {
+ let (dir, file) = self.get_parts();
+ cache_dir.join(dir).join(file)
+ }
+
+ pub fn from_path(file: &Path) -> Option {
+ let filename = file.file_name()?;
+ let directory = file.parent()?.file_name()?;
+
+ if directory.len() != 2 {
+ return None;
+ }
+
+ if filename.len() != 126 {
+ return None;
+ }
+
+ Self::try_from(directory.to_str()?.to_owned() + filename.to_str()?).ok()
+ }
}
impl std::hash::Hash for Hash {
- fn hash(&self, state: &mut H) {
- self.hash.hash(state);
- }
+ fn hash(&self, state: &mut H) {
+ self.hash.hash(state);
+ }
}
impl PartialEq for Hash {
- fn eq(&self, other: &Self) -> bool {
- self.hash == other.hash
- }
+ fn eq(&self, other: &Self) -> bool {
+ self.hash == other.hash
+ }
}
impl Eq for Hash {}
impl FromStr for Hash {
- type Err = anyhow::Error;
+ type Err = anyhow::Error;
- fn from_str(s: &str) -> Result {
- s.try_into()
- }
+ fn from_str(s: &str) -> Result {
+ s.try_into()
+ }
}
impl TryFrom for Hash {
- type Error = anyhow::Error;
-
- fn try_from(value: String) -> Result {
- if value.len() != 128 {
- return Err(anyhow!(
- "Invalid length. Hash has to be 128 characters long"
- ));
- }
-
- let mut hash = [0u8; 64];
- hex::decode_to_slice(&value, &mut hash)?;
-
- Ok(Self {
- hash,
- hash_string: value,
- })
- }
+ type Error = anyhow::Error;
+
+ fn try_from(value: String) -> Result {
+ if value.len() != 128 {
+ return Err(anyhow!(
+ "Invalid length. Hash has to be 128 characters long"
+ ));
+ }
+
+ let mut hash = [0u8; 64];
+ hex::decode_to_slice(&value, &mut hash)?;
+
+ Ok(Self {
+ hash,
+ hash_string: value,
+ })
+ }
}
impl TryFrom<&str> for Hash {
- type Error = anyhow::Error;
-
- fn try_from(value: &str) -> Result {
- if value.len() != 128 {
- return Err(anyhow!(
- "Invalid length. Hash has to be 128 characters long"
- ));
- }
-
- let mut hash = [0u8; 64];
- hex::decode_to_slice(value, &mut hash)?;
-
- Ok(Self {
- hash,
- hash_string: value.to_owned(),
- })
- }
+ type Error = anyhow::Error;
+
+ fn try_from(value: &str) -> Result {
+ if value.len() != 128 {
+ return Err(anyhow!(
+ "Invalid length. Hash has to be 128 characters long"
+ ));
+ }
+
+ let mut hash = [0u8; 64];
+ hex::decode_to_slice(value, &mut hash)?;
+
+ Ok(Self {
+ hash,
+ hash_string: value.to_owned(),
+ })
+ }
}
impl TryFrom<&[u8]> for Hash {
- type Error = anyhow::Error;
-
- fn try_from(value: &[u8]) -> Result {
- if value.len() != 64 {
- return Err(anyhow!("Invalid length. Slice must be 64 bytes long"));
- }
-
- let data: [u8; 64] = value.try_into()?;
- Ok(Self {
- hash_string: hex::encode(value),
- hash: data,
- })
- }
+ type Error = anyhow::Error;
+
+ fn try_from(value: &[u8]) -> Result {
+ if value.len() != 64 {
+ return Err(anyhow!("Invalid length. Slice must be 64 bytes long"));
+ }
+
+ let data: [u8; 64] = value.try_into()?;
+ Ok(Self {
+ hash_string: hex::encode(value),
+ hash: data,
+ })
+ }
}
impl From<[u8; 64]> for Hash {
- fn from(value: [u8; 64]) -> Self {
- Self {
- hash_string: hex::encode(value),
- hash: value,
- }
- }
+ fn from(value: [u8; 64]) -> Self {
+ Self {
+ hash_string: hex::encode(value),
+ hash: value,
+ }
+ }
}
impl From for Hash {
- fn from(value: Sha512) -> Self {
- Self::from(Into::<[u8; 64]>::into(value.finalize_fixed()))
- }
+ fn from(value: Sha512) -> Self {
+ Self::from(Into::<[u8; 64]>::into(value.finalize_fixed()))
+ }
}
impl Debug for Hash {
- fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
- f.debug_tuple("Hash").field(&self.hash_string).finish()
- }
+ fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
+ f.debug_tuple("Hash").field(&self.hash_string).finish()
+ }
}
impl Display for Hash {
- fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
- write!(f, "{}", self.hash_string)
- }
+ fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
+ write!(f, "{}", self.hash_string)
+ }
}
impl<'de> Deserialize<'de> for Hash {
- fn deserialize(deserializer: D) -> Result
- where
- D: Deserializer<'de>,
- {
- struct HashVisitor;
-
- impl<'de> Visitor<'de> for HashVisitor {
- type Value = Hash;
-
- fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
- formatter.write_str("Sha512 hex string Hash")
- }
-
- fn visit_str(self, value: &str) -> Result
- where
- E: serde::de::Error,
- {
- if value.len() != 128 {
- return Err(de::Error::invalid_length(
- value.len(),
- &"A hex string with 128 characters",
- ));
- }
-
- value.try_into().map_err(de::Error::custom)
- }
-
- fn visit_string(self, value: String) -> Result
- where
- E: serde::de::Error,
- {
- if value.len() != 128 {
- return Err(de::Error::invalid_length(
- value.len(),
- &"A hex string with 128 characters",
- ));
- }
-
- value.try_into().map_err(de::Error::custom)
- }
- }
-
- deserializer.deserialize_string(HashVisitor)
- }
+ fn deserialize(deserializer: D) -> Result
+ where
+ D: Deserializer<'de>,
+ {
+ struct HashVisitor;
+
+ impl<'de> Visitor<'de> for HashVisitor {
+ type Value = Hash;
+
+ fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
+ formatter.write_str("Sha512 hex string Hash")
+ }
+
+ fn visit_str(self, value: &str) -> Result
+ where
+ E: serde::de::Error,
+ {
+ if value.len() != 128 {
+ return Err(de::Error::invalid_length(
+ value.len(),
+ &"A hex string with 128 characters",
+ ));
+ }
+
+ value.try_into().map_err(de::Error::custom)
+ }
+
+ fn visit_string(self, value: String) -> Result
+ where
+ E: serde::de::Error,
+ {
+ if value.len() != 128 {
+ return Err(de::Error::invalid_length(
+ value.len(),
+ &"A hex string with 128 characters",
+ ));
+ }
+
+ value.try_into().map_err(de::Error::custom)
+ }
+ }
+
+ deserializer.deserialize_string(HashVisitor)
+ }
}
diff --git a/common/src/header.rs b/common/src/header.rs
index 954cc10..2979d5b 100644
--- a/common/src/header.rs
+++ b/common/src/header.rs
@@ -1,6 +1,6 @@
use std::{
- io::{Read, Write},
- str::from_utf8,
+ io::{Read, Write},
+ str::from_utf8,
};
use anyhow::{anyhow, Result};
@@ -10,102 +10,102 @@ use crate::ObjectType;
#[derive(Debug, Copy, Clone, Eq, PartialEq, PartialOrd, Ord, Hash)]
pub struct Header {
- pub object_type: ObjectType,
- pub size: u64,
+ pub object_type: ObjectType,
+ pub size: u64,
}
impl Header {
- pub fn new(object_type: ObjectType, size: u64) -> Self {
- Header { object_type, size }
- }
-
- #[allow(clippy::inherent_to_string)]
- pub fn to_string(&self) -> String {
- format!("{} {}\0", self.object_type.to_str(), self.size)
- }
-
- pub async fn write_to_async(
- &self,
- writer: &mut (impl AsyncWrite + std::marker::Unpin),
- ) -> Result<(), std::io::Error> {
- writer.write_all(self.to_string().as_bytes()).await
- }
-
- pub fn write_to(&self, writer: &mut impl Write) -> Result<(), std::io::Error> {
- writer.write_all(self.to_string().as_bytes())
- }
-
- pub fn from_data(data: &[u8]) -> Result {
- if data.is_empty() {
- return Err(anyhow!("Invalid Header: No Data"));
- }
-
- let data = if data[data.len() - 1] == 0 {
- &data[..data.len() - 1]
- } else {
- data
- };
-
- Self::from_str(from_utf8(data)?)
- }
-
- #[allow(clippy::should_implement_trait)]
- pub fn from_str(string: &str) -> Result {
- let (object_type, size) = string
- .split_once(' ')
- .ok_or(anyhow!("Invalid Header: missing space ' ' character"))?;
-
- Ok(Header::new(
- ObjectType::from_str(object_type).ok_or(anyhow!(
- "Invalid Header: Invalid ObjectType \"{object_type}\""
- ))?,
- size.parse()?,
- ))
- }
-
- pub fn from_buf(buffer: &[u8]) -> Result {
- if buffer.is_empty() {
- return Err(anyhow!("Invalid Header: No Data"));
- }
- // Find the null marker of the header. If its not available then we just gotta assume the whole buffer is a valid utf8 header
- let null_position = buffer.iter().position(|x| *x == 0).unwrap_or(buffer.len());
- let buffer = &buffer[..null_position];
-
- Self::from_data(buffer)
- }
-
- pub async fn read_from_async(
- reader: &mut (impl AsyncRead + AsyncSeek + std::marker::Unpin),
- ) -> Result {
- let mut buffer = [0u8; 32];
- let bytes_read = reader.read(&mut buffer).await?;
-
- if bytes_read == 0 {
- return Err(anyhow!("Invalid Header: No Data"));
- }
-
- let buffer = &buffer[..bytes_read];
-
- // Find the null marker of the header. If its not available then we just gotta assume the whole buffer is a valid utf8 header
- let null_position = buffer.iter().position(|x| *x == 0).unwrap_or(buffer.len());
- let buffer = &buffer[..null_position];
-
- reader
- .seek(std::io::SeekFrom::Start(null_position as u64))
- .await?;
-
- Self::from_data(buffer)
- }
-
- pub fn read_from(reader: &mut impl Read) -> Result {
- let mut buffer = [0u8; 32];
- let bytes_read = reader.read(&mut buffer)?;
-
- if bytes_read == 0 {
- return Err(anyhow!("Invalid Header: No Data"));
- }
-
- let buffer = &buffer[..bytes_read];
- Self::from_buf(buffer)
- }
+ pub fn new(object_type: ObjectType, size: u64) -> Self {
+ Header { object_type, size }
+ }
+
+ #[allow(clippy::inherent_to_string)]
+ pub fn to_string(&self) -> String {
+ format!("{} {}\0", self.object_type.to_str(), self.size)
+ }
+
+ pub async fn write_to_async(
+ &self,
+ writer: &mut (impl AsyncWrite + std::marker::Unpin),
+ ) -> Result<(), std::io::Error> {
+ writer.write_all(self.to_string().as_bytes()).await
+ }
+
+ pub fn write_to(&self, writer: &mut impl Write) -> Result<(), std::io::Error> {
+ writer.write_all(self.to_string().as_bytes())
+ }
+
+ pub fn from_data(data: &[u8]) -> Result {
+ if data.is_empty() {
+ return Err(anyhow!("Invalid Header: No Data"));
+ }
+
+ let data = if data[data.len() - 1] == 0 {
+ &data[..data.len() - 1]
+ } else {
+ data
+ };
+
+ Self::from_str(from_utf8(data)?)
+ }
+
+ #[allow(clippy::should_implement_trait)]
+ pub fn from_str(string: &str) -> Result {
+ let (object_type, size) = string
+ .split_once(' ')
+ .ok_or(anyhow!("Invalid Header: missing space ' ' character"))?;
+
+ Ok(Header::new(
+ ObjectType::from_str(object_type).ok_or(anyhow!(
+ "Invalid Header: Invalid ObjectType \"{object_type}\""
+ ))?,
+ size.parse()?,
+ ))
+ }
+
+ pub fn from_buf(buffer: &[u8]) -> Result {
+ if buffer.is_empty() {
+ return Err(anyhow!("Invalid Header: No Data"));
+ }
+ // Find the null marker of the header. If its not available then we just gotta assume the whole buffer is a valid utf8 header
+ let null_position = buffer.iter().position(|x| *x == 0).unwrap_or(buffer.len());
+ let buffer = &buffer[..null_position];
+
+ Self::from_data(buffer)
+ }
+
+ pub async fn read_from_async(
+ reader: &mut (impl AsyncRead + AsyncSeek + std::marker::Unpin),
+ ) -> Result {
+ let mut buffer = [0u8; 32];
+ let bytes_read = reader.read(&mut buffer).await?;
+
+ if bytes_read == 0 {
+ return Err(anyhow!("Invalid Header: No Data"));
+ }
+
+ let buffer = &buffer[..bytes_read];
+
+ // Find the null marker of the header. If its not available then we just gotta assume the whole buffer is a valid utf8 header
+ let null_position = buffer.iter().position(|x| *x == 0).unwrap_or(buffer.len());
+ let buffer = &buffer[..null_position];
+
+ reader
+ .seek(std::io::SeekFrom::Start(null_position as u64))
+ .await?;
+
+ Self::from_data(buffer)
+ }
+
+ pub fn read_from(reader: &mut impl Read) -> Result {
+ let mut buffer = [0u8; 32];
+ let bytes_read = reader.read(&mut buffer)?;
+
+ if bytes_read == 0 {
+ return Err(anyhow!("Invalid Header: No Data"));
+ }
+
+ let buffer = &buffer[..bytes_read];
+ Self::from_buf(buffer)
+ }
}
diff --git a/common/src/lib.rs b/common/src/lib.rs
index 4edce43..5723f43 100644
--- a/common/src/lib.rs
+++ b/common/src/lib.rs
@@ -1,9 +1,9 @@
use std::{
- collections::HashMap,
- fs::File,
- io::{BufRead, BufReader, Read, Write},
- path::Path,
- str::from_utf8,
+ collections::HashMap,
+ fs::File,
+ io::{BufRead, BufReader, Read, Write},
+ path::Path,
+ str::from_utf8,
};
use futures::AsyncReadExt;
@@ -25,139 +25,139 @@ pub mod primitives;
pub mod store;
pub fn read_slice_until_byte(data: &[u8], byte: u8) -> Option<&[u8]> {
- let position = data.iter().position(|v| *v == byte)?;
+ let position = data.iter().position(|v| *v == byte)?;
- Some(&data[..position])
+ Some(&data[..position])
}
pub fn read_header_and_body(data: &[u8]) -> Option<(Header, &[u8])> {
- let header = read_slice_until_byte(data, 0)?;
+ let header = read_slice_until_byte(data, 0)?;
- let body_index = header.len() + 1; // one extra for the 0 byte
+ let body_index = header.len() + 1; // one extra for the 0 byte
- let header = read_header_from_slice(header)?;
+ let header = read_header_from_slice(header)?;
- Some((header, &data[body_index..]))
+ Some((header, &data[body_index..]))
}
pub fn read_header_from_slice(slice: &[u8]) -> Option {
- assert!(slice[slice.len() - 1] != 0);
- let string = from_utf8(slice).ok()?;
+ assert!(slice[slice.len() - 1] != 0);
+ let string = from_utf8(slice).ok()?;
- let (object_type, size) = string.split_once(' ')?;
+ let (object_type, size) = string.split_once(' ')?;
- Some(Header::new(
- ObjectType::from_str(object_type)?,
- size.parse().ok()?,
- ))
+ Some(Header::new(
+ ObjectType::from_str(object_type)?,
+ size.parse().ok()?,
+ ))
}
pub fn read_header_from_file(reader: &mut BufReader) -> Option {
- let mut vec = Vec::new();
- reader.read_until(b'\0', &mut vec).ok()?;
+ let mut vec = Vec::new();
+ reader.read_until(b'\0', &mut vec).ok()?;
- read_header_from_slice(&vec[..vec.len() - 1])
+ read_header_from_slice(&vec[..vec.len() - 1])
}
pub async fn read_object_into_headers(
- store: &Store,
- headers: &mut HashMap,
- object_hash: &Hash,
+ store: &Store,
+ headers: &mut HashMap,
+ object_hash: &Hash,
) -> anyhow::Result<()> {
- let mut stack = vec![object_hash.clone()];
+ let mut stack = vec![object_hash.clone()];
- while let Some(current_hash) = stack.pop() {
- if headers.contains_key(¤t_hash) {
- continue;
- }
+ while let Some(current_hash) = stack.pop() {
+ if headers.contains_key(¤t_hash) {
+ continue;
+ }
- let mut object = store.get_object(¤t_hash).await?;
+ let mut object = store.get_object(¤t_hash).await?;
- if object.header.object_type == ObjectType::Index {
- return Err(anyhow::anyhow!(
- "Indexes cannot exist within a tree. Likely a hash collision 😳"
- ));
- }
+ if object.header.object_type == ObjectType::Index {
+ return Err(anyhow::anyhow!(
+ "Indexes cannot exist within a tree. Likely a hash collision 😳"
+ ));
+ }
- headers.insert(current_hash.clone(), object.header);
+ headers.insert(current_hash.clone(), object.header);
- if object.header.object_type == ObjectType::Blob {
- continue;
- }
+ if object.header.object_type == ObjectType::Blob {
+ continue;
+ }
- let mut data = Vec::new();
- let bytes_read = object.read_to_end(&mut data).await?;
+ let mut data = Vec::new();
+ let bytes_read = object.read_to_end(&mut data).await?;
- assert!(
- bytes_read as u64 == object.header.size,
- "Read size must match header size"
- );
+ assert!(
+ bytes_read as u64 == object.header.size,
+ "Read size must match header size"
+ );
- let tree = crate::object_body::Tree::from_data(&data);
+ let tree = crate::object_body::Tree::from_data(&data);
- for entry in &tree.contents {
- stack.push(entry.hash.clone());
- }
- }
+ for entry in &tree.contents {
+ stack.push(entry.hash.clone());
+ }
+ }
- Ok(())
+ Ok(())
}
pub fn read_object_into_headers_sync(
- cache: &Path,
- headers: &mut HashMap,
- object_hash: &Hash,
+ cache: &Path,
+ headers: &mut HashMap,
+ object_hash: &Hash,
) -> anyhow::Result<()> {
- let mut stack = vec![object_hash.clone()];
+ let mut stack = vec![object_hash.clone()];
- while let Some(current_hash) = stack.pop() {
- if headers.contains_key(¤t_hash) {
- continue;
- }
+ while let Some(current_hash) = stack.pop() {
+ if headers.contains_key(¤t_hash) {
+ continue;
+ }
- let object_path = current_hash.get_path(cache);
- let file = File::open(object_path)?;
- let mut reader = BufReader::new(file);
- let mut data = Vec::new();
- let bytes_read = reader.read_until(0, &mut data)?;
+ let object_path = current_hash.get_path(cache);
+ let file = File::open(object_path)?;
+ let mut reader = BufReader::new(file);
+ let mut data = Vec::new();
+ let bytes_read = reader.read_until(0, &mut data)?;
- let header = read_header_from_slice(&data[..bytes_read - 1])
- .ok_or_else(|| anyhow::anyhow!("Invalid header"))?;
+ let header = read_header_from_slice(&data[..bytes_read - 1])
+ .ok_or_else(|| anyhow::anyhow!("Invalid header"))?;
- if header.object_type == ObjectType::Index {
- return Err(anyhow::anyhow!("Indexes cannot exist within a tree"));
- }
+ if header.object_type == ObjectType::Index {
+ return Err(anyhow::anyhow!("Indexes cannot exist within a tree"));
+ }
- headers.insert(current_hash.clone(), header);
+ headers.insert(current_hash.clone(), header);
- if header.object_type == ObjectType::Blob {
- continue;
- }
+ if header.object_type == ObjectType::Blob {
+ continue;
+ }
- data.clear();
- reader.read_to_end(&mut data)?;
+ data.clear();
+ reader.read_to_end(&mut data)?;
- let tree = crate::object_body::Tree::from_data(&data);
+ let tree = crate::object_body::Tree::from_data(&data);
- for entry in &tree.contents {
- stack.push(entry.hash.clone());
- }
- }
+ for entry in &tree.contents {
+ stack.push(entry.hash.clone());
+ }
+ }
- Ok(())
+ Ok(())
}
pub fn pipe(reader: &mut dyn Read, writer: &mut dyn Write) -> anyhow::Result<()> {
- let mut buffer: [u8; 1024] = [0; 1024];
- loop {
- let read = reader.read(&mut buffer)?;
+ let mut buffer: [u8; 1024] = [0; 1024];
+ loop {
+ let read = reader.read(&mut buffer)?;
- if read == 0 {
- break;
- }
+ if read == 0 {
+ break;
+ }
- writer.write_all(&buffer[..read])?;
- }
+ writer.write_all(&buffer[..read])?;
+ }
- Ok(())
+ Ok(())
}
diff --git a/common/src/object.rs b/common/src/object.rs
index 8c84e12..ca330a7 100644
--- a/common/src/object.rs
+++ b/common/src/object.rs
@@ -4,58 +4,58 @@ use sha2::{Digest, Sha512};
use std::io::{BufReader, Read, Write};
pub struct Object {
- header: Header,
- data: Vec,
+ header: Header,
+ data: Vec,
}
impl Object {
- pub fn to_hash(&self) -> Hash {
- let mut hasher = Sha512::new();
- hasher
- .write_all(self.header.to_string().as_bytes())
- .expect("Out of Memory");
- hasher.write_all(&self.data).expect("Out of Memory");
- hasher.into()
- }
-
- pub fn from_data(data: &[u8]) -> Option {
- let mut reader = BufReader::new(data);
- Self::read_from(&mut reader).ok()
- }
-
- pub fn read_from(reader: &mut impl Read) -> Result {
- let mut buffer = [0u8; 32];
- let bytes_read = reader.read(&mut buffer)?;
- let data = &buffer[..bytes_read];
-
- let Some(header_end) = data.iter().position(|x| *x == 0) else {
- return Err(anyhow!(
- "Invalid header. No null byte in the first 32 bytes"
- ));
- };
-
- let header = Header::from_data(&data[..header_end])?;
-
- let mut buffer = Vec::new();
-
- buffer.write_all(&data[header_end + 1..])?;
- reader.read_to_end(&mut buffer)?;
-
- Ok(Object {
- header,
- data: buffer,
- })
- }
-
- pub fn to_data(&self) -> Vec {
- let mut data = Vec::new();
- self.write_to(&mut data).expect("Out of Memory");
- data
- }
-
- pub fn write_to(&self, writer: &mut impl Write) -> Result<()> {
- writer.write_all(self.header.to_string().as_bytes())?;
- writer.write_all(&self.data)?;
- Ok(())
- }
+ pub fn to_hash(&self) -> Hash {
+ let mut hasher = Sha512::new();
+ hasher
+ .write_all(self.header.to_string().as_bytes())
+ .expect("Out of Memory");
+ hasher.write_all(&self.data).expect("Out of Memory");
+ hasher.into()
+ }
+
+ pub fn from_data(data: &[u8]) -> Option {
+ let mut reader = BufReader::new(data);
+ Self::read_from(&mut reader).ok()
+ }
+
+ pub fn read_from(reader: &mut impl Read) -> Result {
+ let mut buffer = [0u8; 32];
+ let bytes_read = reader.read(&mut buffer)?;
+ let data = &buffer[..bytes_read];
+
+ let Some(header_end) = data.iter().position(|x| *x == 0) else {
+ return Err(anyhow!(
+ "Invalid header. No null byte in the first 32 bytes"
+ ));
+ };
+
+ let header = Header::from_data(&data[..header_end])?;
+
+ let mut buffer = Vec::new();
+
+ buffer.write_all(&data[header_end + 1..])?;
+ reader.read_to_end(&mut buffer)?;
+
+ Ok(Object {
+ header,
+ data: buffer,
+ })
+ }
+
+ pub fn to_data(&self) -> Vec {
+ let mut data = Vec::new();
+ self.write_to(&mut data).expect("Out of Memory");
+ data
+ }
+
+ pub fn write_to(&self, writer: &mut impl Write) -> Result<()> {
+ writer.write_all(self.header.to_string().as_bytes())?;
+ writer.write_all(&self.data)?;
+ Ok(())
+ }
}
diff --git a/common/src/object_body.rs b/common/src/object_body.rs
index 9f32e70..17d6c6e 100644
--- a/common/src/object_body.rs
+++ b/common/src/object_body.rs
@@ -5,170 +5,170 @@ use chrono::{DateTime, Utc};
use crate::{Hash, Mode};
pub trait Object {
- fn from_data(data: &[u8]) -> Self;
- fn to_data(&self) -> Vec;
+ fn from_data(data: &[u8]) -> Self;
+ fn to_data(&self) -> Vec;
}
const TREE_KEY: &str = "tree";
const TIMESTAMP_KEY: &str = "timestamp";
#[derive(Debug)]
pub struct Index {
- pub tree: Hash,
- pub timestamp: DateTime,
- pub metadata: HashMap,
+ pub tree: Hash,
+ pub timestamp: DateTime,
+ pub metadata: HashMap,
}
impl Object for Index {
- //TODO: This HAS to return a result. We need to fix that
- fn from_data(data: &[u8]) -> Self {
- let string_data = from_utf8(data).expect("Data to be in valid utf8 format");
-
- assert!(
- &string_data[string_data.len() - 2..] == "\n\n",
- "Index MUST end in a double newline"
- );
- let string_data = &string_data[..string_data.len() - 2];
-
- let mut tree_hash: Option = None;
- let mut timestamp: Option> = None;
- let mut metadata = HashMap::new();
-
- let lines: Vec<&str> = string_data.split('\n').collect();
- for line in lines {
- if line.trim() == "" {
- panic!("Index CANNOT contain a blank line")
- }
-
- let (key, value) = line
- .split_once(':')
- .expect("Each line to be properly formatted");
- let key = key.trim();
- let value = value.trim();
-
- match key.trim() {
- TREE_KEY => tree_hash = Some(Hash::try_from(value).expect("Hash to be valid")),
- TIMESTAMP_KEY => {
- timestamp = Some(
- DateTime::parse_from_rfc3339(value)
- .expect("Timestamp to be in the rfc3339 format")
- .into(),
- )
- }
- _ => {
- if tree_hash.is_none() {
- panic!("Tree MUST come first");
- }
- if timestamp.is_none() {
- panic!("Timestamp MUST come second");
- }
- if metadata
- .insert(key.to_string(), value.trim().to_string())
- .is_some()
- {
- panic!("No duplicate keys allowed within Index Metadata");
- }
- }
- }
- }
-
- Index {
- tree: tree_hash.expect("tree to exist within artifact metadata"),
- timestamp: timestamp.expect("timestamp to exist within artifact metadata"),
- metadata,
- }
- }
-
- fn to_data(&self) -> Vec {
- let mut data: Vec = Vec::new();
-
- fn write_kv(data: &mut Vec, key: &str, value: &str) -> anyhow::Result<()> {
- data.write_all(key.as_bytes())?;
- data.push(b':');
- data.push(b' ');
- data.write_all(value.as_bytes())?;
- data.push(b'\n');
-
- Ok(())
- }
-
- write_kv(&mut data, TREE_KEY, self.tree.as_str()).expect("Write to work");
- write_kv(&mut data, TIMESTAMP_KEY, &self.timestamp.to_rfc3339()).expect("Write to work");
-
- for (key, value) in &self.metadata {
- write_kv(&mut data, key, value).expect("Write to work");
- }
- data.push(b'\n');
-
- data
- }
+ //TODO: This HAS to return a result. We need to fix that
+ fn from_data(data: &[u8]) -> Self {
+ let string_data = from_utf8(data).expect("Data to be in valid utf8 format");
+
+ assert!(
+ &string_data[string_data.len() - 2..] == "\n\n",
+ "Index MUST end in a double newline"
+ );
+ let string_data = &string_data[..string_data.len() - 2];
+
+ let mut tree_hash: Option = None;
+ let mut timestamp: Option> = None;
+ let mut metadata = HashMap::new();
+
+ let lines: Vec<&str> = string_data.split('\n').collect();
+ for line in lines {
+ if line.trim() == "" {
+ panic!("Index CANNOT contain a blank line")
+ }
+
+ let (key, value) = line
+ .split_once(':')
+ .expect("Each line to be properly formatted");
+ let key = key.trim();
+ let value = value.trim();
+
+ match key.trim() {
+ TREE_KEY => tree_hash = Some(Hash::try_from(value).expect("Hash to be valid")),
+ TIMESTAMP_KEY => {
+ timestamp = Some(
+ DateTime::parse_from_rfc3339(value)
+ .expect("Timestamp to be in the rfc3339 format")
+ .into(),
+ )
+ }
+ _ => {
+ if tree_hash.is_none() {
+ panic!("Tree MUST come first");
+ }
+ if timestamp.is_none() {
+ panic!("Timestamp MUST come second");
+ }
+ if metadata
+ .insert(key.to_string(), value.trim().to_string())
+ .is_some()
+ {
+ panic!("No duplicate keys allowed within Index Metadata");
+ }
+ }
+ }
+ }
+
+ Index {
+ tree: tree_hash.expect("tree to exist within artifact metadata"),
+ timestamp: timestamp.expect("timestamp to exist within artifact metadata"),
+ metadata,
+ }
+ }
+
+ fn to_data(&self) -> Vec {
+ let mut data: Vec = Vec::new();
+
+ fn write_kv(data: &mut Vec, key: &str, value: &str) -> anyhow::Result<()> {
+ data.write_all(key.as_bytes())?;
+ data.push(b':');
+ data.push(b' ');
+ data.write_all(value.as_bytes())?;
+ data.push(b'\n');
+
+ Ok(())
+ }
+
+ write_kv(&mut data, TREE_KEY, self.tree.as_str()).expect("Write to work");
+ write_kv(&mut data, TIMESTAMP_KEY, &self.timestamp.to_rfc3339()).expect("Write to work");
+
+ for (key, value) in &self.metadata {
+ write_kv(&mut data, key, value).expect("Write to work");
+ }
+ data.push(b'\n');
+
+ data
+ }
}
#[derive(Debug)]
pub struct TreeEntry {
- pub mode: Mode,
- pub path: String,
- pub hash: Hash,
+ pub mode: Mode,
+ pub path: String,
+ pub hash: Hash,
}
#[derive(Debug)]
pub struct Tree {
- pub contents: Vec,
+ pub contents: Vec,
}
impl Object for Tree {
- fn from_data(data: &[u8]) -> Self {
- let mut contents = Vec::new();
-
- let mut index: usize = 0;
- loop {
- if index == data.len() {
- break;
- }
-
- let remaining = &data[index..];
-
- let Some(position) = remaining.iter().position(|v| *v == 0) else {
- panic!("Entry must contain null char");
- };
-
- let string = from_utf8(&remaining[..position]).expect("Entry must be valid utf8");
- let position = position + 1;
-
- let (mode, name) = string
- .split_once(' ')
- .expect("mode and filename to be seperated by space");
- let mode = Mode::from_str(mode).expect("valid mode");
-
- let hash =
- Hash::try_from(&remaining[position..position + 64]).expect("Hash to be valid");
- contents.push(TreeEntry {
- hash,
- mode,
- path: name.to_string(),
- });
-
- index += position + 64;
- }
-
- Tree { contents }
- }
-
- fn to_data(&self) -> Vec {
- let mut data: Vec = Vec::new();
-
- fn write_entry(data: &mut Vec, entry: &TreeEntry) -> anyhow::Result<()> {
- data.write_all(entry.mode.as_str().as_bytes())?;
- data.push(b' ');
- data.write_all(entry.path.as_bytes())?;
- data.push(0);
- data.write_all(&entry.hash.hash)?;
-
- Ok(())
- }
-
- for entry in &self.contents {
- write_entry(&mut data, entry).expect("Writing to works");
- }
-
- data
- }
+ fn from_data(data: &[u8]) -> Self {
+ let mut contents = Vec::new();
+
+ let mut index: usize = 0;
+ loop {
+ if index == data.len() {
+ break;
+ }
+
+ let remaining = &data[index..];
+
+ let Some(position) = remaining.iter().position(|v| *v == 0) else {
+ panic!("Entry must contain null char");
+ };
+
+ let string = from_utf8(&remaining[..position]).expect("Entry must be valid utf8");
+ let position = position + 1;
+
+ let (mode, name) = string
+ .split_once(' ')
+ .expect("mode and filename to be seperated by space");
+ let mode = Mode::from_str(mode).expect("valid mode");
+
+ let hash =
+ Hash::try_from(&remaining[position..position + 64]).expect("Hash to be valid");
+ contents.push(TreeEntry {
+ hash,
+ mode,
+ path: name.to_string(),
+ });
+
+ index += position + 64;
+ }
+
+ Tree { contents }
+ }
+
+ fn to_data(&self) -> Vec {
+ let mut data: Vec = Vec::new();
+
+ fn write_entry(data: &mut Vec, entry: &TreeEntry) -> anyhow::Result<()> {
+ data.write_all(entry.mode.as_str().as_bytes())?;
+ data.push(b' ');
+ data.write_all(entry.path.as_bytes())?;
+ data.push(0);
+ data.write_all(&entry.hash.hash)?;
+
+ Ok(())
+ }
+
+ for entry in &self.contents {
+ write_entry(&mut data, entry).expect("Writing to works");
+ }
+
+ data
+ }
}
diff --git a/common/src/primitives.rs b/common/src/primitives.rs
index f7eded7..6b2feaf 100644
--- a/common/src/primitives.rs
+++ b/common/src/primitives.rs
@@ -4,10 +4,10 @@ use std::fmt::Display;
#[allow(clippy::zero_prefixed_literal)]
#[derive(Debug)]
pub enum Mode {
- Tree = 040000,
- Normal = 100644,
- Executable = 100755,
- SymbolicLink = 120000,
+ Tree = 040000,
+ Normal = 100644,
+ Executable = 100755,
+ SymbolicLink = 120000,
}
const TREE_MODE: &str = "040000";
@@ -16,56 +16,56 @@ const EXECUTABLE_MODE: &str = "100755";
const SYMBOLIC_LINK_MODE: &str = "120000";
impl Mode {
- #[allow(clippy::should_implement_trait)]
- pub fn from_str(value: &str) -> Option {
- match value {
- TREE_MODE => Some(Mode::Tree),
- NORMAL_MODE => Some(Mode::Normal),
- EXECUTABLE_MODE => Some(Mode::Executable),
- SYMBOLIC_LINK_MODE => Some(Mode::SymbolicLink),
- _ => None,
- }
- }
+ #[allow(clippy::should_implement_trait)]
+ pub fn from_str(value: &str) -> Option {
+ match value {
+ TREE_MODE => Some(Mode::Tree),
+ NORMAL_MODE => Some(Mode::Normal),
+ EXECUTABLE_MODE => Some(Mode::Executable),
+ SYMBOLIC_LINK_MODE => Some(Mode::SymbolicLink),
+ _ => None,
+ }
+ }
- pub fn as_str(&self) -> &'static str {
- match self {
- Self::Tree => TREE_MODE,
- Self::Normal => NORMAL_MODE,
- Self::Executable => EXECUTABLE_MODE,
- Self::SymbolicLink => SYMBOLIC_LINK_MODE,
- }
- }
+ pub fn as_str(&self) -> &'static str {
+ match self {
+ Self::Tree => TREE_MODE,
+ Self::Normal => NORMAL_MODE,
+ Self::Executable => EXECUTABLE_MODE,
+ Self::SymbolicLink => SYMBOLIC_LINK_MODE,
+ }
+ }
}
impl Display for Mode {
- fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
- write!(f, "{}", self.as_str())
- }
+ fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
+ write!(f, "{}", self.as_str())
+ }
}
#[derive(Copy, Clone, Debug, Eq, PartialEq, Ord, PartialOrd, Hash)]
pub enum ObjectType {
- Blob,
- Tree,
- Index,
+ Blob,
+ Tree,
+ Index,
}
impl ObjectType {
- #[allow(clippy::should_implement_trait)]
- pub fn from_str(value: &str) -> Option {
- match value {
- BLOB_KEY => Some(Self::Blob),
- TREE_KEY => Some(Self::Tree),
- INDEX_KEY => Some(Self::Index),
- _ => None,
- }
- }
+ #[allow(clippy::should_implement_trait)]
+ pub fn from_str(value: &str) -> Option {
+ match value {
+ BLOB_KEY => Some(Self::Blob),
+ TREE_KEY => Some(Self::Tree),
+ INDEX_KEY => Some(Self::Index),
+ _ => None,
+ }
+ }
- pub fn to_str(&self) -> &'static str {
- match self {
- Self::Blob => BLOB_KEY,
- Self::Index => INDEX_KEY,
- Self::Tree => TREE_KEY,
- }
- }
+ pub fn to_str(&self) -> &'static str {
+ match self {
+ Self::Blob => BLOB_KEY,
+ Self::Index => INDEX_KEY,
+ Self::Tree => TREE_KEY,
+ }
+ }
}
diff --git a/common/src/store.rs b/common/src/store.rs
index 4ef9435..cd8ac5e 100644
--- a/common/src/store.rs
+++ b/common/src/store.rs
@@ -7,136 +7,136 @@ use opendal::{Builder, FuturesAsyncReader, Operator};
pub struct StoreObject
where
- T: AsyncBufRead + AsyncRead + Unpin,
+ T: AsyncBufRead + AsyncRead + Unpin,
{
- pub header: Header,
- body: T,
+ pub header: Header,
+ body: T,
}
impl StoreObject
where
- T: AsyncBufRead + AsyncRead + Unpin,
+ T: AsyncBufRead + AsyncRead + Unpin,
{
- pub fn header(&self) -> &Header {
- &self.header
- }
-
- pub fn to_data(&mut self) -> Vec {
- // read body into Vec
- let mut body_data: Vec = Vec::new();
- futures::executor::block_on(self.body.read_to_end(&mut body_data))
- .expect("Reading body to work");
-
- body_data
- }
- // pub async fn new(mut reader: T) -> Result
- // {
- // let mut buffer = [0u8; 32];
- // let bytes_read = reader.read(&mut buffer).await?;
- // let data = &buffer[..bytes_read];
-
- // let Some(header_end) = data.iter().position(|x| *x == 0) else {
- // return Err(anyhow!(
- // "Invalid header. No null byte in the first 32 bytes"
- // ));
- // };
- // let header = Header::from_data(&data[..header_end])?;
- // reader
- // .seek(std::io::SeekFrom::Start(header_end as u64))
- // .await?;
-
- // Ok(Self {
- // header,
- // body: reader,
- // })
- // }
-
- pub fn new_with_header(header: Header, reader: T) -> Self {
- Self {
- header,
- body: reader,
- }
- }
+ pub fn header(&self) -> &Header {
+ &self.header
+ }
+
+ pub fn to_data(&mut self) -> Vec {
+ // read body into Vec
+ let mut body_data: Vec = Vec::new();
+ futures::executor::block_on(self.body.read_to_end(&mut body_data))
+ .expect("Reading body to work");
+
+ body_data
+ }
+ // pub async fn new(mut reader: T) -> Result
+ // {
+ // let mut buffer = [0u8; 32];
+ // let bytes_read = reader.read(&mut buffer).await?;
+ // let data = &buffer[..bytes_read];
+
+ // let Some(header_end) = data.iter().position(|x| *x == 0) else {
+ // return Err(anyhow!(
+ // "Invalid header. No null byte in the first 32 bytes"
+ // ));
+ // };
+ // let header = Header::from_data(&data[..header_end])?;
+ // reader
+ // .seek(std::io::SeekFrom::Start(header_end as u64))
+ // .await?;
+
+ // Ok(Self {
+ // header,
+ // body: reader,
+ // })
+ // }
+
+ pub fn new_with_header(header: Header, reader: T) -> Self {
+ Self {
+ header,
+ body: reader,
+ }
+ }
}
impl AsyncRead for StoreObject
where
- T: AsyncBufRead + AsyncRead + Unpin,
+ T: AsyncBufRead + AsyncRead + Unpin,
{
- fn poll_read(
- self: std::pin::Pin<&mut Self>,
- cx: &mut std::task::Context<'_>,
- buf: &mut [u8],
- ) -> std::task::Poll> {
- let this = self.get_mut();
- std::pin::Pin::new(&mut this.body).poll_read(cx, buf)
- }
+ fn poll_read(
+ self: std::pin::Pin<&mut Self>,
+ cx: &mut std::task::Context<'_>,
+ buf: &mut [u8],
+ ) -> std::task::Poll> {
+ let this = self.get_mut();
+ std::pin::Pin::new(&mut this.body).poll_read(cx, buf)
+ }
}
impl AsyncBufRead for StoreObject
where
- T: AsyncBufRead + AsyncRead + Unpin,
+ T: AsyncBufRead + AsyncRead + Unpin,
{
- fn poll_fill_buf(
- self: std::pin::Pin<&mut Self>,
- cx: &mut std::task::Context<'_>,
- ) -> std::task::Poll> {
- let this = self.get_mut();
- std::pin::Pin::new(&mut this.body).poll_fill_buf(cx)
- }
-
- fn consume(self: std::pin::Pin<&mut Self>, amt: usize) {
- let this = self.get_mut();
- std::pin::Pin::new(&mut this.body).consume(amt);
- }
+ fn poll_fill_buf(
+ self: std::pin::Pin<&mut Self>,
+ cx: &mut std::task::Context<'_>,
+ ) -> std::task::Poll> {
+ let this = self.get_mut();
+ std::pin::Pin::new(&mut this.body).poll_fill_buf(cx)
+ }
+
+ fn consume(self: std::pin::Pin<&mut Self>, amt: usize) {
+ let this = self.get_mut();
+ std::pin::Pin::new(&mut this.body).consume(amt);
+ }
}
#[derive(Clone)]
pub struct Store {
- operator: Operator,
+ operator: Operator,
}
impl Store {
- pub fn new(operator: Operator) -> Self {
- Self { operator }
- }
-
- pub fn from_builder(builder: impl Builder) -> Result {
- Ok(Self::new(Operator::new(builder)?.finish()))
- }
-
- pub async fn exists(&self, hash: &Hash) -> Result {
- Ok(self.operator.exists(hash.as_str()).await?)
- }
-
- pub async fn get_object(&self, hash: &Hash) -> Result> {
- let mut reader = self
- .operator
- .reader(hash.as_str())
- .await?
- .into_futures_async_read(..)
- .await?;
-
- let header = Header::read_from_async(&mut reader).await?;
-
- Ok(StoreObject::new_with_header(header, reader))
- }
-
- pub async fn put_object(&self, hash: &Hash, mut object: StoreObject) -> Result<()>
- where
- T: AsyncBufRead + AsyncRead + Unpin,
- {
- let mut writer = self
- .operator
- .writer(hash.as_str())
- .await?
- .into_futures_async_write();
-
- object.header.write_to_async(&mut writer).await?;
- copy(&mut object.body, &mut writer).await?;
-
- writer.close().await?;
-
- Ok(())
- }
+ pub fn new(operator: Operator) -> Self {
+ Self { operator }
+ }
+
+ pub fn from_builder(builder: impl Builder) -> Result {
+ Ok(Self::new(Operator::new(builder)?.finish()))
+ }
+
+ pub async fn exists(&self, hash: &Hash) -> Result {
+ Ok(self.operator.exists(hash.as_str()).await?)
+ }
+
+ pub async fn get_object(&self, hash: &Hash) -> Result> {
+ let mut reader = self
+ .operator
+ .reader(hash.as_str())
+ .await?
+ .into_futures_async_read(..)
+ .await?;
+
+ let header = Header::read_from_async(&mut reader).await?;
+
+ Ok(StoreObject::new_with_header(header, reader))
+ }
+
+ pub async fn put_object(&self, hash: &Hash, mut object: StoreObject) -> Result<()>
+ where
+ T: AsyncBufRead + AsyncRead + Unpin,
+ {
+ let mut writer = self
+ .operator
+ .writer(hash.as_str())
+ .await?
+ .into_futures_async_write();
+
+ object.header.write_to_async(&mut writer).await?;
+ copy(&mut object.body, &mut writer).await?;
+
+ writer.close().await?;
+
+ Ok(())
+ }
}
diff --git a/docs/README.md b/docs/README.md
new file mode 100644
index 0000000..0ba1fec
--- /dev/null
+++ b/docs/README.md
@@ -0,0 +1,21 @@
+# ARX Documentation
+
+Congratulations, you have found the source for the ARX documentation.
+
+## Contributing
+
+Thank you for considering contributing to the ARX documentation! We welcome all contributions, no matter how small. Whether it's fixing a typo, improving grammar, or creating a whole new rfc, every contribution is valuable.
+
+
+## Markdown?
+
+ARX's documentation is written entirely in [Markdown](https://www.markdownguide.org/getting-started/), a plain text markup language.
+
+If you haven't used markdown, then there is a chance that actually you have. Discord, Reddit, WhatsApp, Discourse & most Fediverse apps all have markdown support. So if you know how to make text bold in discord, there is a good chance that you know how to make text bold in markdown.
+
+If you haven't ever used any of those apps or never wrote anything but the most basic of unformatted text. Or even if you just want a reference or quick refresher. You can use the following Cheat Sheet to get up to date quickly: https://www.markdownguide.org/cheat-sheet/.
+
+
+## Disclaimer
+
+As ARX is still in its early stages of development, large parts of the codebase may change and documentation may not always match perfectly, If there are any discrepancies feel free to raise an issue or fix the problem yourself and submit it as a PR.
\ No newline at end of file
diff --git a/docs/assets/banner-dark.svg b/docs/assets/banner-dark.svg
new file mode 100644
index 0000000..6c97afd
--- /dev/null
+++ b/docs/assets/banner-dark.svg
@@ -0,0 +1,9 @@
+