Fix binary data handling in byte-to-string conversions#27
Merged
Conversation
Add lenient mode for handling non-UTF8 binary data (e.g., certificates, images): - When no encoding is specified, use Latin-1 fallback for invalid UTF-8 bytes - When encoding is explicitly provided, use strict mode (throws on invalid) - Replace null bytes with Unicode replacement character (U+FFFD) for C string safety Rust changes: - Add bytes_to_string_lenient, base64_to_string_lenient, decompress_string_lenient FFI functions - Update convert_bytes_to_string_with_fallback to handle null byte replacement PowerShell changes: - Update ConvertFrom-ByteArrayToString, ConvertFrom-Base64ToString, ConvertFrom-CompressedByteArrayToString - Add DllImport declarations for lenient functions in RustInterop.ps1 This allows commands like 'Get-SecureBootUEFI db | ConvertFrom-ByteArrayToString' to work without throwing 'Invalid UTF-8 bytes' errors.
Move encoding defaults from parameter declarations to internal handling:
- Allows detection of whether user explicitly specified an encoding
- Enables future best-effort improvements when encoding is not provided
- All functions now use internal 'if ([string]::IsNullOrEmpty(\)) { \ = ''UTF8'' }'
Updated functions:
- ConvertFrom-Base64, ConvertFrom-MemoryStream, ConvertFrom-MemoryStreamToString
- ConvertFrom-MemoryStreamToSecureString, ConvertFrom-StringToBase64
- ConvertFrom-StringToByteArray, ConvertFrom-StringToCompressedByteArray
- ConvertFrom-StringToMemoryStream, ConvertTo-Base64, ConvertTo-Hash
- ConvertTo-HmacHash, ConvertTo-MemoryStream
…ctions - Update ConvertFrom-ByteArrayToString help to explain UTF-8 with Latin-1 fallback when -Encoding not specified - Update ConvertFrom-Base64ToString help with same lenient/strict behavior documentation - Fix ConvertFrom-CompressedByteArrayToString incorrect synopsis (was backwards) and add encoding docs - Regenerate markdown documentation - Add INCONSISTENCIES.md tracking document for standardization work
url_encode and url_decode now call set_error() when input is null, matching the error handling pattern used by all other Rust functions.
hash.rs now uses crate::base64::convert_string_to_bytes() instead of maintaining its own copy of the encoding conversion logic.
- ConvertTo-Celsius and ConvertTo-Fahrenheit now accept -Precision (0-15) - Default remains 2 decimal places for practical use - Higher precision available for scientific calculations (e.g., absolute zero) - Rust returns full f64 precision, PowerShell handles rounding - Updated INCONSISTENCIES.md to mark item #5 as resolved
- Add compute_hmac_with_encoding for string input with encoding parameter - Add compute_hmac_bytes for direct byte array input (no encoding needed) - Remove unused compute_hmac function - Update ConvertTo-HmacHash to use new Rust functions - Update ConvertFrom-StringToMemoryStream to use Rust string_to_bytes - Fix bug: non-compress path now respects -Encoding parameter - Fix bug: GzipStream now properly closed - Update tests for correct behavior - Mark INCONSISTENCIES.md issue #4 as resolved
- Replace Unicode arrow character with ASCII in test name - Use [char]::ConvertFromUtf32(0x1F30D) instead of literal emoji to avoid file encoding corruption issues
- Update build.ps1 and RustInterop.ps1 - Fix compression.rs in Rust library - Update Base64 conversion functions - Fix ConvertTo-MemoryStream and RustInterop tests
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #24 - Binary data handling issues in byte-to-string conversions.
Changes
Binary Data Handling
ConvertFrom-ByteArrayToStringthat falls back to Latin-1 (ISO-8859-1) when no encoding is specified-Encodingis explicitly specified, strict mode is used and invalid byte sequences throw errorsEncoding Standardization
Encodingparameter values for future extensibilityConvertFrom-Base64,ConvertFrom-Base64ToString,ConvertFrom-ByteArrayToBase64, andConvertFrom-StringToBase64to use Rust implementationsTemperature Functions
-Precisionparameter toConvertTo-CelsiusandConvertTo-Fahrenheitfor controlling decimal placesCode Quality
convert_string_to_bytesfrom hash moduleDocumentation