One API, every format. Extract typed data from Excel, CSV, PDF, JSON, and XML in C#.
| Package | Version | Downloads | Format | Dependencies |
|---|---|---|---|---|
| Unio.Core | CSV | None | ||
| Unio.Excel | XLSX, XLS | DocumentFormat.OpenXml | ||
| Unio.Pdf | PDF tables | PdfPig | ||
| Unio.Json | JSON | None (System.Text.Json) |
||
| Unio.Xml | XML | None (System.Xml.Linq) |
||
| Unio.Validation | -- | None (DataAnnotations) |
Reading an Excel file requires one library. CSV needs another. PDF tables need a third. Each has its own API, its own patterns, its own quirks.
Unio gives you one API for all of them:
var unio = new Unio();
var invoices = unio.Extract<Invoice>("invoices.xlsx");
var invoices = unio.Extract<Invoice>("invoices.csv");
var invoices = unio.Extract<Invoice>("invoices.pdf");
var invoices = unio.Extract<Invoice>("invoices.json");Same call. Same result. Format is auto-detected.
Install the core package and add only the formats you need:
dotnet add package Unio.Core # Core + CSV
dotnet add package Unio.Excel # Add Excel support
dotnet add package Unio.Pdf # Add PDF support
dotnet add package Unio.Json # Add JSON support
dotnet add package Unio.Xml # Add XML support
dotnet add package Unio.Validation # Add validation supportThe core package has zero external dependencies.
public class Invoice
{
[Column("Invoice Number")]
public string InvoiceNo { get; set; }
[Column("Total Amount")]
[Required]
public decimal Amount { get; set; }
[DateFormat("dd/MM/yyyy")]
public DateTime DueDate { get; set; }
[Ignore]
public string InternalNotes { get; set; }
}// From any supported format -- auto-detected
var unio = new Unio();
var invoices = unio.Extract<Invoice>("invoices.xlsx");Process millions of rows without loading everything into memory:
var unio = new Unio();
await foreach (var invoice in unio.ExtractAsync<Invoice>("huge-file.xlsx"))
{
await ProcessAsync(invoice);
}No model? No problem:
var unio = new Unio();
var rows = unio.Extract("data.csv"); // IEnumerable<dynamic>var unio = new Unio();
var records = unio.Extract<Invoice>("data.xlsx", opt =>
{
opt.SheetName = "Sheet2";
opt.StartRow = 3;
opt.HasHeaderRow = true;
opt.Validate = true;
});services.AddUnio(config =>
{
config.RegisterExtractor<XlsxExtractor>();
config.RegisterExtractor<PdfTableExtractor>();
config.DefaultCulture = new CultureInfo("en-US");
config.OnError = ErrorHandling.CollectAndContinue;
});public class ReportService(IUnioExtractor extractor)
{
public async Task<List<Invoice>> LoadAsync(Stream file)
=> await extractor.ExtractAsync<Invoice>(file).ToListAsync();
}- Unified API --
Extract<T>()works the same across every format - Strongly typed -- Map directly to your POCOs with attributes
- Streaming first --
IAsyncEnumerable<T>across all formats, never loads entire files into memory - Zero-dep core -- The core package has no external dependencies
- Modular -- Install only the format readers you need
- Validation -- Built-in support for
DataAnnotationsand fluent validation rules - DI-first -- Implements
IUnioExtractorfor clean dependency injection - Cross-platform -- No
System.Drawing, no COM, nolibgdiplus. Works on Windows, Linux, macOS, and containers - Auto-detection -- File format detected by magic bytes and extension
Add standard [Required], [Range], [StringLength] attributes to your model, then:
using Unio.Validation;
var result = unio.ExtractWithErrors<Invoice>("invoices.csv", opt =>
{
opt.UseDataAnnotationValidation();
});
Console.WriteLine($"Valid: {result.SuccessCount}, Invalid: {result.ErrorCount}");
foreach (var error in result.Errors)
Console.WriteLine($" Row {error.RowNumber}: {error.Message}");using Unio.Validation;
var validator = new FluentRecordValidator<Invoice>()
.RuleFor(x => x.Amount, v => v.GreaterThan(0m).LessThan(1_000_000m))
.RuleFor(x => x.InvoiceNo, v => v.NotEmpty().MaxLength(20))
.RuleFor(x => x.DueDate, v => v.NotDefault());
var result = unio.ExtractWithErrors<Invoice>("invoices.csv", opt =>
{
opt.UseFluentValidation(validator);
});
// result.Records -- valid records only
// result.Errors -- validation failures with row numbersvar records = unio.Extract<Invoice>("invoices.csv");
var batch = new DataAnnotationValidator().ValidateAll(records);
// batch.Valid -- records that passed
// batch.Invalid -- records that failed
// batch.Errors -- all validation errors with details// Collect all errors and get valid records
var result = unio.ExtractWithErrors<Invoice>("data.csv");
// result.Records -- valid records
// result.Errors -- all errors with row numbers
// result.HasErrors -- quick check
// Configure error handling mode
var data = unio.Extract<Invoice>("data.csv", opt =>
{
opt.OnError = ErrorHandling.ThrowOnFirst; // Throw immediately
opt.OnError = ErrorHandling.SkipAndContinue; // Skip bad rows
opt.OnError = ErrorHandling.CollectAndContinue; // Collect errors
});File / Stream
|
v
FormatDetector Detects format via magic bytes + extension
|
v
IDataExtractor<T> Format-specific reader (CSV, XLSX, PDF...)
|
v
TypeMapper Maps columns to properties via attributes
|
v
ValidationEngine Validates using DataAnnotations
|
v
IEnumerable<T> Your strongly-typed data, ready to use
Full documentation and guides are available in the Wiki.
Contributions are welcome. See CONTRIBUTING.md for guidelines.
git clone https://github.com/Clifftech123/unio.git
cd unio
dotnet build
dotnet testMIT -- Isaiah Clifford Opoku