Lavoisier Documentation

Welcome to the comprehensive documentation for the Lavoisier mass spectrometry analysis framework. Lavoisier is a high-performance computing framework that combines numerical and visual processing methods with integrated artificial intelligence modules for automated compound identification and structural elucidation.

🎯 NEW: Precursor Framework

The Precursor module introduces a revolutionary approach to mass spectrometry analysis through S-Entropy Coordinates and Virtual Instruments.

Key Innovations

  • S-Entropy Coordinates: 3D categorical coordinate system (S_knowledge, S_time, S_entropy) for platform-independent spectral representation
  • Ion-to-Droplet Computer Vision: Bijective transformation of mass spectra into thermodynamic droplet images
  • Virtual Instrument Ensemble: Hardware-grounded virtual mass spectrometers with phase-lock networks
  • Molecular Maxwell Demon: Information-theoretic fragmentation analysis using categorical states
  • Categorical Completion: Gap-filling and annotation through S-entropy trajectory analysis

Precursor Pipeline

Spectral Acquisition β†’ S-Entropy Transform β†’ Computer Vision
                                              ↓
Virtual Instruments ← Categorical Completion ← BMD Grounding

Validated on UC Davis Metabolomics Dataset

The Precursor framework has been validated on the UC Davis metabolomics dataset (10 mzML files, ~16,000 spectra total), demonstrating:

  • S-Entropy transformation: 800+ spectra/second
  • Physics-validated droplet conversion: 50-100 spectra/second
  • Cross-platform categorical consistency: Coherence > 0.85

🎯 Buhera Scripting Language

Lavoisier includes Buhera, a domain-specific scripting language that transforms mass spectrometry analysis by encoding the scientific method as executable scripts.

Buhera Documentation

Key Buhera Features

  • 🎯 Objective-First Analysis: Scripts declare explicit scientific goals before execution
  • βœ… Pre-flight Validation: Catch experimental flaws before wasting time and resources
  • 🧠 Goal-Directed AI: Bayesian evidence networks optimized for specific objectives
  • πŸ”¬ Scientific Rigor: Enforced statistical requirements and biological coherence

Core Framework

System Architecture & Installation

AI Modules & Intelligence

Analysis Pipelines

Development & Integration

Quick Start Guide

1. Precursor Analysis (NEW!)

from precursor.src.core.SpectraReader import extract_mzml
from precursor.src.core.EntropyTransformation import SEntropyTransformer
from precursor.src.core.IonToDropletConverter import IonToDropletConverter

# Load data
scan_info, spectra, xic = extract_mzml("your_data.mzML")

# Transform to S-Entropy coordinates
transformer = SEntropyTransformer()
coords, matrix = transformer.transform_spectrum(mz, intensity)

# Generate droplet images with physics validation
converter = IonToDropletConverter(resolution=(512, 512))
image, droplets = converter.convert_spectrum_to_image(mz, intensity)

# Access categorical coordinates
for droplet in droplets:
    s_k = droplet.s_entropy_coords.s_knowledge  # Structural knowledge
    s_t = droplet.s_entropy_coords.s_time       # Temporal position
    s_e = droplet.s_entropy_coords.s_entropy    # Thermodynamic entropy

2. Virtual Instrument Ensemble

from precursor.src.virtual import VirtualMassSpecEnsemble

# Create ensemble with all instruments
ensemble = VirtualMassSpecEnsemble(
    enable_all_instruments=True,
    enable_hardware_grounding=True
)

# Measure with cross-platform consensus
result = ensemble.measure_spectrum(mz, intensity, rt)
print(f"Phase-locks: {result.total_phase_locks}")
print(f"Convergence: {result.convergence_nodes_count}")

3. Complete Pipeline

cd precursor

# Run complete analysis on UC Davis dataset
python run_ucdavis_complete_analysis.py

# Or resume from Stage 2B (faster)
python run_ucdavis_resume.py

4. Buhera Script Analysis

# Build Buhera language
cd lavoisier-buhera && cargo build --release

# Create and execute a script
buhera validate biomarker_discovery.bh
buhera execute biomarker_discovery.bh

Pipeline Results Structure

After running Precursor analysis:

results/
β”œβ”€β”€ ucdavis_complete_analysis/
β”‚   β”œβ”€β”€ {file_name}/
β”‚   β”‚   β”œβ”€β”€ stage_01_preprocessing/
β”‚   β”‚   β”‚   β”œβ”€β”€ scan_info.csv
β”‚   β”‚   β”‚   └── spectra/
β”‚   β”‚   β”œβ”€β”€ stage_02_sentropy/
β”‚   β”‚   β”‚   β”œβ”€β”€ sentropy_features.csv
β”‚   β”‚   β”‚   └── matrices/
β”‚   β”‚   β”œβ”€β”€ stage_02_cv/
β”‚   β”‚   β”‚   β”œβ”€β”€ images/        # Droplet images
β”‚   β”‚   β”‚   └── droplets/      # Physics-validated data
β”‚   β”‚   β”œβ”€β”€ stage_02_5_fragmentation/
β”‚   β”‚   β”œβ”€β”€ stage_03_bmd/
β”‚   β”‚   β”œβ”€β”€ stage_04_completion/
β”‚   β”‚   └── stage_05_virtual/
β”‚   └── analysis_summary.csv
└── visualizations/
    β”œβ”€β”€ entropy_space/
    β”œβ”€β”€ molecular_language/
    └── phase_lock/

Use Cases

πŸ”¬ Scientific Research

  • Metabolomics: S-Entropy coordinate analysis for metabolite identification
  • Proteomics: Fragmentation pattern analysis with categorical completion
  • Biomarker Discovery: Virtual instrument consensus for robust markers
  • Cross-Platform Studies: Platform-independent categorical representation

πŸ€– Computer Vision

  • Ion-to-Droplet Conversion: Thermodynamic image generation
  • Physics Validation: Navier-Stokes constrained droplet parameters
  • Multi-Modal Analysis: Spectral + visual feature fusion

πŸ”— Virtual Instruments

  • Ensemble Consensus: Multi-instrument agreement scoring
  • Hardware Grounding: Reality validation through oscillation harvesting
  • Phase-Lock Networks: Molecular ensemble detection

Publications

The framework is documented in several publications under precursor/publication/:

  • S-Entropy Coordinates: Categorical coordinate system for mass spectrometry
  • Ion-to-Droplet Computer Vision: Bijective thermodynamic image generation
  • Virtual Instruments: Hardware-grounded virtual mass spectrometers
  • Molecular Language: Categorical amino acid alphabet and fragmentation grammar

Contributing

We welcome contributions to:

  1. Precursor Framework: S-Entropy, virtual instruments, computer vision
  2. Buhera Language: Rust-based language implementation
  3. Documentation: Tutorials, examples, and best practices
  4. Validation: Test cases and benchmarking datasets

See our implementation roadmap for current development priorities.

Community

  • GitHub: lavoisier
  • Issues: Report bugs and request features
  • Discussions: Share use cases and get help

License

Lavoisier is released under the MIT License. See LICENSE file for details.


β€œOnly the extraordinary can beget the extraordinary” - Antoine Lavoisier

Transform your mass spectrometry analysis with S-Entropy coordinates and virtual instruments.


Copyright © 2024 Lavoisier Project. Distributed under the MIT License.

This site uses Just the Docs, a documentation theme for Jekyll.