Heihachi

🔥 Revolutionary Fire-Based Emotion Interface

Groundbreaking fire-based emotional querying system that taps into humanity's deepest cognitive patterns. Users create and manipulate digital fire through an intuitive WebGL interface, which the system "understands" using advanced AI reconstruction techniques.

The Science Behind Fire-Emotion Mapping

Based on extensive research into human consciousness and fire recognition, fire represents humanity's first and most fundamental abstraction - deeply embedded in our neural architecture. Fire recognition activates the same neural networks as human consciousness itself.

🔥

Digital Fire Creation

Intuitive WebGL interface for creating and manipulating fire with real-time physics simulation

🧠

Pakati Understanding Engine

AI system that "learns" fire patterns by reconstructing them from partial information

🎵

Direct Audio Generation

Converting understood fire patterns into music that matches the emotional content

How It Works

1

Fire Creation

Users interact with WebGL interface to create, maintain, and modify digital fire

2

Pattern Capture

System captures fire characteristics (intensity, color, movement, structure)

3

AI Understanding

Pakati reconstructs fire from partial information to prove comprehension

4

Audio Generation

Understood patterns drive Heihachi's audio synthesis engines

🧠 Autobahn Integration: Delegated Probabilistic Reasoning

Revolutionary delegated probabilistic reasoning architecture where all probabilistic tasks, Bayesian inference, biological intelligence, and consciousness modeling are delegated to the Autobahn oscillatory bio-metabolic RAG system.

Autobahn System Overview

🧬

12 Theoretical Frameworks

Including fire-evolved consciousness substrate and biological intelligence architectures

⚡

Oscillatory Bio-Metabolic Processing

3-layer architecture with ATP-driven metabolic computation

🧘

Consciousness Emergence Modeling

Real-time IIT Φ calculation for consciousness quantification

📊

Advanced Uncertainty Quantification

Sophisticated Bayesian inference and fuzzy logic processing

Integration Benefits

Performance Optimization

Fire Pattern Analysis: <10ms (Autobahn oscillatory processing)
Audio Optimization: <20ms (Autobahn Bayesian inference)
Consciousness Calculation: <15ms (Autobahn IIT Φ)
End-to-End Latency: <50ms (Rust + Autobahn delegation)

Scientific Foundation

Biological Intelligence: Membrane processing with ion channel coherence
Consciousness Modeling: IIT-based Φ calculation for awareness quantification
Metabolic Computation: ATP-driven processing with multiple metabolic modes
Uncertainty Handling: Explicit modeling in all probabilistic operations

Delegation Architecture

Fire Interface
Pakati Engine
Rust Audio Core

→

Autobahn

Fire Pattern Analysis
Consciousness Modeling (IIT Φ)
Bayesian Optimization

→

Audio Output

Optimized Audio Generation
Consciousness-Informed Synthesis
Real-time Performance

FeltBeats: Music Discovery by Feeling

Transforming Heihachi into a revolutionary music listening application where users discover music by describing emotions and feelings - powered by academic research and continuous learning.

Discover Music by Feeling

Instead of searching by genre or artist, describe how you want to feel: "I want something dark and atmospheric with building tension" or "Find me energetic tracks with complex rhythms and heavy bass."

"I want to feel mysterious and anticipatory"

→ Atmospheric intro sections with filtered breaks and sparse percussion

"Find energetic sections with aggressive basslines"

→ Peak moments with layered percussion and bass stacking

"Something technical but spacious"

→ Complex drum patterns with reverb-heavy atmospheric elements

Dual LLM Architecture

Academic Knowledge LLM

Trained on ~100 scientific publications covering music perception, emotion, and drum & bass production. Provides deep theoretical understanding of how music affects emotions and neural processing.

Scientific Foundation Music Perception Emotional Response

Continuous Learning LLM

Builds domain expertise by continuously analyzing new mixes. Each analysis becomes training data, creating an ever-growing understanding of electronic music patterns and emotional characteristics.

Adaptive Learning Mix Analysis Pattern Recognition

REST API

Comprehensive REST API for integrating Heihachi's audio analysis capabilities into web applications, mobile apps, and other systems. Supports both synchronous and asynchronous processing.

🚀

Fast & Scalable

Asynchronous job processing with configurable concurrency limits and rate limiting

🔧

Easy Integration

RESTful endpoints with comprehensive documentation and client examples

🎯

Specialized Analysis

Dedicated endpoints for beats, drums, stems, emotions, and semantic search

Quick Start

# Install API dependencies
pip install flask flask-cors flask-limiter

# Start the API server
python api_server.py --host 0.0.0.0 --port 5000

# Or with production settings
python api_server.py --production --config-path configs/production.yaml

# Analyze audio file with emotion mapping
curl -X POST http://localhost:5000/api/v1/semantic/analyze \
  -F "file=@track.wav" \
  -F "include_emotions=true" \
  -F "index_for_search=true"

# Extract beats from audio
curl -X POST http://localhost:5000/api/v1/beats \
  -F "file=@track.mp3"

# Search indexed tracks semantically
curl -X POST http://localhost:5000/api/v1/semantic/search \
  -H "Content-Type: application/json" \
  -d '{"query": "dark aggressive neurofunk", "top_k": 5}'

import requests

# Extract emotional features
def extract_emotions(file_path):
    url = "http://localhost:5000/api/v1/semantic/emotions"
    with open(file_path, 'rb') as f:
        files = {'file': f}
        response = requests.post(url, files=files)
        return response.json()

# Example usage
emotions = extract_emotions("track.wav")
print(f"Dominant emotion: {emotions['summary']['dominant_emotion']}")
print(f"Energy level: {emotions['emotions']['energy']:.1f}/10")

// Analyze audio file
async function analyzeAudio(file) {
    const formData = new FormData();
    formData.append('file', file);
    formData.append('include_emotions', 'true');
    
    const response = await fetch('/api/v1/semantic/analyze', {
        method: 'POST',
        body: formData
    });
    
    return await response.json();
}

// Usage with file input
const fileInput = document.getElementById('audio-file');
fileInput.addEventListener('change', async (e) => {
    const result = await analyzeAudio(e.target.files[0]);
    console.log('Analysis result:', result);
});

Available Endpoints

Audio Analysis

POST /api/v1/analyze Full audio analysis pipeline

POST /api/v1/features Extract audio features

POST /api/v1/beats Beat and tempo detection

POST /api/v1/drums Drum pattern analysis

POST /api/v1/stems Audio stem separation

Semantic Analysis

POST /api/v1/semantic/analyze Semantic analysis with emotions

POST /api/v1/semantic/emotions Extract emotional features

POST /api/v1/semantic/search Semantic track search

POST /api/v1/semantic/text-analysis Text sentiment analysis

Job Management

POST /api/v1/batch-analyze Batch process multiple files

GET /api/v1/jobs/{id} Get job status and results

GET /api/v1/jobs List all jobs (paginated)

Analysis Results Visualization

The API returns detailed analysis data that can be visualized to understand track characteristics and emotional profiles. These visualizations help developers integrate meaningful insights into their applications.

Drum Element Distribution

Shows the proportion of different drum elements, contributing to groove and rhythm characteristics

Velocity-Confidence Analysis

Correlates drum hit confidence with velocity, indicating playing dynamics and energy levels

Semantic Analysis

Transform raw audio features into meaningful emotional dimensions and enable intelligent music discovery through semantic search and natural language queries.

Emotional Feature Mapping

Heihachi maps technical audio features to 9 distinct emotional dimensions using scientifically-grounded algorithms that correlate spectral, rhythmic, and temporal characteristics with human emotional perception.

Energy

8.5

Loudness, tempo, and drum intensity

Brightness

6.5

Spectral centroid and high-frequency content

Tension

7.5

Dissonance and rhythmic complexity

Warmth

4.5

Low-mid energy and harmonic richness

Groove

9.0

Microtiming and syncopation quality

Aggression

8.0

Transient sharpness and distortion

Atmosphere

7.0

Reverb amount and stereo width

Melancholy

3.5

Minor key and sparse arrangement

Euphoria

5.5

Major key and uplifting progressions

Semantic Search

Find tracks using natural language descriptions of emotions, moods, and musical characteristics. The search system understands contextual relationships between audio features and emotional responses.

Query: "dark aggressive neurofunk with heavy bass"

Blackout Protocol - Artist A

Similarity: 0.92

Energy: 8.5 Aggression: 9.1 Tension: 8.8

Neural Storm - Artist B

Similarity: 0.89

Energy: 8.2 Aggression: 8.7 Warmth: 3.1

Search Capabilities

Natural language emotion queries
Vector-based similarity matching
Multi-dimensional emotional filtering
Contextual query enhancement
Real-time indexing and retrieval

Technical Foundation

Semantic analysis builds upon detailed drum pattern analysis and feature extraction to create comprehensive emotional profiles. The system processes complex rhythmic patterns and translates them into meaningful emotional dimensions.

Drum pattern analysis visualization showing the temporal distribution of different drum elements, which feeds into the emotional mapping algorithms to determine groove, energy, and tension characteristics.

Drum density analysis over time - high-density regions contribute to energy and aggression scores, while sparse sections indicate atmospheric or melancholic characteristics.

Command Line Integration

Semantic analysis capabilities are fully integrated into the Heihachi CLI for streamlined workflows.

Extract emotions: python -m src.main semantic emotions track.wav

Index for search: python -m src.main semantic index audio_folder/ --artist "Artist" --title "Track"

Search tracks: python -m src.main semantic search "atmospheric intro with tension building"

View statistics: python -m src.main semantic stats

Emotional Analysis Output

Transform raw audio analysis into structured, emotion-focused data that powers feeling-based music discovery and LLM training.

Analysis Structure

mix_analysis/

mix_001/

metadata.json # Basic mix info

summary.txt # Human-readable summary

segments.json # Track segments with timestamps

emotional_profile.json # Emotional characteristics

technical_features.jsonl # LLM-friendly features

emotional_profile.json

{
  "overall_mood": ["dark", "energetic", "technical"],
  "intensity_curve": [0.4, 0.5, 0.7, 0.8, 0.75, 0.9, 0.85, 0.7],
  "emotional_segments": [
    {
      "start_time": 0,
      "end_time": 390.0,
      "primary_emotion": "atmospheric",
      "tension_level": 0.4,
      "descriptors": ["spacious", "anticipatory", "mysterious"]
    }
  ],
  "peak_moments": [
    {
      "time": 870.5,
      "intensity": 0.92,
      "description": "Maximum energy with layered percussion and aggressive bassline",
      "key_elements": ["double_drops", "bass_stacking", "drum_fills"]
    }
  ]
}

segments.json

[
  {
    "segment_id": "s001",
    "start_time": 0,
    "end_time": 198.5,
    "type": "intro",
    "energy_level": 0.45,
    "key_elements": ["atmospheric_pads", "filtered_breaks", "sparse_percussion"],
    "description": "Atmospheric intro with filtered breaks and sparse percussion"
  },
  {
    "segment_id": "s002", 
    "start_time": 198.5,
    "end_time": 390.0,
    "type": "build",
    "energy_level": 0.68,
    "key_elements": ["rolling_bassline", "amen_break", "rising_synths"],
    "description": "Energy building section with rolling bassline and classic amen breaks"
  }
]

technical_features.jsonl

{"time": 0, "feature_type": "bass", "description": "Sub-heavy reese bass with moderate distortion and 120Hz fundamental", "characteristics": {"distortion": 0.35, "width": 0.7, "sub_weight": 0.8}}
{"time": 0, "feature_type": "drums", "description": "Broken beat pattern with ghost notes and 16th hi-hats", "characteristics": {"complexity": 0.65, "velocity_variation": 0.4, "swing": 0.2}}
{"time": 0, "feature_type": "atmosphere", "description": "Reverb-heavy pads with 6-8kHz air frequencies", "characteristics": {"reverb_size": 0.85, "density": 0.3, "brightness": 0.5}}
{"time": 198.5, "feature_type": "transition", "description": "Filter sweep transition with drum roll buildup", "characteristics": {"length_bars": 8, "smoothness": 0.7, "energy_change": 0.25}}

summary.txt

This 60-minute neurofunk mix features 24 tracks with consistent energy throughout. 
The mix begins with atmospheric elements at 174 BPM before transitioning to 
heavier sections at 6:30. Notable sections include an extended bass sequence 
from 18:20-22:45 featuring time-stretched Amen breaks and layered Reese basses. 
The final third introduces more percussive elements with complex drum patterns 
and syncopated rhythms. Energy peaks occur at 14:30, 28:15, and 52:40.

Overview

🧠

Neural Foundation

Built upon established neuroscientific research on rhythm processing and motor-auditory coupling

🎵

Genre Specialization

Optimized for electronic music analysis with focus on neurofunk and drum & bass

⚡

High Performance

Memory-optimized processing, parallel execution, and GPU acceleration

🤖

AI Integration

HuggingFace models for advanced feature extraction and neural processing

Theoretical Foundation

Neural Basis of Rhythm Processing

The framework is built upon established neuroscientific research demonstrating that humans possess an inherent ability to synchronize motor responses with external rhythmic stimuli. This phenomenon, known as beat-based timing, involves complex interactions between auditory and motor systems in the brain.

Key Neural Mechanisms

Beat-based Timing Networks: Basal ganglia-thalamocortical circuits, supplementary motor area (SMA), premotor cortex (PMC)
Temporal Processing Systems: Duration-based timing mechanisms, beat-based timing mechanisms, motor-auditory feedback loops

Motor-Auditory Coupling

Research has shown that low-frequency neural oscillations from motor planning areas guide auditory sampling, expressed through coherence measures:

$$C_{xy}(f) = \frac{|S_{xy}(f)|^2}{S_{xx}(f)S_{yy}(f)}$$

Where:

$C_{xy}(f)$ represents coherence at frequency $f$
$S_{xy}(f)$ is the cross-spectral density
$S_{xx}(f)$ and $S_{yy}(f)$ are auto-spectral densities

Mathematical Framework

Spectral Decomposition

$$X(k) = \sum_{n=0}^{N-1} x(n)e^{-j2\pi kn/N}$$

Groove Pattern Analysis

$$MT(n) = \frac{1}{K}\sum_{k=1}^{K} |t_k(n) - t_{ref}(n)|$$

Amen Break Detection

$$S_{amen}(t) = \sum_{f} w(f)|X(f,t) - A(f)|^2$$

Reese Bass Analysis

$$R(t,f) = \left|\sum_{k=1}^{K} A_k(t)e^{j\phi_k(t)}\right|^2$$

Core Features

Rhythmic Analysis

Automated drum pattern recognition
Groove quantification
Microtiming analysis
Syncopation detection

Spectral Analysis

Multi-band decomposition
Harmonic tracking
Timbral feature extraction
Sub-bass characterization

Component Analysis

Sound source separation
Transformation detection
Energy distribution analysis
Component relationship mapping

Comprehensive breakdown of Heihachi's multi-dimensional analysis capabilities

Amen Break Analysis

Pattern matching and variation detection
Transformation identification
Groove characteristic extraction
VIP/Dubplate classification
Robust onset envelope extraction

Prior Subspace Analysis

Neurofunk-specific component separation
Bass sound design analysis
Effect chain detection
Temporal structure analysis

Composite Similarity

Multi-band similarity computation
Transformation-aware comparison
Groove-based alignment
Confidence scoring

Peak Detection

Multi-band onset detection
Adaptive thresholding
Feature-based peak classification
Confidence scoring

Segment Clustering

Pattern-based segmentation
Hierarchical clustering
Relationship analysis
Transition detection

Transition Detection

Mix point identification
Blend type classification
Energy flow analysis
Structure boundary detection

Memory Management

Streaming processing for large files
Efficient cache utilization
GPU memory optimization
Automatic garbage collection

Parallel Processing

Multi-threaded feature extraction
Batch processing capabilities
Distributed analysis support
Adaptive resource allocation

Storage Efficiency

Compressed result storage
Metadata indexing
Version control for analysis results
Scalable parallel execution

AI Model Integration

Heihachi integrates specialized AI models from HuggingFace, enabling advanced neural processing of audio using state-of-the-art models carefully selected for electronic music analysis tasks.

Core Feature Extraction

Microsoft BEATs

High Priority

Bidirectional ViT-style encoder trained with acoustic tokenizers, providing 768-d latent embeddings at ~20ms hop length

Spectral Analysis Temporal Analysis

OpenAI Whisper

High Priority

Trained on >5M hours; encoder provides 1280-d features tracking energy, voicing & language

Robust Features Energy Tracking

Rhythm & Beat Analysis

Beat-Transformer

High Priority

Dilated self-attention encoder with F-measure ~0.86 for beat and downbeat detection

Beat Detection Downbeat Detection

BEAST

Medium Priority

50ms latency, causal attention; ideal for real-time DJ analysis

Real-time Low Latency

Audio Separation & Component Analysis

Demucs v4

High Priority

Returns 4-stem or 6-stem tensors for component-level analysis (drums, bass, vocals, other)

Stem Separation Component Analysis

Multimodal & Similarity

LAION CLAP

Medium Priority

Query with free-text and compute cosine similarity on 512-d embeddings

Multimodal Text-Audio

UniMus OpenJMLA

Medium Priority

Score arbitrary tag strings for effect-chain heuristics

Zero-shot Tagging

Usage Example

from heihachi.huggingface import FeatureExtractor, StemSeparator, BeatDetector

# Extract features
extractor = FeatureExtractor(model="microsoft/BEATs-base")
features = extractor.extract(audio_path="track.mp3")

# Separate stems
separator = StemSeparator()
stems = separator.separate(audio_path="track.mp3")
drums = stems["drums"]
bass = stems["bass"]

# Detect beats
detector = BeatDetector()
beats = detector.detect(audio_path="track.mp3", visualize=True, output_path="beats.png")
print(f"Tempo: {beats['tempo']} BPM")

Academic Knowledge Pipeline

Extract, process, and structure knowledge from ~100 scientific publications on music perception, emotion, and drum & bass production to build a comprehensive academic knowledge base.

Processing Pipeline

📄

PDF Extraction

Extract structured text from academic PDFs with layout preservation and section detection

→

🧠

Knowledge Extraction

Use LLMs to extract concepts, findings, and relationships from research papers

→

🔗

Knowledge Graph

Build interconnected knowledge base linking concepts across papers

→

⚡

LLM Training

Generate training examples and fine-tune models for music expertise

Extracted Knowledge Types

Concepts

Key concepts related to music perception, emotion, and production techniques

Example: "Beat-based timing networks involve basal ganglia-thalamocortical circuits that enable synchronization with rhythmic stimuli"

Findings

Research conclusions and evidence about music and emotional responses

Example: "Low-frequency neural oscillations from motor planning areas guide auditory sampling (Chen et al., 2008)"

Relationships

Connections between concepts across different research domains

Example: "Motor-auditory coupling → enables → rhythm perception"

Implementation Overview

class AcademicKnowledgeProcessor:
    def process_papers(self, papers_directory):
        """Extract and structure knowledge from academic PDFs"""
        processed_papers = []
        
        for pdf_file in papers_directory:
            # Extract structured text with section preservation
            sections = self.extract_structured_text(pdf_file)
            metadata = self.extract_paper_metadata(pdf_file)
            
            # Use LLM to extract knowledge
            concepts = self.extract_concepts(sections)
            findings = self.extract_findings(sections)
            relationships = self.extract_relationships(concepts)
            
            processed_papers.append({
                "metadata": metadata,
                "concepts": concepts,
                "findings": findings,
                "relationships": relationships
            })
        
        return self.create_knowledge_base(processed_papers)
    
    def generate_training_examples(self, knowledge_base):
        """Create LLM training examples from extracted knowledge"""
        examples = []
        
        # Concept explanation examples
        for concept in knowledge_base['concepts']:
            examples.append({
                "input": f"What is {concept['name']} in music perception?",
                "output": concept['explanation']
            })
        
        # Application examples
        for concept in knowledge_base['concepts']:
            examples.append({
                "input": f"How can I apply {concept['name']} in drum and bass production?",
                "output": self.generate_application_example(concept)
            })
        
        return examples

Experimental Results

Demonstration of Heihachi's capabilities through comprehensive analysis of a 33-minute electronic music mix, showcasing advanced drum pattern recognition and temporal structure analysis.

91,179

Drum Hits Detected

33 min

Analysis Duration

5

Drum Categories

0.385

Avg. Confidence Score

Drum Hit Analysis

Advanced multi-stage analysis employing onset detection, neural network classification, confidence scoring, and temporal pattern recognition identified 91,179 percussion events across five primary categories.

Distribution by Type

Distribution of 91,179 detected drum hits by type

Comparative analysis of drum type frequencies

Temporal distribution of drum events throughout the mix

Relationship between detection confidence and velocity

Rhythmic density patterns across the entire mix

Heatmap visualization of drum pattern intensity

Classification Performance

Toms

0.385 confidence

Snares

0.381 confidence

Kicks

0.370 confidence

Cymbals

0.284 confidence

Hi-hats

0.223 confidence

Key Findings

Microtiming Variations

Subtle deviations from quantized grid detected, particularly in hi-hats and snares, contributing to human feel

Structural Markers

Clear delineation of musical sections through changes in drum event density and type distribution

Layering Techniques

Overlapping drum hits at key points create impact moments through stacked percussion events

Rhythmic Motifs

Recurring patterns serve as stylistic identifiers throughout the mix structure

Documentation

Installation

Quick Install (Recommended)

# Clone the repository
git clone https://github.com/fullscreen-triangle/heihachi.git
cd heihachi

# Run the setup script
python scripts/setup.py

Manual Installation

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Install the package
pip install -e .

Installation Options

--dev Install development dependencies

--no-gpu Skip GPU acceleration dependencies

--no-interactive Skip interactive mode dependencies

--shell-completion Install shell completion scripts

Quick Usage

# Process a single audio file
heihachi process audio.wav --output results/

# Extract emotional features from audio
python -m src.main semantic emotions track.wav

# Start the REST API server
python api_server.py --host 0.0.0.0 --port 5000

# Index tracks for semantic search
python -m src.main semantic index audio_dir/ --artist "Artist"

# Search indexed tracks semantically
python -m src.main semantic search "dark atmospheric neurofunk"

# Extract features using AI models
heihachi hf extract audio.wav --model microsoft/BEATs-base

Heihachi Neural Audio Analysis Framework

🔥 Revolutionary Fire-Based Emotion Interface

The Science Behind Fire-Emotion Mapping

Digital Fire Creation

Pakati Understanding Engine

Direct Audio Generation

How It Works

Fire Creation

Pattern Capture

AI Understanding

Audio Generation

🧠 Autobahn Integration: Delegated Probabilistic Reasoning

Autobahn System Overview

12 Theoretical Frameworks

Oscillatory Bio-Metabolic Processing

Consciousness Emergence Modeling

Advanced Uncertainty Quantification

Integration Benefits

Performance Optimization

Scientific Foundation

Delegation Architecture

Heihachi

Autobahn

Audio Output

FeltBeats: Music Discovery by Feeling

Discover Music by Feeling

Dual LLM Architecture

Academic Knowledge LLM

Continuous Learning LLM

REST API

Fast & Scalable

Easy Integration

Specialized Analysis

Quick Start

Available Endpoints

Audio Analysis

Semantic Analysis

Job Management

Analysis Results Visualization

Drum Element Distribution

Velocity-Confidence Analysis

Semantic Analysis

Emotional Feature Mapping

Semantic Search

Search Capabilities

Technical Foundation

Command Line Integration

Emotional Analysis Output

Analysis Structure

emotional_profile.json

segments.json

technical_features.jsonl

summary.txt

Overview

Neural Foundation

Genre Specialization

High Performance

AI Integration

Theoretical Foundation

Neural Basis of Rhythm Processing

Key Neural Mechanisms

Motor-Auditory Coupling

Mathematical Framework

Spectral Decomposition

Groove Pattern Analysis

Amen Break Detection

Reese Bass Analysis

Core Features

Rhythmic Analysis

Spectral Analysis

Component Analysis

Amen Break Analysis

Prior Subspace Analysis

Composite Similarity

Peak Detection

Segment Clustering

Transition Detection

Memory Management

Parallel Processing

Storage Efficiency