Revolutionary audio analysis framework featuring fire-based emotional querying, Autobahn biological intelligence integration, and Rust-powered performance. Combining neurological models with consciousness-aware audio generation.
Groundbreaking fire-based emotional querying system that taps into humanity's deepest cognitive patterns. Users create and manipulate digital fire through an intuitive WebGL interface, which the system "understands" using advanced AI reconstruction techniques.
Based on extensive research into human consciousness and fire recognition, fire represents humanity's first and most fundamental abstraction - deeply embedded in our neural architecture. Fire recognition activates the same neural networks as human consciousness itself.
Intuitive WebGL interface for creating and manipulating fire with real-time physics simulation
AI system that "learns" fire patterns by reconstructing them from partial information
Converting understood fire patterns into music that matches the emotional content
Users interact with WebGL interface to create, maintain, and modify digital fire
System captures fire characteristics (intensity, color, movement, structure)
Pakati reconstructs fire from partial information to prove comprehension
Understood patterns drive Heihachi's audio synthesis engines
Revolutionary delegated probabilistic reasoning architecture where all probabilistic tasks, Bayesian inference, biological intelligence, and consciousness modeling are delegated to the Autobahn oscillatory bio-metabolic RAG system.
Including fire-evolved consciousness substrate and biological intelligence architectures
3-layer architecture with ATP-driven metabolic computation
Real-time IIT Φ calculation for consciousness quantification
Sophisticated Bayesian inference and fuzzy logic processing
Transforming Heihachi into a revolutionary music listening application where users discover music by describing emotions and feelings - powered by academic research and continuous learning.
Instead of searching by genre or artist, describe how you want to feel: "I want something dark and atmospheric with building tension" or "Find me energetic tracks with complex rhythms and heavy bass."
Trained on ~100 scientific publications covering music perception, emotion, and drum & bass production. Provides deep theoretical understanding of how music affects emotions and neural processing.
Builds domain expertise by continuously analyzing new mixes. Each analysis becomes training data, creating an ever-growing understanding of electronic music patterns and emotional characteristics.
Comprehensive REST API for integrating Heihachi's audio analysis capabilities into web applications, mobile apps, and other systems. Supports both synchronous and asynchronous processing.
Asynchronous job processing with configurable concurrency limits and rate limiting
RESTful endpoints with comprehensive documentation and client examples
Dedicated endpoints for beats, drums, stems, emotions, and semantic search
# Install API dependencies
pip install flask flask-cors flask-limiter
# Start the API server
python api_server.py --host 0.0.0.0 --port 5000
# Or with production settings
python api_server.py --production --config-path configs/production.yaml
# Analyze audio file with emotion mapping
curl -X POST http://localhost:5000/api/v1/semantic/analyze \
-F "file=@track.wav" \
-F "include_emotions=true" \
-F "index_for_search=true"
# Extract beats from audio
curl -X POST http://localhost:5000/api/v1/beats \
-F "file=@track.mp3"
# Search indexed tracks semantically
curl -X POST http://localhost:5000/api/v1/semantic/search \
-H "Content-Type: application/json" \
-d '{"query": "dark aggressive neurofunk", "top_k": 5}'
import requests
# Extract emotional features
def extract_emotions(file_path):
url = "http://localhost:5000/api/v1/semantic/emotions"
with open(file_path, 'rb') as f:
files = {'file': f}
response = requests.post(url, files=files)
return response.json()
# Example usage
emotions = extract_emotions("track.wav")
print(f"Dominant emotion: {emotions['summary']['dominant_emotion']}")
print(f"Energy level: {emotions['emotions']['energy']:.1f}/10")
// Analyze audio file
async function analyzeAudio(file) {
const formData = new FormData();
formData.append('file', file);
formData.append('include_emotions', 'true');
const response = await fetch('/api/v1/semantic/analyze', {
method: 'POST',
body: formData
});
return await response.json();
}
// Usage with file input
const fileInput = document.getElementById('audio-file');
fileInput.addEventListener('change', async (e) => {
const result = await analyzeAudio(e.target.files[0]);
console.log('Analysis result:', result);
});
The API returns detailed analysis data that can be visualized to understand track characteristics and emotional profiles. These visualizations help developers integrate meaningful insights into their applications.
Shows the proportion of different drum elements, contributing to groove and rhythm characteristics
Correlates drum hit confidence with velocity, indicating playing dynamics and energy levels
Transform raw audio features into meaningful emotional dimensions and enable intelligent music discovery through semantic search and natural language queries.
Heihachi maps technical audio features to 9 distinct emotional dimensions using scientifically-grounded algorithms that correlate spectral, rhythmic, and temporal characteristics with human emotional perception.
Loudness, tempo, and drum intensity
Spectral centroid and high-frequency content
Dissonance and rhythmic complexity
Low-mid energy and harmonic richness
Microtiming and syncopation quality
Transient sharpness and distortion
Reverb amount and stereo width
Minor key and sparse arrangement
Major key and uplifting progressions
Find tracks using natural language descriptions of emotions, moods, and musical characteristics. The search system understands contextual relationships between audio features and emotional responses.
Semantic analysis builds upon detailed drum pattern analysis and feature extraction to create comprehensive emotional profiles. The system processes complex rhythmic patterns and translates them into meaningful emotional dimensions.
Drum pattern analysis visualization showing the temporal distribution of different drum elements, which feeds into the emotional mapping algorithms to determine groove, energy, and tension characteristics.
Drum density analysis over time - high-density regions contribute to energy and aggression scores, while sparse sections indicate atmospheric or melancholic characteristics.
Semantic analysis capabilities are fully integrated into the Heihachi CLI for streamlined workflows.
python -m src.main semantic emotions track.wav
python -m src.main semantic index audio_folder/ --artist "Artist" --title "Track"
python -m src.main semantic search "atmospheric intro with tension building"
python -m src.main semantic stats
Transform raw audio analysis into structured, emotion-focused data that powers feeling-based music discovery and LLM training.
{
"overall_mood": ["dark", "energetic", "technical"],
"intensity_curve": [0.4, 0.5, 0.7, 0.8, 0.75, 0.9, 0.85, 0.7],
"emotional_segments": [
{
"start_time": 0,
"end_time": 390.0,
"primary_emotion": "atmospheric",
"tension_level": 0.4,
"descriptors": ["spacious", "anticipatory", "mysterious"]
}
],
"peak_moments": [
{
"time": 870.5,
"intensity": 0.92,
"description": "Maximum energy with layered percussion and aggressive bassline",
"key_elements": ["double_drops", "bass_stacking", "drum_fills"]
}
]
}
[
{
"segment_id": "s001",
"start_time": 0,
"end_time": 198.5,
"type": "intro",
"energy_level": 0.45,
"key_elements": ["atmospheric_pads", "filtered_breaks", "sparse_percussion"],
"description": "Atmospheric intro with filtered breaks and sparse percussion"
},
{
"segment_id": "s002",
"start_time": 198.5,
"end_time": 390.0,
"type": "build",
"energy_level": 0.68,
"key_elements": ["rolling_bassline", "amen_break", "rising_synths"],
"description": "Energy building section with rolling bassline and classic amen breaks"
}
]
{"time": 0, "feature_type": "bass", "description": "Sub-heavy reese bass with moderate distortion and 120Hz fundamental", "characteristics": {"distortion": 0.35, "width": 0.7, "sub_weight": 0.8}}
{"time": 0, "feature_type": "drums", "description": "Broken beat pattern with ghost notes and 16th hi-hats", "characteristics": {"complexity": 0.65, "velocity_variation": 0.4, "swing": 0.2}}
{"time": 0, "feature_type": "atmosphere", "description": "Reverb-heavy pads with 6-8kHz air frequencies", "characteristics": {"reverb_size": 0.85, "density": 0.3, "brightness": 0.5}}
{"time": 198.5, "feature_type": "transition", "description": "Filter sweep transition with drum roll buildup", "characteristics": {"length_bars": 8, "smoothness": 0.7, "energy_change": 0.25}}
This 60-minute neurofunk mix features 24 tracks with consistent energy throughout.
The mix begins with atmospheric elements at 174 BPM before transitioning to
heavier sections at 6:30. Notable sections include an extended bass sequence
from 18:20-22:45 featuring time-stretched Amen breaks and layered Reese basses.
The final third introduces more percussive elements with complex drum patterns
and syncopated rhythms. Energy peaks occur at 14:30, 28:15, and 52:40.
Built upon established neuroscientific research on rhythm processing and motor-auditory coupling
Optimized for electronic music analysis with focus on neurofunk and drum & bass
Memory-optimized processing, parallel execution, and GPU acceleration
HuggingFace models for advanced feature extraction and neural processing
The framework is built upon established neuroscientific research demonstrating that humans possess an inherent ability to synchronize motor responses with external rhythmic stimuli. This phenomenon, known as beat-based timing, involves complex interactions between auditory and motor systems in the brain.
Research has shown that low-frequency neural oscillations from motor planning areas guide auditory sampling, expressed through coherence measures:
Where:
Comprehensive breakdown of Heihachi's multi-dimensional analysis capabilities
Heihachi integrates specialized AI models from HuggingFace, enabling advanced neural processing of audio using state-of-the-art models carefully selected for electronic music analysis tasks.
Bidirectional ViT-style encoder trained with acoustic tokenizers, providing 768-d latent embeddings at ~20ms hop length
Trained on >5M hours; encoder provides 1280-d features tracking energy, voicing & language
Dilated self-attention encoder with F-measure ~0.86 for beat and downbeat detection
50ms latency, causal attention; ideal for real-time DJ analysis
Returns 4-stem or 6-stem tensors for component-level analysis (drums, bass, vocals, other)
Query with free-text and compute cosine similarity on 512-d embeddings
Score arbitrary tag strings for effect-chain heuristics
from heihachi.huggingface import FeatureExtractor, StemSeparator, BeatDetector
# Extract features
extractor = FeatureExtractor(model="microsoft/BEATs-base")
features = extractor.extract(audio_path="track.mp3")
# Separate stems
separator = StemSeparator()
stems = separator.separate(audio_path="track.mp3")
drums = stems["drums"]
bass = stems["bass"]
# Detect beats
detector = BeatDetector()
beats = detector.detect(audio_path="track.mp3", visualize=True, output_path="beats.png")
print(f"Tempo: {beats['tempo']} BPM")
Extract, process, and structure knowledge from ~100 scientific publications on music perception, emotion, and drum & bass production to build a comprehensive academic knowledge base.
Extract structured text from academic PDFs with layout preservation and section detection
Use LLMs to extract concepts, findings, and relationships from research papers
Build interconnected knowledge base linking concepts across papers
Generate training examples and fine-tune models for music expertise
Key concepts related to music perception, emotion, and production techniques
Research conclusions and evidence about music and emotional responses
Connections between concepts across different research domains
class AcademicKnowledgeProcessor:
def process_papers(self, papers_directory):
"""Extract and structure knowledge from academic PDFs"""
processed_papers = []
for pdf_file in papers_directory:
# Extract structured text with section preservation
sections = self.extract_structured_text(pdf_file)
metadata = self.extract_paper_metadata(pdf_file)
# Use LLM to extract knowledge
concepts = self.extract_concepts(sections)
findings = self.extract_findings(sections)
relationships = self.extract_relationships(concepts)
processed_papers.append({
"metadata": metadata,
"concepts": concepts,
"findings": findings,
"relationships": relationships
})
return self.create_knowledge_base(processed_papers)
def generate_training_examples(self, knowledge_base):
"""Create LLM training examples from extracted knowledge"""
examples = []
# Concept explanation examples
for concept in knowledge_base['concepts']:
examples.append({
"input": f"What is {concept['name']} in music perception?",
"output": concept['explanation']
})
# Application examples
for concept in knowledge_base['concepts']:
examples.append({
"input": f"How can I apply {concept['name']} in drum and bass production?",
"output": self.generate_application_example(concept)
})
return examples
Demonstration of Heihachi's capabilities through comprehensive analysis of a 33-minute electronic music mix, showcasing advanced drum pattern recognition and temporal structure analysis.
Advanced multi-stage analysis employing onset detection, neural network classification, confidence scoring, and temporal pattern recognition identified 91,179 percussion events across five primary categories.
Distribution of 91,179 detected drum hits by type
Comparative analysis of drum type frequencies
Temporal distribution of drum events throughout the mix
Relationship between detection confidence and velocity
Rhythmic density patterns across the entire mix
Heatmap visualization of drum pattern intensity
Subtle deviations from quantized grid detected, particularly in hi-hats and snares, contributing to human feel
Clear delineation of musical sections through changes in drum event density and type distribution
Overlapping drum hits at key points create impact moments through stacked percussion events
Recurring patterns serve as stylistic identifiers throughout the mix structure
# Clone the repository
git clone https://github.com/fullscreen-triangle/heihachi.git
cd heihachi
# Run the setup script
python scripts/setup.py
# Create virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Install the package
pip install -e .
# Process a single audio file
heihachi process audio.wav --output results/
# Extract emotional features from audio
python -m src.main semantic emotions track.wav
# Start the REST API server
python api_server.py --host 0.0.0.0 --port 5000
# Index tracks for semantic search
python -m src.main semantic index audio_dir/ --artist "Artist"
# Search indexed tracks semantically
python -m src.main semantic search "dark atmospheric neurofunk"
# Extract features using AI models
heihachi hf extract audio.wav --model microsoft/BEATs-base