Fable Flux

Technical Details

How Fable Flux generates personalized children's stories using AI

OpenAIModel Architecture

Fable Flux is powered by a fine-tuned version of OpenAI's GPT-OSS-20B model, specifically optimized for generating high-quality children's stories with educational value and age-appropriate content.

Model Pipeline

Training Dataset

The model was trained on a carefully curated dataset of children's stories designed to ensure:

  • Age-appropriate language (3+ years)
  • Positive moral lessons and educational value
  • Diverse characters, settings, and themes
  • Engaging storytelling with clear narrative structure
  • Safe and encouraging content

Dataset Highlights:

  • • 10,000+ high-quality children's stories
  • • 200 diverse character archetypes
  • • 100 varied settings and environments
  • • Systematic diversity tracking
  • • Quality validation and sentiment analysis

Serving Infrastructure

Fable Flux uses a modern, scalable infrastructure to deliver fast and reliable story generation:

Technology Stack

🚀 vLLM Engine

High-performance inference engine optimized for large language models, providing fast token generation and efficient memory usage.

☁️ Modal.com Platform

Serverless compute platform that automatically scales GPU resources based on demand, ensuring optimal performance and cost efficiency.

🎯 H100 GPUs

NVIDIA H100 Tensor Core GPUs provide the computational power for real-time story generation with minimal latency.

⚡ Next.js Frontend

Modern React framework with server-side rendering, optimized for performance and user experience.

Performance & Quality

~10s
Average Generation Time
600-700
Words per Story
3+
Target Age Range

Quality Assurance

Every generated story undergoes automated quality validation:

  • Content Safety: Automated filtering for age-appropriate content
  • Educational Value: Verification of positive moral lessons
  • Language Complexity: Reading level analysis for target age group
  • Narrative Structure: Proper story format with clear beginning, middle, and end
  • Character Consistency: Logical character development and behavior

Technical Specifications

Model Details

  • • Parameters: 20 billion
  • • Architecture: Transformer-based
  • • Precision: Mixed FP16/FP4
  • • Context Length: 8,192 tokens

Infrastructure

  • • GPU: NVIDIA H100 (80GB)
  • • Memory: Tensor parallelism
  • • Serving: vLLM 0.10.1+gptoss
  • • Platform: Modal.com serverless
Technical Details - Powered by OpenAI