- Kairos-HomeWorld is purpose built for embodied intelligence and represents the first unified framework capable of generating a complete, fully interactive home environment from a single text prompt. Extending indoor scene generation beyond individual rooms, it enables whole-home simulation in which every object is fully manipulable within an integrated simulation engine.
- Kairos-HomeWorld employs a four-stage hierarchical architecture encompassing floorplan generation, 2D-to-3D lifting, recursive refinement, and manipulable object placement. This approach enables the production of globally coherent, physically accurate, and simulation-ready scenes. Each environment contains more than 15 manipulable objects and achieves a Footprint Object Density of 4.16, the highest among compared methods.
- The accompanying open-source dataset is purpose-built for Chinese households, pairing 300,000 real residential floor plans with 5,000 fully furnished, simulation-ready homes and 50,000 physics-enabled interactive object assets. Already deployed in ACE ROBOTICS' daily robot training, it significantly accelerates the simulation-to-reality transfer cycle.
SHANGHAI, CHINA -
Media OutReach Newswire - 5 June 2026 - ACE ROBOTICS, in collaboration with the Multimedia Laboratory at The Chinese University of Hong Kong (CUHK) and Shenzhen Loop Area Institute, today announced the open-source release of Kairos-HomeWorld, the industry's first unified World Model framework capable of generating full home-scale, object-level interactive 3D environments from a single text prompt. The solution addresses longstanding limitations in indoor scene generation, which has typically been restricted to single-room outputs with weak global consistency and limited interactivity. Kairos-HomeWorld overcomes these constraints by delivering structurally coherent, physically plausible, and functionally complete residential environments. These high-fidelity, large-scale simulations provide a robust foundation for advancing embodied intelligence applications and accelerating real-world robot training. The long-term vision for embodied intelligence is the home environment. However, residential settings are inherently diverse and highly personalized, requiring robots to be trained across a broad range of realistic and differentiated scenarios before they can reliably operate in even a single household. High-fidelity simulation offers the most practical pathway to achieving this at scale, yet existing approaches typically involve a trade-off: synthetic environments lack realism, while scanned real-world scenes offer limited interactivity. Kairos-HomeWorld, together with its accompanying dataset, is designed to bridge this gap, delivering both realistic and interactive environments within a unified framework.
A four-stage architecture for whole-home, object-level generation Conventional approaches to indoor scene generation remain constrained to single-room outputs, often exhibiting weak global consistency, frequent physical inaccuracies, and limited or no interactivity. Kairos-HomeWorld takes a fundamentally different approach. It decomposes whole-home generation into a structured, four-stage process, redefining the underlying architectural paradigm from the ground up. Stage 1 — Floor Plan Generation. A K-D tree-based approach translates real-world floor plans into a hierarchical text representation that can be efficiently processed by large language models (LLMs). This method mitigates common issues in conventional layout generation, including room overlaps and fragmented topologies, resulting in more coherent and structurally consistent spatial configurations. Stage 2 — 2D-to-3D Lifting & Furniture Layout Generation. A "top-down global initialization combined with a first-person detail walkthrough" approach anchors the process to the 3D building shell generated in Stage 1. This methodology mitigates the geometric drift commonly associated with conventional 2D-to-3D lifting techniques, enabling more stable and spatially consistent scene generation. Stage 3 — Recursive Refinement. A fine-tuned vision-language model performs iterative validation and correction, automatically identifying and resolving physical inconsistencies, such as obstructed doorways or object collisions. This recursive process materially reduces spatial errors, achieving among the lowest reported furniture-collision rates in the industry. Stage 4 — Manipulable Object Placement. A surface-centric placement algorithm assigns each object detailed physical properties, including material composition, density, friction, and structural support relationships. Each generated scene incorporates an average of more than 15 manipulable objects and achieves a Footprint Object Density (FOD) of 4.16, a metric reflecting the concentration of items across furniture surfaces. All objects are natively compatible with simulation engines, enabling direct interaction for tasks...
Read more: ACE ROBOTICS Open-Sources Kairos-HomeWorld , Enabling Fully Interactive Whole-Home 3D Scene...