Beijing Innovation Center teaches robots to live among us

chiny24.com 3 weeks ago

The embodied AI cannot make in vacuum. While large language models are trained on texts from the internet, humanoid robots request data on the physical world: gravity, friction, shapes and unpredictability of the human environment. Beijing City Office of Economy and Information Technology has late published a study summarizing the first 4 months of activity of the database of data collection and robot training at the Beijing Humanoid Robot Innovation Center. The results of this pioneering task shed fresh light on how China is systematically approaching the solution of the largest bottlenecks in the robotic manufacture – the deficiency of advanced quality training data.

Data mill for humanoid machines

Located in the Shijingshan district, in the Shougang Industrial Park, the training base covers an area of nearly 5,000 square meters. It is presently 1 of the largest and most versatile centres of this kind in China. Almost 10 percent of this area is occupied by a specialist motion capture studio.

The results of the first 4 months of work are impressive. The center squad collected over 3 million records of data from interior investigation and improvement and made more than 300 000 open origin data records available. This translated into tens of thousands of hours of high-quality training material, which has already been handed over to leading manufacture companies and technological institutions. It is estimated that the centre is able to make more than 6 million data points per year, making them a country leader.

Dynamic scenarios alternatively of static laboratories

The biggest challenge in robot training is the alleged “scenario fragmentation”. The real planet is full of variables, and the robot trained in a sterile laboratory is frequently lost in the natural environment. To remedy this, the Beijing centre has recreated more than 30 typical scenarios from six key areas:

  • home,
  • supermarket,
  • offices,
  • industry,
  • medicine and
  • health care.

Most importantly, these spaces are not static “model rooms”. According to experts from the Centre, lighting conditions, layout of objects and routes of movement of persons can be dynamically adapted according to the needs of training algorithms. This creates a flexible "data factory". For example, in the script of a "child room" the model trainer (operator) controls the robot utilizing VR equipment and sensors to turn the sock to the right. In another zone, robots practice making beds in a simulated nursing home, and in another – stacking goods on supermarket shelves.

When performing these activities, real-time data are collected on robot joint angles, motion trajectory, force force and another physical parameters. Collecting data for 1 simple movement requires 300 to 1000 repetitions to enable the algorithm to generalize the task and cope with it in future under somewhat different conditions.

Fight for data quality and standardisation

In the first phase of operations, the centre faced a problem of mediocre data quality – the qualification rate was only about 50 percent. This was due to errors in the interception of motion, problems with lighting or imprecise synchronization of many sensors. To remedy this, the Centre has developed and implemented rigorous, standard procedures for collecting, marking and monitoring data quality. As a result, the current compliance and usability rate has stabilised at more than 95 percent.

The centre's activity solves another pressing problem of the industry: the "language barrier" between different robots. Different models have different sensor systems, degrees of joint freedom and control interfaces, making data collected by 1 robot frequently useless to another. The standardization of processes in the Beijing centre allows for parallel data collection by robots of different designs (now the center has more than 120 machines). For tiny and medium-sized robotic companies, the usage of the centre's resources means saving the cost of collecting data of at least 50 percent, which importantly lowers the entry threshold.

Third phase of improvement and industrial ecosystem

In late March 2026, the 3rd phase of the improvement of the Beijing Humanoid Robot Data Centre (Embodied AI) was officially inaugurated. At the same time, an industrial alliance "Beijing Shijingshan Embodied AI Data component manufacture Alliance" was announced, consisting of more than 40 entities: government institutions, technology companies, universities and investigation institutes.

This alliance aims to combine computing, simulation and data processing resources. The Centre is no longer limited to the supply of "raw material". He is working on fundamental technologies, specified as the general traffic control system, which was made available on an open origin basis. This provides researchers and companies with a solid starting base, avoiding the request for "rethinking". The Centre besides completed the first circular of marketplace finance, raising over 700 million yuan (about PLN 360 million) from state funds and strategical investors specified as Baidu.

Step towards a million hours

Beijing's actions show a clear paradigm shift. The robotic manufacture understood that the competitive advantage would not only be decided by the perfect mechanic (hardware), but above all by the quality of the robot's "brain" trained on the applicable data (software and date). Real, physical data from machines operating in the real planet (so-called first hand data) are irreplaceable by computer simulations. These include “physical intuition” – information about feedback, friction or unexpected interference.

The Beijing Humanoid Robot Innovation Centre is presently pursuing an ambitious goal: to gather a globally unprecedented database of millions of hours of high-quality training data. It is precisely specified initiatives – combining immense infrastructure investments, process standardisation and cooperation of the full ecosystem – that are to be the foundation on which China plans to build its dominance in the coming era of intelligent home and service machines.

Source:

  • CCTV News (news.cctv.cn) – "300万条数据哪里来?揭秘人形机器人数据特训"
  • Securities Times (stcn.com) – “实探北京人形机器人创新中心数据基地,对外交付高质量实采数据超数万小时”
  • China manufacture News (cinic.org.cn) – "北京人形机器人创新中心具身智能机器人数据采集与训练基地:迈向全球首个百万小时数据里程碑"
  • Pandaroid News (pandaroid-info.com) – 「中国人型ロボット、即戦力化への壁 北京データ訓練拠点が映す進化の現在」
  • Hanjoong Global diary (hanjoongglobal.com) – 「공장-가정-마트서 ‘수업’ 받는...베이징 훈련센터 현장」
  • Record China (recordchina.co.jp) – 「中国最大の人型ロボット訓練基地が北京に誕生」
  • Gasgoo car News (autonews.gasgoo.com) – "Beijing: Defining the ‘China Base’ of the Humanoid Robot Industry"

Leszek B. Glass

Email: [email protected]

© www.chiny24.com

Read Entire Article