Challenge Details
General Rules
Participants may participate alone or in teams.
Individuals are not allowed to participate in multiple teams.
Each team needs to have a contact person and provide an email address through which they can be reached.
Cash prizes will be paid out to an account specified by the contact person of each team. It is the responsibility of the team's contact person to distribute the prize money according to their team-internal agreements.
To be eligible to win prizes, participants agree to release their code and models as well as publish a report about their methods so that the research community can reproduce and benefit from the results.
The organizers reserve the right to change the rules if doing so is necessary to resolve unforeseen problems.
The organizers reserve the right to disqualify participants who violate the rules or engage in scientific misconduct.
Challenge Structure
The challenge is divided into three stages, I. Warm-Up, II. Qualifying, and III. Tournament. At each stage, you can develop your agent locally and upload your solution to the cloud server (supported by Huawei Cloud) for evaluation. In the evaluator, environments are modified in different ways to simulate various of real-world problems. The modification is not directly accessible to the participant.
I. Warm-Up
In the warm-up stage, an ideal environment (no disturbance is added) with a 3-degree-of-freedom robot will be released to the public.
This stage aims to familiarize with the task, the environments, and the API.
Every interested person/team can download the docker image of the challenge and develop their algorithms. You could also clone the environment directly from GitHub, but we only accept submissions of the docker image.
No registration is required at this stage. However, you have to finish the registration if you want your solution to be evaluated and ranked.
The evaluator in the cloud server is identical to the provided simulator. It's recommended to submit your solution at least once to get familiar with the evaluation pipeline, i.e., upload your solution, download the dataset collected by the evaluator, and set up your analysis toolbox.
No report is required.
In this stage, you will handle two simplified tasks:
Hit
The puck is randomly initialized on the left side of the table. The initial velocity is zero. The objective is to hit the puck to score a goal as fast as possible.
Defend
The puck is randomly initialized on the right side of the table with a random velocity heading left. The objective is to stop the puck on the right side of the table and prevent it from getting scored.
II. Qualifying
Registered teams will work with a general-purpose robot, KUKA iiwa14 LBR.
The solution should be submitted in a docker image. The collected dataset can be downloaded once the evaluation is complete.
The evaluator in the cloud server is modified to simulate different types of real-world problems. These modifications include but are not limited to disturbances, observation noise, loss of tracking, model mismatch, and imperfect tracking controller.
Each team can evaluate its solution once per task per day. Each evaluation will be conducted with 1000 episodes, equivalently 2.8 hours of real-world experiments.
Based on the evaluation metric, the agents will be categorized into three levels: Deployable, Improvable, and Nondeployable. Only Deployable and Improvable agents are qualified for the tournament stage. Further details about the evaluation metric can be found in Evaluation Metric.
A two-page one-page report summarizing the applied approach is needed at the end of the stage.
Tasks:
Hit
In this task, the opponent moves in a predictable pattern. The puck is randomly initialized with a small velocity. The objective is to score the goal as many times as possible.
Defend
Same as above.
Prepare
The puck is initialized close to the table's boundary and is unsuitable for hitting. The task is to control the puck to move it into a good hit position. The puck is not allowed to cross the middle line.
III. Tournament
Teams at the Improvable or Deployable level are qualified to participate in the tournament stage. The maximum number of teams is 16, which will be determined based on the ranking of success rate in the qualifying stage. The best performing submission (based on the average success rate among all tasks) will be considered.
Each team will develop a single agent that is able to play the whole game.
The game is played based on the Game Rules.
A hard-coded baseline agent will be provided to test and validate your agents.
During the preparation stage, participants can upload their agent and compete with the baseline agent. The best result will be used for the ranking and confrontation arrangement.
Each team should submit a single agent as well as a two-page report by the deadline before the tournament starts.
A double round-robin schedule will apply for the final tournament. After the first round, participants will have two weeks to adjust and improve their agent.
The match follows the regular air hockey rules with slight modifications.
Each match lasts 15 minutes (45,000 steps).
The agent will lose the game if it is classified as non-deployable during the match.
Each agent has 15 seconds (750 steps) to execute a shot that crosses the centerline. The 15 seconds begin when the puck enters and remains on that player’s side of the centerline. A violation of this rule is a foul.
If the agent accumulates three fouls, it will lose one score.
A match win will accumulate 3 points, a draw will get 1 point, and a loss will get 0 points.
The ranking of the tournament will be determined by the total score of the two rounds.
In case of software crashes, the participants will have another opportunity to fix the issue to rerun the faulting match.
Evaluation Metrics
Success Rate:
Hit: The puck enters the scoring zone with a speed above the threshold.
Defend: The final velocity of the puck (at the end of the game) is below the threshold and does not bounce to the other side of the court.
Prepare: The final position of the puck is within a predefined area and at a speed less than the threshold.
Deployability: Each deployability metric is assigned one or multiple penalty points based on the level of risk. The deployability score is counted when the constraints of the evaluation metric are violated. Each metric is computed at most once per episode (maximum 500 steps per episode).
Violations of the End-Effector's Position Constraints [3]: The desired x-y-position of the end-effector should remain within the boundaries of the table. The z-position of the end-effector should remain within a range.
Violations of the Joint Position Limit Constraints [2]: The desired position command should not exceed the position limits.
Violations of the Joint Velocity Limit Constraints [1]: The desired velocity command should not exceed the velocity limits.
Computation Time [0.5-2]: The computation time at each step should be shorter than 0.02s.
Jerk [1]: A smooth joint trajectory is desired to prevent damage to the actuator. ( Jerk will not be evaluated for the rest of the competition. )
Each evaluation consisted of 1000 episodes. The success rates were used to rank the leaderboards. The deployability score is the sum of the penalty score from all episodes. The rankings were divided into three categories based on the score of deployability.
Deployable: [ 0 <= deployability score <= 500 ]
Improvable: [ 500 < deployability score <= 1500 ]
Non-deployable: [ 1500 <= deployability score ]
Simulator
The simulator specification of the Robot Air Hockey Challenge is summarized in the following:
Simulator: MuJoCo
Simulation frequency: 1000 Hz
Control Frequency: 50 Hz
Observation:
Robot:
Joint Positions [radians]
Joint Velocities [radians/s]: Computed by finite difference
Puck (rel. to the robot base frame):
X-Y Position [m]
Yaw Angle [radians]
X-Y Velocity [m/s]
Yaw Velocity [radians/s]: Computed by finite difference
Opponent (if applicable):
End-Effector's X-Y Position [m]
Control Action:
We try to set the simulation as close as possible to the real-world setup. The robot is controlled by torque at 1000 Hz. We use a feed-forward + PD controller to determine the joint torque that tracks the desired joint position and velocity.
A 20-step interpolation is required in the agent's adjacent control commands (the agent control frequency is 50 Hz). We provide different types of polynomial interpolation upon requirements. Details about the action interface can be found here.
Real Robot Experiments
We will invite the top 3 winners to our lab to test their solutions. These teams will compete against each other and against our baseline. We will also live-stream and record the competition.
Please note that the awards will not refer to the results of the real robot's competition.