Increasing robustness of of openCV Aruco Pose Estimation

I’m currently working on a toolset for Aruco marker pose estimation for an AR-Application.
The main issue I found by now is that the signal can be verry shaky by times. Esp in the Z Axis and in the rotation I get alot of flickering.

The initital simple solution is to simply filter the positions using a filter or lag CHOP, but this has of course several drawbacks. For a multisample channel it is not possible to use the quaternion-blend method in the lagCHOP. The other issue is of course the quite heavy lag that gets introduced.

Sadly, the pose estimation is missing a confidence value to discard certain detected poses.

So my ideas are the following:

  • Use more then one camera for the tracking and use a means of the values.
  • Filter the values beforehand and ommit all value changes that are smaller then a certain threshhold.
  • Do the pose estimation more then once per frame and build a means.
  • Find a more robust solution using something else and not openCV.

I’m open for experiences and some ideas on how to make it smoother.