The Contact Problem

State-of-the-art RGB-D perception systems achieve 2-5mm localization accuracy on objects in a scene. This sounds impressive until you consider what precision assembly requires: peg-in-hole insertion tolerances are typically +/-0.5mm, USB-A connector insertion is +/-0.3mm, and SIM card slots require +/-0.1mm. The gap between what vision can resolve and what contact tasks require is a physical constant — no amount of better camera resolution closes it entirely.

The core issue is that contact itself changes the system state in ways that are invisible to a camera. When a gripper finger makes contact with a surface, forces and deformations occur at a sub-millimeter scale. The camera sees a gripper approaching an object; it cannot see whether the contact is stable, slipping, or about to cause the object to rotate out of the grasp.

Compliant vs. Stiff Control: The Fundamental Choice

Before discussing force sensing hardware, it is essential to understand the two fundamental control philosophies for contact tasks:

Stiff (position) control: The robot tracks a commanded position trajectory regardless of external forces. If the end-effector encounters an obstacle, the controller drives high forces to try to maintain the commanded position. This is the default control mode on most industrial robots and is appropriate for free-space motion where contact is not expected. During contact, stiff control can damage the robot, the object, or both — a 1mm position error during a 0.5mm-clearance insertion produces contact forces that can reach hundreds of Newtons.

Compliant (impedance/admittance) control: The robot behaves like a virtual spring-damper system at the end-effector. When it encounters an external force, it yields — moving in the direction of the force proportionally to its magnitude. The stiffness and damping parameters determine how much the robot "gives" under load. Low stiffness (50-200 N/m) makes the robot very compliant, useful for contact exploration. High stiffness (2000-5000 N/m) makes it nearly rigid while still protecting against force spikes.

Impedance Control Theory

Impedance control is the most widely used force control framework in manipulation research. The controller defines a desired mechanical impedance (relationship between position error and applied force) at the end-effector:

F = K * (x_desired - x_actual) + D * (v_desired - v_actual) Where: K = stiffness matrix (6x6, diagonal for Cartesian) D = damping matrix (6x6) x = end-effector pose (position + orientation) v = end-effector velocity Typical values for manipulation: K_translation = 200-1000 N/m (compliant approach) K_rotation = 10-50 Nm/rad (allow wrist compliance) D_translation = 2*sqrt(K*m) (critical damping) D_rotation = 2*sqrt(K*I) (critical damping)

The key insight: impedance control does not require a force sensor. It uses the commanded position and measured position (from joint encoders) to estimate the applied force. However, adding a wrist F/T sensor enables explicit force measurement, which is more accurate than the dynamics-model-based estimation (especially for arms with significant friction in the joints, which includes most arms under $50K).

Force Control Modes

Three operational modes cover the majority of contact manipulation tasks:

1. Pure impedance control: Set a desired stiffness in all 6 Cartesian axes. The robot acts like a spring — compliant when encountering unexpected contact, stiff enough to maintain trajectory in free space. Use this as the default mode for any task involving potential contact. No force sensor required (but one improves accuracy).

2. Force/position hybrid control: Control some Cartesian axes in position mode and others in force mode. The classic example is surface polishing: control X and Y in position mode (follow the surface trajectory) while controlling Z in force mode (maintain 5N contact force against the surface). This requires a force sensor to measure the contact force in the force-controlled axes.

3. Force-guided search: Use force feedback to guide the robot toward a target configuration. For peg-in-hole insertion: move the peg down until a Z-force spike is detected (contact with the surface), then apply a spiral search pattern in X-Y while maintaining low Z-force until the peg slips into the hole (detected by a sudden decrease in Z-force and increase in insertion depth). This is the standard industrial approach for precision insertion without precise vision.

Control ModeSensor RequiredTypical TasksPrecisionImplementation Complexity
Pure impedanceNone (optional F/T)Safe grasping, contact exploration+/-2-5mmLow
Force/position hybridWrist 6-axis F/TPolishing, deburring, surface following+/-0.5N forceMedium
Force-guided searchWrist 6-axis F/TPeg-in-hole, connector insertion+/-0.1mmMedium-high
Tactile-guided manipulationTactile array (GelSight/XELA)In-hand rotation, slip recovery3mm spatialHigh

Contact Mechanics Fundamentals

Three concepts from contact mechanics are essential for understanding why force sensing matters:

  • Friction Cone: A contact force must lie within the friction cone to avoid slip. The cone half-angle is arctan(mu), where mu is the coefficient of friction. For rubber gripper pads on steel (common in industrial settings), mu = 0.3. For silicone on glass, mu = 0.5. A measured contact force outside the friction cone predicts imminent slip — information that no camera can provide.
  • Force Closure vs. Form Closure: A form closure grasp constrains the object by geometry alone (e.g., gripping a cylinder in a matched V-groove). A force closure grasp constrains the object by applying contact forces from multiple directions such that their friction cones together prevent any rigid body motion. Force sensors let you verify you have achieved force closure; vision cannot.
  • Coulomb Friction Model: The standard model for predicting slip: |F_tangential| <= mu x F_normal. Measuring F_normal at the contact point lets you compute the maximum tangential force before slip, which is the fundamental constraint in dexterous manipulation.

Where Vision Fails in Practice

Three specific failure modes dominate in real manipulation systems:

  • Occlusion During Approach: As the gripper approaches an object for grasping, the fingers occlude the contact surface. The camera's view of exactly where contact will occur disappears in the last 2-5cm of approach — the most critical phase for precise placement.
  • Pixel Noise at Close Range: Depth sensors lose accuracy below 10-15cm working distance due to structured light interference and IR reflections from the gripper itself. Wrist-mounted cameras operating at typical manipulation distances are operating at the edge of their reliable range.
  • Depth Shadow on Grasp Surface: The gripper structure casts depth shadows on the grasp target surface, creating missing data in the point cloud precisely where you need the best localization.

Force Sensor Types: Detailed Comparison

Sensor TypeWhat It MeasuresPrecisionBandwidthCostBest For
Wrist 6-axis F/T (ATI Mini45)Full 6-DOF contact wrench+/-0.05N / +/-0.5Nmm7 kHz$3K-8KInsertion, assembly, grinding
Wrist 6-axis F/T (Robotiq FT300)Full 6-DOF contact wrench+/-0.1N / +/-1Nmm100 Hz$1.5K-3KGeneral manipulation, cobots
Joint torque estimationEstimated external torques+/-0.5Nm1 kHz$0 (software)Collision detection, rough contact
Fingertip F/T (Bota SensONE)6-axis force at fingertip+/-0.01N800 Hz$2K-5K/fingerDelicate grasping, surface following
Tactile array (XELA uSkin)3-axis force per taxel3mm spatial100-500 Hz$800-3KSlip detection, contact localization
Optical tactile (GelSight Mini)High-res contact geometry25 micron depth30-60 Hz$300-600Texture recognition, object ID
Paxini tactile sensorDistributed normal + shear1mm spatial200 Hz$500-1.5KSlip prediction, grasping

Real Examples Where Contact Sensing Enables Tasks

The following are concrete examples from published work and SVRC internal projects where adding force/tactile sensing enabled tasks that were impossible with vision alone:

Peg-in-hole insertion (0.5mm clearance): A Franka Research 3 arm with ATI Mini45 wrist F/T sensor. The controller uses a spiral search strategy: descend until Z-force exceeds 5N (contact with surface), then spiral outward in X-Y while maintaining 3N Z-force until Z-force drops (peg entering hole). Success rate: 89% with F/T vs. 62% vision-only. The F/T sensor detects the hole entry within 0.1mm, which is below the resolution of any practical depth camera setup.

PCB connector assembly: USB-C connectors require +/-0.3mm alignment and 10-15N insertion force applied in the correct direction. A vision-only policy can localize the connector to within 1-2mm but cannot control insertion force. Adding a wrist F/T sensor enables force-limited insertion with compliance — the controller applies 12N in the insertion direction while allowing +/-5mm compliance in the perpendicular axes. Success rate jumps from 45% (vision-only) to 92% (vision + F/T).

Egg grasping and transfer: Eggs break at approximately 30-50N force depending on orientation. A standard parallel-jaw gripper commanded to close with constant force will either fail to grasp (too little force) or break the egg (too much). Adding a fingertip force sensor enables a force-limited grasp that closes until 8N is detected, holds at that force, and increases to 12N only during acceleration phases to prevent slip. Zero egg breakage in 200 trials vs. 15% breakage without force sensing.

Surface following (polishing, wiping): Maintaining consistent 5N contact force while following an irregular surface shape. Vision can estimate the surface geometry before contact but cannot measure the actual contact force during the wipe. A force/position hybrid controller holds 5N in the surface-normal direction while following a trajectory in the tangential plane. Force variation: +/-0.3N with F/T control vs. +/-3N with position control (estimated from dynamics model).

Key Papers and Systems

  • Hogan, 1985 — "Impedance Control": The foundational paper establishing impedance control theory. Still the primary reference for understanding compliant manipulation.
  • Raibert & Craig, 1981 — "Hybrid Position/Force Control": Introduced the concept of independently controlling position and force in orthogonal task-space directions. The basis for all modern force/position hybrid controllers.
  • Calandra et al., 2018 — "More Than a Feeling": Demonstrated that GelSight tactile sensing enables grasping of objects that vision cannot perceive (transparent, reflective, thin). The paper that established tactile sensing as a practical tool rather than a research curiosity.
  • Qi et al., 2023 — "General In-Hand Object Rotation": Used tactile sensing with RL to achieve dexterous in-hand rotation of arbitrary objects. Showed that tactile feedback is necessary for in-hand manipulation — vision-only policies plateau at 30% success on rotation tasks while tactile-enabled policies reach 85%.
  • Suresh et al., 2024 — "Neural Contact Fields": Learned contact prediction from tactile sensor readings, enabling model-based planning for contact-rich manipulation. Represents the frontier of combining learned models with force/tactile data.

Performance Impact: The Numbers

The performance difference between force-enabled and vision-only manipulation is large and well-documented in robotics literature. For peg-in-hole insertion with 0.5mm clearance: force/torque-guided approaches achieve 89% success rate vs. 62% for vision-only, based on ATI-sponsored benchmarks on a Franka Research 3 arm. The delta is larger for tighter tolerances.

Cable insertion — arguably the most economically important manipulation task for electronics manufacturing — shows a similar gap: 94% success with wrist F/T control vs. 71% vision-only. The difference compounds in production: a 71% success rate requires 1.4 attempts per insertion on average; at 94%, it is 1.06. At 10,000 insertions per day, this translates directly to throughput capacity.

For manipulation tasks where tolerances exceed 5mm and compliance is not required, vision-only approaches are often sufficient. But for anything involving constrained fit, sliding contact, or delicate force limits — force sensing is not optional, it is the enabling technology.

SVRC stocks a range of force-enabled manipulation hardware including wrist F/T sensors, Paxini tactile sensors, GelSight Mini, and XELA uSkin arrays. Our data collection infrastructure records synchronized force/tactile data alongside vision for training contact-rich policies. Pilot data collection starts at $2,500. Browse the hardware catalog for current availability and integration support.

Related Reading