Who what & why?

Few people who use or follow wearable tracking tech will have failed to hear that Fitbit is facing multiple lawsuits in the US alleging a lack of accuracy in the heart rate (HR) monitoring function of devices such as the Fitbit Surge ™ and Charge HR™. Whilst most people would agree that activity trackers can help quantify and motivate more physical activity in the general population, you would think that by paying extra for the inclusion of a heart rate sensor you ought to be getting something that is reasonably accurate and providing added value.

However according to researchers Edward Jo, PhD and Brett A. Dolezal, PhD of California State Polytechnic University, Pomona, this is apparently not the case. They were commissioned as expert witnesses to test Fitbit devices against a gold standard ECG criterion measure and report on their findings.

What did they do?

They conducted studies with 43 healthy young adults who had to wear a Fitbit Surge™ on one wrist and a Fitbit Charge HR ™ on the other wrist, as well as a Zephyr Bioharness™, which provided ECG accurate R-R (individual heartbeat) intervals.

The subjects had to perform a range of resting, low and higher intensity activities including running, stair climbing and plyometrics in both outdoor and lab conditions whilst being simultaneously monitored by all 3 devices.

In total they took over 250,000 readings, which they analysed using four different techniques:

  1. Correlation.

This is a very commonly used statistic that indicates whether two variables track in the same direction i.e. if they go up or down together. However, it doesn’t tell you how close together they are.

  1. Paired sample T-Test.

This tells you how well the averages of two sets of data compare. Again it needs the caveat that this doesn’t tell you how far apart the actual scores are, only how well their averages compare.

  1. Bland-Altman plot.

This technique tells you about the degree of bias between two sets of data, both from an average and an individual data point of view. In this respect it is more revealing than either correlation or the sample T-Test.

  1. Absolute differences.

For example, if one device recorded 125 bpm whilst the other recorded 130 bpm, the absolute difference would be 5 bpm.

A standard error of estimate of 5 bpm, correlation > 0.9 and mean bias of 3 bpm is considered an acceptable agreement between two heart rate monitoring devices.

In spite of a reasonable 0.85 overall correlation between the ECG reference and the Charge HR, the chart below shows how much the readings could differ by at different exercise intensities (an ideal match between the two would be straight line with no cloud of data points above and below):
Pulse monitors 1At a moderate exercise intensity of 140 bpm as measured by the ECG, the Fitbit indicated anywhere between 80 and 180 bpm. Correlation at this intensity and higher was only 0.48, with a mean absolute difference of 15.5 bpm and a mean bias of -12.5 bpm, i.e. the Charge HR™ was consistently under reporting the actual heart rate.

At resting, and low intensity levels, the mean bias was reduced to almost zero, but the mean absolute difference was still 8.9 bpm. If we look at the chart again, a healthy resting HR of 60 bpm could be reported as anywhere between 55 and 80 bpm.

Results for the Fitbit Surge™ were worse, with a mean absolute difference of 22.8 bpm and bias of -20.8 bpm at exercise intensities and 8.2 bpm and -1.9 bpm respectively at rest & low intensity (below 132 bpm).

The mean bias and wide limits of agreement (LoA) for the combined results are illustrated in the Bland-Altman plot below:

Bland-Altman plot Fitbit HR

Since both devices use Fitbit’s PurePulse technology, it’s surprising to see how much the devices differed when compared against each other with time synchronized data. So for example, the values at an HR of 150 bpm could differ from 80 to 180 bpm from the other model worn on the opposite wrist of the same user at the same time.

Charge HR chart

Overall, both devices failed to meet the validation criteria:

  1. Standard error of estimate was 17.2 bpm vs requirement of 5 bpm
  2. Correlation of 0.88 vs requirement of 0.9
  3. Mean bias of -8.9 bpm vs requirement of ±3 bpm

What does it mean?

The authors observed that although the bias towards under reporting exercise heart rates was not systematic, the data was extensively dispersed and the wide limits of agreement created this effect. They also noted that the Fitbit devices failed to produce a reading on a number of occasions.

They concluded that the PurePulse™ technology used in these devices does not accurately record or report heart rate, and is more unreliable at higher heart rates.

ithlete’s perspective

It’s certainly disappointing that with so many people now using this kind of activity tracker, and Fitbit being the leading brand, that pulse rate measurement is not a good deal more accurate. If you are exercising to heart rate zones, it is important to know your actual current heart rate to an accuracy of better than ±5 bpm, and all the more so if you are on a program set by a physician or cardiologist. Under reporting is especially concerning as you may be exercising at an intensity substantially higher than that prescribed, which is potentially dangerous of course.

These devices also seem to be a very long way from the better than 1% precision needed for heart rate variability (HRV) measurement. By way of comparison, the validation study performed by the University of Sydney on the ithlete Finger Sensor found an almost perfect correlation of better than 0.99 with the ECG reference. They also found a mean error of 0.05% at the resting heart rates for which this sensor was intended.

We hope that this report will stimulate Fitbit and others to renew efforts to bring accurate devices to consumers which can be used during exercise, at rest and eventually for HRV

References:

  1. Validation of the Fitbit® SurgeTM and Charge HR™ Fitness Trackers. Authors: Edward Jo, PhD and Brett A. Dolezal, PhD
  2. Heathers, J.A.J., Smartphone-enabled pulse rate variability: An alternative methodology for the collection of heart rate variability in psychophysio…, International Journal of Psychophysiology (2013)