Bias is found wherever measurements are made and wherever people apply judgments. It’s an intrinsic part of the data landscape.
Bias can be informed by experiences, but it’s also a reflection of the imperfect way in which data—all data—are gathered and summarized. You might want to know the temperature, but all you can get is a number from one thermometer in one particular place. The measurement it provides might not be representative of the actual temperature due to a miscalibrated sensor, say, or where the thermometer is placed.
Bias is everywhere. Yet, in many cases, data can still provide valuable insights. So how to avoid throwing out the data with the bias? What’s needed is a systematic way to estimate and remove that bias.
U.S. employment estimates
ADP gathers payroll data for millions of U.S. workers, which ADP Research analyzes every month to estimate how U.S. employment is changing.
In the United States, there are three major employment estimates. Each month, the Bureau of Labor Statistics conducts its Current Employment Situation survey of approximately 121,000 employers, and ADP leverages employment data from more than 500,000 clients to create the National Employment Report.
Every three months, the Bureau of Labor Statistics also produces the Quarterly Census of Employment and Wages, a measure of employment and wages reported by employers responsible for more than 95 percent of U.S. jobs
The QCEW is primarily sourced from the administrative data that private sector companies and state and local governments submit quarterly to state unemployment insurance programs. Historically, QCEW has received roughly 40 percent of its data from payroll providers like ADP, who collect this data on behalf of their clients.
The QCEW is considered the gold standard for employment and has the smallest bias, but it’s less timely than the BLS and ADP monthly reports.
An ideal employment estimate would be one with the timeliness and granularity of the ADP National Employment Report with the comprehensiveness of the federal government’s QCEW. We can’t achieve that lofty goal—a week-by-week census of every U.S. worker. But we can get closer to achieving it by combining the strengths of these two data sources. The question is how to do that.
National Employment Report bias
Bias is an effect found in many data sources where the measured value is consistently higher or lower than the actual value; that is, the measurements for whatever reason are not perfectly representative of the process being measured.
Bias is a significant consideration for the ADP National Employment Report. The BLS Current Employment Situation survey can target particular respondents to maximize how well they represent important slices of U.S. employment. This can minimize bias when survey participation is high.
ADP data comes from businesses that have chosen to outsource payroll and human resources activities. As such, these employers might not perfectly represent overall employment for their industry or geography.
The QCEW is comprehensive enough, at 95 percent employment coverage, that the impact from bias is minimized.
To see the difference between ADP client employment growth and overall U.S. employment growth, let’s look at an industry included in the National Employment Report: manufacturing.
Manufacturing employment
Here’s a peek behind the curtain. This comparison plot shows raw data tracking the change in manufacturing employment from 2010 to the end of 2023. It includes only employers that are ADP clients and ignores the effects of any change in ADP market share. That is, if a manufacturer with 100 employees becomes an ADP client, that event doesn’t count as manufacturing employment growth. While the company is with ADP, we track their hiring and firing week to week. If the company leaves, that also doesn’t affect the change reported in manufacturing employment.
While the shapes of the QCEW and ADP measures are remarkably similar, the employment growth experienced by ADP clients far exceeds the overall market growth reported by the QCEW.
If we reported this unadjusted ADP data, there would be consistent disagreement between our National Employment Report and other measures of U.S. hiring.
Bias
The situation described above is an example of sample bias.
Imagine the goal is to estimate the average height of students at a college. The QCEW would obtain responses from at least 95 percent of all students. This would result in a good estimate of student height, but would take a lot of time and effort, which is why the QCEW employment estimates are reported with a six-month lag.
Another approach would be to select a random fraction of the student population, making sure it’s a balanced mix of demographics. The BLS Current Employment Survey takes this approach. Instead of taking standard measurements, it would rely on students voluntarily completing a self-reported height survey.
A third approach would be to rigorously measure actual student height, but include only students who, for example, visit the campus gym. Instead of relying on surveys, this approach would go out and obtain actual measurements. But not all students chose to go to the gym. Gym-goers might be representative of the overall student body, but they might not. Maybe members of the basketball team make up a disproportionate share of campus gym-goers. This would result in bias. Our gym example broadly illustrates the ADP National Employment Report estimate. It’s a valuable estimate based on actual payroll data, not surveys. But it might exhibit a bias.
In fact, there is a bias in the ADP sample, one that we correct for.
Additive bias
Additive bias occurs when there’s an identical systematic error, or bias, at each measurement. Adding or subtracting the average gap between the two data series across all measurements is an appropriate bias correction to use when, visually, the gap between the actual and biased measurements stays the same.
This clearly isn’t the case in the ADP data, but let’s try it anyway.
Even though the additive bias adjustment applied in the plot below failed to align the two series, it did improve their level of agreement. The maximum distance between the QCEW and ADP series has shrunk, because the disagreement is now split between the beginning and end of the series. The unadjusted series had zero error at the beginning and significant error accumulated by the end. After applying the additive bias adjustment, we see half that error at the beginning, zero error in the middle, and half the error at the end.
Multiplicative bias
A multiplicative bias correction is used when the ratio between two measurements is constant. Multiplying all measurements by a constant corrects for this bias, where the constant is the average of the ratio between the two series. This bias correction typically is appropriate when the gap between the actual and biased measurements widens as measured values go up and narrows as measured values go down.
While our multiplicative bias correction didn’t narrow the gap between the ADP and QCEW manufacturing series, it did align the growth between the two. Now it looks like the gap between the QCEW and the bias-adjusted ADP employment report is constant, or close to it.
This exercise suggests that an additive bias adjustment applied at this point could create a well-adjusted series.
Estimating both multiplicative and additive bias adjustments is the same as estimating a linear model, which brings us to the linear model bias adjustment.
Linear model bias
A linear model is a model of the form y=mx+b. In this case, x is our unadjusted ADP series and y will be the resulting adjusted series. The parameters m and b are the multiplicative (see how m is multiplied by x, which is our unadjusted series) and additive (see how b is added to x, our unadjusted series) bias adjustments.
If we can estimate a good m and b that removes the bias from our x series, then the adjusted y series we produce will have a much smaller bias.
In practice, it’s almost always best to use a linear model to perform bias adjustments. It delivers the advantages of both multiplicative and additive adjustments, along with easy access to procedures for fitting the parameters. If the slope parameter m is near 1, the bias is nearly purely additive. If the offset parameter b is near 0, the bias is nearly purely multiplicative.
Fitting for ADP employment growth, we get the following:
While this adjustment does a good job of removing the bias, challenges remain. The linear fit between these two series will depend on the start and end times of the series. For example, if the period between 2010 and 2019 is used to estimate bias, the plot looks different:
Note how the bias adjustment for the 2010-2019 period looks a little better, but the bias adjustment for the 2020-2024 period looks substantially worse.
One way to address this problem is to fit multiple linear models into the bias correction. For example, we could fit one bias model for each calendar year. This approach allows for a tighter fit of the data, and the fixed start and end dates for each model means that only the last model’s bias parameters would change with time as new data are collected for the latest year.
Piecewise linear bias
At the cost of increased complexity, a piecewise linear bias will have a better fit and better stability for all piecewise sections except the last one.
Remember, a linear bias fits a single line from the start of a series to the end. For any series with active data collection, the end of the series keeps moving forward as new data are recorded, slowly changing the estimates for the parameters m and b.
To account for this parameter drift, we can split a time series into two parts and estimate two linear models, one for, say, the first year and another for the rest of the years. The linear model for the first year now has fixed start and end points. Whether we estimate the parameters of that linear model today or a year from now, they’ll be exactly the same. The rest of the series still has a moving endpoint, so let’s keep breaking it up. If we break the time series into a number (n) of smaller series, we get n linear models. N-1 of them will have fixed start and end points; the bias parameter estimates will drift over time only for the last of them. These n linear models are a piecewise linear model.
For the National Employment Report, we use a piecewise linear bias adjustment, where each section is one year long. For a linear model bias this increases the bias correction from having two parameters (the additive and multiplicative bias) to two parameters per year—currently 32 parameters covering the time period from 2010 to present.
There’s more to bias than just linear adjustments of the measurements. Data transformations can make bias more stationary across time. In systems with multiple measurements, the choices of what granularity to correct for can affect the quality of the resulting corrected data.
For example, in the National Employment Report, we perform bias corrections on weekly instead of monthly employment data, and do so at a finer granularity than the reporting granularity.
When it comes to manufacturing employment, our piecewise, weekly bias correction yields a tight fit between our employment report and QCEW data.
The takeaway, or how I learned to stop worrying and love the bias
Bias is everywhere, it’s how that bias is measured and mitigated that matters when estimating the state of the world.
Bias is a natural consequence of limited resources and a limited ability to characterize exactly what needs to be measured and always has been and will continue to be an important consideration in all data.
Appropriate and responsible methods exist to estimate and remove these biases. The linear model bias correction outlined here is one such responsible method, at least for cases where a largely bias-free reference series, such as the federal government’s QCEW, is available.
Tim Decker is the lead architect of the ADP National Employment Report.