I am trying to fit data (muon decays) to find the decay rate. The detection technique introduces a background rate of false decays uniformly distributed throughout the range of data (0 to 20 µsec). So, in other words, the data theoretically should fit the function
\begin{align} N(\Delta T) = N_0 \exp(-\Delta T/\tau) + N_b, \end{align}
where $N_b$ is the background noise in the data and $N_0$ and $\tau$ are fitting parameters (with the best value for $\tau$ being the only thing I really care about).
I have noticed that fitting using scipy.optimize.curve_fit, for example, consistently under estimates $\tau$, even when I generate simulation data with the precise values of $\tau$ as well as $N_0$ and $N_b$ completely under my control.
For example, this Python script generates one simulation:
# generate random muon decay times
muon_decay_times = np.random.exponential(2.2, 1000)
# generate random background decay times
background_decay_times = np.random.uniform(0.0, 20.0, 100)
# combine the two arrays
decay_times = np.concatenate((muon_decay_times, background_decay_times))
# bin data for fitting
data, div = np.histogram(decay_times, range=(0,20), bins=FITTING_BINS)
centers = (div[:-1] + div[1:]) / 2
which is then fitted using scipy.optimize.curve_fit:
# fitting function
def decayWithBackground(ΔT, N_0, τ, N_b):
return N_0 * math.e ** (-ΔT / τ) + N_b
# do fit
popt, pcov = curve_fit(decayWithBackground, centers, data)
print(popt)
The problem is that running many simulations and fitting as shown will consistently underestimate the value of $\tau$ as ~2.18 instead of 2.2.
I would like to understand why this happens. My signal to background ratio in the example above is 10. I can work to improve that ratio, and presumably get a better estimate for $\tau$. But what I want to know is how to "unbias" my fit for $\tau$ so that for a large number of simulations, my mean estimate for $\tau$ will be "unbiased".
I suspect that what may be happening is that the residuals being minimized are the vertical distance to the curve instead of the distance to the nearest point on the curve. I've thought about trying to estimate and remove the background first so that I can take the log of counts and fit to a straight line, but then the question becomes, "how to estimate $N_b$." So far, I've tried getting a rough estimate for $\tau$, then fitting for $N_b$ by only considering points where $\Delta T > 5 \tau$. The choice of 5 was arbitrary, but this method actually made things worse.
If this type of problem is common, then perhaps just some references and proper terminology would be enough for me to find what I'm looking for, which is both a specific solution to this problem as well as a more general approach to similar problems.