If you have a set of data points that look like they’re increasing rapidly, it might be useful to fit them with a smooth, exponentially increasing line in order to describe the general shape of the data: The line that you need to fit in order to achieve this shape will be one that is described by an exponential function, that is any function of the form: \(y = AB^x + C\) or \(y = ae^{bx} + c\) (these two are mathematically equivalent because \(AB^x = Ae^{x\ln(B)}\)). The important thing to realise is that an exponential function can be fully defined with three constants. We will use the second of these formulations, which can be written in Python as 7 where 8 is the exponential function \(e^x\) from the Numpy package (renamed 9 in our examples).For this tutorial, let’s create some fake data to use as an example. This should be a set of points that increase exponentially (or else our attempts to fit an exponential curve to them won’t work well!) with some random noise thrown in to mimic real-world data:
The random noise is being added with the 0 function from Numpy which draws random samples from a normal (Gaussian) distribution. Let’s take a look at what this example data looks like on a scatter plot:
The 1 command from Numpy is used to fit a polynomial function to data. This might seem a little strange: why are we trying to fit a polynomial function to the data when we want to fit an exponential function? The answer is that we can convert an exponential function into a polynomial one using the fact that:\(y = ae^{bx} \implies \ln(y) = \ln(a) + bx\) because we can take the natural logarithm of both sides. This creates a linear equation \(f(x) = mx + c\) where:
So 1 can be used to fit \(\ln(y)\) against \(x\):
This polynomial can now be converted back into an exponential:
Let’s take a look at the fit:
This method has the disadvantage of over-emphasising small values: points that have large values and which are relatively close to the linear line of best fit created by 1 become much further away from the line of best fit when the polynomial is converted back into an exponential. The act of transforming a polynomial function into an exponential one has the effect of increasing large values much more than it does small values, and thus it has the effect of increasing the distance to the fitted curve for large values more than it does for small values. This can be mitigated by adding a ‘weight’ proportional to \(y\): tell 1 to lend more importance to data points with a large y-value:
Using a weight has improved the fit. From the Scipy pacakge we can get the 5 function. This is more general than 1 (we can fit any type of function we like, exponential or not) but it’s more complicated in that we sometimes need to provide an initial guess as to what the constants could be in order for it to work.Let’s use our original example data (with \(c \neq 0\)):
Now let’s fit the function \(y = ae^{bx} + c\). This is done by defining it as a lambda function (ie as an object rather than as a command) of a dummy variable \(t\) and using the 5 function to fit this object to the x- and y-data. Note that the 5 function needs to be imported from the 9 sub-package:
The first output, 0, is a list of the optimised values for the parameters which, in our case, are the constants \(a\), \(b\) and \(c\):
Let’s see what this looks like:
This looks really good, and we didn’t need to provide an initial guess! This is because the example data we are using is close enough to exponential in nature that the optimisation algorithm behind 5 could fit a curve without accidentally choosing the wrong local minimum. This won’t always be the case, so here’s how to do it with an initial guess provided: 0 1Let’s plot all three methods against one another using the same example data (\(c = 0\)) for each: 2As you can see, the 5 method has given us the best approximation of the true underlying exponential behaviour.We can use the fitted curve to estimate what our data would be for other values of \(x\) that are not in our raw dataset: what would the value be at \(x=11\) (which is outside our domain and thus requires us to forecast into the future) or \(x = 8.5\) (which is inside our domain and thus requires us to ‘fill in a gap’ in our data)? To answer these questions, we simply plug these x-values as numbers into the equation of the fitted curve: |