Credibility Based Smoothing Using Ghost Trend

Joseph Boor

1. INTRODUCTION

Smoothing “bumpy” or “volatile” data is not required in many actuarial analyses. However, it is often needed when rating factors vary by policy limits, amount of insurance, or other characteristics that fall into “buckets” along a line, or when a random sample is used to construct a severity distribution. Typically, the average frequency, severity, pure premium, percentage of sampled values, etc. of all the risks that fall into each bucket is the value assigned to the bucket, and for smoothing purposes the index assigned to each bucket is the midpoint of the range of the amount of insurance, etc. that the bucket covers.

Smoothing is especially relevant when the data in some or all of the buckets do not have adequate credibility. This paper begins with a credibility-based approach to smoothing that recognizes the credibility of individual buckets but still recognizes the tendency of a curve to move in a continuous way and at a continuous rate. Then, it expands the method to provide a broader tool kit for more challenging smoothing situations.

2. THE MODEL

This approach begins with a model that reflects certain assumptions. The situation, in more precise terms, is:

The data in each of the buckets has what might be termed process variance around the true, but unknown, values along an underlying curve. The data creates a statistical approximation to that curve which is more accurate at the points/buckets^[1] where there is more data (less process variance) and less accurate elsewhere.
The underlying curve is assumed to be fairly smooth, so the point-to-point changes on the underlying curve would be encouraged, but not forced, to change by the same slope as one moves along the curve.
On the other hand, most curves do not fall perfectly along a line, so the point-to-point changes along the curve should have random aspects but still retain a continuous looking shape.
Further, few actual curves seen in practice are generated from straight lines or linear relationships, so that trend must change as one moves from point to point along the curve.
However, one would logically assume that the process errors, the general trend, and the point-to-point trends between adjacent points are all random.

Then, the next step is to develop a model that may be used to estimate the curve by smoothing the data.

2. THE SIMPLER MODEL-CONSTANT UNDERLYING TREND

In this model, the process error variances vary from point to point but are either known or estimated with reasonable accuracy. Under best estimate credibility (see Boor 1992) that process error would be part of a multiplicative inverse of credibility. So, points with high process error should receive less weight in deriving the smoothed curve. Then, to illustrate the situation, there would be observed data points $\mathbf{S}_{\mathbf{1}}\mathbf{,\ }\mathbf{S}_{\mathbf{2}}\mathbf{,\ \ldots,\ }\mathbf{S}_{\mathbf{n}}\mathbf{\ }$ that differ from the unknown true values $\mathbf{p}_{\mathbf{1}}\mathbf{,\ }\mathbf{p}_{\mathbf{2}}\mathbf{,\ \ldots,\ }\mathbf{p}_{\mathbf{n}}\mathbf{\ }$ by process errors with variances of $\mathbf{\sigma}_{\mathbf{1}}^{\mathbf{2}}\mathbf{,\ }\mathbf{\sigma}_{\mathbf{2}}^{\mathbf{2}}\mathbf{,\ \ldots\ ,}\mathbf{\sigma}_{\mathbf{n}}^{\mathbf{2}}\mathbf{\ }$ respectively.

If one defines each “change” $\mathbf{\ D}_{\mathbf{i + 1}}\mathbf{\ }$ as the change from point-to-point $\mathbf{\ D}_{\mathbf{i + 1}}\mathbf{=}\mathbf{\ p}_{\mathbf{i + 1}}\mathbf{-}\mathbf{p}_{\mathbf{i}}$ , one might view them as driven by trend. That is especially in this case where the data is evaluated at the points 1, 2, …, n, or other situations where the indices ( $\mathbf{i'}$ s) are equally spaced. The expected trend is assumed to be constant, at some slope $\mathbf{G}$ . However, since one would not expect the underlying values to lie perfectly on a line, one must allow the actual (and also unknown) changes to vary from period to period around $\mathbf{G,}$ with some variance $\mathbf{\tau}^{\mathbf{2}}\mathbf{.}$

Table 1 shows an example of the consequent smoothing process. The assumed constant overall trend (not yet the ghost trend) of, in this case, 0.75, is specified in column (7). The process variances are specified, along with the observed values, $\mathbf{G}$ =0.75, and $\mathbf{\tau}^{\mathbf{2}}\mathbf{=}$ 0.80. The actual data values input to the process and estimates of their variances around the “true” underlying expected values are included in columns (2) and (3). The fitted curve, the ' $\mathbf{p}_{\mathbf{i}}$ ‘s’ that the process finds are in column (4). The normalized fit error (the squared differences between the raw data and the $\mathbf{p}_{\mathbf{i}}$ 's, divided by the variance associated with the raw data point) is computed in column (5). The estimate of the “local” (using $\mathbf{p}$ between the $\mathbf{(i - 1)}^{\mathbf{th}}\mathbf{\ }$ and $\mathbf{i}^{\mathbf{th}}\mathbf{\ }$ steps) trend is computed in column (6). The constant overall governing trend (in this case, 0.75) is posted in column (7). Lastly, how far the “local trend” has "drifted away from that constant trend is computed by taking the squared difference between the local trend and the selected overall trend value, the dividing the results by a common preselected, $\mathbf{\tau}^{\mathbf{2}}\mathbf{=}$ .8. That result is shown in column (8).

Table 1.Flexible Trend with Fixed Expected Trend and Drift Variance

	Reference Values
	A. $\ \ \ \ \tau^{2} =$		0.80
	B. $\ \ G\$ =		0.75

(1)	(2)	(3)	(4)	(5)	(6)	(7)	(8)
Data	Database	Data	To Minimize C.	[((4)-(2))^2]/(3)	(4)-Prev,(4)	Data	[((7)-(6))^2]/A.
Evaluations	Observed		Estimated			Global
at	Values	Variance	True Value	Normalized	Estimated	Trend	Normalized
$i =$	$S_{i}$	$\sigma_{i}^{2}$	$p_{i}$	Fit Error	Local Trend	$G$	Trend Drift
(Data)	(Data)	(Data)	(Minimizes $C)$	${(p_{i} - S_{i})}^{2}/\sigma_{i}^{2}$	$D_{i} = p_{i} - p_{i - 1}$	B. Above	${(G - D_{i})}^{2}/\tau^{2}$

1	10	36	8.766	0.045
2	7	25	9.488	0.239	0.723	0.75	0.001
3	13	4	10.291	1.896	0.802	0.75	0.003
4	9	1	10.551	2.237	0.260	0.75	0.300
5	15	4	12.052	2.338	1.501	0.75	0.705
6	10	16	12.963	0.482	0.911	0.75	0.033
7	11	16	14.022	0.472	1.059	0.75	0.120
8	16	1	15.233	1.301	1.211	0.75	0.265
9	15	36	15.830	0.000	0.597	0.75	0.029
10	18	4	16.446	1.881	0.616	0.75	0.023
11	9	36	16.751	0.101	0.305	0.75	0.248
12	6	64	17.228	0.090	0.477	0.75	0.093
13	15	16	17.845	0.004	0.617	0.75	0.022
14	20	4	18.605	0.281	0.760	0.75	0.000
15	14	36	19.085	0.005	0.481	0.75	0.091
16	13	16	19.679	0.940	0.594	0.75	0.031
17	26	16	20.607	1.291	0.928	0.75	0.039
18	28	36	21.265	0.165	0.658	0.75	0.011
19	21	4	21.773	0.482	0.508	0.75	0.073
20	22	4	22.436	1.711	0.663	0.75	0.009

		Variance Subtotals		20.447			2.095

				C.= Total Variance =		22.542

(The gray values were selected by the optimization routine to minimize the cell in yellow.)
This is also visualized graphically in Figure 1.

Then the goal is to find the $\mathbf{p}_{\mathbf{i}}$ 's that simultaneously fit the data well and yet provide a smooth curve. However, the values of “does not fit the data well” and “is not smooth,” are easier to compute numerically than the original targets. Specifically, the sum of all the entries in column (5), the normalized fit error, represents “does not fit the data well .” The sum of the entries of the drift in the local trend in column (8) represents “is not smooth ,” albeit indirectly. Then one would seek to reduce (minimize, speaking in numerical terms) those values.

Essentially, using a computer minimization routine, the process finds the $\mathbf{p}_{\mathbf{i}}\mathbf{\ }$ points that minimize the standard squared differences(squared difference divided by variance) between the points and the observed data, the $\mathbf{S}_{\mathbf{i}}$ 's. It simultaneously minimizes the standard squared differences between the $\mathbf{\ p}_{\mathbf{i + 1}}\mathbf{-}\mathbf{p}_{\mathbf{i}}$ 's and $\mathbf{G}$ as well. First, a cell or variable adding together the subtotals of columns (5) and (7) is now included in the chart and highlighted in yellow. Then the value in that target cell in yellow was minimized by finding the $\mathbf{\ p}_{\mathbf{i}}$ 's that create the lowest possible value of the target. For reference, the solution routine in standard spreadsheet software was used to compute the $\mathbf{\ p}_{\mathbf{i}}$ 's in all of the tables in this article.

In this case, the fitted curve looks like it could reasonably be a smoothed version of the data. However, in this special case, the data does show a steady uptick, mirroring the assumption of constant governing trend. Hence, something similar to a straight line can be an effective smoothing of this particular data.

On the other hand, when the raw data has a ‘hump,’ or other curvature, the results of this smoothing method do not fit the data as well. In Figure 2 all the parameters are exactly the same as they were in Table 1 and Figure 1, (except for minimizing the target “C” by selecting new $\mathbf{\ p}_{\mathbf{i}}$ 's) but the raw data values are more U-shaped.

Figure 1.Fitted vs. Raw Data

Figure 2.Fitted vs. Raw Data with ‘Hump’

Of course, some of the fit problems may be mitigated by changing the values of the drift parameter $\mathbf{\tau}^{\mathbf{2}}$ and the expected trend $\mathbf{G.}$ One may vary both those, as well as the various $\mathbf{p}_{\mathbf{i}}$ 's to produce a better match to the data. In fact, varying those values produces the following graph (reusing the Figure 2 raw data).

However, one may readily see, that this eliminates all smoothing. Considering the alternatives, the use of constant expected trend in the simple model limits its ability to appropriately smooth data with “humps.”

3. INCLUDING THE GHOST TREND

Since the constant underlying trend $\mathbf{G}$ limits the ability of the smoothing process in Section 2 to mimic curves, it would be logical to enhance $\mathbf{G}$ by allowing it to change as one moves among the data points. Therefore, one would no longer specify a constant expected trend (as Table 1 did), but rather find the expected local trends that best match the data. Of course, there should be some control on the point-to-point changes, or something like Figure 3 will reoccur. The result is that one will have a set of “nearly invisible” $\mathbf{G}_{\mathbf{i}}$ 's, governed by a requirement that the difference between each $\mathbf{G}_{\mathbf{i}}$ and $\mathbf{G}_{\mathbf{i + 1}}$ follow a probability distribution (in this article they are specified to vary with a mean of zero and a constant variance of some $\mathbf{\delta}^{\mathbf{2}}\mathbf{)}$ . The $\mathbf{\ D}_{\mathbf{i}}$ 's would continue to vary around $\mathbf{G}$ , except that now each $\mathbf{\ D}_{\mathbf{i}}\mathbf{\ }$ will vary around each individual $\mathbf{G}_{\mathbf{i}}\mathbf{\ }$ with mean zero and a variance of some prespecified $\mathbf{\tau}^{\mathbf{2}}$ .

Figure 3.Choosing All Parameters to Minimize Variance

The unobserved, indirectly estimated, and “nearly invisible” $\mathbf{G}_{\mathbf{i}}$ 's affect the $\mathbf{\ D}_{\mathbf{i}}$ 's the $\mathbf{\ D}_{\mathbf{i}}$ 's help to determine the $\mathbf{\ }\mathbf{p}_{\mathbf{i}}^{\mathbf{'}}\mathbf{s}\mathbf{,\ }$ and those smooth the $\mathbf{S}_{\mathbf{i}}$ 's, which in turn are the only hard data in the process. So, in a sense, the $\mathbf{G}_{\mathbf{i}}$ 's are shadows of shadows of the data. It is then logical to describe them as “ghost trend.”

In this case, the total standard squared error to be minimized still includes that of the differences between the $\mathbf{p}_{\mathbf{i}}$ 's and the $\mathbf{S}_{\mathbf{i}}$ 's and that between the $\mathbf{\ D}_{\mathbf{i}}$ 's and now the $\mathbf{G}_{\mathbf{i}}$ 's. However, it also includes the squared differences (divided by $\mathbf{\ }\mathbf{\delta}^{\mathbf{2}}\mathbf{)}$ arising between each $\mathbf{G}_{\mathbf{i}}\mathbf{\ }$ and the $\mathbf{G}_{\mathbf{i - 1}}$ that preceded it. Table 2 illustrates this process.

Table 2.Fit to Data with ‘Hump’ Using Flexible Trend Governed by Ghost Trend

	Reference Values
	${\ A.\ \ \tau}^{2}$ =		0.8
	B. $\delta^{2}\$ =		0.0625
(1)	(2)	(3)	(4)	(5)	(6)	(7)	(8)	(9)
Data	Data	Data	To Minimize C.	[((4)-(2))^2]/(3)	(4)-Prev,(4)	To Min. C.	[((7)- Prev.(7)))^2]/B.	[((7)-(6))^2]/A.
Evaluations	Observed		Estimated		Observed		Normalized	Similar
at	Values	Variance	True Value	Normalized	Step-by-Step	Ghost	Ghost Trend	Squared Diff
$i =$	$S_{i}$	$\sigma_{i}^{2}$	$p_{i}$	Fit Error	Difference in p’s	Trend	Squared Diff	$D_{i}$ from $G_{i}$
(Data)	(Data)	(Data)	(To minimize $C$ )	${(p_{i} - S_{i})}^{2}/\sigma_{i}^{2}$	$D_{i} = p_{i} - p_{i - 1}$	(To min. $C.$ )	${(G_{i} - G_{i - 1})}^{2}/\delta^{2}$	$\tau^{2}$ normalized

1	10	36	9.038	0.026
2	7	25	9.677	0.287	0.640	0.661		0.001
3	13	4	10.407	1.681	0.730	0.662	0.000	0.006
4	9	1	10.613	2.600	0.205	0.659	0.000	0.257
5	15	4	12.141	2.044	1.528	0.690	0.016	0.878
6	10	16	13.067	0.588	0.926	0.657	0.018	0.091
7	11	16	14.094	0.598	1.027	0.602	0.048	0.226
8	16	1	15.188	0.660	1.094	0.514	0.123	0.420
9	15	36	15.496	0.007	0.309	0.381	0.284	0.007
10	18	4	15.686	1.338	0.190	0.253	0.261	0.005
11	13	36	15.290	0.146	-0.396	0.132	0.236	0.349
12	17	64	14.864	0.071	-0.426	0.050	0.107	0.283
13	14	16	14.373	0.009	-0.491	0.006	0.032	0.309
14	15	4	13.891	0.307	-0.482	0.000	0.000	0.290
15	13	36	13.191	0.001	-0.701	0.000	0.000	0.613
16	9	16	12.489	0.761	-0.702	0.000	0.000	0.616
17	8	16	11.963	0.981	-0.526	0.000	0.000	0.346
18	10	36	11.634	0.074	-0.329	0.000	0.000	0.135
19	11	4	11.342	0.029	-0.291	0.000	0.000	0.106
20	10	4	11.118	0.313	-0.224	0.000	0.000	0.063

	Variance Subtotals			12.521			1.125	4.999


				C. Total Variance		18.645

To implement the ghost trend process, the chart in Table 2 adds two columns to the chart from Table 1. The constant trend in the previous example is replaced with a column (7) of ghost trend. To keep the ghost trend smooth, a “penalty” column, containing the squares of the differences between successive values of the ghost trend is included as column (8). To control the balance between that column, the fit error penalty column (5), and column (9) penalizing the drift of the actual trend ( $\mathbf{D}_{\mathbf{i}}$ 's) from the “expected” ghost trend ( $\mathbf{G}_{\mathbf{i}}$ 's), each term in column (8) is divided by the $\mathbf{\delta}^{\mathbf{2}}$ discussed above.

Further, in this case, the sum of the fit error, the drift penalty, and the ghost trend smoothing penalty must be computed in the target cell/variable to be minimized. So, the total variance in C. below (in yellow) sums all three. The spreadsheet minimization routine was directed to minimize that value by choosing the values of the $\mathbf{\ p}_{\mathbf{i}}$ 's and $\mathbf{G}_{\mathbf{i}}$ 's in columns (4) and (7). The results are shown in Table 2. Of course, the key values are the smoothed values, the $\mathbf{\ p}_{\mathbf{i}}$ 's, in column (4).

As one may see in Figure 4, this provides a better fit (conforms better) to the last half of the data from Figure 3.

Figure 4.Ghost Trend vs. Fixed Expected Trend

Figure 5.Ghost Trend vs. Fixed Expected Trend on Steep Hump Using

$\mathbf{\ \delta}^{\mathbf{2}}\mathbf{= .0625}$

Now, in this case, the hump is fairly modest. However, when a more pronounced hump (such as a normal distribution with a low variance) is involved, the difference may be more significant. Figure 6 looks at data with a steeper hump but continues using the existing $\mathbf{\delta}^{\mathbf{2}}$ of .0625.

Figure 6.Ghost Trend vs. Fixed Expected Trend on Steep Hump Using

$\mathbf{\ \delta}^{\mathbf{2}}\mathbf{= 5}$

In this case, the fit is improved, but still not that desirable. However, this approach also allows one to vary the “long-term flexibility” $\mathbf{\delta}^{\mathbf{2}}$ . When the $\mathbf{\delta}^{\mathbf{2}}\mathbf{\ }$ parameter is increased to five in Figure 6, the fit is demonstrably superior to that of the simpler model.

Increasing $\mathbf{\ \delta}^{\mathbf{2}}$ results in smoothing with a fairly good fit. This illustrates a key point. Both approaches require selecting two parameters. For the fixed expected trend approach, the trend and the short-term flexibility (or inverse of smoothness) parameter $\mathbf{\tau}^{\mathbf{2}}$ must be selected. For the ghost trend, $\mathbf{\tau}^{\mathbf{2}}\mathbf{\ }$ must still be chosen, but one also chooses the long-term flexibility parameter $\mathbf{\delta}^{\mathbf{2}}$ . Therefore, actual implementation may involve judgments of how much smoothness is desired and how much replication of the data, or “fit,” is required.

It would be desirable if some proper optimum set of parameters for smoothing could be identified, but, considering Figure 3, that appears to be impossible. Nevertheless, given proper judgment-based selections of the flexibility parameters, this appears to be a very good, structured, tool for smoothing data.

4. AN EXAMPLE: USING AN ENHANCED GHOST TREND ANALYSIS ON VERY CHALLENGING DATA

Sometimes fitting a curve can be difficult even when the ghost trend process is used. For example, in process of preparing a separate article related to assessing transfer of risk, Boor 2021, it was necessary to use an aggregate loss distribution with the number of claims generated by a Poisson random variable with a mean of five hundred and the severity distribution of each claim following a Pareto distribution with an alpha value of 1.5 and truncation point of 100,000. The simulation^[2] was not overly hard to generate. However, the results of the simulation, graphed in Figure 7, and supported by the data in Table 3, are fairly “bumpy ,” even after combining thirty thousand trials into one hundred bands. Due to the large number of bands, only the first and last ten rows of the table are shown

Table 3.Bucketed Raw Data From Monte Carlo Simulations.

Top of Histogram "Bucket"	Number of Claims up to Top of Bucket	Number of Claims in Bucket	Relative Frequencies
60,000,000	36	36	0.00120
62,000,000	113	77	0.00257
64,000,000	240	127	0.00423
66,000,000	440	200	0.00667
68,000,000	774	334	0.01113
70,000,000	1,265	491	0.01637
72,000,000	1,866	601	0.02003
74,000,000	2,713	847	0.02823
76,000,000	3,780	1,067	0.03557
78,000,000	5,014	1,234	0.04113
	…
240,000,000	29,706	6	0.00020
242,000,000	29,708	2	0.00007
244,000,000	29,713	5	0.00017
246,000,000	29,720	7	0.00023
248,000,000	29,729	9	0.00030
250,000,000	29,736	7	0.00023
252,000,000	29,740	4	0.00013
254,000,000	29,743	3	0.00010
256,000,000	29,748	5	0.00017
258,000,000	29,756	8	0.00027

Total Probability in this Range			0.9918

Figure 7.Raw Results of Aggregate Loss Simulation.

For reference, the labels correspond to the top ends of the buckets, which each have a width of $2 million. However, as one may see a shift of $1 million to the left would not meaningfully change the appearance of the curve.

(Also, less than 1% of the curve lies in the tail beyond this range. However, due to the skewness of the distribution, graphing that would place most of the attention where the fewest losses are.)

The next step is of course to use the ghost trend approach. The approach is identical to Table 3, but with more data and different values for the constants. Note that as the calculations unfold the rationale behind the specific constants used will become clear. The resulting graph (including the raw data it began with) is in Figure 8.

Figure 8.Comparison of Results of Initial Ghost Trend -Generated Curve to Raw Data.

As one may see, in this case, and with the given assumptions the ghost trend approach alone does not match this numerous and volatile raw data very well. The associated calculations are shown, again for the first and last ten rows, in Table 4.

Table 4.Ghost Trend Applied to Bucketed Raw Data.

	Reference Values
	${\ A.\ \ \tau}^{2}$ =		0.005
	B. $\delta^{2}\$ =		0.005
(1)	(2)	(3)	(4)	(5)	(6)	(7)	(8)	(9)
Data	Data	Data	To Minimize C.	[((4)-(2))^2]/(3)	(4)-Prev,(4)	To Min. C.	[((7)- Prev.(7)))^2]/B.	[((7)-(6))^2]/A.
Top	Observed	Binomial-Based	Estimated		Observed		Normalized	Similar
Of Each	Frequencies	Variance	True Frequency	Normalized	Step-by-Step	Ghost	Ghost Trend	Squared Diff
Bucket	$S_{i}$	$\sigma_{i}^{2}$	$p_{i}$	Fit Error	Difference in p’s	Trend	Squared Diff	$D_{i}$ from $G_{i}$
(Data)	(Data)	(Data)	(To Minimize $C$ )	${(p_{i} - S_{i})}^{2}/\sigma_{i}^{2}$	$D_{i} = p_{i} - p_{i - 1}$	(To min. $C.$ )	${(G_{i} - G_{i - 1})}^{2}/\delta^{2}$	$\tau^{2}$ normalized

60,000,000	0.00120	35.664	0.0215	1.154E-05
62,000,000	0.00257	76.178	0.0223	5.116E-06	8.237E-04	8.211E-04		7.082E-12
64,000,000	0.00423	125.434	0.0231	2.848E-06	8.247E-04	8.202E-04	7.622E-13	2.001E-11
66,000,000	0.00667	197.051	0.0240	1.516E-06	8.194E-04	8.153E-04	2.361E-11	1.676E-11
68,000,000	0.01113	327.595	0.0248	5.669E-07	8.088E-04	7.984E-04	2.884E-10	1.097E-10
70,000,000	0.01637	479.036	0.0256	1.761E-07	7.895E-04	7.790E-04	3.738E-10	1.105E-10
72,000,000	0.02003	584.170	0.0263	6.733E-08	7.544E-04	7.548E-04	5.866E-10	1.211E-13
74,000,000	0.02823	816.392	0.0270	1.752E-09	7.327E-04	7.211E-04	1.136E-09	1.337E-10
76,000,000	0.03557	1020.681	0.0277	6.021E-08	6.898E-04	6.806E-04	1.643E-09	8.620E-11
78,000,000	0.04113	1173.618	0.0284	1.389E-07	6.362E-04	6.282E-04	2.741E-09	6.336E-11
				…
240,000,000	0.00020	5.950	0.0002	1.327E-10	1.639E-05	1.802E-05	1.950E-11	2.659E-12
242,000,000	0.00007	1.984	0.0002	1.435E-08	7.253E-06	8.141E-06	9.754E-11	7.873E-13
244,000,000	0.00017	4.959	0.0002	7.363E-10	-8.264E-06	-2.726E-06	1.181E-10	3.067E-11
246,000,000	0.00023	6.941	0.0002	8.163E-11	-1.756E-05	-1.790E-05	2.302E-10	1.136E-13
248,000,000	0.00030	8.924	0.0002	1.788E-09	-3.586E-05	-3.131E-05	1.800E-10	2.064E-11
250,000,000	0.00023	6.941	0.0001	1.371E-09	-3.789E-05	-3.332E-05	4.030E-12	2.085E-11
252,000,000	0.00013	3.967	0.0001	1.000E-10	-2.237E-05	-2.033E-05	1.687E-10	4.156E-12
254,000,000	0.00010	2.975	0.0001	1.503E-10	7.733E-06	-2.647E-06	3.128E-10	1.078E-10
256,000,000	0.00017	4.959	0.0001	2.022E-10	1.385E-05	2.570E-06	2.723E-11	1.273E-10
258,000,000	0.00027	7.933	0.0001	1.867E-09	9.954E-06	9.669E-06	5.039E-11	8.119E-14

Column Totals	0.99187			3.027E-05			4.544E-07	1.265E-08

	Adjustment Factors			1			2.000E+03	2.000E+03
				(no change)			(1/ $\delta^{2}$ )	(1/ $\tau^{2}$ )
	Final Components			3.027E-05			9.089E-04	2.530E-05

			C. Sum of All Components (Value to Minimize)			9.644E-04

Table 4 partially explains the poor fit in this case. Although the trend values are modified by values of $\mathbf{\tau}^{\mathbf{2}}$ and $\mathbf{\delta}^{\mathbf{2}}$ , which may be titrated up and down for the desired degrees of “stiffness,” the impact of the fit error column (5) is greatly affected by the variances in column (3). Further, the sum of column (5) is much larger than that of column (8), but the effect of column (8) is increased by an “adjustment factor” (discussed below) of 2000, the trend controls in column (8) have roughly the same impact as the fit (or accuracy) controls. So, this may be thought of as a fifty/fifty balance between smoothness and fit accuracy. Note also for reference that the variance-based divisors $\mathbf{\ }\mathbf{\tau}^{\mathbf{2}}$ and $\mathbf{\delta}^{\mathbf{2}}$ are now applied at the bottom of each column rather than used in the calculation of the individual column entries.

As used above, some additional adjustment factors are also used. Certainly, one is needed for the fit error column. As it turns out, though, two additional columns are both helpful in controlling the accuracy and smoothness of the fitted curve. First, a review of column (5) will show that the fit errors that enforce accuracy are much lower in the upper end of the range. Therefore, since small changes in the smoothed values would not generate many changes in the total error value at the bottom of the chart, one might argue that there is less emphasis on accuracy in the upper end of the range. However, readers might desire a smooth curve the works in different contexts with different requirements. So, now there is an additional column, along with its own adjustment factor. In its calculation, the difference between each raw value and $\mathbf{p}$ value pair is then divided by the raw value before the result is squared and divided by the variance. That will give more weight to the smaller values, for a more consistent fit.

Another adjustment is included in this version. Noticing the sort of “granular bumpiness” (a high degree of small, high slope, oscillations near the tail), a direct control against abrupt changes in the slope is now included in the new column (8). It is simply the squared difference between each slope $\mathbf{G}_{\mathbf{i}}$ and the previous slope $\mathbf{G}_{\mathbf{i - 1}}$ . This penalizes abrupt changes in the trend/slope. The adjustment factor applied to the total of those values titrates its influence on the fitted curve.

Those adjustments result in the curve in Figure 9. Note that the general shape is somewhat close to acceptable, but there is still so much bumpiness that, within the scale of the graph, it cannot be distinguished from the raw data.

Figure 9.Comparison of Results of Ghost Trend with Three Adjustments to Raw Data.

That curve is generated by the process in Table 5. Again, in this case, all the normalizing divisions, by $\mathbf{\tau}^{\mathbf{2}}\mathbf{,}\mathbf{\ }\mathbf{\delta}^{\mathbf{2}}\mathbf{,}$ etc. take place at the bottom of the Table.

Table 5.Enhanced Ghost Trend Applied to Bucketed Raw Data.

	Reference Values
	A. $\ \tau^{2}\ \$ =			0.0005
	B. $\mathbf{\ }\mathbf{\delta}^{\mathbf{2}}$ =			0.0005
	C. λ² =			1
(1)	(2)	(3)	(4)	(5)	(6)	(7)	(8)	(9)	(10)	(11)
Data	Data	Data	To Minimize D.	[((4)-(2))^2]/(3)	[{((4)-(2))/(2)}^2]/(3)	(4)-Prev.(4)	((7)-Prev.(7))^2	To Min. D.	[((9)-Prev.(9))^2]/B.	[((9)-(7))^2]/A.
Top	Observed	Binomial-Based	Estimated		Normalized	Observed in	Square of		Normalized	Similar
of Each	Frequencies	Variance	True Frequency	Normalized	and Relative	Step-by-Step	Differences in Trend	Ghost	Ghost tend	Squared Diff
Bucket	$S_{i}$	$\sigma_{i}^{2}$	$p_{i}$	Fit Error	Fit Error	Difference in p’s	From Step to Step	Trend	Squared Diff	$D_{i}$ from $G_{i}$
(Data)	(Data)	(Data)	(To Minimize D.)	${(p_{i} - S_{i})}^{2}/\sigma_{i}^{2}$	[ ${(p_{i} - S_{i})/S_{i}\rbrack}^{2}/\sigma_{i}^{2}$	$D_{i} = p_{i} - p_{i - 1}$	[ ${(D_{i} - D_{i})}^{2}\rbrack/\lambda ²$	(To Minimize D.)	${(G_{i} - G_{i - 1})}^{2}/\delta^{2}$	$\tau^{2}$ normalized


60,000,000	0.00120	35.664	0.0012	2.787E-18	1.93562E-12
62,000,000	0.00257	76.178	0.0026	1.506E-17	2.28535E-12	1.367E-03		-1.018E-05		1.896E-06
64,000,000	0.00423	125.434	0.0042	4.839E-17	2.6999E-12	1.667E-03	3.238E-02	7.193E-04	5.321E-07	8.974E-07
66,000,000	0.00667	197.051	0.0067	3.864E-18	8.69503E-14	2.433E-03	9.932E-02	-1.062E-05	5.327E-07	5.973E-06
68,000,000	0.01113	327.595	0.0111	9.869E-20	7.96214E-16	4.467E-03	2.072E-01	-1.125E-05	4.011E-13	2.005E-05
70,000,000	0.01637	479.036	0.0164	1.115E-16	4.16301E-13	5.234E-03	2.147E-02	-1.410E-05	8.131E-12	2.754E-05
72,000,000	0.02003	584.170	0.0200	1.496E-17	3.72645E-14	3.666E-03	8.967E-02	-1.742E-05	1.099E-11	1.357E-05
74,000,000	0.02823	816.392	0.0282	3.260E-17	4.09011E-14	8.200E-03	3.057E-01	-2.125E-05	1.468E-11	6.759E-05
76,000,000	0.03557	1020.681	0.0356	6.132E-17	4.84768E-14	7.333E-03	1.119E-02	-1.895E-05	5.296E-12	5.405E-05
78,000,000	0.04113	1173.618	0.0411	1.512E-17	8.93808E-15	5.567E-03	5.799E-02	-1.425E-05	2.204E-11	3.115E-05
					…
240,000,000	0.00020	5.950	0.0002	2.490E-17	6.22544E-10	-6.668E-05	2.778E+00	-1.693E-05	1.397E-09	2.474E-09
242,000,000	0.00007	1.984	0.0001	7.705E-17	1.73352E-08	-1.333E-04	9.985E-01	-1.159E-05	2.851E-11	1.481E-08
244,000,000	0.00017	4.959	0.0002	1.709E-23	6.15376E-16	9.999E-05	5.444E+00	-1.232E-05	5.314E-13	1.261E-08
246,000,000	0.00023	6.941	0.0002	2.284E-20	4.19567E-13	6.667E-05	1.111E-01	-1.604E-05	1.382E-11	6.841E-09
248,000,000	0.00030	8.924	0.0003	1.006E-24	1.11793E-17	6.667E-05	1.405E-10	-1.682E-05	6.048E-13	6.970E-09
250,000,000	0.00023	6.941	0.0002	8.766E-22	1.61011E-14	-6.667E-05	4.000E+00	-1.738E-05	3.185E-13	2.429E-09
252,000,000	0.00013	3.967	0.0001	2.141E-19	1.20438E-11	-1.000E-04	2.500E-01	-1.544E-05	3.773E-12	7.150E-09
254,000,000	0.00010	2.975	0.0001	2.233E-22	2.23296E-14	-3.333E-05	4.000E+00	-1.455E-05	7.875E-13	3.527E-10
256,000,000	0.00017	4.959	0.0002	2.246E-24	8.08513E-17	6.667E-05	2.250E+00	-1.554E-05	9.695E-13	6.758E-09
258,000,000	0.00027	7.933	0.0003	7.505E-23	1.05533E-15	1.000E-04	1.111E-01	-2.171E-05	3.805E-11	1.481E-08

Column Totals	0.991866667			7.881E-08	3.619E-01		9.455E+03		1.595E-06	4.424E-04

	Adjustment Factors			1E-22	1		1.000E+00		2.000E+03	2.000E+03
				(zeroed out)	(no change)		(1/λ²)		(1/ $\mathbf{\delta}^{\mathbf{2}}$ )	(1/ $\tau^{2}$ )
	Final Components			0.000E+00	3.619E-01		9.455E+03		3.190E-03	8.848E-01

		D. Sum of All Components (Value to Minimize)					9.457E+03

To finish the curve, a different smoothing process, centered five-point averaging, was used. That produced the quite acceptable curve in Figure 10.

Figure 10.Comparison of Results of Five Point Averaged Enhanced Ghost Trend Curve with Raw Data.

For reference, a curve comparing the fit, with the scale above, of straight five-point averaging to this process is provided in Figure 11.

Figure 11.Comparison of Results of Five Point Averaged Enhanced Ghost Trend Curve with Raw Data.

As one may, due to the relatively small changes from point to point and the large volume of points, straight five-point averaging is almost or as good as the ghost trend+ process. However, the example is good as an illustration of the process, if not for optimizing the result.

For an example where the ghost trend+ is clearly superior, one need only look at the same situation, only with 2000 points in the sample rather than 30,000.

Figure 12.Comparison of Results of Fully Enhanced 2000 Samples Ghost Trend Curve with Five Point Averaged Raw Data, Raw Data, and 30,000 Samples Ghost Trend+ Curve.

There are several curves in the graph, but a few things are visible

The final ghost trend+ is smooth and fits the data well;

It also is fairly close to the presumably-more-accurate curve resulting from 30,000 samples of the underlying distribution; and,

Five-point averaging on this raw data does not result in a smooth curve.

So, one may conclude that this enhanced ghost trend process can be quite useful in the right circumstances.

5. WHAT IF THE DISTANCES BETWEEN THE POINTS VARY FROM POINT TO POINT?

It is fairly common to break down data into categories such as “under $5,000 ,” “$5,000-$9,999 ,” “10,000-$24,999 ,” “$25,000-$50,000 ,” and "over “50,000 .” In that example, one could attempt to fit a smooth curve to values corresponding to the points 2,500, 7,500, 17,500, 37,500, and 100,000 (making judgmental selections for the points at the bottom and top). Then the spacing between the points is 5,000, 10,000, 20,000, and 62,500. In other words, they are very unequally spaced. One might expect the ghost trend to change a lot more between 37,500 and 100,000 than between 2,500 and 7,000, depending on the appearance of the data that is involved. However, the more important question is how it changes between the adjacent intervals. For example, how does it change between the interval from 17,500 to 37,500 and the interval from 37,500 to 100,000?

Since Brownian motion would say that the variance between values is proportional to the distance between the points they correspond to, it seems logical that the value of $\mathbf{\delta}^{\mathbf{2}}$ be multiplied by the distance between the midpoints of the intervals (100,000+37,500)/2 – (37,500+17,500)/2 = 68,750-27,500 = 41,250. Thus, the variance between two adjacent $\mathbf{G}_{\mathbf{i}}$ 's would be proportional to 41,250, in effect $\mathbf{41,250\ \times \ }\mathbf{\delta}^{\mathbf{2}}$ . Then, that revised variance would be used in the denominator of the computations in the “Normalized Ghost Trend Squared Diff” instead of just the overall $\mathbf{\delta}^{\mathbf{2}}$ , associated with this smoothing process. In fact, one of those “scaling” parameters must be used for each $\mathbf{G}_{\mathbf{i}}$ . It could also be logical to apply the same scaling within the $\mathbf{\tau}^{\mathbf{2}}\mathbf{\ }$ terms as well. As one may imagine, the large 41,250 multiplier suggests that a much lower value of $\mathbf{\delta}^{\mathbf{2}}$ should be used. It also may be useful to visually compare these scaled $\mathbf{p}_{\mathbf{i}}$ 's to those based on equal variances for the changes in ghost trend. When appropriate, this scaling process can be a useful tool.

4. SUMMARY

Even the simpler approach is of value in smoothing data with a great deal of variance. However, by carefully choosing the parameters in the ghost trend model, one may convert data with a lot of process error into a smooth curve that does a quality job reflecting the data. The enhanced process expands that to cover a wider variety of scenarios.

For reference, data is typically provided in buckets to expedite processing, e.g., “claims on policies with amounts of insurance between $45,000 and $55,000,” but curves are usually fit to points like “$50,000 .”
The simulation was done using the NTRAND implementation of the Mersenne Twister and standard spreadsheet functions. The author notes that real world situations would often add parameter variance, but this approach is suitable in context.

Credibility Based Smoothing Using Ghost Trend

Abstract

1. INTRODUCTION

2. THE MODEL

2. THE SIMPLER MODEL-CONSTANT UNDERLYING TREND

3. INCLUDING THE GHOST TREND

4. AN EXAMPLE: USING AN ENHANCED GHOST TREND ANALYSIS ON VERY CHALLENGING DATA

5. WHAT IF THE DISTANCES BETWEEN THE POINTS VARY FROM POINT TO POINT?

4. SUMMARY

References

Credibility Based Smoothing Using Ghost Trend

Abstract

1. INTRODUCTION

2. THE MODEL

2. THE SIMPLER MODEL-CONSTANT UNDERLYING TREND

3. INCLUDING THE GHOST TREND

4. AN EXAMPLE: USING AN ENHANCED GHOST TREND ANALYSIS ON VERY CHALLENGING DATA

5. WHAT IF THE DISTANCES BETWEEN THE POINTS VARY FROM POINT TO POINT?

4. SUMMARY

References

This website uses cookies