Method Article
We depict a multistage method to measure a cohort effect with age data, thereby allowing data to be eliminated in many situations without sacrificing data quality. The protocol demonstrates the strategy and provides a weighted regression model for analyzing the hepatocellular carcinoma data.
To eliminate the influence of age and period in age cycle contingency table data, a multistage method was adopted to evaluate the cohort effect. The most general primary malignant tumor of the liver is hepatocellular carcinoma (HCC). HCC is associated with liver cirrhosis with alcohol and viral etiologies. In epidemiology, long-term trends in HCC mortality were delineated (or forecasted) by using an age-period-cohort (APC) model. The HCC deaths were determined for each cohort with its weighted influence. The confidence interval (CI) of the weighted mean is fairly narrow (compared to the equally weighted estimates). Due to the fairly narrow CI with less uncertainty, the weighted mean estimation was used as a means for forecasting. With the multistage method, it is recommended to use weighted mean estimation based on a regression model to evaluate the cohort effect in the age-period contingency table data.
The most common primary malignant tumor of the liver is hepatocellular carcinoma (HCC). Its mortality rate ranks fifth in men and eighth in women (6% of men and 3% of women) 1 among all malignant tumors worldwide. In Taiwan, it is the most common cancer in men and the second most common cancer in women (21.8% of men and 14.2% of women) 2. It is estimated that since 2000, the annual number of HCCs diagnosed worldwide is 564,000, among which 398,000 are men and 166,000 are women 3. In epidemiology, the most common way to explain the relationship between age, period, and cohort (APC) variables is that age and period influence each other to create a unique generational experience for the disease trend investigated.
Even though this conceptualization still has a precise linear connection of age + cohort = period, exposure (predictor) is not an inherent factor in a birth cohort. Instead, we propose that when changes cause different distributions of disease, there is a cohort effect. Nevertheless, since age + cohort = period, these three variables are linearly related; only if other restrictions are enforced is it impossible to generate an estimated age-period-cohort (APC) model using the linear effects of age, period, and cohort. In this study, we clarified this problem and the potential restrictions we imposed in our previous publications 4,5,6,7.
With the slightest conjectures about the contingency table data, the multistage method 8 provides three stages to evaluate the cohort effect. In addition, since median polish does not depend on a specific distribution or framework, it was used for various types of data, such as ratios, logarithmic ratios, and counts. Median polishing is the prime technique used in the multiphase method.
Data from a two-way contingency table 9 were used to generate the development of the polished median. The median polishing procedure is used to eliminate the cumulative effects of age (i.e., row) and period (i.e., column) by iteratively subtracting the median from each row and each column. This procedure is often used in epidemiological data analysis 10. One advantage of this technique is that no assumptions about the distribution or structure of the data in the bidirectional contingency table are required. Therefore, this technique was broadly utilized for any type of data contained in the table, such as suicide data 11. The APC model has also been used to describe the long-term trends of disease incidence or mortality 5. APC models often assume that age, period, and cohort have additive effects on the logarithmic transformation of disease/mortality. To evaluate cohort effects, the described protocol generates an APC model for complete hepatocellular carcinoma (HCC) mortality analysis with weighted regression, thereby supporting reliable predictions and moderate assessment of treatment effects.
1. Data sources
To demonstrate the calculations, we used annual data on HCC mortality from 1976 to 2015 for men and women in Taiwan. Statistical package for social sciences (SPSS) version 24.0 for Windows and Microsoft Excel were used to execute the protocols for this study.
2. Model setting
NOTE: The multistage method was proposed by Keys and Li 8 with graphical investigation. A median polish analysis was performed to eliminate the cumulative effects of age and period; finally, these residuals from the median polish phase in the cohort category in the linear regression model were regressed, and cohort effects using data in the contingency table were evaluated.
The mortality data were demonstrated for 10 five-year age groups (40-44, 45-49, 50-54, 55-59, 60-64, 65-69, 70-74, 75-79, 80-84, and 85+) and 8 five-year time periods (1976-1980, 1981-1985, 1986-1990, 1991-1995, 1996-2000, 2001-2005, 2006-2010 and 2011-2015). The number of cohort groups was selected by subtracting one from the total number of age-period groups: 10 (five-year age groups) + 8 (five-year time periods) -1 = 17 birth cohorts, with the birth cohort groups denoted by mid-cohort years as 1891, 1896, 1901, 1906, 1911, 1916, 1921, 1926, 1931, 1936,1941, 1946, 1951, 1956, 1961, 1966, and 1971. We provide a format of the age group contingency of men and women with HCC (in Supplementary Table 1). Figure 1 and Figure 2 show the HCC mortality rates within the age and period groups. These fluctuations were more substantial among men than among women. The ratio based on the age distribution shows that at the lower end of the 40-44 age group, the HCC mortality rates are increasing (Figure 1). In contrast, Figure 2 clearly shows that the HCC mortality rates gradually increased in the ≥60 age group. However, the HCC mortality rates based on age have changed substantially over time, which means that a remarkable cohort effect hidden in the normal age-period cross-classified Vital Statistics table will not become apparent until some point in the future.
We implemented the median polishing procedure on log-transformed HCC mortality rates. The estimated cohort effects for the APC model of the HCC mortality rates for men and women are shown in Tables 1 and 2, respectively. In addition, the calculation of the two gender-weighted average procedures before obtaining the weighted estimates is reported in Tables 1 and 2. The weighted estimates better conform to the data than the previous estimated cohort effects, based on the minimum deviation (comparing equally weighted estimates) of the confidence interval (CI) of the weighted estimates.
For men, the left panel of Table 1 shows the cohort effects of the birth cohorts. The cohort effect increases from 0.73 (the earliest cohort effect in 1891) to 1.20 (the greatest cohort effect in 1936). For women, the cohort effect increases from 0.68 (the earliest cohort effect in 1891) to 1.35 (the greatest cohort effect in 1936). It is important to emphasize that compared to the 1891 cohort, the cohort effect for men and women increased by approximately 64% and 98%, respectively. The increase was equally distributed in the right panel of Table 2. Here, the cohort effect increased from 0.71 (the earliest cohort effect in 1891) to 1.11 (the greatest cohort effect in 1936). For women, a similar effect of increased distribution is demonstrated in the right column of Table 2. The cohort effect increased from 0.64 (the earliest cohort effect in 1891) to 1.11 (the greatest cohort effect in 1926). Therefore, compared to the earliest cohort of men and women, we observed an increase in mortality rates of approximately 57% and 73%, respectively.
Among the birth cohorts, men born in approximately 1936 showed the highest risk for HCC mortality (Supplementary Table 1). Therefore, for the weighted estimates, the impact of the birth cohort in 1936 compared to that of the reference birth cohort in 1921 was 1.11 (95% CI: 1.08-1.14). In contrast, the earlier cohort of 1891 showed a sharply increasing trend. In addition, the effects were reversed after the 1936 cohort. In Table 1, compared with the reference birth cohort of 1916, the weighted effect was 1.11 (95% CI: 1.07-1.16). In addition, with men and women, we modeled equally weighted and weighted cohort effects, respectively, with 95% confidence intervals. Both figures show that the equally weighted cohort effects are broader than almost all of the widths of the 95% CIs.
Figure 1. HCC mortality rates per 100,000 by age and period, men, Taiwan, 1976-2015. Please click here to view a larger version of this figure.
Figure 2. HCC mortality rates per 100,000 by age and period, women, Taiwan, 1976-2015. Please click here to view a larger version of this figure.
Figure 3. Age-adjusted mortality rate of death from hepatocellular carcinoma for men and women in Taiwan. Please click here to view a larger version of this figure.
Unweighted | Weighted | |||
Effects | 95% CI for Effects | Effects | 95% CI for Effects | |
Cohort | ||||
(1891~1971) | ||||
1891 | 0.73 | 0.59-0.90 | 0.71 | 0.57- 0.88 |
1896 | 0.88 | 0.79-0.99 | 0.87 | 0.78- 0.97 |
1901 | 0.89 | 0.83-0.96 | 0.81 | 0.71- 0.92 |
1906 | 0.91 | 0.86- 0.97 | 0.85 | 0.78- 0.94 |
1911 | 0.95 | 0.90-1.00 | 0.89 | 0.83- 0.96 |
1916 | 1.01 | 0.97-1.06 | 0.99 | 0.95- 1.03 |
1921 | 1 | REF | 1 | REF |
1926 | 1.04 | 1.00-1.08 | 1.03 | 1.01- 1.06 |
1931 | 1.1 | 1.06-1.14 | 1.08 | 1.06- 1.11 |
1936 | 1.2 | 1.15- 1.24 | 1.11 | 1.08- 1.14 |
1941 | 1.14 | 1.09- 1.19 | 1.1 | 1.07- 1.13 |
1946 | 1.04 | 1.00-1.09 | 1.06 | 1.04- 1.09 |
1951 | 0.91 | 0.87-0.96 | 1 | 0.98- 1.03 |
1956 | 0.87 | 0.82-0.92 | 0.96 | 0.93- 0.98 |
1961 | 0.82 | 0.76-0.88 | 0.88 | 0.85- 0.92 |
1966 | 0.76 | 0.68- 0.85 | 0.79 | 0.74- 0.83 |
1971 | 0.71 | 0.57-0.87 | 0.83 | 0.80- 0.87 |
Note: REF = reference; CI = confidence interval. |
Table 1. Estimated rate ratios and 95% conference intervals for the effect of birth cohort on hepatocellular carcinoma mortality of men in Taiwan, 1891-1971.
Unweighted | Weighted | |||
Effects | 95% CI for Effects | Effects | 95% CI for Effects | |
Cohort | ||||
(1891~1971) | ||||
1891 | 0.68 | 0.42- 1.10 | 0.64 | 0.38-1.09 |
1896 | 0.81 | 0.63-1.04 | 0.75 | 0.56- 1.00 |
1901 | 0.8 | 0.67- 0.95 | 0.7 | 0.52- 0.94 |
1906 | 0.83 | 0.72- 0.95 | 0.76 | 0.65- 0.88 |
1911 | 0.88 | 0.78- 0.99 | 0.85 | 0.78- 0.93 |
1916 | 1 | REF | 1 | REF |
1921 | 1.12 | 1.01-1.24 | 1.08 | 1.03- 1.13 |
1926 | 1.29 | 1.17-1.42 | 1.11 | 1.07- 1.12 |
1931 | 1.3 | 1.18-1.43 | 1.1 | 1.05- 1.15 |
1936 | 1.35 | 1.22-1.49 | 1.1 | 1.04- 1.14 |
1941 | 1.19 | 1.07-1.32 | 1.09 | 1.03-1.13 |
1946 | 1.05 | 0.94-1.17 | 1.06 | 1.02-1.11 |
1951 | 0.83 | 0.73-0.94 | 1 | 0.96-1.05 |
1956 | 0.67 | 0.58-0.77 | 0.93 | 0.89-0.98 |
1961 | 0.58 | 0.49-0.70 | 0.79 | 0.74-0.84 |
1966 | 0.59 | 0.46-0.75 | 0.58 | 0.49-0.69 |
1971 | 0.63 | 0.40-1.02 | 0.64 | 0.58-0.72 |
Note: REF = reference; CI = confidence interval. |
Table 2. Estimated rate ratios and 95% conference intervals for the effect of birth cohort on hepatocellular carcinoma mortality of women in Taiwan, 1891-1971.
Supplementary Table 1. Please click here to download this table.
Due to the time trend of HCC mortality, conventional models underestimate some important features hidden in the data (such as cohort effects), and conventional analyses that use simple linear extrapolation of the observed logarithmic age correction rate show significantly reduced accuracy in their predictions. It is clear that this trend has continued for 35 years and will trend upwards in the next few years if we directly observe the long-term trend of HCC mortality in Taiwan from 1976 to 2015 (Figure 3). Indeed, the recent trend of HCC mortality in Taiwan is declining and is driven by the cohort effect (determined by APC analysis), which, as mentioned earlier, declined after the 1936 cohort. This study shows that the application of the APC model provides advanced and more accurate warnings about trend changes.
From a clinical perspective, there are approximately two billion people infected by hepatitis B virus (HBV) 12, and approximately 350 million people suffer as a result. Consequently, this is a significant health problem with high morbidity worldwide. HBV infection causes a wide range of clinical problems, including ineffective carrier status to fulminate hepatitis, cirrhosis, or hepatocellular carcinoma. The most effective prevention method is to inoculate individuals with hepatitis B vaccine. Taiwan implemented the first global hepatitis B mass vaccination plan in 1984 13. In this program, pregnant women were screened for hepatitis B surface antigen (HBsAg) and hepatitis B envelope antigen (HBeAg) 14. For the first two years of this program, the immunization program only covered babies of mothers with HBsAg. However, beginning in the third year of the vaccination program, all babies were covered. The coverage rate of the hepatitis B vaccine has reached 99% in recent years 15. Nearly 90% to 95% of people will experience lifelong immunity after receiving the three doses of the vaccine. We emphasize that the decline in pediatric HCC in Taiwan is largely attributed to this global vaccination program.
The APC modeling depicted in this article provides advanced warning about these (increased) trend changes (which will decrease in the near future).When comparing the trend of cohort effects (Tables 1 and 2) and age-adjusted mortality (Figure 3), the direct age-adjusted mortality rate (or age standardization mortality rate) was the same as the weighted average. It weights the age-mortality rates by the proportion of the age group of interest based on the 2000 World Standard Population 16 in this study. As the validation of cohort effects dominated the latest pattern of HCC mortality, we calculated the age-adjusted HCC mortality for the up-to-date data (until 2011-2015). We interpreted this to mean that the weighted mean estimate of the cohorts provides reliable information while research is prepared to forecast future HCC mortality. Details on forecasting HCC mortality are available in our previous study 5.
A general hypothesis is that each value within the data provides equal information to evaluate the parameters in a model. This approach has been used in most modeling methods (such as linear or nonlinear regression models) and means that the standard deviation of the error term is the constant underlying predictor variable. However, according to our literature review, this hypothesis is not suitable for modeling to empirically estimate parameters. The unknown parameters are estimated when we use weighted regression, which generates a smaller weight with less accuracy for data points and a large weight with more accuracy for data points. The weighting process diminished the standard deviation of the estimator. Nonetheless, the shortcomings of the weighted regression method are almost unknown in empirical practice. Since the exact weights are not known, the estimated weights were used to estimate the parameters. In addition, previous experience has demonstrated that the weighting based on estimation does not significantly change or usually affects regression analysis or its interpretation 17. Hypothetically, the APC model can be fitted to any disease in which the incidence is affected by age, period, and cohort. In addition, weighted mean estimates were made available for prediction 18,19,20. If the CI is comparatively narrow, the uncertainty is small. In view of the fact that the CI depicts the uncertainty inherent in this type of evaluation and the values within, we generally conclude that using CI has a substantial impact.
Transcatheter arterial chemoembolization (TACE) is one of the most efficient methods to clinically control HCC. However, it is difficult to choose this method as the prime or auxiliary therapy, as it does not require open surgery. The liver typically provides 75% of the blood and nutrients through the hepatic portal vein, while the hepatic artery provides 25% of the blood and nutrients. In contrast to the hepatic artery blood extracted from most HCCs, this fluid increases rapidly and rarely comes from the hepatic portal vein. Moreover, this effect is well suited to TACE because primary liver cancer rarely metastasizes to other parts of the body. Even though hepatocellular malignant tumors are unlikely to metastasize, they are difficult to eradicate. In clinical practice, follow-up for HCC patients is conducted every two to three months. Once abnormal elevation of alpha-fetoprotein (AFP) or an abnormal ultrasound check is detected, computed tomography and magnetic resonance imaging are performed. If a new tumor is discovered, then TACE will be considered. New biomarkers have also been developed to detect the recurrence of HBV-related HCC, such as the HBV DNA quantitative-time index (HDQTI) 21. The product of the follow-up results and the logarithm of the detected to normal HBV DNA load ratio is the summation of the HDQTIs. The HDQTI is used as an independent prognostic indicator of HBV-related HCC recurrence 21.
Our study has several limitations. First, we merely hypothesized regarding the etiologies of the observed changes. Using the APC model, HCC mortality according to age, period, and cohort effects was reconsidered. Nevertheless, in this study, we used the median polishing setting as an assumption. Second, APC analysis has been widely used in the field of epidemiology in developing or recently developed countries for long-term cohort studies. Third, we did not have information from the accumulated format data set to adjust for confounding factors in the APC model, such as comorbidities or lifestyle. Isolated data are needed for future research to resolve this limitation. Fourth, to modify the regression procedure in the multistage method, we used the number of deaths due to HCC as the weight. Since the exact weight is not known, the use of various weights causes slight inflation within the estimated cohort effects. Ultimately, there are various APC estimation methods to solve the unrecognizable problem (for example, Holford uses linear and curvature trends to solve the unrecognizable problem 22). At the same time, the median polish provides complex assumptions in the form of conceptual conversion among APC models to evaluate the cohort effect with the fewest assumptions and easily applies a common format for contingency tables.
Overall, the weighted mean effect with a comparatively narrow CI of each cohort was then allowed by the weighted estimation to modify the regression model. Briefly, for multistage methods, it is advisable to use weighted estimates of regression models to evaluate the cohort effects in age-period contingency table data.
The authors have nothing to disclose.
This work was supported by Taipei Tzu Chi Hospital TCRD-TPE-109-RT-8 (2/3) and TCRD-TPE-109-39 (2/2).
Name | Company | Catalog Number | Comments |
not applicable | not applicable | not applicable | not applicable |
Request permission to reuse the text or figures of this JoVE article
Request PermissionThis article has been published
Video Coming Soon
Copyright © 2025 MyJoVE Corporation. All rights reserved