Our calculations explained

Age-period cohort models for projecting cancer incidence and mortality

Age-period-cohort modelling

Age-period-cohort models were used to calculate these projections. This approach assumes that the probability that someone will get cancer, or die of cancer, will vary by their age, the year that they are born in (period) and which birth cohort (e.g., generation) they are in.

Age effects relate mainly to the biological process of ageing – older people are more likely to get cancer as their cells have had more time to accumulate damage. Period effects refer to changes at a specific time period which affect (albeit to varying degrees) all age groups alive at that time, for example the advent of smoking, or the introduction of tamoxifen. Cohort effects relate to changes which may happen at a specific time but affect each generation alive at that time differently, for example some generations may have started smoking younger than others. Age, period and cohort effects interact with one another, and age-period-cohort models are able to account for all those effects when projecting future cancer incidence and mortality. These models were fitted to observed cancer incidence and mortality rates, with parameters for each of these age, period and cohort effects. The model fitted to the data calculates a trend which is then used to extrapolate the data into the future. The results are projected age-specific incidence rates per 100,000 people, split by sex, five-year age band, cancer site and UK nation. These are weighted to the European standard population to create age-standardised rates that are appropriate for comparison over time and between nations, as they account for differences in population structure.

Models were fit separately for each UK nation and projected cases or deaths were combined to create UK totals. For some cancer sites, a low number of observed cases or deaths in individual nations may have led to unreliable projections. To avoid this, models were fitted to UK-wide data for these sites, and projected cases or deaths for individual nations were calculated as a proportion of national populations. Cancer sites with low numbers of cases even at UK level were grouped together as ‘Other cancers’ and were projected by applying 2014-2018 age- and sex-specific incidence and mortality rates for this group to each UK nation’s projected population; as these sites individually have quite variable observed trends, it was deemed inappropriate to project them using the age-period-cohort approach. The Northern Ireland Cancer Registry recently published projections for cancer incidence in Northern Ireland [1] and kindly provided Cancer Research UK (CRUK) with their data. Therefore, these projections were used where possible and were supplemented with CRUK’s own, using the above method, for cancer sites not included in the provided data. Projections for cancer mortality in Northern Ireland were also provided by the Northern Ireland cancer registry, using the same method as their incidence projections.

The number of cancer cases and deaths were calculated from the projected age-specific rates by applying those rates to projected populations from Office for National Statistics.

Modifications to the model

The gradient of the projected trend is reduced over time, as it is unrealistic to assume that the same trends will continue forever – otherwise it would be possible to project that rates could exceed 100%. The gradient was reduced by combining observed (mean average of the last five years of observed data) and projected data in a weighted average with progressively more weight given to the observed data. The first projected datapoint was 95% projected and 5% observed data, with the proportions changing incrementally each year until the last projected datapoint was 60% projected and 40% observed data.

The last data point in the historical cancer incidence and mortality trends can have a large impact on the projected trend. This can lead to unreliable projections if there is some variability in recent incidence and mortality rates. To minimise the impact of this, we fit five models for each cancer site, with each model omitting an additional year of historical data and took the median average of these five projected rates.

Impact of risk factors

Risk factors have been modelled implicitly in this analysis. The means that rather than directly adjusting for, say, overweight and obesity rates changing over time, this approach uses the trends seen in the rates of cases and deaths (which are affected by trends in risk factors) to make its projections. The same is true for the effects of new and improved treatments over time on mortality rates, and other variables such as changes to screening and early diagnosis.

The only exceptions to this are smoking rates and HPV vaccination and cervical screening uptake. Smoking rates have been included as an additional parameter for projections of lung cancer incidence and mortality rates due to the close association between smoking and lung cancer. Smoking rates have only been implicitly modelled in projections for other cancer sites for which smoking is a known risk factor. Projected cervical cancer incidence rates are taken from a paper that modelled these rates under the assumption that the nine-valent HPV vaccination would be introduced in 2019, that cervical screening uptake would be 86% and coverage would be at current rates.[2] Projected cervical cancer mortality rates were estimated from projected incidence rates by assuming that the current incidence-to-mortality ratio would continue into the future.

Statistical significance

Confidence intervals are not calculated for the projected figures. Projections are by their nature uncertain because unexpected events in future could change the trend. It is not sensible to calculate a boundary of uncertainty around these already uncertain point estimates. Changes are described as ‘increase’ or ‘decrease’ if there is any difference between the point estimates.

References

Donnelly DW, Anderson LA, Gavin A; Northern Ireland Cancer Registry Group. Cancer Incidence Projections in Northern Ireland to 2040. Cancer Epidemiol Biomarkers Prev 2020.
Castanon A, Landy R, Pesola F, Windridge P, Sasieni P. Prediction of Cervical Cancer Incidence in England, UK, up to 2040, Under Four Scenarios: A Modelling Study. Lancet Public Health 2018.

Last reviewed: 3 February 2023

Attributable risk calculations

Attributable risk is calculated by multiplying the proportion of the population exposed to the risk factor in question (often based on population surveys), by the RR associated with that risk factor (often based on meta-analyses).[1]

Calculating attributable risk/Population attributable fraction

cs_arc_formula.png

Infographic showing the formula to calculate attributable risk

Usually the calculation takes into account a delay (lag) between exposure to the risk factor and cancer diagnosis, e.g. using exposure 10 years ago to calculate PAF for current cancer cases; this is often based on the lag between exposure and cancer diagnosis seen in the study from which the RR is taken.

‘Exposure’ may be defined as any exposure (versus none), or as exposure above/below an optimum level (that level is sometimes defined using Government guidelines).

If a risk factor is known to account for almost all cases of a particular cancer, but prevalence of exposure to that risk factor in the population is not known, then a ‘notional prevalence’ can be calculated. This is done by comparing observed cancer incidence rates in the population overall, with expected cancer incidence rates in an unexposed population.[2]

Each cancer type may have multiple risk factors, but summing the PAFs for all those risk factors would overestimate the total attributable proportion for that cancer type, because there is overlap between exposure to different factors. PAFs for a cancer type can be combined by applying the ‘risk factor B’ PAF only to the proportion of cases not attributable to ‘risk factor A’, and then applying the ‘risk factor C’ PAF only to the proportion of cases not attributable to ‘risk factor A’ or ‘risk factor B’, and so on until all the risk factors have been combined (risk factors can be added in any order).

Combining Population attributable fractions

1.	*Calculate ‘% not attributable to RFA’ (‘not RFA’)*	100% - 10% = 90%
2.	Apply RFB to ‘not RFA’, to get % of ‘not RFA’ which is attributable to RFB (‘not RFA but RFB’)	5% * 90% = 4.5%
3.	Subtract this from ‘not RFA’ to get % not attributable to RFA or RFB (‘not RFA or RFB’)	90% – 4.5% = 85.5%
4.	Apply RFC to ‘not RFA or RFB’, to get % of ‘not RFA or RFB’ which is attributable to RFC (‘not RFA or RFB but RFC’)	3% * 85.5% = 2.565%
5.	Subtract this from ‘not RFA or RFB’ to get % not attributable to RFA or RFB or RFC (‘not RFA or RFB or RFC’)	85.5% – 2.565% = 82.935%
6.	Subtract ‘not RFA or RFB or RFC’ from 100% to get % attributable to RFA or RFB or RFC (‘RFA or RFB or RFC’)	100% – 82.935% = 17.065%

Risk factor A PAF (RFA) = 10%, Risk factor B PAF (RFB) = 5%, Risk factor C PAF (RFC)= 3%

Simply summing would give 18%.

Theoretically all cancer cases attributable to a risk factor could be prevented by removing exposure to that risk factor. However we acknowledge that it is very difficult to completely remove a risk factor at population level, and so the total number of ‘preventable cancer cases’ based on PAFs is a very ambitious target.

PAFs can be expressed as a percentage, a proportion, or an absolute number of cases or deaths.

See also

Want to generate bespoke preventable cancers stats statements? Download our interactive statement generator.

Risk terminology explained

References

Rockhill B, Newman B, Weinburg C. Use and misuse of population attributable fractions. Am J Public Health 1988;88:15-9.
Peto R, Lopez AD, Boreham J, et al. Mortality from tobacco in developed countries: indirect estimation from national vital statistics. Lancet 1992;339(8804):1268-78

Last reviewed: 10 October 2014

Calculating the impact of improved survival

The impact of improved survival is calculated in three parts:

First: the number of patients diagnosed in a particular time period, multiplied by the survival from that same time period – (what has happened?)
Second: multiply the same number of patients diagnosed in the time period, by the survival from a previous time period – (what would have happened?)
Third: subtract the first number from the second to identify how many more people have survived.

(N of patients diagnosed with cancer in most recent time period (T1) x survival in T1 period)

minus

(N of patients diagnosed with cancer in T1 period x survival in a previous time period (T2))

See also

Survival terminology explained

Last reviewed: 27 January 2015

Cancer risk factors evidence

We focus mainly on exposures classified by the International Agency for Research on Cancer (IARC) and/or the World Cancer Research Fund/American Institute for Cancer Research (WCRF/AICR) as being causally linked with the cancer type. IARC and WCRF/AICR evaluations are internationally-recognised, and are considered the gold standard in cancer epidemiology.

IARC and WCRF/AICR base their classifications on reviews of all the available evidence, taking into account the amount, quality and consistency of evidence. Exposures with the strongest evidence are classified as sufficient/Group 1 (IARC) or convincing (WCRF/AICR), and we use these factors in our key statistics. Exposures classified as having weaker evidence are also covered in the in-depth risk factors content.

IARC evaluates evidence on the carcinogenic risk to humans of a number of exposures including tobacco, alcohol, infections, radiation (ionising and ultraviolet), occupational exposures, and medications (including exogenous hormones). WCRF evaluates evidence for other exposures including diet, overweight and obesity, and physical exercise.

Where possible meta-analyses and systematic reviews are cited where available, as they provide the best overview of all available research and most take study quality into account. Individual case-control and cohort studies are reported where such aggregated data are lacking.

World Cancer Research Fund/American Institute for Cancer Research (WCRF/AICR)

Last reviewed: 16 December 2014

Children smokers calculations

The number of new children smokers was calculated by comparing smoking rates of 'current smokers' at each age from the Smoking, Drinking and Drug Use among Young People in England reports with the smoking rates of the same cohort in the year before[1] 'Current smokers' include both regular smokers (one or more cigarettes per week) and occasional smokers (less than one cigarette per week).

For example, if, from a thousand children aged 12 in 2011, 10 smoked regularly, 20 smoked occasionally and 20 used to smoke, and from a thousand children aged 13 in 2012, 30 smoked regularly, 20 smoked occasionally and 40 used to smoke, then we can calculate that there were 20 more smokers in 2012 than 2011 and 20 of the 12 year-old smokers in 2011 have given up. As we know that there are children in each dataset who ‘used to smoke’ we add an equivalent number as an estimate of children who are likely to have started smoking and conclude that there are actually 40 new children smoking.

References

Health & Social Care Information Centre. Smoking, Drinking and Drug Use among Young People in England.

Last reviewed: 10 October 2014

Deprivation gradient

A deprivation gradient shows whether there is a real difference between socio-economic groups on a measure, such as cancer incidence or cancer mortality. The difference between socio-economic groups is typically calculated using age-standardised Open a glossary item rates along with measures of statistical significance (e.g. confidence intervals), to assess whether there is a statistically significant difference between the most deprived group and the least deprived, once differing age profiles across groups are taken into account.

Last reviewed: 31 December 2020

Excess cases or deaths calculations

Excess cases or deaths can be calculated to show the difference between what is observed for a category of groups with differing incidence or mortality rates (e.g. deprivation groups), and what could be observed if there were no differences in rates between these groups. Excess cases or deaths are calculated by multiplying the age-specific crude rate for the reference group (e.g. least deprived) with the underlying populations in the remaining groups within the category (e.g. less deprived, more deprived, most deprived), to produce the expected number of cases/deaths per group. The expected cases for each group are then subtracted from the number of observed cases/deaths in their corresponding group, resulting in either excess (if rates are higher than the reference group) or fewer (if rates are lower than the reference group) cases/deaths.

Last reviewed: 31 December 2020

Gaps calculations

Gaps are calculated to show the differences between two points. It is simply the arithmetic difference between two specified values and is commonly used to show a gap between categories. For example, a deprivation gap is the difference between the least and most deprived group.

Last reviewed: 10 October 2014

Lifetime risk calculations

Methods of calculating lifetime risk

Various methods exist in order to estimate the lifetime risk of developing cancer.
The “cumulative risk” method uses the number of cases of cancer (incidence) and the population estimates for each age.[1] Whilst simple to calculate, this method over-estimates the risk of developing cancer during one’s lifetime as it does not take into account that people die from other causes at different ages.

The “current probability” method (Esteve et al[2]) takes into account that people die from other causes. In addition it uses the number of cases of cancer (incidence), population estimates for each age and data on deaths from all causes (from life tables). This method gives a better estimate of the lifetime risk of cancer than the cumulative risk method produces but it is still an overestimation of the lifetime risk of cancer if the data includes multiple primaries so is best used when these have been excluded from the data.

The “adjusted for multiple primaries (AMP)” method or “Sasieni” method (Sasieni et al[1]) addresses the issues of multiple primary tumours by assuming that the risk of developing a new primary diagnosis of cancer is the same for an individual who has not had a previous diagnosis as it is for an individual who has had a previous diagnosis of cancer, and adjusts the data accordingly.[1] Where possible, we use the AMP method for calculating lifetime risk where subsequent primary tumours are likely (e.g. breast cancer) as this avoids over estimating the lifetime risk of developing cancer.

Data available	Cancer sites	Recommended method
Population and cancer incidence	Any	"Cumulative risk"
Population, all cause death, cancer death, cancer incidence (excluding multiple primary tumours)	Any	"Current probability"
Population, all cause death, cancer death, cancer incidence (including multiple primary tumours)	Sites where multiple primary tumours are not likely	"Current probability"
Population, all cause death, cancer death, cancer incidence (including multiple primary tumours)	All cancers combined, sites where multiple primary tumours are likely	"Adjusted for multiple primaries"

Cohort approach and period approach

The methods above are often calculated using a period approach, that is they are based on current incidence and mortality rates and therefore they assume that the current rates (at all ages) will remain constant.

Our lifetime risk calculations use a cohort approach.[3] The cohort approach uses either known or projected incidence and mortality rates for each age group as it ages. For example, for a cohort born in 1960 the incidence and mortality rate for 25 year olds will be based on data from 1985, rates for 50 year olds will be based on data from 2010 and rates for 75 year olds will be based on projections for 2035.

Our calculations use projected cancer incidence (using data up to 2018) calculated by the Cancer Intelligence Team at Cancer Research UK and projected all-cause mortality (using data up to 2020, with adjustment for COVID impact) calculated by Office for National Statistics. Differences from previous analyses are attributable mainly to the slowing pace of improvement in life expectancy, and also to slowing/stabilising increases in cancer incidence.

References

Sasieni P, Shelton J, Ormiston-Smith N, et al. What is the lifetime risk of developing cancer?: The effect of adjusting for multiple primaries. Brit J Cancer 2011;105(3):460-5
Esteve J, Benhamou E, Raymond L. Descriptive Epidemiology (IARC Scientific Publications No.128), Lyon, International Agency for Research on Cancer, 1994:67-68
Ahmad AS, Ormiston-Smith N, Sasieni PD. Trends in the lifetime risk of developing cancer in Great Britain: Comparison of risk for those born in 1930 to 1960 . Br J Cancer 2015;bjc.2014:606.

Last reviewed: 27 November 2023

Odds ratio calculations

ORs are calculated by dividing the likelihood of exposure to a particular risk factor among people with cancer, by the likelihood of no exposure of this risk factor among people with cancer; or by dividing the likelihood of exposure to a particular risk factor among people with cancer, by the likelihood of exposure to this risk factor among people without cancer.

Calculating odds ratios

cs_orc_formula.png

Infographic showing the formula to calculate odds ratios

For example: if the odds of developing cancer in people exposed to ‘risk factor B’ is around 5 to 2 and the odds of developing cancer in people not exposed to ‘risk factor B’ is less than 1 to 2, then the odds in the group exposed to ‘risk factor B’ is 4.3 times the odds in the group not exposed to ‘risk factor B’.

ORs are less intuitive to understand than RRs; they are similar to RRs in that both RRs and ORs measure associations between cancer and risk factors; however the two should not be confused because ORs do not provide a direct measure of risk.

ORs are usually expressed as a number:

OR = 1: no difference in odds of having cancer between people exposed to the risk factor and people not exposed to it
OR less than 1: odds of having cancer are lower in people exposed to the risk factor compared with people not exposed to it; exposure to the risk factor may decrease the odds of developing cancer;
OR more than 1: odds of having cancer are higher in people exposed to the risk factor compared with people not exposed to it; exposure to the risk factor may increase the odds of developing cancer.

The best-quality studies also take into account (‘adjust’ or ‘control’ for) exposure to other potential risk factors (‘confounders’); for example adjusting for alcohol use when comparing smokers with non-smokers. Failure to adjust for confounders can result in overestimation or underestimation of the effect of the risk factor being studied.

See also

Risk terminology explained

Last reviewed: 10 October 2014

Rate calculations

We use the direct method to calculate rates, as we have age profiles of the populations and this is better than the estimates used by indirect methodology (although this is often used in cancer statistics when age profiles are not known).

See also

Rates explained

Last reviewed: 31 October 2013

Relative risk calculations

Relative Risks are calculated by dividing the likelihood of developing cancer for people exposed to a particular risk factor, by the likelihood of developing cancer for people not exposed to this risk factor.

Calculating relative risks

cs_rrc_formula.png

Infographic showing the formula to calculate relative risk

RRs are usually expressed as a number or percentage:

RR = 1, or RR = 100%: no difference in cancer risk between people exposed to the risk factor and people not exposed to it
RR less than 1, or RR = 0-100%: cancer risk is lower in people exposed to the risk factor compared with people not exposed to it; exposed people are less likely to develop cancer
RR greater than 1, or RR = 100%+: cancer risk if higher in people exposed to the risk factor compared with people not exposed to it; exposed people are more likely to develop cancer

See also

Risk terminology explained

Last reviewed: 10 October 2014

Rounding calculator

If numbers are very large or the exact value is too much detail for the context, frequencies, fractions and percentages can be rounded to make them simpler, so long as this does not change the meaning of the underlying data and you add words to say that it’s not exact. This means we need to take special care with phrases like “around”, “more than”, “almost”, “nearly” and “less than”.

Choosing which number is simpler to use and which words to add depends on the size of the exact number and what type of simpler number you want to use (number, fraction, 1 in X frequency, or 1 in 10 frequency).

For help with rounding numbers and converting percentages to fractions and frequencies, use the rounding calculator prepared by the CRUK Statistical Information Team.

Download the rounding calculator

Email stats.team@cancer.org.uk if you have any questions.

Last reviewed: 5 February 2021

Survival calculations

We mainly use net survival estimates for our survival data as it estimates the number of people who survive their cancer, taking background mortality in to account. It uses life-times and measures the impact on cancer specific survival due to improvements such as early diagnosis and treatment. It excludes deaths from other causes and it cannot be directly used to calculate the number of people alive after a cancer diagnosis.

See also

Survival terminology explained

Last reviewed: 28 April 2014

Local Cancer Statistics

Go to local cancer statistics - search profiles by area, constituency or health board in the UK..

Go to devolved nations overviews for an overview of Wales, Scotland or Northern Ireland

Citation

You are welcome to reuse this Cancer Research UK content for your own work.
Credit us as authors by referencing Cancer Research UK as the primary source. Suggested styles are:

Web content: Cancer Research UK, full URL of the page, Accessed [month] [year].
Publications: Cancer Research UK ([year of publication]), Name of publication, Cancer Research UK.
Graphics (when reused unaltered): Credit: Cancer Research UK.
Graphics (when recreated with differences): Based on a graphic created by Cancer Research UK.

When Cancer Research UK material is used for commercial reasons, we encourage a donation to our life-saving research.
Send a cheque payable to Cancer Research UK to: Cancer Research UK, 2 Redman Place, London, E20 1JQ or

Donate online

Our calculations explained

References

References

References

See also

References

Comparative cancer statistics

Citation

Newsletter

Acknowledgements