Forget the Data, it’s Time to Focus on Information

Photo: pexels.com (Anna Nekrashevich)

Original Text (Alkuperäinen teksti): Aapeli Nevala

English translation: Sanna Heikkinen

The blog was first published on the website of the Cancer Society of Finland on February 10, 2021 at https://www.syopajarjestot.fi/ajankohtaista/blogit/unohdetaan-data-on-aika-keskittya-informaatioon/

Forget the data, it’s time to focus on information

The following two things are hard statistical facts about Finland.

Cancer deaths have increased by 30 percent over the past 30 years.

Cancer mortality has also decreased by 25 percent during the same period.

Despite the apparent paradox, Erwin Schrödinger has not become a cancer epidemiologist; instead, the explanation lies in the method of reporting. The absolute number of cancer cases and deaths continues to rise due to a simple reason: the age pyramid in Finland is now and will continue to be increasingly weighted towards older age groups, where cancer is more common.

In contrast, age-standardized cancer mortality is decreasing. Age standardization artificially adjusts the population of Finland in different years to resemble each other in terms of age structure. This allows us to observe a decline in cancer mortality per capita, assuming the age structure of the population remained the same. Factors influencing this include, for example, the reduction of poor-prognosis cancers caused by smoking as smoking decreases, and advancements in cancer treatments.

Numbers can be misused

Despite its simplicity, the example of cancer mortality development hints at the problems encountered when solely focusing on data or numbers. Neither figure is incorrect, and both have their purposes. Absolute numbers are needed, for instance, when estimating future healthcare needs. Age-standardized mortality, on the other hand, can reveal changes in the causes or treatments of cancer or other unknown factors. With the current demographic shift, and everything else remaining the same, deaths will increase.

Similarly, both figures can be misused if they are employed to answer the wrong question.

Rather than merely looking at data, the crucial aspect of using data is to describe and understand the process that results in data collection. An essential part of this is understanding what data goes unnoticed, and knowing if the data has been recorded and gathered in similar fashion over time.

Two things must happen before a cancer appears in cancer statistics: the cancer must exist, and it must be diagnosed. The relative number of cancers may change if either of these processes undergo alterations. The observed increase in cancer statistics may partly be explained by the fact that cancers have previously gone undetected more often.

Drawing conclusions from such processes is challenging and often relies on statistical or mathematical models. Data always sets some boundaries for modelling, but the more undetected data there is, the more flexibility – and thus room for error – the models have. Decision-making is often required to be more firmly based on evidence and data. While this is a good starting point, it creates another problem. It is essential to consider the process of forming evidence.

”Decision-making is often required to be more firmly based on evidence and data. While this is a good starting point, it creates another problem. It is essential to consider the process of forming evidence.” Picture: pexels.com (Anna Nekrashevich)

What is evidence based on data?

If political decision-making is based solely on incontrovertible evidence, we give more weight to measures for which formation of evidence using current practices is more straightforward.

The effectiveness of drug treatments can be reliably demonstrated in randomized trials, but lifestyle interventions are more challenging. For example, there is much diverse research evidence on the connection between lifestyles and cancer, but organizing a randomized trial on years of continuous consumption of charred sausage is impossible.

In this case, it is useful to consider the potential, cost, and possible harm of a successful intervention. This may sound trivial, but implementing a low-risk, high-potential, and low-side-effect intervention can be worthwhile – even if the evidence is not watertight.

Instead of just data, we could begin talking of information.

The writer is a statistician who thinks that the best part of the job is outlining and imagining the processes that generate data. He works as a Development manager of colorectal cancer screening at the Finnish Cancer Registry.
Photo: Aapeli Nevala. The writer is a statistician who thinks that the best part of the job is outlining and imagining the processes that generate data. He works as a Development manager of colorectal cancer screening at the Finnish Cancer Registry.

Want to learn more about the Finnish Center of Excellence in Tumor Genetics?

Keep up to date with the latest tweets, videos and blog posts for a peek into the everyday life in cancer research; subscribe to our blog and YouTube channel! and don’t forget to follow us on Twitter (@CoEinTG)