Scientists use big data to develop an algorithm that can give cancer patients better survival estimates


A SURVIV analysis of breast cancer isoforms developed at UCLA. Blue lines are associated with longer survival times, and magenta lines with shorter survival times. Courtesy: Yi Xing

Survival time refers to the living time given to a cancer patient since the time the initial diagnosis of cancer was made. This time is usually estimated by the medical oncologist based on the age, history, stage and size of the detected cancer in the patient. Sometimes doctors make inaccurate predictions which could lead to wrong decisions regarding the line of therapy.

What if there was a better way for doctors to accurately predict the survival time and create a clear and personalized therapeutic plan?

Scientists at UCLA have used massive amounts of patient information available i.e. the biomedical big data to glean patterns and trends that will give the doctors the ability to better tailor their care for each individual patient- the hallmark of precision medicine.

They have developed an algorithm called “SURVIV“, which can provide a more reliable estimate of the living time. It is based on the presence and quantity of the patient’s gene sequences (or RNA), which are identified by a process called “sequencing” or “RNA-seq“.

SURVIV, as opposed to current methods, takes into consideration the alternative isoforms in which a gene can be spliced. As the following figure shows, a gene contains a number of exons and introns. During splicing, introns are cut and exons are glued together to create the isoform. There are various alternative ways of splicing creating different isoforms, which in turn creates different proteins. According to senior author Yi Xing, a typical human gene produces 7 to 10 isoforms.


The same gene can give different protein products through alternative splicing. These products are called ‘isoforms’. Here alternative splicing produces three protein isoforms. Credit: Wikipedia

Current survival prediction methods don’t consider alternative isoforms of a gene as different proteins. This decreases their potential of identifying subtle changes in gene expression, which could lead to false predictions.

SURVIV is the first statistical method which takes into account isoform information from RNA-seq data, according to Xing. The analysis was conducted on 2,684 cancer tissues of various types, including breast, brain, lung, ovary and kidney cancer. These samples belong to the National Institutes of Health’s Cancer Genome Atlas. Comparisons with conventional methods showed that the isoform corrected predictions were consistently better. In breast cancer only, SURVIV was able to detect around 200 new isoforms that are associated with varying patient survival rates.

“In cancer, sometimes a single gene produces two isoforms, one of which promotes metastasis and one of which represses metastasis,” Xing said, highlighting that understanding and considering this difference into a model, can have a big impact on cancer treatment.

Everything shows that SURVIV is a new computational method which can more accurately predict cancer survival rates, and is particularly useful in cases where a patient has many types of cancer.

The research was funded by the National Institutes of Health (grants R01GM088342 and R01GM105431) and the National Science Foundation (grant DMS1310391). The original article was published last week in the journal Nature Communications.

Source: UCLA

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s