みーのぺーじ

みーが趣味でやっているPCやソフトウェアについて.Python, Javascript, Processing, Unityなど.

lifelinesで解析してみた

JupyterでKaplan-Meier 曲線を書いてみました.

使用したデータは German Breast Cancer Study Group 2 です.

datasets — lifelines 0.24.1 documentation

項目

GBSG2 function | R Documentation

  • horTh : hormonal therapy, a factor at two levels no and yes.

  • age : of the patients in years.

  • menostat: menopausal status, a factor at two levels pre (premenopausal) and post (postmenopausal).

  • tsize : tumor size (in mm).

  • tgrade : tumor grade, a ordered factor at levels I < II < III.

  • pnodes : number of positive nodes.

  • progrec : progesterone receptor (in fmol).

  • estrec : estrogen receptor (in fmol).

  • time : recurrence free survival time (in days).

  • cens : censoring indicator (0- censored, 1- event).

全患者

from lifelines.datasets import load_gbsg2
from lifelines import KaplanMeierFitter

df = load_gbsg2()

kmf = KaplanMeierFitter()
kmf.fit(df['time'], event_observed=df['cens'])
kmf.plot()

f:id:atsuhiro-me:20200306223847p:plain

ホルモン治療の有無

from lifelines.datasets import load_gbsg2
from lifelines import KaplanMeierFitter

df = load_gbsg2()
for name, group in df.groupby('horTh'):
    kmf = KaplanMeierFitter()
    kmf.fit(group['time'], event_observed=group['cens'],label=name)
    kmf.plot()

f:id:atsuhiro-me:20200306223029p:plain

閉経状態

from lifelines.datasets import load_gbsg2
from lifelines import KaplanMeierFitter

df = load_gbsg2()
for name, group in df.groupby('menostat'):
    kmf = KaplanMeierFitter()
    kmf.fit(group['time'], event_observed=group['cens'],label=name)
    kmf.plot()

f:id:atsuhiro-me:20200306223152p:plain

腫瘍のステージ

from lifelines.datasets import load_gbsg2
from lifelines import KaplanMeierFitter

df = load_gbsg2()
for name, group in df.groupby('tgrade'):
    kmf = KaplanMeierFitter()
    kmf.fit(group['time'], event_observed=group['cens'],label=name)
    kmf.plot()

f:id:atsuhiro-me:20200306223230p:plain

層別化できていそうな感じがします.