A clinical decision support console for Systemic Lupus Erythematosus.

A three-step machine-learning pipeline that turns blood RNA into a per-patient SLE readout.

Biomed09 · Team
Manna Berry  ·  Kiwi Lin  ·  Udit Samant  ·  Hadi Shafat  ·  Minh Hieu Tran  ·  Jillian Zhao

DATA3888 · 2026
The University of Sydney

Disease context

Lupus is unpredictableand impacts millions.

\( \approx 5\mathrm{M} \)

People living with SLE globally, >90% women.

\(9\)

Years (avg.) from first symptom to confirmed diagnosis.

\(40\%\)

Non-response to first-line therapy at 12 months.

Where workflows fall short

Clinicians fly blindbetween visits.

Manual Slow, expensive diagnostic workflows.
Inconsistent Variable accuracy across clinics and clinicians.
Reactive Therapy adjustments lag flares, not anticipate them.
Unscalable Molecular tests rarely deployed at the point of care.

Our approach

A three-step machine-learning pipeline.

From raw expression data to a clinician-ready interface.

STEP 01

Data & features

  • InputsGEO cohorts
  • Harmoniselabels
  • Engineerfeatures
  • Protectsafe splits
STEP 02

ML prediction

  • Model 1diagnosis
  • Model 2flare risk
  • Model 3response
  • SelectRF winner
STEP 03

Interpret & deploy

  • Evidencemetrics
  • Packagemodels
  • ServeAPI
  • Consolehandoff

Method · how each model stage is done

Three model stages, one Random Forest framework.

Model 1
See how

Diagnosis

"Does this patient have lupus?"

Data GSE72509 · 117 whole-blood RNA-seq samples (99 SLE, 18 controls).
Preprocess log(RPKM + 1) · low-expression & variance filtering · stratified 5-fold CV.
Features RF Gini importance · Boruta · biological curation against the IFN module.
Model Random Forest (500 trees) vs. limma signature, LASSO — RF wins on macro-F1.
Result AUROC 0.972 · macro-F1 0.95 · bal-acc 0.94.
Model 2
See how

Flare risk

"Will they be active at the next visit?"

Data GSE65391 (train) → GSE49454 (external) · longitudinal cohorts, SLEDAI ≥ 6.
Preprocess t → t+1 labelling · leakage-safe patient-level split · gene + clinical features.
Features Training-only probe selection · RF wrapper · clinical / treatment / temporal blocks.
Model RF, elastic net, GBM, weighted ensemble — gene + clinical RF wins externally.
Result AUC 0.823 · bal-acc 0.73 · accuracy 0.82 (external GSE49454).
Model 3
See how

Treatment response

"Will they respond to first-line therapy?"

Data GSE224705 · whole-blood microarray, lupus nephritis · SRI-4 endpoint.
Preprocess Probe filtering · collapse probes to gene symbols · balanced bootstrap on minority class.
Features RF Gini + Boruta + curation → 50-gene panel held out for evaluation.
Model limma, RF, LASSO, elastic net, linear SVM — Random Forest tops macro-F1.
Result AUROC 0.865 · macro-F1 0.79 · bal-acc 0.79.

Evaluation metrics

How we picked the winner.

One selection rule across every stage — accuracy plus the metrics that matter in a clinic.

01

Accuracy

Share of patients classified correctly on held-out cohorts.

Honest top-line, but hides class imbalance.

02

\(F_1\)-score

Harmonic mean of precision and recall on the minority class.

Penalises missed flares - the costly clinical error.

03

Generalisation

Patient-level CV plus a one-time external cohort (GSE49454).

Confirms the panel transfers across labs and platforms.

04

Inference speed

End-to-end latency from upload to risk band on a single patient.

Sub-second matters for a real bedside tool.

Selection rule Winner = best generalisation at clinic-grade speed, breaking ties by \(F_1\).

Results · external validation

Generalises to data it has never seen.

01 · Diagnosis · Random Forest
AUROC 0.972
Bal-Acc 0.939
Macro F1 0.950
Accuracy 0.974

0.97AUROC

Random Forest beat limma signature and LASSO on macro-F1. 5-fold stratified CV · GSE72509 · 117 samples.

02 · Flare risk · gene + clinical RF
AUC 0.823
Accuracy 0.821
Bal-Acc 0.734
Sensitivity 0.903

0.82AUC

External hold-out — never seen during training. Trained on GSE65391, tested on GSE49454.

03 · Treatment · Random Forest
AUROC 0.865
Accuracy 0.834
Macro F1 0.791
Bal-Acc 0.789

0.87AUROC

Random Forest top on macro-F1 across five candidates. 5-fold CV with balanced bootstrap · GSE224705 · SRI-4 endpoint.

Key finding Stable generalisation across cohorts, sub-second inference, and consistent gains over baselines on every stage's primary metric.

Final architecture

Random Forest, end-to-end.

01 · INPUT

Patient CSV

Whole-blood expression upload, validated against the gene panel.

02 · PREPROCESS

Clean & normalise

z-score against the reference distribution, missing-value imputation.

03 · FEATURES

Immune panel

~50 transcripts selected on the training cohort only.

04 · PREDICT

Random Forest

Three calibrated outputs - diagnosis, flare risk, treatment response.

05 · OUTPUT

Risk + drivers

Banded risk score with the top features behind every prediction.

No retraining at inference - the same trained model serves every patient request.

Live demonstration

From a blood sample to a recommendation in one click.

01Upload a patient expression CSV
02Auto-preprocess and score against the panel
03See diagnosis, flare and response risk with confidence
04Explore drivers & export a clinical summary

In summary.

The problem

Lupus care is reactive - molecular signal exists in blood but rarely reaches the clinic.

Our contribution

A three-step Random-Forest pipeline that classifies, predicts flares, and forecasts treatment response.

Why it matters

External validation holds up, inference is sub-second, and the console is usable by a non-ML clinician.

Where we go next.

Larger, prospective data

Validate on a contemporary, ancestry-diverse cohort to confirm calibration in the real world.

Multimodal learning

Layer autoantibody, cytokine and methylation features alongside the transcript module.

Cloud + explainability

Host as a clinician-facing web service with per-prediction SHAP-style attribution.

Thank you.

Biomed09 · Team
Manna Berry  ·  Kiwi Lin  ·  Udit Samant  ·  Hadi Shafat  ·  Minh Hieu Tran  ·  Jillian Zhao

DATA3888 · 2026
The University of Sydney

Candidate models

Top performers across all three stages

T₁ T₂ T₃ T₄ residual ↓ each round
Boosting

XGBoost

Sequential trees correcting residuals - strong, but harder to tune.

runner-up
p = 0.5
Linear

Logistic Regression

Sigmoid fit on the features — interpretable, but lower AUC.

baseline
Kernel

SVM

Maximum-margin separator — solid on diagnosis, less stable on flares.

control
1 1 0 majority → 1
Selected

Random Forest

Many trees vote — robust to noise, interpretable, fast.

used in all stages
Sample A Sample B Sample C vote 1 vote 1 vote 0 Votes collected 1 1 0 Majority vote
\(2\text{ of }3=\text{prediction }1\)
Random feature subsets · bagged trees · stable final call

Appendix · 01 / 06 · Selected model · Random Forest

Many trees, one robust answer.

  • How it works

    Each tree trains on a bootstrapped sample of patients and a random subset of features - predictions are aggregated by majority vote.

  • Why it won

    Best balance of accuracy and stability across our three outcomes, with sub-second inference and built-in feature importance.

  • Built-in interpretability

    Per-prediction feature importances become the "drivers" surfaced in the clinician console.

  • Robust by design

    Averaging across trees damps noise from small cohorts - important for a domain with 100s of samples, not millions.

Appendix · 02 / 06 · Deployment

From model to clinician in one click.

Frontend

Risk console

Static HTML + JS dashboard - upload, results, drivers, simulator, hand-off note.

HTML · CSS · vanilla JS
Backend

Plumber API

R server that loads the trained model and returns predictions plus per-feature drivers.

R · Plumber · randomForest
Workflow

Real-time loop

Clinician uploads expression data and inspects the patient-level risk band on the spot.

Sub-second round-trip
Upload Preview Predict Explore drivers Export summary

Appendix · 03 / 06

Dataset sources.

  • Hung, T., Pratt, G. A., Sundararaman, B., Townsend, M. J., Chaivorapol, C., Bhangale, T., Graham, R. R., Ortmann, W., Bhangale, T. R., Behrens, T. W., Yeo, G. W., & Chaussabel, D. (2015). The Ro60 autoantigen binds endogenous retroelements and regulates inflammatory gene expression in systemic lupus erythematosus [Data set; GSE72509]. NCBI Gene Expression Omnibus. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE72509
  • Banchereau, R., Hong, S., Cantarel, B., Baldwin, N., Baisch, J., Edens, M., Cepika, A.-M., Acs, P., Turner, J., Anguiano, E., Vinod, P., Kahn, S., Obermoser, G., Blankenship, D., Wakeland, E., Nassi, L., Gotte, A., Punaro, M., Liu, Y.-J., … Pascual, V. (2016). Personalized immunomonitoring uncovers molecular networks that stratify lupus patients. Cell, 165(3), 551–565. https://doi.org/10.1016/j.cell.2016.03.008
  • Chiche, L., Jourde-Chiche, N., Whalen, E., Presnell, S., Gersuk, V., Dang, K., Anguiano, E., Quinn, C., Burtey, S., Berland, Y., Kaplanski, G., Harlé, J.-R., Pascual, V., & Chaussabel, D. (2014). Modular transcriptional repertoire analyses of adults with systemic lupus erythematosus reveal distinct type I and type II interferon signatures. Arthritis & Rheumatology, 66(6), 1583–1595. https://doi.org/10.1002/art.38628
  • NCBI Gene Expression Omnibus. (2023). Whole-blood microarray expression in lupus nephritis: Treatment response by SRI-4 [Data set; GSE224705]. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE224705

Appendix · 04 / 06

Libraries & frameworks.

  • R Core Team. (2024). R: A language and environment for statistical computing (Version 4.x) [Computer software]. R Foundation for Statistical Computing. https://www.R-project.org/
  • Liaw, A., & Wiener, M. (2002). Classification and regression by randomForest. R News, 2(3), 18–22.
  • Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). Association for Computing Machinery. https://doi.org/10.1145/2939672.2939785
  • Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1), 1–22. https://doi.org/10.18637/jss.v033.i01
  • Kursa, M. B., & Rudnicki, W. R. (2010). Feature selection with the Boruta package. Journal of Statistical Software, 36(11), 1–13. https://doi.org/10.18637/jss.v036.i11
  • Ritchie, M. E., Phipson, B., Wu, D., Hu, Y., Law, C. W., Shi, W., & Smyth, G. K. (2015). limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Research, 43(7), e47. https://doi.org/10.1093/nar/gkv007
  • Siriseriwan, W. (2019). smotefamily: A collection of oversampling techniques for class imbalance problem based on SMOTE (Version 1.3.1) [R package]. https://CRAN.R-project.org/package=smotefamily
  • Schloerke, B., & Allen, J. (2024). plumber: An API generator for R [R package]. https://www.rplumber.io/

Appendix · 05 / 06

Research papers & models.

Methods, prior art, and clinical scoring referenced in the pipeline.

  • Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324
  • Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). Association for Computing Machinery. https://doi.org/10.1145/2939672.2939785
  • Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297. https://doi.org/10.1007/BF00994018
  • Kursa, M. B., & Rudnicki, W. R. (2010). Feature selection with the Boruta package. Journal of Statistical Software, 36(11), 1–13. https://doi.org/10.18637/jss.v036.i11
  • Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357. https://doi.org/10.1613/jair.953
  • Lundberg, S. M., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems (Vol. 30, pp. 4765–4774). Curran Associates.
  • Gladman, D. D., Ibañez, D., & Urowitz, M. B. (2002). Systemic lupus erythematosus disease activity index 2000. The Journal of Rheumatology, 29(2), 288–291.
  • Furie, R., Petri, M. A., Wallace, D. J., Ginzler, E. M., Merrill, J. T., Stohl, W., Chatham, W. W., Strand, V., Weinstein, A., & Chevrier, M. (2009). Novel evidence-based systemic lupus erythematosus responder index. Arthritis & Rheumatism, 61(9), 1143–1151. https://doi.org/10.1002/art.24698

Appendix · 06 / 06

Team & acknowledgements.

DATA3888 · 2026
The University of Sydney

Team members Biomed09
Manna Berry email pending The University of Sydney, NSW 2006
Kiwi Lin llin0935@uni.sydney.edu.au School of Mathematics and Statistics F07, The University of Sydney, NSW 2006 Australia
Udit Samant email pending School of Computer Science J12, The University of Sydney, NSW 2006 Australia
Hadi Shafat hsha0153@uni.sydney.edu.au School of Computer Science J12, The University of Sydney, NSW 2006 Australia
Minh Hieu Tran email pending School of Computer Science J12, The University of Sydney, NSW 2006 Australia
Jillian Zhao yzha0369@uni.sydney.edu.au School of Computer Science J12, The University of Sydney, NSW 2006 Australia
Supervisors

Andy Tran · Elyna Lin

Thank you for the weekly guidance, thoughtful feedback, and steady support throughout the project.

Also thanks to
  • Original data contributors and study participants behind the public GEO cohorts.
  • Open-source maintainers across R, Bioconductor, and the modelling libraries used here.
  • The DATA3888 teaching team for project structure, feedback, and course support.