You're currently viewing an old version of this dataset. To see the current version, click here.

Synthetic Breast Cancer Data

The data is generated using the Synthetic Data Generator which generates process-based breast cancer treatment data following the distribution in a real population of breast cancer patients. The collection comprises a total of 18 data sets, nine for relational databases and nine for RDF-based knowledge graphs. For each data format, there are three different sizes of data sets:

  • Small models 1,000 patients
  • Medium-sized models 10,000 patients
  • Large models 100,000 patients

There are three data sets of each size. They differ in the parameter used for the mutation probability of the data generator. The lower this value is, the closer the data is to following the treatment guideline for breast cancer patients with an amplified HER2 gene.

Data and Resources

Cite this as

Philipp D. Rohde, Antonio Jesus Diaz-Honrubia, Emetis Niazmand, Maria-Esther Vidal (2024). Dataset: Synthetic Breast Cancer Data. https://doi.org/10.57702/4qixxfy9

DOI retrieved: September 12, 2024

Additional Info

Field Value
Created September 12, 2024
Last update September 16, 2024
License cc-by-sa: Creative Commons Attribution Share-Alike
Source https://github.com/SDM-TIB/Synthetic-Data-Generator
Defined In https://doi.org/10.1016/j.inffus.2024.102557
Link to ORKG http://orkg.org/orkg/resource/R705607
Author Philipp D. Rohde
More Authors
Antonio Jesus Diaz-Honrubia
Emetis Niazmand
Maria-Esther Vidal