VISE-KG

doi:doi:10.57702/w3ghyku2

VISE-KG

VISE: Validated and Invalidated Symbolic Explanations for Knowledge Graph Integrity

This collection includes all the data necessary to reproduce the results from the experimental evaluation of VISE at EXPLIMED @ ECAI'24. The data is an anonymized synthetic lung cancer benchmark that comprises clinical data extracted from heterogeneous sources such as publications, clinical trials, and clinical records representing patients diagnosed with lung cancer. We evaluate the VISE approach on three anonymized Lung Cancer KGs: LC-𝐾𝐺1, LC-𝐾𝐺2,and LC-𝐾𝐺3

The collection comprises nine data sets of three different sizes:

LC Knowledge Graph 1 (LC-KG1) models 29 lung cancer patients
LC Knowledge Graph 2 (LC-KG2) models 203 lung cancer patients
LC Knowledge Graph 3 (LC-KG3) models 319 lung cancer patients

Three distinct KGs of different sizes are available, each with its own characteristics.

"Original KG": The original KG comprises anonymized lung cancer patients with different medical characteristics.
"Enriched KG": Utilizes an inductive learning technique of KG completion through self-supervised symbolic learning over the original KG.
"Transformed KG": Denotes a transformation of the KG depending on SHACL shapes evaluated across the enriched KGs. This procedure is used to determine the validity of the data.

VISE is also evaluated with KGs comprising 1242 lung cancer patients (LungCancer-OriginalKG, LungCancer-EnrichedKG, and LungCancer-TransformedKG).

Our experimental results demonstrate the effectiveness of this hybrid strategy, which combines the strengths of symbolic, numerical, and constraint validation paradigms.

BibTex:

@dataset{Disha_Purohit_and_Yashrajsinh_Chudasama_and_Maria_Torrente_and_Maria-Esther_Vidal_2024,
    abstract = {VISE: Validated and Invalidated Symbolic Explanations for Knowledge Graph Integrity

This collection includes all the data necessary to reproduce the results from the experimental evaluation of VISE at EXPLIMED @ ECAI'24.
The data is an anonymized synthetic lung cancer benchmark that comprises clinical data extracted from heterogeneous sources such as publications, clinical trials, and clinical records representing patients diagnosed with lung cancer. We evaluate the VISE approach on three anonymized Lung Cancer KGs: LC-𝐾𝐺1, LC-𝐾𝐺2,and LC-𝐾𝐺3

The collection comprises nine data sets of three different sizes:

- LC Knowledge Graph 1 (LC-KG1) models 29 lung cancer patients
- LC Knowledge Graph 2 (LC-KG2) models 203 lung cancer patients
- LC Knowledge Graph 3 (LC-KG3) models 319 lung cancer patients

Three distinct KGs of different sizes are available, each with its own characteristics.

- "Original KG": The original KG comprises anonymized lung cancer patients with different medical characteristics. 
- "Enriched KG": Utilizes an inductive learning technique of KG completion through self-supervised symbolic learning over the original KG. 
- "Transformed KG": Denotes a transformation of the KG depending on SHACL shapes evaluated across the enriched KGs. This procedure is used to determine the validity of the data.

VISE is also evaluated with KGs comprising 1242 lung cancer patients (LungCancer-OriginalKG, LungCancer-EnrichedKG, and LungCancer-TransformedKG).

Our experimental results demonstrate the effectiveness of this hybrid strategy, which combines the strengths of symbolic, numerical, and constraint validation
paradigms.},
    author = {Disha Purohit and Yashrajsinh Chudasama and Maria Torrente and Maria-Esther Vidal},
    doi = {10.57702/w3ghyku2},
    institution = {TIB},
    keyword = {'Health-care', 'Knowledge Graph', 'Symbolic Learning'},
    month = {sep},
    publisher = {TIB},
    title = {VISE-KG},
    url = {https://ldm.kisski.de/dataset/vise-kg},
    year = {2024}
}