Descrição:
This dataset showcases the results obtained by the so-called Attention-based Topics (ABT) method which aims to identify themes covered by collections of sentences. In each run, the method produces a set of topics in which each topic is associated with a set of its constituent sentences and a set of its most representative words. ABT is parameterized to execute using different language models as a base to produce the topics. Hence, this output dataset contains the results obtained by the usage of the following language models: Envoy, BERT, BioBERT, BART and all-mpnet. The experiments ´produced a range of 1 up to 200 topics.
In this case study we have used as input a collection of 10538 clinical case reports extracted from the CliCR corpus, available at http://github.com/clips/clicr. A clinical case is a detailed assessment that focuses on patients for different reasons - e.g., epidemiology, clinical studies, rare diseases or others.