| dc.contributor.author |
Pantoja, Fagner Leal |
|
| dc.contributor.author |
Medeiros, Claudia Bauzer |
|
| dc.date |
2025-07-28 |
|
| dc.date.accessioned |
2025-07-03 |
|
| dc.date.accessioned |
2025-07-30T01:11:50Z |
|
| dc.date.available |
2025-07-30T01:11:50Z |
|
| dc.identifier.uri |
https://doi.org/10.25824/redu/R15PFJ |
|
| dc.identifier.uri |
https://redu.unicamp.br/dataset.xhtml?persistentId=doi:10.25824/redu/R15PFJ |
|
| dc.description |
This dataset showcases the results obtained by the so-called Attention-based Topics (ABT) method which aims to identify themes covered by collections of sentences. In each run, the method produces a set of topics in which each topic is associated with a set of its constituent sentences and a set of its most representative words. ABT is parameterized to execute using different language models as a base to produce the topics. Hence, this output dataset contains the results obtained by the usage of the following language models: Envoy, BERT, BioBERT, BART and all-mpnet. The experiments ´produced a range of 1 up to 200 topics.
In this case study we have used as input a collection of 10538 clinical case reports extracted from the CliCR corpus, available at http://github.com/clips/clicr. A clinical case is a detailed assessment that focuses on patients for different reasons - e.g., epidemiology, clinical studies, rare diseases or others. |
|
| dc.description.sponsorship |
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior |
|
| dc.format |
text/plain |
|
| dc.format |
text/plain |
|
| dc.format |
text/plain |
|
| dc.format |
text/plain |
|
| dc.format |
text/plain |
|
| dc.format |
text/markdown |
|
| dc.publisher |
Pantoja, Fagner Leal |
|
| dc.subject |
Computer and Information Science |
|
| dc.title |
Attention-based topics |
|
| dc.description.sponsorshipId |
CAPES: 1649850 |
|