Mostrar registro simples

dc.contributor.author Pantoja, Fagner Leal
dc.contributor.author Medeiros, Claudia Bauzer
dc.date 2025-07-28
dc.date.accessioned 2025-07-03
dc.date.accessioned 2025-07-30T01:11:50Z
dc.date.available 2025-07-30T01:11:50Z
dc.identifier.uri https://doi.org/10.25824/redu/R15PFJ
dc.identifier.uri https://redu.unicamp.br/dataset.xhtml?persistentId=doi:10.25824/redu/R15PFJ
dc.description This dataset showcases the results obtained by the so-called Attention-based Topics (ABT) method which aims to identify themes covered by collections of sentences. In each run, the method produces a set of topics in which each topic is associated with a set of its constituent sentences and a set of its most representative words. ABT is parameterized to execute using different language models as a base to produce the topics. Hence, this output dataset contains the results obtained by the usage of the following language models: Envoy, BERT, BioBERT, BART and all-mpnet. The experiments ´produced a range of 1 up to 200 topics. In this case study we have used as input a collection of 10538 clinical case reports extracted from the CliCR corpus, available at http://github.com/clips/clicr. A clinical case is a detailed assessment that focuses on patients for different reasons - e.g., epidemiology, clinical studies, rare diseases or others.
dc.description.sponsorship Coordenação de Aperfeiçoamento de Pessoal de Nível Superior
dc.format text/plain
dc.format text/plain
dc.format text/plain
dc.format text/plain
dc.format text/plain
dc.format text/markdown
dc.publisher Pantoja, Fagner Leal
dc.subject Computer and Information Science
dc.title Attention-based topics
dc.description.sponsorshipId CAPES: 1649850


Arquivos deste item

Arquivos Tamanho Formato Visualização

Não existem arquivos associados a este item.

Este item aparece na(s) seguinte(s) coleção(s)

Mostrar registro simples