Mostrar registro simples

dc.coverage.spatial Brazil
dc.coverage.temporal 1970-2022
dc.date.accessioned 2022-04-04T13:19:53Z
dc.date.accessioned 2025-05-24T01:10:55Z
dc.date.available 2022-04-04T13:19:53Z
dc.date.available 2025-05-24T01:10:55Z
dc.date.issued 2022-04-04T10:19:53Z
dc.identifier.uri www.usp.br
dc.identifier.uri http://repositorio.uspdigital.usp.br/handle/item/354
dc.description Carolina is a general corpus of contemporary Brazilian Portuguese with information on origin and typology. Carolina is an open corpus for Linguistics and Artificial Intelligence with a robust volume of texts of varied typology in contemporary Brazilian Portuguese (1970-2021). The first version of the corpus – 1.0 Ada – totals 653,354,884 million tokens, and is available in open access, for free download for research purposes, since March 8, 2022. Lincensing information may vary from text to text. Please check information at each text/file TEI-xml heading. This version of the corpus contains seven typologies: 1. datasets and other corpora 2. legislative branch 3. social media 4. wikis 5. judicial branch 6. public domain works 7. university domains This collection: datasets and other corpora
dc.format zip file
dc.publisher Center for Artificial Intelligence (C4AI) http://c4ai.inova.usp.br
dc.title Corpus Carolina v1.0 Ada


Arquivos deste item

Arquivos Tamanho Formato Visualização

Não existem arquivos associados a este item.

Este item aparece na(s) seguinte(s) coleção(s)

Mostrar registro simples