Use este identificador para citar ou acessar este item: https://doi.org/10.25824/redu/REJCTD
DOI: https://doi.org/10.25824/redu/REJCTD
Título: Financial news about brazilian companies listed on B3 and source-codes to perform sentiment analysis
Assunto: Computer and Information Science
Descrição: This package contains a dataset of financial news (written in Portuguese) and the source codes (in Python) to perform sentiment analysis on these news, according to two approaches: (i) based on three lexicons (also in Portuguese), being two of then proposed by the authors and specifically developed for the financial market; and (ii) based on machine learning, particularly with Naive Bayes and Multilayer Perceptrons. The dataset (file "NewsDatabase.zip") contains 828 news, downloaded from Brazilian newspapers through a web scrapper and manually labeled as positive or negative, according to an investor's sentiment. This dataset contains two sets of files, with and without the application of stemming. All documents were preprocessed with steps of tokenization, normalization, and removal of special characters and stop words. In the source codes (file "Source-Codes.zip"), the two proposed dictionaries can be found in the file "financial_dictionary.py".
Autor(es): Januário, Brenda Alexsandra
Carosia, Arthur Emanuel de Oliveira
Silva, Ana Estela Antunes da
Coelho, Guilherme Palermo
URI: https://doi.org/10.25824/redu/REJCTD
https://redu.unicamp.br/dataset.xhtml?persistentId=doi:10.25824/redu/REJCTD
Outros identificadores:  
Fomento: Fundação de Amparo à Pesquisa do Estado de São Paulo
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior
Número do Projeto: FAPESP: 2018/24371-1
CAPES: 001
Termo de uso:  
Data: 31-Ago-2021
Data de Disponibilização: 1-Set-2021
Formato: application/zip
application/zip
Tipo:  
Editora / Evento / Instituição: Guilherme Palermo Coelho
Idioma :  
Aparece nas coleções:Repositório de Dados de Pesquisa da UNICAMP



Os itens no repositório estão protegidos por copyright, com todos os direitos reservados, salvo quando é indicado o contrário.