Preprints (not yet peer-reviewed)

Background. Office of Academic Affairs (OAA), Office of Student Life (OSL) and Information Technology Helpdesk (ITD) are support functions within a university which receives hundreds of email messages on the daily basis. A large percentage of emails received by...

["Artificial Intelligence","Data Mining and Machine Learning","Natural Language and Speech"]
doi:10.7287/peerj.preprints.26531v1
203 downloads
1,284 views

Human speech is the most important part of General Artificial Intelligence and subject of much research. The hypothesis proposed in this article provides explanation of difficulties that modern science tackles in the field of human brain simulation. The hypothesis...

["Artificial Intelligence","Computational Linguistics","Natural Language and Speech"]
doi:10.7287/peerj.preprints.1576v4
155 downloads
637 views

Severe weather impact identification and monitoring through social media data is a good challenge for data science. In last years we assisted to an increase of natural disasters, also due to climate change. Many works showed that during such events people tend...

["Data Science","Emerging Technologies","Natural Language and Speech","Network Science and Online Social Networks"]
doi:10.7287/peerj.preprints.2241v2
51 downloads
262 views

Scaling up the analysis of sensitive or confidential documents frequently stumbles on the limited number of individuals with the necessary clearance to access the documents. The availability of cryptographic protocols compatible with text processing methods can...

["Cryptography","Data Science","Natural Language and Speech"]
doi:10.7287/peerj.preprints.2994v1
333 downloads
692 views

Building an effective team of developers is a complex task faced by both software companies and open source communities. The problem of forming a “dream” team involves many variables, including consideration of human factors, and it is not a dilemma solvable in...

["Data Mining and Machine Learning","Data Science","Natural Language and Speech","Social Computing","Software Engineering"]
doi:10.7287/peerj.preprints.2285v1
126 downloads
365 views

Kamus Dewan is the authoritative dictionary for Bahasa Malaysia, containing a wealth of linguistic and cultural information about Bahasa Malaysia. It is currently available in print, as well as a searchable online dictionary. However, the online dictionary lacks...

["Computational Linguistics","Natural Language and Speech"]
doi:10.7287/peerj.preprints.2205v1
320 downloads
829 views

Due to the rapid development of information technology, Internet has become part of everyday life gradually. People would like to communicate with friends to share their opinions on social networks. The diverse social network behavior is an ideal users' personality...

["Artificial Intelligence","Natural Language and Speech","Social Computing"]
doi:10.7287/peerj.preprints.1906v1
170 downloads
705 views

Developers summarize their changes to code in commit messages. When a message seems “unusual,” however, this puts doubt into the quality of the code contained in the commit. We trained \(n\)-gram language models and used cross-entropy as an indicator of commit...

["Data Mining and Machine Learning","Natural Language and Speech","Software Engineering"]
doi:10.7287/peerj.preprints.1771v1
674 downloads
1,111 views

Automated test generation tools have been widely investigated with the goal of reducing the cost of testing activities. However, generated tests have been shown not to help developers in detecting and finding more bugs even though they reach higher structural coverage...

["Natural Language and Speech","Software Engineering"]
doi:10.7287/peerj.preprints.1467v3
131 downloads
250 views

We present an application of the naturalness of software to provide multi-token code suggestions in GitHub’s Atom text editor. We extended the results of a simple n-gram prediction model using the "mean surprise" metric—the arithmetic mean of the surprisal of several...

["Data Mining and Machine Learning","Natural Language and Speech","Software Engineering"]
doi:10.7287/peerj.preprints.1597v1
148 downloads
421 views

The problem of designing effective methodology to summarize, and analyze the amount of textual information produced by developers remains particularly challenging especially when the goal is to help developers in making better development/maintenance decisions....

["Natural Language and Speech","Software Engineering"]
doi:10.7287/peerj.preprints.1534v1
220 downloads
422 views

Product Data Management (PDM) produced desktop and web based systems to maintain the organizational technical and managerial data to increase the quality of products by improving the processes of development, business process flows, change management, product structure...

["Human-Computer Interaction","Artificial Intelligence","Databases","Natural Language and Speech","Software Engineering"]
doi:10.7287/peerj.preprints.1518v1
320 downloads
446 views

In this paper, we have tried to use statistical machine translation in order to convert Python 2 code to Python 3 code. We use data from two projects and achieve a high BLEU score. We also investigate the cross-project training and testing to analyze the errors...

["Natural Language and Speech","Software Engineering"]
doi:10.7287/peerj.preprints.1459v1
250 downloads
353 views

We describe and experimentally validate a question-asking framework for machine-learned linguistic knowledge about human emotions. Using the Socratic method as a theoretical inspiration, we develop an experimental method and computational model for computers to...

["Agents and Multi-Agent Systems","Artificial Intelligence","Computational Linguistics","Data Mining and Machine Learning","Natural Language and Speech"]
doi:10.7287/peerj.preprints.1292v1
333 downloads
262 views

Despite being a relatively new discipline, Chinese Interpreting Studies (CIS) has witnessed tremendous growth in the number of publications and diversity of topics investigated over the past two decades. The number of doctoral dissertations produced has also increased...

["Data Mining and Machine Learning","Data Science","Databases","Digital Libraries","Natural Language and Speech"]
doi:10.7287/peerj.preprints.1277v1
What is a PeerJ Preprint?

A PeerJ Preprint is a draft of an article, abstract, or poster that has not yet been peer-reviewed for formal publication. Submit a draft, incomplete, or final version of your work for free.

Submissions today can be approved by Editorial Staff and online in 24 hours.

Establish precedent. Solicit feedback. Publish updates.

Refine by manuscript type

Top subject areas - Preprints

Top subject areas - People

View all subject areas