Note
Author(s) affiliated with: Concordia Institute for Information Systems Engineering, Concordia University.
"March 10, 2011".
"Preprint submitted to Information Sciences".
Issued in Information Sciences, 231; pages 98-112. ISSN 0020-0255.
Résumé
"The cyber world provides an anonymous environment for criminals to conduct malicious activities such as spamming, sending ransom e-mails, and spreading botnet malware. Often, these activities involve textual communication between a criminal and a victim, or between criminals themselves. The forensic analysis of online textual documents for addressing the anonymity problem called authorship analysis is the focus of most cybercrime investigations. Authorship analysis is the statistical study of linguistic and computational characteristics of the written documents of individuals. This paper is the first work that presents a unified data mining solution to address authorship analysis problems based on the concept of frequent pattern-based writeprint. Extensive experiments on real-life data suggest that our proposed solution can precisely capture the writing styles of individuals. Furthermore, the writeprint is effective to identify the author of an anonymous text from a group of suspects and to infer sociolinguistic characteristics of the author."--Pages 1-2.