This thesis addresses the challenge of processing and analyzing large-scale, unstructured log data, which is a task of growing importance in today’s data centers. The thesis focuses on practical methods to extract meaningful features from large, unlabeled log datasets, with emphasis on using unsupervised learning techniques suitable for handling the scale and nature of the data effectively. Before applying the designed model to a real-world dataset, the accuracy of the designed model is tested on a publicly available dataset. The findings contribute to the field of log data analysis, offering insights into handling large datasets and highlighting potential areas for further research in anomaly detection and data processing techniques.