THE SEGMENTASI DOKUMEN TEKS DENGAN METODE TEXTTILING
DOI:
https://doi.org/10.33884/jif.v10i01.4509Keywords:
Segmentasi, Dokumen Teks, TextTilingAbstract
In this paper, we will report our work on text segmentation on Indonesian speech documents. As a result of using Automatic Speech Recognition (ASR), the speech documents are transcribed into the text without any boundary for each document. The documents are certainly needed to be segmented regarding to its topics. We apply TextTiling method with various term weighted techniques such as TF-IDF, TF-IDF-Mutual Information, TF-IDF Mutual Information-Word Similarity, and TF-IDF-Word Frequency for measuring the similarity between segments. The result show TF-IDF-Mutual Information performed better in most of the collections.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2022 JURNAL ILMIAH INFORMATIKA
This work is licensed under a Creative Commons Attribution 4.0 International License.