Text Classification [2,5] is gaining more attention due to the availability of a huge number of t... more Text Classification [2,5] is gaining more attention due to the availability of a huge number of text data, such as blog articles and news data. Traditional text classification methods [1] use all the words present in a given text to represent a document. However, the high number of words mentioned in documents can tremendously increase the complexity of the classification task and subsequently make it very costly. Moreover, long (natural language text) documents usually include a different variety of information related to the topic of a document. For example, encyclopedic articles such as the life of a scientist, contain besides topic related content also detailed biographical information. Often, in such articles after the first paragraph (or first a few sentences), words or entities appear, which are not related to the main topic (or category) of the article. We assume that the most informative part of such articles is limited to a few starting sentences. In other words, instead o...
Text Classification [2,5] is gaining more attention due to the availability of a huge number of t... more Text Classification [2,5] is gaining more attention due to the availability of a huge number of text data, such as blog articles and news data. Traditional text classification methods [1] use all the words present in a given text to represent a document. However, the high number of words mentioned in documents can tremendously increase the complexity of the classification task and subsequently make it very costly. Moreover, long (natural language text) documents usually include a different variety of information related to the topic of a document. For example, encyclopedic articles such as the life of a scientist, contain besides topic related content also detailed biographical information. Often, in such articles after the first paragraph (or first a few sentences), words or entities appear, which are not related to the main topic (or category) of the article. We assume that the most informative part of such articles is limited to a few starting sentences. In other words, instead o...
Uploads
Papers by Rima Türker