Introduction

Time is one of key five points to determine document's credibility besides coverage,objectivity, accuracy and relevance. A survey performed on various approaches to temporal annotation emphasize increasing importance of time in Natural language processing( NLP ) and Information Retrieval( IR ). TempEval task in SemEval 2007 workshop has shown the value of time in various NLP tasks.

Time plays an important role in various application domains like sequencing of events, question answer systems,temporal summaries, temporal clustering, temporal querying, etc. So detecting presence of time in context has become a important research problem in recent years.

However according to survey and TempEval task of SemEval 2007, various solutions to the problem of identifying presence of time has used linguistic constructs such as words after, before, currently, etc and linked them with document creation time. However context need not always contain such constructs. e.g. the query " Did the Enron merger with Dynegy take place ? " requires answer to be given from time view point but question is not consisting of any language construct from which implicit presence of time could be identified. So, created a resource for English language which removes the need for presence of such language constructs in context.

Inspired from Tempowordnet for English , we propose to build Tempo-IndoWordnet : a temporal resource for Hindi Language. Ultimate aim of this research is to provide a resource for Hindi using which temporal information can be accessed effectively. For achieving the aim, temporal classifiers are built and learnt in two steps on initial set of seed words representing five classes past, present, future, neutral and atemporal to classify entire Hindi wordnet among these classes. In first step entire Hindi wordnet is classified in two classes temporal and atemporal. In second step the instances which are predicted as temporal in preceding step are classified into four classes past, present, future and neutral. Classes and their meanings are as follows :

Problem Statement

Given a Hindi wordnet, annotate each synset present in it with temporal information it denotes. i.e. classify each synset among five classes past, present, future, neutral and atemporal. e.g.

Word : कल
Synset : भविष्य, भविष्य काल, आगामी समय, उत्तरकाल, उत्तर-काल, उत्तर काल, भावी समय, आने वाला समय, अगत, अप्राप्तकाल, अवर्त्तमान, अवर्तमान, आगम, आगाह
Gloss : आने वाला काल या समय
Example sentence : "भविष्य में क्या होगा कोई नहीं जानता । / कल किसने देखा है ।"
Temporal annotation : Future

Challenges

Polysemous words can have different temporal annotation for each sense. Notice different annotation for same word "कल" for each of its sense in below example.

  1. word : कल
    Synset : अतीत, अतीतकाल, भूतकाल, अतीत काल, कल, भूत काल, गत काल,पिछला ज़माना, पूर्वकाल
    Gloss : बीता हुआ समय या काल
    Example Sentense : "यह उपन्यास अतीत की घटनाओं पर आधारित है । / कल की बातों को याद करके दुखी होना अच्छा नहीं ।"
    Temporal Annotation : Past

  2. word : कल
    Synset : कल
    Gloss : आज के बाद आनेवाले पहले दिन को
    Example sentense : "मैं कल घर जाऊँगा ।"
    Temporal Annotation : Future