Tagarela

TAGARELA: An Intelligent Tutoring System
TAGARELA TAGARELA
Detmar Meurers & Detmar Meurers &
Ramon Ziai Ramon Ziai
Introduction Introduction
Feedback Feedback
TAGARELA: A web based intelligent System Architecture

The three models
I TAGARELA: Teaching Aid for Grammatical Awareness, System Architecture
The three models
workbook for Portuguese

Expert model: NLP
Annotation-based setup
Recognition and Enhancement of Linguistic Abilities Expert model: NLP
Activity model
(Amaral & Meurers 2005, 2006, 2007a,b, 2008, 2009; Amaral 2007; Ziai 2009) Activity model
Relevance for processing Relevance for processing
Analyzing learner I an intelligent web-based workbook for beginning Analyzing learner

language language
On Tokenization learners of Portuguese On Tokenization
Detmar Meurers and Ramon Ziai Interpretation Interpretation
Portuguese Properties
Mismatches in the
I designed to satisfy real-life FLT needs identified at OSU Portuguese Properties
Mismatches in the
identification of tokens
provide opportunities for students to practice their
Solution I Solution
based on joint research with On interpreting accented

characters listening, reading, and writing skills
On interpreting accented
characters
Luiz Amaral (UMass Amherst) Interpretation Interpretation

Mismatches in the
I offers individual feedback on learner input to system Portuguese Properties
Mismatches in the
interpretation of tokens interpretation of tokens
Solution I foster learner awareness of language forms and Solution
Wrapping up Wrapping up
Conclusion
categories (Long 1991, 1996; Ellis 1994; Schmidt 1995; Lyster 1998; Conclusion
Berlin. October 15, 2009 Lightbown & Spada 1999; Norris & Ortega 2000; Schulz 2002)
Appendix Appendix
Screenshots Screenshots
1 / 39 2 / 39
System role, Activity types, Interface Providing Feedback

TAGARELA TAGARELA
I What role does the system play in teaching? Feedback Feedback
System Architecture System Architecture
→ Self-guided activities accompanying teaching The three models The three models
Expert model: NLP Expert model: NLP

What type of activities are appropriate and useful for
I Activity model
I TAGARELA provides on-the-spot feedback on Activity model
Relevance for processing
fostering awareness (and fit into the FLT approach)?
Analyzing learner I orthographic errors (non-words, spacing, capitalization, Analyzing learner

language
→ Activities ideally involve both form and meaning, such language
On Tokenization
punctuation) On Tokenization
as listening/reading comprehension questions. Interpretation

I syntactic errors (nominal and verbal agreement)
Interpretation
Portuguese Properties Portuguese Properties

I TAGARELA offers six types of activities: Mismatches in the
semantic errors (missing or extra concepts, word choice)
Mismatches in the
identification of tokens I identification of tokens
I listening comprehension Solution Solution
Providing feedback on meaning becomes crucial for

On interpreting accented On interpreting accented
I reading comprehension characters I characters
Interpretation
I picture description Interpretation
Portuguese Properties activities such as reading and listening comprehension. Portuguese Properties
I fill-in-the-blank Mismatches in the
interpretation of tokens
Mismatches in the
I rephrasing Solution Solution
Wrapping up
I vocabulary Wrapping up
Conclusion Conclusion
Similar to traditional workbook exercises, plus audio. Appendix
Appendix
Screenshots
What should the system interfaces look like?
Screenshots
I
→ Use L2 as far as possible (needs careful interface design).

3 / 39 4 / 39
Feedback on Agreement
TAGARELA TAGARELA
Feedback Feedback
The three models The three models
Annotation-based setup Annotation-based setup
Activity model Activity model
Analyzing learner Analyzing learner

language language
On Tokenization On Tokenization
Interpretation Interpretation
Mismatches in the Mismatches in the
identification of tokens identification of tokens
Solution Solution
characters characters
Solution Solution
Appendix Appendix
5 / 39 6 / 39
Feedback on Word Choice

TAGARELA TAGARELA
Feedback Feedback

language language
Solution Solution
Solution Solution
Appendix Appendix
7 / 39 8 / 39
Feedback on Wrong Word
TAGARELA TAGARELA
Feedback Feedback

language language
Solution Solution
Solution Solution
Appendix Appendix
9 / 39 10 / 39
Feedback on Missing Verb General Architecture of TAGARELA

TAGARELA TAGARELA
Feedback Feedback
System Architecture Web Interface System Architecture
Student Input Feedback Message
Relevance for processing Expert Module Relevance for processing
Instruction Model
Analyzing learner Linguistic Analysis Feedback Analyzing learner
language sub-modules language
Activity Model Generation
On Tokenization • Form Analysis: On Tokenization
Interpretation • tokenizer Interpretation
• Linguistic Analysis (form and content)

Portuguese Properties • spell-checker Portuguese Properties
Feedback Manager (pedagogical modules)

Mismatches in the • lexical look-up Error Taxonomy Mismatches in the
identification of tokens • disambiguator identification of tokens
• Feedback selection
Solution • parser Solution
• Error Filtering • Student analysis

On interpreting accented • Content Analysis: On interpreting accented
• difflib
Analysis Manager
Interpretation Student Model Interpretation
• correct answer
• Strategic Analysis
Portuguese Properties • token matcher Portuguese Properties
Mismatches in the • canonic matcher Personal information Mismatches in the
interpretation of tokens • pos matcher interpretation of tokens
Solution Solution
Strategic Analysis Interaction Preferences
Conclusion sub-modules Conclusion
• Ranking
Appendix • task strategies Language Competence Appendix
Screenshots • task appropriateness Screenshots
• transfer
11 / 39 12 / 39
The three models NLP analysis modules in TAGARELA
TAGARELA TAGARELA
Feedback
I Form Analysis: Feedback
I The TAGARELA architecture includes System Architecture System Architecture

The three models
I tokenizer: takes into account specifics of Portuguese The three models
I model of domain knowledge (linguistic knowledge) Expert model: NLP Expert model: NLP
Annotation-based setup (cliticization, contractions, abbreviations) Annotation-based setup
I learner model Activity model Activity model
I instruction/activity model
Relevance for processing I lexical/morphological lookup: returns multiple analyses Relevance for processing
Analyzing learner
language
based on CURUPIRA lexicon (Martins et al. 2006) Analyzing learner
language
I What is the point of learner and activity models? On Tokenization
Interpretation
I disambiguator: finite state disambiguation rules narrow On Tokenization
Interpretation
Portuguese Properties down lexical information, in the spirit of Constraint Portuguese Properties
⇒ Providing feedback involves Mismatches in the

identification of tokens Grammar (Karlsson et al. 1995; Bick 2000, 2004)
Mismatches in the
I identifying linguistic properties of the learner input and Solution Solution

On interpreting accented I parser: bottom-up chart parser establishes relations to On interpreting accented
I interpreting them in terms of likely (mis)conceptions of characters characters
Interpretation check agreement, case and global well-formedness Interpretation
the learner Portuguese Properties Portuguese Properties
I This interpretation goes beyond linguistic form as such.
Content Analysis:
Solution
I Solution
I It needs to model the learner’s use of language for a
shallow semantic matching strategies between student
I
specific task in a specific context. Conclusion Conclusion
→ (Amaral & Meurers 2007a) answer and target, cf. Content Assessment Module
Appendix Appendix
Screenshots (Bailey & Meurers 2006, 2008) Screenshots
13 / 39 14 / 39
Annotation-based processing General Characteristics of Activities

TAGARELA TAGARELA
I Allow the analysis manager to flexibly employ NLP Introduction Activities can be characterized and differ in: Introduction
modules relevant to a particular activity. Feedback
System Architecture I task specification
Feedback
System Architecture
I To support a flexible control structure, the data Expert model: NLP I e.g.: listen, read, write, comment, complete Expert model: NLP
structures serving as input and as output for the Activity model I level Activity model
analysis modules need to be uniform and explicit. Analyzing learner I e.g.: basic, intermediate, advanced Analyzing learner
language language
I NLP analysis = a process of enriching the learner input On Tokenization I expected input On Tokenization
with annotations (parallel to XML-based corpus annotation) Portuguese Properties
I e.g.: word, phrase, sentence Portuguese Properties
I The same data structure, the learner input annotated

Solution
I nature and availability of target responses and type of identification of tokens
Solution
with information, is accessed throughout. On interpreting accented

characters variation from target that is permitted On interpreting accented
characters
Closely related idea: Common Analysis System (CAS,

I
Portuguese Properties I required skills and abilities, e.g.: Portuguese Properties
Götz & Suhre 2004) of the Unstructured Information Mismatches in the

interpretation of tokens I strategies needed (e.g., scanning, summarizing, grouping)
Mismatches in the
Management Architecture (UIMA). Solution

Wrapping up
I amount of content manipulation required
Solution
Wrapping up
I UIMA-based reimplementation of TAGARELA’s NLP Conclusion I required awareness of linguistic categories and rules Conclusion
(Ziai 2009) Appendix
Screenshots
I pedagogical goals behind activity and feedback provided: Appendix
Screenshots
I In addition to the information obtained by analyzing the I generally: improve the required skills and abilities
input, we need information about the activity.
15 / 39 16 / 39
Where it matters for processing Property identification in TAGARELA
TAGARELA TAGARELA
I General claim: The NLP analysis and feedback Introduction

Feedback
Introduction
Feedback
generation depend on the specific activity (type). System Architecture

I In TAGARELA, different activity types require different
System Architecture
I The information from the activity model has an impact on

Expert model: NLP
linguistic information to analyze student’s input: Expert model: NLP
Activity model
I FIB: spell-checking, lexical information Activity model
I Property Identification: Relevance for processing Relevance for processing
Analyzing learner
I Rephrasing: as above + syntactic processing and basic Analyzing learner
I Which linguistic properties (incl. errors) of the learner language content assessment (correct answer, token matcher) language
input can actually be observed in a given activity? On Tokenization On Tokenization
Interpretation I Reading: as above + all content analysis modules Interpretation
I Property Selection: Which of the observed properties Portuguese Properties

Mismatches in the
Mismatches in the
to select as likely error cause (or other relevant aspect)? identification of tokens I Why not always run everything? identification of tokens
Solution Solution
On interpreting accented I “Don’t guess what you know.” On interpreting accented
I Which of the identified properties is most likely to characters characters
provide a reliable assessment?

Interpretation
I The more we know about the linguistic properties, the Interpretation
I Which of the identified errors should be the focus of the Mismatches in the types of variation, and the potential errors NLP needs to Mismatches in the
feedback given activity and its specific pedagogical goals? Solution detect, Solution
I the more specific information we can diagnose
I Feedback Strategy: Which strategy does it choose? E.g.: Conclusion Conclusion
I with higher reliability
I explicit feedback on form for FIBs Appendix Appendix
I scaffolding for reading comprehension (i.e., encouraging Screenshots Screenshots
the use of required strategies)
17 / 39 18 / 39
TAGARELA meets real life language learners Identifying tokens (I)

TAGARELA TAGARELA
Feedback Feedback
I The system was used by beginning Portuguese Relevance for processing Relevance for processing
Analyzing learner
students at The Ohio State University. Analyzing learner
language language
Studying the system logs, we identified two aspects On Tokenization On Tokenization

I Interpretation Interpretation
where feedback based on the linguistically correct Portuguese Properties

Mismatches in the
Mismatches in the
analysis did not seem to be helpful for learners: identification of tokens

Solution
Solution
I interpretation of tokens with accented characters On interpreting accented

characters
characters
I tokenization of compounds Interpretation

Interpretation

Solution Solution
Appendix Appendix
19 / 39 20 / 39
Identifying tokens (II)
TAGARELA TAGARELA
Detmar Meurers &
Properties of Portuguese Detmar Meurers &
Tokenization
Feedback Feedback
Expert model: NLP I Certain Portuguese words are syntactically complex. Expert model: NLP
Relevance for processing I Contraction: preposition + determiner/pronoun Relevance for processing

language
I no = em (in) + o (the) language
Interpretation I nela = em (in) + ela (it) Interpretation
Mismatches in the
I destes = de (of) + estes (these) Mismatches in the
Solution Solution
characters
I às = a (to) + as (the) On interpreting accented
characters
Encliticization:
Mismatches in the
I Mismatches in the
Solution I comprá-lo = comprar (to buy) + o (it) Solution
Conclusion I compram-nas = compram (buy) + as (them) Conclusion
Appendix Appendix
Screenshots
I comprei-a = comprei (bought) + a (it) Screenshots
21 / 39 22 / 39
Mismatches in the identification of tokens Addressing the Identification of Tokens

TAGARELA TAGARELA
Feedback Feedback
I Learner input: O Amazonas fica no região norte. Expert model: NLP Expert model: NLP
Annotation-based setup I The system needs to connect the surface form provided Annotation-based setup
Relevance for processing by the student with the system analysis of this input. Relevance for processing
I System’s interpretation: no = em + o Analyzing learner Analyzing learner

language I An annotation-based NLP architecture (→ UIMA) language
I tokenized input: [em, o, região, norte]
I syntactically analyzed: [
On Tokenization
readily supports this with multiple parallel layers of On Tokenization
PP em [NP omasc , regiãofem , norte]]

⇒ Agreement error between o and região.

Mismatches in the
annotation for the learner input. Portuguese Properties
Mismatches in the
Solution
I The tokenization mismatch can be addressed by Solution
I Student’s interpretation: characters

Interpretation
representing both surface and deep tokenizations of the characters
Interpretation
IThere is no o região norte in the sentence I wrote.

Mismatches in the
learner input, and the mapping between the two. Portuguese Properties
Mismatches in the
I used the ‘preposition’ no.

I
Solution I Refer to surface form when generating the feedback. Solution
⇒ So no seems to be the wrong preposition? Wrapping up Wrapping up
Appendix Appendix
23 / 39 24 / 39
Example Token Representation Interpreting tokens: Accents (I)
TAGARELA TAGARELA
  Ramon Ziai Ramon Ziai
Token 
 
begin
0  Feedback Feedback
end 2 
  System Architecture System Architecture
 ‘no’  The three models The three models
tokenString  Expert model: NLP Expert model: NLP

category ‘prep’  Annotation-based setup Annotation-based setup
    Activity model Activity model
 LexiconInfo   Relevance for processing Relevance for processing
   
 *pos ‘prep’ +  Analyzing learner Analyzing learner
    language language
lexDef canonic ‘no’   On Tokenization On Tokenization
   
 frequency 31   Interpretation Interpretation
    Portuguese Properties Portuguese Properties
 source ‘lexicon’  Mismatches in the Mismatches in the

  identification of tokens identification of tokens
    Solution Solution

 Token   On interpreting accented On interpreting accented
     


 Token  begin 0   Interpretation Interpretation

      Portuguese Properties Portuguese Properties
 begin 0  end 2   Mismatches in the Mismatches in the
 
end 2
 
 tokenString ‘o’   interpretation of tokens interpretation of tokens
  
*
Solution Solution
   +
 tokenString ‘em’  category ‘det’   Wrapping up Wrapping up
deepForm  ,      Conclusion Conclusion

 category ‘prep’   LexiconInfo   
          Appendix Appendix
  *LexiconInfo +  *pos ‘det’ +  Screenshots Screenshots
       
 lexDef pos ‘prep’  lexDef source ‘tokn’  
         
 source ‘tokn’  gender ‘m’   
     
number ‘s’ 25 / 39 26 / 39
Interpreting tokens: Accents (II)

TAGARELA TAGARELA
Detmar Meurers &
Properties of Portuguese Detmar Meurers &
Accents and their importance for lexical distinctions

Feedback Feedback
System Architecture
The three models
I Accents in Portuguese encode important linguistic System Architecture
The three models
Expert model: NLP
distinctions. Expert model: NLP
Relevance for processing I Part-of-speech differences: Relevance for processing

language
I pronoun vs. verb language
On Tokenization
Interpretation
I esta (this) – está (is) On Tokenization
Interpretation
Portuguese Properties I conjunction vs. verb Portuguese Properties
Solution
I e (and) – é (is) identification of tokens
Solution
characters
I verb vs. noun On interpreting accented
characters
Interpretation
I para (stop) – Pará (state’s name) Interpretation
Solution
I Other differences: interpretation of tokens
Solution
Wrapping up
I gender Wrapping up
I avô (grandfather) – avó (grandmother)
Appendix Appendix
Screenshots I meaning Screenshots
I coco (coconut) – cocô (poop)
27 / 39 28 / 39
Mismatches in the interpretation of accents Addressing the Interpretation of Accents
TAGARELA TAGARELA
Feedback Feedback
System Architecture I Learners perceive the unaccented and accented System Architecture
I Learner Input: O vaso esta em cima de mesa. The three models

Expert model: NLP versions of a character as orthographically similar and
The three models
Expert model: NLP
Activity model in consequence confuse linguistically unrelated forms. Activity model
I System’s interpretation: Relevance for processing Relevance for processing
I The word esta in the learner input is a determiner.

Analyzing learner
language
I The system needs to capture the confusability of Analyzing learner
language
I There is no form of the verb (estar) in the answer. On Tokenization
Interpretation
accented with unaccented characters. On Tokenization
Interpretation
⇒ The student did not include the main verb. Portuguese Properties
Mismatches in the
I Treat accented and unaccented characters parallel to Portuguese Properties
Mismatches in the
Solution
common L1-transfer phonological confusions. identification of tokens
Solution
I Student’s interpretation: On interpreting accented

characters
I está and esta are confused just like On interpreting accented
characters
I I included esta as a form of the verb estar.

Interpretation I liver and river are by Japanese learners of English Interpretation
(The correct spelling is está.)

I interpretation of tokens
Solution
⇒ Develop a module that compares whether different interpretation of tokens
Solution
IThere is a verb in the sentence. Wrapping up (un)accentuated variants of input words are more likely. Wrapping up
⇒ The lack of an accent is a spelling error. Conclusion I Where this is the case, provide dedicated feedback Conclusion
Appendix
Screenshots
alerting learner of this confusion. Appendix
Screenshots
29 / 39 30 / 39
Wrapping up: Token Identification & Interpretation Conclusion

TAGARELA TAGARELA
I Problems for an ITS can arise from mismatches between Introduction

I Integration of computational, linguistic, and FLT/SLA
Introduction
Feedback Feedback
I the system’s interpretation of the learner input System Architecture
expertise opens up opportunities for ICALL research
System Architecture
I how the learner perceives and conceptualize the input Expert model: NLP Expert model: NLP
Annotation-based setup I An ITS such as TAGARELA can address specific needs Annotation-based setup
I Where such mismatches arise, the feedback produced

Activity model
in real-life FLT: Activity model
by the system is inadequate. Analyzing learner I provide opportunities for students to practice their Analyzing learner
language language
On Tokenization listening, reading, and writing skills On Tokenization
I We discussed two such mismatches for Portuguese Interpretation I provide individualized feedback to learner Interpretation
tokens in TAGARELA:
Mismatches in the
I foster learner awareness of language forms and categories Mismatches in the
I identification of tokens: contraction, encliticization Solution I provide contextualized activities integrating meaning Solution
I interpretation of tokens: accented characters characters and form characters
I We argued that these problems can be addressed

Mismatches in the
I The explicit activity design in ITS opens up unique Portuguese Properties
Mismatches in the
opportunities for the collection of learner language interpretation of tokens
I by treating accented and unaccented characters parallel Solution Solution

Wrapping up produced in a range of controlled but meaningful activities. Wrapping up
to common L1-transfer phonological confusions. Conclusion I Explicit activity design (constraining the potential learner Conclusion
I using an annotation-based NLP processing architecture Appendix
input) makes it possible to include target answers (i.e., a Appendix
supporting a rich representation of the learner input, Screenshots Screenshots
premeditated set of potential target hypotheses!)
including surface and deep tokenizations.
31 / 39 32 / 39
Language Learning. Computer-Assisted Language Learning 21(4), 323–338.
References TAGARELA
Detmar Meurers &
URL http://purl.org/dm/papers/amaral-meurers-call08.html.
TAGARELA
Detmar Meurers &
Amaral, L. & D. Meurers (2009). Little Things With Big Effects: On the Identification
Amaral, L. (2007). Designing Intelligent Language Tutoring Systems: integrating Introduction and Interpretation of Tokens for Error Diagnosis in ICALL. CALICO Journal Introduction
Natural Language Processing technology into foreign language teaching. Feedback
27(1). Feedback
Ph.D. thesis, The Ohio State University. System Architecture System Architecture
The three models Bailey, S. & D. Meurers (2006). Exercise-driven selection of content matching The three models
Amaral, L. & D. Meurers (2005). Towards Bridging the Gap between the Needs of Expert model: NLP Expert model: NLP
methodologies. Peer reviewed conference presentation. EUROCALL’06.
Foreign Language Teaching and NLP in ICALL. In A. Pedros-Gascon (ed.), Annotation-based setup Annotation-based setup
Activity model September 6, 2006. University of Granada. Activity model
Proceedings of the 8th Annual Symposium on Hispanic and Luso-Brazilian Relevance for processing Relevance for processing
Literatures, Linguistics, and Cultures. Bailey, S. & D. Meurers (2008). Diagnosing meaning errors in short answers to
reading comprehension questions. In J. Tetreault, J. Burstein & R. D. Felice
Amaral, L. & D. Meurers (2006). Where does ICALL Fit into Foreign Language language language
(eds.), Proceedings of the 3rd Workshop on Innovative Use of NLP for Building
Teaching? URL http://purl.org/net/icall/handouts/calico06-amaral-meurers.pdf. On Tokenization On Tokenization
Interpretation Educational Applications, held at ACL 2008. Columbus, Ohio: Association for Interpretation
23rd Annual Conference of the Computer Assisted Language Instruction Portuguese Properties
Computational Linguistics, pp. 107–115. URL Portuguese Properties
Consortium (CALICO), May 19, 2006. University of Hawaii. Mismatches in the Mismatches in the
identification of tokens http://aclweb.org/anthology-new/W/W08/W08-0913.pdf. identification of tokens
Amaral, L. & D. Meurers (2007a). Conceptualizing Student Models for ICALL. In Solution Solution
On interpreting accented Bick, E. (2000). The Parsing System “Palavras”: Automatic Grammatical Analysis On interpreting accented
C. Conati & K. F. McCoy (eds.), User Modeling 2007: Proceedings of the characters characters
of Portuguese in a Constraint Grammar Framework . Aarhus University Press.
Eleventh International Conference. Wien, New York, Berlin: Springer, Lecture Interpretation Interpretation
Notes in Computer Science. URL

Mismatches in the
Bick, E. (2004). PaNoLa: Integrating Constraint Grammar and CALL. In Portuguese Properties
Mismatches in the
http://purl.org/dm/papers/amaral-meurers-um07.html. interpretation of tokens H. Holmboe (ed.), Nordic Language Technology, Arbog for Nordisk interpretation of tokens
Solution
Sprogteknologisk Forskningsprogram 2000-2004 (Yearbook 2003), Solution
Amaral, L. & D. Meurers (2007b). Putting activity models in the driver’s seat: Wrapping up Wrapping up
Copenhagen: Museum Tusculanum, pp. 183–190.
Towards a demand-driven NLP architecture for ICALL. URL Conclusion Conclusion
http://www.ling.ohio-state.edu/icall/handouts/eurocall07-amaral-meurers.pdf. Ellis, N. (1994). Implicit and Explicit Language Learning - An Overview. In Implicit
Appendix Appendix
EUROCALL. September 7, 2007. University of Ulster, Coleraine Campus. Screenshots
and Explicit Learning of Languages, San Diego, CA: Academic Press, pp. Screenshots
Amaral, L. & D. Meurers (2008). From Recording Linguistic Competence to 1–31.

Supporting Inferences about Language Acquisition in Context: Extending the Götz, T. & O. Suhre (2004). Design and implementation of the UIMA Common
Conceptualization of Student Models for Intelligent Computer-Assisted Analysis System. IBM Systems Journal 43(3), 476–489.
32 / 39 32 / 39
Karlsson, F., A. Voutilainen, J. Heikkilä & A. Anttila (eds.) (1995). Constraint TAGARELA awareness in foreign language learning, Honolulu: University of Hawaii Press, TAGARELA
Grammar: A Language-Independent System for Parsing Unrestricted Text. Detmar Meurers &
pp. 1–63. Detmar Meurers &
No. 4 in Natural Language Processing. Berlin and New York: Mouton de Ramon Ziai
Schulz, R. A. (2002). Hilft es die Regel zu wissen um sie anzuwenden? Das
Ramon Ziai
Gruyter. Introduction Verhältnis von metalinguistischem Bewusstsein und grammatischer Introduction

Lightbown, P. M. & N. Spada (1999). How languages are learned. Oxford: Oxford Feedback
Kompetenz in DaF. Die Unterrichtspraxis—Teaching German 35(1), 15–24. Feedback
University Press. The three models
URL http://www.jstor.org/stable/pdfplus/3531951.pdf. The three models
Long, M. H. (1991). Focus on form: A design feature in language teaching Expert model: NLP
Ziai, R. (2009). A Flexible Annotation-Based Architecture for Intelligent Language Expert model: NLP
methodology. In K. D. Bot, C. Kramsch & R. Ginsberg (eds.), Foreign language Activity model Tutoring Systems. Master’s thesis, Universität Tübingen, Seminar für Activity model
research in cross-cultural perspective, Amsterdam: John Benjamins, pp. Relevance for processing Sprachwissenschaft. Relevance for processing
39–52. Analyzing learner Analyzing learner

Long, M. H. (1996). The role of linguistic environment in second language language language
acquisition. In W. C. Ritchie & T. K. Bhatia (eds.), Handbook of second Interpretation Interpretation
language acquisition, New York: Academic Press, pp. 413–468. Portuguese Properties Portuguese Properties
Lyster, R. (1998). Negotiation of form, recasts, and explicit correction in relation to identification of tokens identification of tokens
error types and learner repair in immersion classroom. Language Learning 48, Solution Solution
183–218. characters characters
Martins, R., R. Hasegawa & M. das Graças Nunes (2006). Curupira: a functional Interpretation
Interpretation
parser for Brazilian Portuguese. In Computational Processing of the Mismatches in the Mismatches in the
Portuguese Language, 6th International Workshop, PROPOR. Lecture Notes interpretation of tokens
Solution
Solution
in Computer Science 2721. Faro, Portugal: Springer. URL Wrapping up Wrapping up
http://www.springerlink.com/content/b48vjft1l88yvrj0/fulltext.pdf. Conclusion Conclusion

Norris, J. & L. Ortega (2000). Effectiveness of L2 Instruction: A Research Appendix Appendix
Synthesis and Quantitative Meta-Analysis. Language Learning 50(3), Screenshots Screenshots
417–528.
Schmidt, R. (1995). Consciousness and foreign language: A tutorial on the role of
attention and awareness in learning. In R. Schmidt (ed.), Attention and
32 / 39 32 / 39
TAGARELA TAGARELA
Feedback Feedback

language language
Solution Solution
Solution Solution
Appendix Appendix
33 / 39 34 / 39
TAGARELA TAGARELA
Feedback Feedback

language language
Solution Solution
Solution Solution
Appendix Appendix
35 / 39 36 / 39
TAGARELA TAGARELA
Feedback Feedback

language language
Solution Solution
Solution Solution
Appendix Appendix
37 / 39 38 / 39
TAGARELA
Detmar Meurers &
Ramon Ziai
Introduction
Feedback
System Architecture
The three models
Expert model: NLP
Activity model
Analyzing learner
language
On Tokenization
Interpretation
Mismatches in the
Solution
characters
Interpretation
Mismatches in the
Solution
Wrapping up
Conclusion
Appendix
Screenshots
39 / 39

Tagarela

Uploaded by

Copyright:

Available Formats

Tagarela

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Tagarela

Uploaded by

Copyright:

Available Formats

TAGARELA: An Intelligent Tutoring System

TAGARELA: A web based intelligent System Architecture

workbook for Portuguese

Analyzing learner I an intelligent web-based workbook for beginning Analyzing learner

based on joint research with On interpreting accented

Luiz Amaral (UMass Amherst) Interpretation Interpretation

System role, Activity types, Interface Providing Feedback

System Architecture System Architecture

Expert model: NLP Expert model: NLP

Analyzing learner I orthographic errors (non-words, spacing, capitalization, Analyzing learner

as listening/reading comprehension questions. Interpretation

Portuguese Properties Portuguese Properties

Providing feedback on meaning becomes crucial for

→ Use L2 as far as possible (needs careful interface design).

Analyzing learner Analyzing learner

Feedback on Word Choice

Analyzing learner Analyzing learner

Analyzing learner Analyzing learner

Feedback on Missing Verb General Architecture of TAGARELA

• Linguistic Analysis (form and content)

Feedback Manager (pedagogical modules)

• Error Filtering • Student analysis

I The TAGARELA architecture includes System Architecture System Architecture

⇒ Providing feedback involves Mismatches in the

I identifying linguistic properties of the learner input and Solution Solution

Annotation-based processing General Characteristics of Activities

I The same data structure, the learner input annotated

with information, is accessed throughout. On interpreting accented

Closely related idea: Common Analysis System (CAS,

Götz & Suhre 2004) of the Unstructured Information Mismatches in the

Management Architecture (UIMA). Solution

I General claim: The NLP analysis and feedback Introduction

generation depend on the specific activity (type). System Architecture

I The information from the activity model has an impact on

I Property Identification: Relevance for processing Relevance for processing

I Property Selection: Which of the observed properties Portuguese Properties

provide a reliable assessment?

the use of required strategies)

TAGARELA meets real life language learners Identifying tokens (I)

Studying the system logs, we identified two aspects On Tokenization On Tokenization

where feedback based on the linguistically correct Portuguese Properties

analysis did not seem to be helpful for learners: identification of tokens

I interpretation of tokens with accented characters On interpreting accented

I tokenization of compounds Interpretation

Mismatches in the Mismatches in the

Analyzing learner Analyzing learner

Conclusion I compram-nas = compram (buy) + as (them) Conclusion

Mismatches in the identification of tokens Addressing the Identification of Tokens

I System’s interpretation: no = em + o Analyzing learner Analyzing learner

PP em [NP omasc , regiãofem , norte]]

⇒ Agreement error between o and região.

I Student’s interpretation: characters

IThere is no o região norte in the sentence I wrote.

I used the ‘preposition’ no.

⇒ So no seems to be the wrong preposition? Wrapping up Wrapping up

 ‘no’  The three models The three models

tokenString  Expert model: NLP Expert model: NLP

    Portuguese Properties Portuguese Properties

 source ‘lexicon’  Mismatches in the Mismatches in the