LinguaContext is a language learning tool, still in development. It enables someone reading web content, on a page in a language he is studying, to make a record of a word he isn’t familiar with and the context in which he found it.
The LinguaContext tool has two elements:
- A web application written with Asp.net Core and SqlServer, and
- A browser extension written in JavaScript.
How it works:
A user browsing page will have the LingaContext extension installed. This will enable him to:
- Log into the LingaContext extension from the website,
- Analyse the page to determine the language,
- Search for any stored words on the page.
A User browses a website in the language which he wishes to study. He comes across a word after looking up the word using another application (LinguaContext does not provide this translation facility) he will then highlight, right mouse-click to bring up the menu context, and select the option to save the word. The word, and the web page on which it occurred, will then be written back to the server.
The user can view the list of displayed words in the LinguaContext website.
Planned Enhancements:
Highlighting words for pages. Now, the word is not highlighted when the User returns to the page.
Introduce Lists. Allow the user to allocate words to lists. The database structure already allows for this.
Background Crawling A background thread can crawl favoured sites favoured by a given user to find other instances of words he’s looked up across sites that he’s browsed. So when he wants to remind himself of the meaning of a word, he is given multiple contexts across the sort of web content that is likely to engage his interest.
Profiling the User Building a profile based on the history of the User and browsing websites on his behalf for instances or words.
Applying Stemming or Lemmatization When searching for multiple instances of a word the system could be made to identify different forms. This could be achieved either by
- Stemming, a crude method of identifying the root of a word, or
- Lemmatization, that uses vocabulary and grammar for context-aware base forms