Analyzed about 1 year ago
Use the internet as a linguistic corpus:
Provide tools and infrastructure for acquisition, visual annotation, merging and storage of web pages as parts of bigger corpora.
Develop a classification engine that learns to automatically annotate pages, provide visual tools for inspection of results.