Advanced Resource Creation, Archiving and Usage

Report on the CLARA 2010 Summer School on Advanced Resource Creation, Archiving and Usage, organized at Nijmegen, 5-16 July 2010 by the Max Planck Institute for Psycholinguistics.

The goal was to train early stage researchers in the use of modern technology to create language resources in particular when the source material consists of multimedia streams, in methods for archiving the resulting complex resource types, and in methods and tools for their access and analysis via state-of-the-art (web) applications and for their enrichment. It was also shown how virtual collections can be built and how to carry out operations on such collections. As a result, early stage researchers have gotten a deep understanding about modern methodologies and technologies which can create, archive and use sharable resources.

The focus of this training has been on (1) using state-of-the-art tools to create resources that adhere to open standards such as XML and MPEG, (2) how to optimally make use of open archives, how to use converters for various data types and how to define the necessary access permissions, (3) using existing frameworks that allow accessing the archived resources via various ways (from metadata up to web applications for complex objects), (4) applying existing methods of how to create and use virtual collections and (5) using existing frameworks to make comments and draw relations between resources and resource fragments.

The participants were early stage PhDs and Postdocs who are going to work on/with language resources in their work and who needed to be trained in the use of state-of-the-art tools. Some knowledge about computational aspects was expected, but not deep skills about XML schema or software programming. Thus we addressed those who see themselves as being users of modern technology and methodologies.

The LAT technology from MPI was used during the course as well as some other major tools, e.g. to perform efficient processing of specific tasks (speech analysis, image analysis, conversion, etc).

Teachers were mainly members of MPI. In addition there were external specialists with deep knowledge about relevant standards in our domain, about audio/video codecs and appropriate software, about speech analysis and appropriate software, about semantic web techniques and representation standards such as RDF. The MPI experts have 8 years of experience in giving such courses twice per year to the indicated group of people.

There was no participation fee for the summer school.

Leave a Reply

You must be logged in to post a comment.