Type of session: Talk
Approximate duration: 1.5 hr
Skill level: All
We are living an exciting moment: a deluge of documents, texts, images, videos, tweets, etc. are accessible via the web that can be used for conducting social studies. In this sense, the Web is a data platform that can be fully exploited by non-specialist if they have the right tools. Yet, the majority of tools are limited to specific domains or very specific tasks.
The objective of this talk is to introduce the basic concepts that builds the Web (HTML pages, HTTP protocol, URI/URL, client-server architectures, etc.) and how they work together. This knowledge can help non-expert you understand how to use more basic tools for retrieving data from the Web or communicating with computer specialists. As an example, I will describe how to build a corpus using articles from The Guardian web service.