THATCamp UGA in Valence 2018 http://ugainvalence2018.thatcamp.org Just another THATCamp site Thu, 21 Jun 2018 08:38:55 +0000 en-US hourly 1 https://wordpress.org/?v=4.9.12 The Web as a Data Platform for Everyone http://ugainvalence2018.thatcamp.org/2018/06/12/the-web-as-a-data-platform-for-everyone/ Tue, 12 Jun 2018 13:58:17 +0000 http://ugainvalence2018.thatcamp.org/?p=259

Type of session: Talk

Approximate duration: 1.5 hr

Skill level: All

We are living an exciting moment: a deluge of documents, texts, images, videos, tweets, etc. are accessible via the web that can be used for conducting social studies. In this sense, the Web is a data platform that can be fully exploited by non-specialist if they have the right tools. Yet, the majority of tools are limited to specific domains or very specific tasks. 

The objective of this talk is to introduce the basic concepts that builds the Web (HTML pages, HTTP protocol, URI/URL, client-server architectures, etc.) and how they work together. This knowledge can help non-expert you understand how to use more basic tools for retrieving data from the Web or communicating with computer specialists. As an example, I will describe how to build a corpus using articles from The Guardian web service.

]]>
Trust & Reputation http://ugainvalence2018.thatcamp.org/2018/06/04/trust-reputation/ Mon, 04 Jun 2018 16:08:31 +0000 http://ugainvalence2018.thatcamp.org/?p=247
How to establish & evaluate them.
What open source data and algorithms to use.
]]>
Why is social media hard to analyse? http://ugainvalence2018.thatcamp.org/2018/05/30/why-is-social-media-hard-to-analyse/ Wed, 30 May 2018 15:45:23 +0000 http://ugainvalence2018.thatcamp.org/?p=241

Type of session : Play

Title: Why is social media hard to analyse?

Name of session facilitator: Diana Maynard

Approximate duration: 1-2 hours

Skill level: beginner

Proposal: 

Tools for analysing tweets and other kinds of social media are everywhere these days, allowing us to understand what kinds of opinions are being expressed and who is talking about what. However, the reality might not be what you think! Most tools are actually pretty rubbish at “understanding” language, and especially the kinds of language used on social media. What happens when you run these analysers over sarcasm, irony, slang, mixed languages, and so on? In this session you can play with some of the GATE tools on social media datasets of different types. We’ll look together at what kind of problems might occur and discuss how these could be resolved. Most likely, we’ll spend our time laughing at funny examples of AI gone wrong.

Prerequisite: No experience required. Ideally, bring a laptop with GATE installed. gate.ac.uk/download

]]>
From Hack & Yacks to Reading Groups: Keeping skills apace Digital Scholarship http://ugainvalence2018.thatcamp.org/2018/05/30/from-hack-yacks-to-reading-groups-keeping-skills-apace-digital-scholarship/ Wed, 30 May 2018 14:22:55 +0000 http://ugainvalence2018.thatcamp.org/?p=237

Type of session: Talk

Title:  From Hack & Yacks to Reading Groups: Keeping skills apace Digital Scholarship

Name of session facilitator(s): Nora

Approximate duration: 1 hr

Skill level: All

Proposal:

Research libraries and cultural heritage institutions must be able to adapt to a changing research landscape and invest in the development of staff skills and core competencies to match if they are to continue to effectively support and engage with modern scholars. The Digital Curator team at British Library creates a variety of opportunities all year round for library staff to develop skills necessary to support emerging areas of modern scholarship, particularly the Digital Humanities (DH).

In this talk I can share a bit about how we approach this through our Digital Scholarship Training Programme, and in turn would love to hear from other campers where and how they keep their skills up to date!

Prerequisite: Just an interest 🙂

 

]]>
A simple tutorial using Open Refine to prepare messy historical data to be mapped in Google Fusion tables. http://ugainvalence2018.thatcamp.org/2018/05/30/a-simple-tutorial-using-open-refine-to-prepare-messy-historical-data-to-be-mapped-in-google-fusion-tables/ http://ugainvalence2018.thatcamp.org/2018/05/30/a-simple-tutorial-using-open-refine-to-prepare-messy-historical-data-to-be-mapped-in-google-fusion-tables/#comments Wed, 30 May 2018 14:01:52 +0000 http://ugainvalence2018.thatcamp.org/?p=234

Type of session: Play

Title: A simple tutorial using Open Refine to prepare messy historical data to be mapped in Google Fusion tables.

Name of session facilitator(s): Nora

Approximate duration: 1-2 hr

Skill level: All

Proposal:

I can walk through a short exercise we did for colleagues at British Library as part of our staff Digital Scholarship Training Programme showing how to use Open Refine to prepare messy historical data to be mapped. The dataset relates to our Canadian Photographs Collection. We’ll use OpenRefine to extract location names referenced in these image captions, and then Google Fusion Tables to find latitude/longitude and map the results. Participants are more than welcome to recommend/suggest/try other mapping tools with the data provided at their own pace as well and report back to the group!

Dataset: Picturing Canada Messy Data

Prerequisite: A laptop with OpenRefine installed. A Google account. A print out of Preparing your Data to be MappedRevised and a saved copy of GoogleSheetsGeocodeScript to cut and paste into Google Sheets.

 

]]>
http://ugainvalence2018.thatcamp.org/2018/05/30/a-simple-tutorial-using-open-refine-to-prepare-messy-historical-data-to-be-mapped-in-google-fusion-tables/feed/ 1
Exploring large annotated datasets for interesting information http://ugainvalence2018.thatcamp.org/2018/05/30/exploring-large-annotated-datasets-for-interesting-information/ Wed, 30 May 2018 10:56:50 +0000 http://ugainvalence2018.thatcamp.org/?p=229

Type of session : Play

Title: Exploring large annotated datasets for interesting information

Name of session facilitator: Diana Maynard

Approximate duration: 1 hour

Skill level: beginner

Proposal: 

Come and play with MIMIR, our tools for semantic search and visualisation of annotated data. Ask complex queries over huge amounts of data. For example: which newspapers talked most positively about Europe before the UK referendum? Were regional issues talked about more than national ones by those who wanted to leave? Which male actors born in France have talked about gay marriage in the BBC news? In this session we will use MIMIR to explore several annotated datasets and see what we can find out. We  might issue some challenges for finding the most interesting facts about a topic, or to answer certain questions the fastest.

Prerequisite: no experience of anything required. Bring laptop connected to the Internet!

Slides and sample queries for MIMIR: https://gate.ac.uk/tutorials/THATcamp2018.html

]]>
Introduction to GATE for text analysis http://ugainvalence2018.thatcamp.org/2018/05/30/introduction-to-gate-for-text-analysis/ Wed, 30 May 2018 10:17:20 +0000 http://ugainvalence2018.thatcamp.org/?p=223

Type of session : Teach/Play

Title: Introduction to GATE for text analysis

Name of session facilitator: Diana Maynard

Approximate duration: 2 hours

Skill level: all

Proposal: 

This session will demonstrate the basics of text analysis tasks such as named entity recognition and sentiment analysis with the open source GATE tools. Participants will be able to use the toolkit to try simple tasks as we go along, such as using some of the existing applications and tools for different languages, and try annotating their own texts. There are dozens of different tools and plugins to try out. Adventurous users can even try building their own simple applications, tinkering with existing ones, or comparing different tools for the same task and evaluating and visualising the results. Demonstration and discussion of the successes and failures will be encouraged!

Prerequisite: no experience of anything required. Bring laptop with GATE installed (gate.ac.uk/download)

Materials for download: slides and hands-on material (corpora etc) gate.ac.uk/tutorials/THATcamp2018.html

]]>
What can you do with this data ?? http://ugainvalence2018.thatcamp.org/2018/05/25/what-can-you-do-with-this-data/ http://ugainvalence2018.thatcamp.org/2018/05/25/what-can-you-do-with-this-data/#comments Fri, 25 May 2018 13:53:50 +0000 http://ugainvalence2018.thatcamp.org/?p=220

Type of session : PLAY

Title : What can you do with this data ??

Name of session facilitator(s) : Geraldine + ?

Approximate duration : 2h

Skill level : all

Proposal

One corpus of open data (Text ? CSV ?), five teams.

Each uses open source software (limited choice or free ?) to work on the data set and try to achieve results.

Each team presents these to the rest of the group at the end of the session.

Suggestions for the corpus to use or the tasks to complete ??

Prerequisite : Laptop with internet connection

]]>
http://ugainvalence2018.thatcamp.org/2018/05/25/what-can-you-do-with-this-data/feed/ 1
Open source toolkit for working with social media/networks http://ugainvalence2018.thatcamp.org/2018/05/25/open-source-toolkit-for-working-with-social-media-networks/ Fri, 25 May 2018 13:49:24 +0000 http://ugainvalence2018.thatcamp.org/?p=218

Type of session : MAKE

Title : Open source toolkit for working with social media/networks

Name of session facilitator(s) : Geraldine

Approximate duration : 45mn ?

Skill level : all

Proposal

Social media and networks have in recent years emerged as an El Dorado for research, be it academic or commercial. The promises are attractive, yet dealing with the data made available through those can seem daunting as illustrated by Monroe in his article entitled ‘The Five Vs of Big data Political Science’. To the challenges related to volume, Monroe adds velocity, variety, but also vinculation and validity. Indeed, the data available on social networks is heterogeneous, constantly evolving and interrelated which raises issues pertaining to collection, storage and usage.

The session could provide an opportunity to draw a list of open source tools currently available to address those various challenges.

Prerequisite : none

]]>
How to limit black box effects in collaborative projects http://ugainvalence2018.thatcamp.org/2018/05/25/how-to-limit-black-box-effects-in-collaborative-projects/ Fri, 25 May 2018 13:47:22 +0000 http://ugainvalence2018.thatcamp.org/?p=216

Type of session : TALK

Title : How to limit black box effects in collaborative projects

Name of session facilitator(s) : Geraldine + Javier ?

Approximate duration : 30mn ?

Skill level : all

Proposal

“In science, computing, and engineering, a black box is a device, system or object which can be viewed in terms of its inputs and outputs without any knowledge of its internal workings” (Wikepedia) + “a complicated electronic device whose internal mechanism is usually hidden from or mysterious to the user; broadly : anything that has mysterious or unknown internal functions or mechanisms” (Merriam Webster).

In research labs as well as in many companies today, computer scientists are working with non-specialists of computing who delegate to them tasks they are unable to perform on their own. Such a collaboration can lead to a productive partnership on both sides but dialogue between individuals and teams with different backgrounds, skills and methodologies can be challenging. One of those challenges is the blackbox effect. For a non-specialist of computing, how much does one need to understand of the mechanisms involved in the automated processes of the tasks performed to guarantee the scientific validity of the results ? What level of training is necessary and in what ? For a computer scientist working with non-specialists, how to make those processes understandable ? Is drawing a step by step summary of those tasks realistic ? Which tools could make it more easily manageable ?

Prerequisite : none

]]>