Text Mining – THATCamp UGA in Valence 2018 http://ugainvalence2018.thatcamp.org Just another THATCamp site Thu, 21 Jun 2018 08:38:55 +0000 en-US hourly 1 https://wordpress.org/?v=4.9.12 Why is social media hard to analyse? http://ugainvalence2018.thatcamp.org/2018/05/30/why-is-social-media-hard-to-analyse/ Wed, 30 May 2018 15:45:23 +0000 http://ugainvalence2018.thatcamp.org/?p=241

Type of session : Play

Title: Why is social media hard to analyse?

Name of session facilitator: Diana Maynard

Approximate duration: 1-2 hours

Skill level: beginner

Proposal: 

Tools for analysing tweets and other kinds of social media are everywhere these days, allowing us to understand what kinds of opinions are being expressed and who is talking about what. However, the reality might not be what you think! Most tools are actually pretty rubbish at “understanding” language, and especially the kinds of language used on social media. What happens when you run these analysers over sarcasm, irony, slang, mixed languages, and so on? In this session you can play with some of the GATE tools on social media datasets of different types. We’ll look together at what kind of problems might occur and discuss how these could be resolved. Most likely, we’ll spend our time laughing at funny examples of AI gone wrong.

Prerequisite: No experience required. Ideally, bring a laptop with GATE installed. gate.ac.uk/download

]]>
Exploring large annotated datasets for interesting information http://ugainvalence2018.thatcamp.org/2018/05/30/exploring-large-annotated-datasets-for-interesting-information/ Wed, 30 May 2018 10:56:50 +0000 http://ugainvalence2018.thatcamp.org/?p=229

Type of session : Play

Title: Exploring large annotated datasets for interesting information

Name of session facilitator: Diana Maynard

Approximate duration: 1 hour

Skill level: beginner

Proposal: 

Come and play with MIMIR, our tools for semantic search and visualisation of annotated data. Ask complex queries over huge amounts of data. For example: which newspapers talked most positively about Europe before the UK referendum? Were regional issues talked about more than national ones by those who wanted to leave? Which male actors born in France have talked about gay marriage in the BBC news? In this session we will use MIMIR to explore several annotated datasets and see what we can find out. We  might issue some challenges for finding the most interesting facts about a topic, or to answer certain questions the fastest.

Prerequisite: no experience of anything required. Bring laptop connected to the Internet!

Slides and sample queries for MIMIR: https://gate.ac.uk/tutorials/THATcamp2018.html

]]>
Introduction to GATE for text analysis http://ugainvalence2018.thatcamp.org/2018/05/30/introduction-to-gate-for-text-analysis/ Wed, 30 May 2018 10:17:20 +0000 http://ugainvalence2018.thatcamp.org/?p=223

Type of session : Teach/Play

Title: Introduction to GATE for text analysis

Name of session facilitator: Diana Maynard

Approximate duration: 2 hours

Skill level: all

Proposal: 

This session will demonstrate the basics of text analysis tasks such as named entity recognition and sentiment analysis with the open source GATE tools. Participants will be able to use the toolkit to try simple tasks as we go along, such as using some of the existing applications and tools for different languages, and try annotating their own texts. There are dozens of different tools and plugins to try out. Adventurous users can even try building their own simple applications, tinkering with existing ones, or comparing different tools for the same task and evaluating and visualising the results. Demonstration and discussion of the successes and failures will be encouraged!

Prerequisite: no experience of anything required. Bring laptop with GATE installed (gate.ac.uk/download)

Materials for download: slides and hands-on material (corpora etc) gate.ac.uk/tutorials/THATcamp2018.html

]]>