People are biased. And computers learn from people.

That means our data is biased, and in a big data world, that can cause big problems.

But researchers are finding ways to turn down the bias in a dataset. We’re talking to two of them on this episode of Talk the Talk.


Listen to this episode

Download this episode

You can listen to all the episodes of Talk the Talk by pasting this URL into your podlistener.

http://danielmidgley.com/talkthetalk/talk_classic.xml

Promo

Promo with Danae Gibson, 2018-09-10: Bias

Full interview

Interview with Robyn Speer and Kai-Wei Chang at ACL2018 (complete)

A lot of attention has been focused lately on bias in big data. The short version: People are biased, data comes from people, so the data is biased. And that means that our computational tools may come up with answers that exclude people, marginalise people, or might be just plain wrong. So can we fix this?

Daniel caught up with Robyn Speer (ConceptNet) and Kai-Wei Chang (UCLA) at the 2018 conference for the Association of Computational Linguistics in Melbourne. They work on reducing bias in data, and they explain how it all works.

Thanks to Kai-Wei and Robyn for the chat, and thanks to the ACL for making this interview possible.

Also at https://www.patreon.com/posts/21301041


Cutting Room Floor

Cutting Room Floor 337: Bias

Hedvig gives us a brain teaser: In Swedish, adding your to anything makes it an insult: your idiot or even your linguist. But why?

There’s a bonus quiz for Kylie and Hedvig about which way certain words are biased. Who will prevail?

And Hedvig reveals a secret technique she uses for ferreting out her bias.

Also at https://www.patreon.com/posts/21463476


Patreon supporters

Our Patreon patrons are helping us make the show better — and keeping it ad-free and on the airwaves. They include:

  • Jerry
  • Nicki
  • Termy
  • Ann
  • Helen
  • Jack
  • Matt
  • Sabrina

Thanks to all our patrons! Your support means a lot.

We’re Because Language now, and you can become a Patreon supporter!
Depending on your level, you can get bonus episodes, mailouts, shoutouts, come to live episodes, and of course have membership in our Discord community.

Become a Patron!

Show notes

AI sucks at stopping online trolls spewing toxic comments
https://www.theregister.co.uk/2018/08/31/ai_toxic_comments/

Gröndahl et al.All You Need is “Love”: Evading Hate Speech Detection (PDF)
https://arxiv.org/pdf/1808.09115.pdf

Lee, Cho, and Hofmann: Fully Character-Level Neural Machine Translation without Explicit Segmentation
https://arxiv.org/abs/1610.03017

Mehl, et al.: Are Women Really More Talkative Than Men?
http://science.sciencemag.org/content/317/5834/82

Computational linguistics reveals pervasive gender bias in modern English novels
https://www.technologyreview.com/s/611820/computational-linguistics-reveals-pervasive-gender-bias-in-modern-english-novels/

ConceptNet Numberbatch 17.04: better, less-stereotyped word vectors
http://blog.conceptnet.io/posts/2017/conceptnet-numberbatch-17-04-better-less-stereotyped-word-vectors/

I Am Part of the Resistance Inside the Trump Administration
https://www.nytimes.com/2018/09/05/opinion/trump-white-house-anonymous-resistance.html

What is a lodestar, the word from The New York Times Op-Ed people can’t stop talking about?
https://www.usatoday.com/story/news/nation-now/2018/09/06/new-york-times-editorial-lodestar-defined/1210402002/

Language Log: Lodestar
http://languagelog.ldc.upenn.edu/nll/?p=39910

Etymonline: lodestar (n.)
https://www.etymonline.com/word/lodestar

Counsellors dismissed as ‘gender whisperers’ deny teachers have been trained to spot transgender children
https://www.smh.com.au/politics/federal/counsellors-dismissed-as-gender-whisperers-deny-teachers-have-been-trained-to-spot-transgender-children-20180905-p501zd.html

We Fact-Checked The Daily Telegraph’s Rubbish About “Gender Whisperers” And Trans Kids
http://junkee.com/scott-morrison-gender-whisperer/174136

Prime Minister’s ‘gender whisperer’ comments deeply offensive and divisive
https://www.news.com.au/lifestyle/parenting/school-life/prime-ministers-gender-whisperer-comments-deeply-offensive-and-divisive/news-story/ca6cfafce5a713d0e3deac56897b922a

Scott Morrison confronted by transgender child on The Project
https://www.news.com.au/entertainment/tv/current-affairs/scott-morrison-confronted-by-transgender-child-on-the-project/news-story/b896352bda24f8147934d5ecc906e3f0

The oleaginous Mike Pence, with his talent for toadyism and appetite for obsequiousness, could, Trump knew, become America’s most repulsive public figure.

George Will, Trump is no longer the worst person in government, Washington Post

George Will really doesn’t like ‘oleaginous’ Mike Pence, but he loves big words
https://www.marketwatch.com/story/george-will-really-doesnt-like-oleaginous-mike-pence-but-he-loves-big-words-2018-05-10

Fire Devastates Brazil’s Oldest Science Museum
https://www.nationalgeographic.com/science/2018/09/news-museu-nacional-fire-rio-de-janeiro-natural-history/

The irreplaceable scientific treasures lost in Brazil’s National Museum blaze
https://elpais.com/elpais/2018/09/07/inenglish/1536314750_865530.html

Brazil’s Museum Fire Proves Cultural Memory Needs A Digital Backup
https://www.wired.com/story/brazil-museum-fire-digital-archives/

Think the museum fire in Brazil can’t happen here? Think again
http://www.latimes.com/opinion/op-ed/la-oe-mccormack-brazil-museum-fire-funding-20180909-story.html


Transcript

We’re working our way back through the archives. If you think we should prioritise a transcript of this episode, let us know!