Filling the AI gap: languages and the humanities are central to the future of AI — this Berkeley group is broadening that conversation

There has never been a better time for the human centered disciplines and especially academics who study language to contribute to the understanding and development of generative AI systems.
William Allison, Chief Technology Officer, UC Berkeley
January 16, 2024

On February 28, the Berkeley Language Center and the Language and AI working group at the Townsend Center for the Humanities will be hosting a half-day conference Language and AI: Generating Interdisciplinary Connections and Possibilities. The conference, conceived and launched by Emily Hellmich and Kimberly Vinall, of the Berkeley Language Center, as well as Kayla van Kooten, a graduate student in the Department of German, intends to bring together perspectives from research, industry, and theory in order to explore how language/culture learning intersect with AI. 

For Kayla, the inspiration for the conference developed out of frustration that there wasn't broader collaboration happening between language/humanities students and STEM-based students. She’s hoping that this first event will be a calling card for collaboration since it's hard to know what research is happening across such a large campus.

“I started grad school at Berkeley wanting to explore the connection between technology and language. So many people I knew from my undergrad who studied languages eventually went on to work in tech, doing things like programming Alexa or training Google Translate.” Kayla said, “At Berkeley, I see all these cool events happening on campus, but they're all just STEM people talking about STEM things. If there is an inclusion of humanities, so far, it’s only been in conversations about AI and ethics, but not around language, which is proving in all fields to be at the heart of AI. I want to see more discussions about the role of language in large language models, and connect with folks across Berkeley so we can collaborate and utilize each other's strengths.” 

The conference features a number of perspectives. Featured speakers include David Bamman, Associate Professor, School of Information; Timothy Tangherlini, Professor, Department of Scandinavian; and Brock Imel, French PhD, NLP Manager and Prompt Engineer at Writer, among other panelists. The introduction will be given by UC Berkeley Chief Technology Officer William Allison, who just launched the UC Berkeley AI Community.

Allison, who has been exploring AI research across many UC campuses, feels that the humanities and languages should be a central part of A.I. research. "There has never been a better time for the human centered disciplines and especially academics who study language to contribute to the understanding and development of generative AI systems." Said Allison, "Bringing in deep expertise from outside of STEM disciplines at universities like UC Berkeley, to collaborate in engaging, interrogating and developing AI may afford just the kinds of fresh perspectives that will be needed to evolve the technology and society's collective response to it."

In order to learn more about the inspirations behind the conference and to learn more about the research happening around language and A.I. at Berkeley, we sat down with Kimberly Vinall, Executive Director, Berkeley Language Center; Emily Hellmich, Associate Director, Berkeley Language Center; and Kayla van Kooten, Ph.D. student in the Department of German. Read their interview below. 

We’ve already seen that these AI tools have the ability to learn how to program code, because it is a skill set that can be replicated. Something you really cannot replicate, yet at least, is the ability to think critically, to see things from different perspectives, and that's so valuable, to be able to to engage on these different levels. — Emily Hellmich


What role do you think the humanities have in the development and use of tools like ChatGPT? Why is it essential for the humanities to be a central contributor to conversations around these tools?

Kimberly: ​​The humanities ask fundamental questions about meaning making through and in language, in addition to other semiotic resources and I think that we need to be asking some of these same questions of these technologies. So for me, that's just one example of why the humanities and specifically the study of and about language(s) are essential to this conversation. 

Emily: I think the humanities and languages teach you how to look at things from multiple perspectives. Whether looking at a grammatical problem, or looking at a complex issue. We’ve already seen that these AI tools have the ability to learn how to program code, because it is a skill set that can be replicated. Something you really cannot replicate, yet at least, is the ability to think critically, to see things from different perspectives, and that's so valuable, to be able to to engage on these different levels. 

Kayla: When I was an undergrad, I studied Arabic and majored in Middle Eastern Languages and Cultures and was involved in a digital humanities project that was a collaboration between students of Arabic and informatics students. We were digitizing travel diaries from Iraq, and the informatics students were handling the majority of the technical side of things. However, during the digitization process, most of the informatics students didn’t know that Arabic was written right to left, which caused a lot of problems in transcription. People who are exclusively looking at these technologies through a purely STEM perspective are going to potentially miss really essential information that has a much larger impact. 

To engage these broader questions requires developing these digital literacies to be able to understand the affordances and limitations of these technologies and the ideological layers that are involved. — Kimberly Vinall

You each talk a lot about digital literacy. Can you explain what that means and how the humanities and languages, specifically, can help address digital literacy? How does this show up in the classroom? 

Kimberly: I think about digital literacies as a continuum from the functional to the critical, the functional being about the tool itself, its affordances and limitations, and the critical looks at questions of the ideologies embedded in the technologies that one needs to understand in any instance of their use. In many ways, these technologies alter how we engage with texts and language, what meanings are expressed and how, and what decisions are being made based on these understandings. To engage these broader questions requires developing these digital literacies to be able to understand the affordances and limitations of these technologies and the ideological layers that are involved. 

Students are already being asked to make ethical decisions here at Berkeley in their classrooms. They’re thinking “do I use ChatGPT? Do I not? And what’s at stake.” After having conversations with undergraduate students that are struggling with ChatGPT, one of the big issues for them is the realization that they will be using this technology in their jobs and in their personal lives in the future. In other words, these same ethical decisions are going to come up again and again. In fact, that's one of the debates that's happening right now in education — is it the responsibility of the university to teach students how and when to use these tools? I think it is. They need to be prepared for these ethical and ideological decisions that they will be encountering on a daily basis in ways that I, when I was an undergraduate, cannot have imagined.

Kayla: These tools are bringing up concerns in the classroom from students and instructors alike. I think this is where digital literacy comes in and will be a way forward academically – It is unrealistic to expect students to not engage with them in any way, so we need to be actively teaching digital literacy in the classroom.

My philosophy is that these tools are not going to go away. They're always going to be around and you can ban them in the classroom all you want, but students are still going to use them. That being said, we cannot expect that every student who comes into the classroom today is going to critically engage with these tools. So, how can we teach students to use these tools in meaningful ways rather than just telling them they cannot use them? 

I think at Berkeley there are a lot of professors who are engaging with digital literacy in the classroom, such as Alex Saum Pascual in Spanish & Portuguese. In a class I took with her in the fall, she brought in Emma Fraser from Media Studies for a guest lecture and workshop where we explored the capabilities of ChatGPT through strategic language use and prompt generation, even attempting to get it to write fiction. We analyzed how the language we used in the prompt can inform the results and generate particular pieces of information. 

Emily: When I think about digital literacies, because they're plural, it's different tools, different skills, different mindsets, different competencies that you need in order to accomplish particular goals. What helps a lot of instructors, I think, who are maybe nervous about what foundational canons might be disrupted by these tools is the fact that you can use digital tools to support the skills that students need to build. Some of the same skill sets that you use in, say, analytical thinking or a close reading, can be cultivated in a digital world. 

if someone is studying the inclusion of ASL in natural language processing with Daniel Klein (EECS, statistical natural language processing) and isn’t hanging out with the people in the humanities studying and researching ASL, we’re all missing out on that connection and collaboration. — Kayla van Kooten

Kayla, turning to you, you’re a PhD student in the German Department, co-founder of the Townsend Center working group Language and AI, and one of the organizers of this upcoming conference. Can you tell me about your background? 

Kayla: I got my undergraduate degree from University of Washington where I majored in International Studies and Middle Eastern Languages and Cultures and graduated in 2020 into a terrible job market. During a Fulbright year in Germany after graduation, I was exploring alternative career options in the fields of translation and tech. I was connected with a number of people who were language graduates working in tech in various positions. When I would ask how they learned the technical skills that got them their job, it was always self taught. 

So, I kind of felt like there was this gap between what I could do and what I currently can do, with no specific path to get me there. Not everyone can or wants to teach themselves to code, and not everyone has the self discipline or resources to do that. Hoping to bridge the gap in what I could do and what I currently can do, I matriculated at Berkeley. Here, I met Kimberly and Emily, which was so great because I finally met people who were thinking about the same problems and questions regarding language and technology that I was, particularly in academic contexts. 

This will be the first ever language-focused A.I. panel at Berkeley. How did this conference come about and what do you hope comes from this event? 

Kayla: This conference came about in collaboration between Kimberly, Emily and I. We identified a need for more engagement with languages in AI, and with this event we aim to develop a broader community of practice and research at Berkeley across all fields. For example, if someone is studying the inclusion of ASL in natural language processing with Daniel Klein (EECS, statistical natural language processing) and isn’t hanging out with the people in the humanities studying and researching ASL, we’re all missing out on that connection and collaboration. I am constantly learning about so many different people and projects that I would never know existed if I wasn't actively searching for them, but I’m only one person. I’m hoping this conference is the first of many, and will act as a calling card to bring together people engaging with language and AI. 

Who is speaking and how did you plan the theme? 

Kayla: The theme emerged quite naturally from our impression that there was a lack of cross-disciplinary collaboration between the languages and the more technical fields. We identified research, industry, and theory as pillars to explore the intersection of artificial intelligence and language. Through these pillars, we hope to showcase the work being done by different people in machine learning and language/culture study in the UC Berkeley community. We intentionally chose people from various fields and diverse backgrounds in order to enrich the conversion and push our thinking in new directions.

The research panel will consist of David Bamman, Associate Professor, School of Information; Rick Kern, Professor, French Department; Claudia von Vacano, Executive Director, D-Lab and the Digital Humanities; Emily Hellmich, Associate Director, Berkeley Language Center. 

Theory: Timothy Tangherlini, Professor, Dept of Scandinavian, Assoc. Director, Berkeley Institute for Data Science; Kent Chang, PhD student, Berkeley School of Information; Catherine Flynn, Associate Professor, Department of English; Ben Spanbock, PhD, College Writing Programs; Kimberly Vinall, Executive Director, Berkeley Language Center

Industry: Brock Imel, UCB French PhD, Director, Customer Language Engineering at Writer; Margaret Kolb, PhD, College of Engineering; Cristina Farronato, PhD; Department of Italian Studies; Kayo Yin, PhD student, Computer Science; Kayla van Kooten, PhD Student, Department of German

We’re really excited that the UC Berkeley Chief Technology Officer William Allison, who just launched a community of practice around A.I., is going to introduce the panel. 

Stay tuned for more information regarding time, location and RSVP information this week.