  {"id":40204,"date":"2023-04-26T10:59:14","date_gmt":"2023-04-26T15:59:14","guid":{"rendered":"https:\/\/uwm.edu\/libraries\/?page_id=40204"},"modified":"2026-02-25T13:17:40","modified_gmt":"2026-02-25T19:17:40","slug":"lgbtq-audio-archive-mining-project","status":"publish","type":"page","link":"https:\/\/uwm.edu\/libraries\/digital-humanities\/dh-lab-resources\/lgbtq-audio-archive-mining-project\/","title":{"rendered":"LGBTQ+ Audio Archive Mining Project"},"content":{"rendered":"\n<div class=\"uwm-p-tabs\" data-tabs-prefix-class=\"uwm-p-tabs\"><ul class=\"uwm-p-tablist\" data-hx=\"h2\"><li class=\"uwm-p-tablist--item\"><a href=\"#tab-4cqh-home\" class=\"uwm-p-tablist--link\">Home<\/a><\/li><li class=\"uwm-p-tablist--item\"><a href=\"#tab-4cqh-project-team\" class=\"uwm-p-tablist--link\">Project Team<\/a><\/li><li class=\"uwm-p-tablist--item\"><a href=\"#tab-4cqh-proposal\" class=\"uwm-p-tablist--link\">Proposal<\/a><\/li><li class=\"uwm-p-tablist--item\"><a href=\"#tab-4cqh-collections\" class=\"uwm-p-tablist--link\">Collections<\/a><\/li><li class=\"uwm-p-tablist--item\"><a href=\"#tab-4cqh-updates\" class=\"uwm-p-tablist--link\">Updates<\/a><\/li><\/ul><div class=\"uwm-p-tabcontent\">\n<div id=\"tab-4cqh-home\" class=\"uwm-p-tabcontent--pane\">\n<div class=\"uwm-p-slider uwm-p-slider--dots-outside\"><div class=\"uwm-p-slider--base\"><figure class=\"uwm-c-img--caption-gray\"><img decoding=\"async\" src=\"https:\/\/uwm.edu\/libraries\/wp-content\/uploads\/sites\/572\/2023\/04\/Tri-Cable-News-9-lesbians-of-color-support-group.jpg\" alt=\"\" title=\"\" loading=\"lazy\" width=\"750\" height=\"500\" \/><figcaption>Tri Cable Tonight Lesbians of Color Support Group Speaker<\/figcaption><\/figure><figure class=\"uwm-c-img--caption-gray\"><img decoding=\"async\" src=\"https:\/\/uwm.edu\/libraries\/wp-content\/uploads\/sites\/572\/2023\/04\/Tri-Cable-Tonight-31.jpg\" alt=\"\" title=\"\" loading=\"lazy\" width=\"750\" height=\"500\" \/><figcaption>Tri Cable Tonight 1989 Stonewall 20th Anniversary Segment<\/figcaption><\/figure><figure class=\"uwm-c-img--caption-gray\"><img decoding=\"async\" src=\"https:\/\/uwm.edu\/libraries\/wp-content\/uploads\/sites\/572\/2023\/04\/Tri-Cable-Tonight-33.jpg\" alt=\"\" title=\"\" loading=\"lazy\" width=\"750\" height=\"500\" \/><figcaption>Tri Cable Tonight Gay Liberation Parade Representatives<\/figcaption><\/figure><figure class=\"uwm-c-img--caption-gray\"><img decoding=\"async\" src=\"https:\/\/uwm.edu\/libraries\/wp-content\/uploads\/sites\/572\/2023\/04\/Tri-Cable-Tonight-32.jpg\" alt=\"\" title=\"\" loading=\"lazy\" width=\"750\" height=\"500\" \/><figcaption>Tri Cable Tonight&#8211;Historic Madison Pride March<\/figcaption><\/figure><figure class=\"uwm-c-img--caption-gray\"><img decoding=\"async\" src=\"https:\/\/uwm.edu\/libraries\/wp-content\/uploads\/sites\/572\/2023\/04\/TRI-Cable-News-1.jpg\" alt=\"\" title=\"\" loading=\"lazy\" width=\"750\" height=\"500\" \/><figcaption>TRI Cable Tonight 1987 Community Cable News Report<\/figcaption><\/figure><figure class=\"uwm-c-img--caption-gray\"><img decoding=\"async\" src=\"https:\/\/uwm.edu\/libraries\/wp-content\/uploads\/sites\/572\/2023\/04\/Tri-Cable-Tongiht-16.jpg\" alt=\"\" title=\"\" loading=\"lazy\" width=\"750\" height=\"500\" \/><figcaption>Tri Cable Tonight&#8211;Lesbians of Color News Segment<\/figcaption><\/figure><\/div><\/div>\n\n\n\n<p>LGBTQ+ history has often been hidden away. But we can bring that history out into the open\u2013and you can play a part!<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"a0\">The Issue<\/h2>\n\n\n\n<p>The 51ÁÔÆæ Libraries house important archives holding historical and contemporary LGBTQ+ materials. Included are rich records of LGBTQ+ communities in Milwaukee, Wisconsin, and the Midwest generally. Not only do these archives contain textual documents such as community newsletters, advocacy group records, and personal letters\u2013they also contain audiovisual materials. Examples include local television news and radio broadcasts, early LGBTQ+ community cable programming, and video recorded oral histories.<\/p>\n\n\n\n<p>Working with archives is fascinating, because these are primary sources that can be full of surprises. Reading and listening, the researcher makes new discoveries: a handwritten note on a news clipping might yield new insights A recording of an old news broadcast on high schools opening might make a surprising quick reference to a new student Gay-Straight Alliance having formed.<\/p>\n\n\n\n<p>But what makes archival research fascinating is also what makes it frustrating, because often a user will have little idea of what is in the archive. Some textual sources like newspapers may have been digitized and processed using \u201cOCR\u201d\u2013Optical Character Recognition\u2013so you can use search terms to find materials you are interested in. But what about audiovisual materials? The 51ÁÔÆæ Library archives contain many digitized videos and audio recordings that users can access, but obviously they can\u2019t just be run through a text-recognition package to be able to search their content. So those using the audiovisual archives have had to rely on the terse descriptions each item was given when it was added to the collection, which necessarily only provide a broad outline of the actual contents of the item.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"a1\">The Solution<\/h2>\n\n\n\n<p>This is where the&nbsp;<em>LGBTQ+ Audio Archive Mining Project<\/em>&nbsp;comes in. Members of this project team are developing ways to automatically generate searchable text transcripts of the audiovisual materials in the archive. More than this: the team is using open-source software to create tools that will allow users like you to visualize patterns across the texts, such as how often various words appear, and relationships between terms that get used.<\/p>\n\n\n\n<p>The end goal is that users will be able to search for terms that interest them\u2013like \u201cbisexual\u201d or \u201cdomestic partner\u201d\u2013in novel places, like recordings of local news broadcasts not labeled as containing LGBTQ+ content. Users can find out how selected terms may have been used differently in the same timeframe in different contexts\u2013were mainstream local news broadcasts mostly using the term \u201chomosexual\u201d at a time when community news sources were using \u201clesbian and gay\u201d? A researcher could trace the uneven fading from usage of terms like \u201ctransvestite\u201d alongside the growing usage of terms like \u201ctransgender\u201d. And a user can trace the relationships between words. Are terms for LGBTQ+ people appearing in close proximity to terms identifying people of color? How does this change over time?<\/p>\n\n\n\n<p>The&nbsp;<em>LGBTQ+ Audio Archive Mining Project<\/em>&nbsp;aims to allow academic researchers and community members to discover new information about LGBTQ+ history. It also aims to make the tools it is creating\u2013and the processes for developing them\u2013available to all, so that they can be used to mine information about other communities, in other archives.<\/p>\n<\/div>\n\n\n\n<div id=\"tab-4cqh-project-team\" class=\"uwm-p-tabcontent--pane\">\n<p><strong>Project Leads<\/strong>:&nbsp;<em><strong>Ann Hanlon<\/strong><\/em>, Head, Digital Collections &amp; Initiatives and Digital Humanities Lab, 51ÁÔÆæ Libraries;&nbsp;<em><strong>Dan Siercks<\/strong><\/em>, Interim Director, Web &amp; Data Services, 51ÁÔÆæ College of Letters &amp; Science<\/p>\n\n\n\n<p><strong>Disciplinary Scholar<\/strong>:&nbsp;<em><strong>Cary Costello<\/strong><\/em>, Associate Professor, Department of Sociology and Director, LGBT Studies Program, 51ÁÔÆæ College of Letters &amp; Science<\/p>\n\n\n\n<p><strong>Senior Administrator<\/strong>:&nbsp;<em><strong>Marcy Bidney<\/strong><\/em>, Assistant Director of Libraries for Distinctive Collections and Curator of the American Geographical Society Library, 51ÁÔÆæ Libraries<\/p>\n\n\n\n<p><strong>Team<\/strong>:&nbsp;<em><strong>Shiraz Bhathena<\/strong><\/em>, Digital Archivist, 51ÁÔÆæ Libraries;&nbsp;<em><strong>Jie Chen<\/strong><\/em>, Application Specialist, 51ÁÔÆæ Libraries;&nbsp;<em><strong>Karl Holten<\/strong><\/em>, Information Systems Specialist, 51ÁÔÆæ Libraries and College of Letters &amp; Science;&nbsp;<em><strong>Ling Meng<\/strong><\/em>, Digital Collections Librarian, 51ÁÔÆæ Libraries<\/p>\n<\/div>\n\n\n\n<div id=\"tab-4cqh-proposal\" class=\"uwm-p-tabcontent--pane\">\n<div data-wp-interactive=\"core\/file\" class=\"wp-block-file\"><object data-wp-bind--hidden=\"!state.hasPdfPreview\" hidden class=\"wp-block-file__embed\" data=\"https:\/\/uwm.edu\/libraries\/wp-content\/uploads\/sites\/572\/2023\/04\/uwm_collsdata-proposal_public_2019.pdf\" type=\"application\/pdf\" style=\"width:100%;height:600px\" aria-label=\"Embed of uwm_collsdata-proposal_public_2019.\"><\/object><a id=\"wp-block-file--media-12c09dce-8e35-4444-b90a-151f044a4602\" href=\"https:\/\/uwm.edu\/libraries\/wp-content\/uploads\/sites\/572\/2023\/04\/uwm_collsdata-proposal_public_2019.pdf\">uwm_collsdata-proposal_public_2019<\/a><a href=\"https:\/\/uwm.edu\/libraries\/wp-content\/uploads\/sites\/572\/2023\/04\/uwm_collsdata-proposal_public_2019.pdf\" class=\"wp-block-file__button wp-element-button\" download aria-describedby=\"wp-block-file--media-12c09dce-8e35-4444-b90a-151f044a4602\">Download<\/a><\/div>\n<\/div>\n\n\n\n<div id=\"tab-4cqh-collections\" class=\"uwm-p-tabcontent--pane\">\n<p><strong>ACT UP Milwaukee Records<\/strong>:&nbsp;<a href=\"https:\/\/uwm.edu\/lib-collections\/actup\/\">Digital Collection<\/a>&nbsp;|&nbsp;<a href=\"https:\/\/digital.library.wisc.edu\/1711.dl\/wiarchives.uw-mil-uwmmss0203\">Finding Aid<\/a><br>Collection documents the history and activism of the Milwaukee chapter of ACT UP (AIDS Coalition to Unleash Power). It includes meeting agendas and minutes, newsletters, press releases, flyers and other ephemera, grant applications, financial reports, subject files, and videos relating to ACT UP Milwaukee demonstrations. The collection also contains documentation regarding the ACT UP Network, an information clearinghouse for ACT UP chapters across the United States and Europe. ACT UP Milwaukee administered the Network.<\/p>\n\n\n\n<p><strong>AIDS Resource Center of Wisconsin Records:&nbsp;<\/strong><a href=\"https:\/\/uwm.edu\/lib-collections\/arcw\/\">Digital Collection<\/a>&nbsp;|&nbsp;<a href=\"http:\/\/digital.library.wisc.edu\/1711.dl\/wiarchives.uw-mil-uwmmss0335\">Finding Aid<\/a><br>Founded in 1985, the AIDS Resource Center of Wisconsin (ARCW) is a national model for providing comprehensive, integrated health and social services to HIV patients. The collection documents the development of ARCW from a small provider of limited social services to the largest provider of HIV-health care services in Wisconsin and the most comprehensive AIDS service organization in the United States. It includes administrative records relating to governmental advocacy, annual reports, financial statements, strategic plans, board meeting minutes, and originating documents. It also contains photographs, posters, and audiovisual material related to the fight against HIV\/AIDS in Wisconsin.<\/p>\n\n\n\n<p><strong>Cream City Foundation Records<\/strong>:&nbsp;<a href=\"http:\/\/digital.library.wisc.edu\/1711.dl\/wiarchives.uw-mil-uwmmss0205\">Finding Aid<\/a><br>Founded in 1982, Cream City Foundation (CCF) is the only non-profit, grant making, community-based foundation serving the entire State of Wisconsin whose sole purpose is to support the changing needs of the lesbian, gay, bisexual and transgender (LGBT) communities. The collection contains correspondence, board meeting minutes, newsletters, financial information, publicity materials, articles of incorporation, and annual reports. It also contains grant applications from over eighty LGBT organizations, many of which are now inactive. These applications are likely the only surviving documentation of many of these groups. Examples of educational materials produced with CCF support are included in the collection.<\/p>\n\n\n\n<p><strong>Gay People\u2019s Union Records<\/strong>:&nbsp;<a href=\"https:\/\/search.library.wisc.edu\/digital\/AGPU\">Digital Collection<\/a>&nbsp;|&nbsp;<a href=\"http:\/\/digital.library.wisc.edu\/1711.dl\/wiarchives.uw-mil-uwmmss0240\">Finding Aid<\/a><br>Collection consists of the records of the Gay Peoples Union (GPU), the first gay rights organization in Milwaukee, Wisconsin and one of the earliest such groups in the state. Collection documents the history of&nbsp;GPU&nbsp;from its beginnings as a University of Wisconsin-Milwaukee student organization to its development as the most important gay and lesbian rights organization in Milwaukee in the 1970s. The collection also includes audio recordings of&nbsp;Gay Perspective, a radio program produced by&nbsp;GPU&nbsp;and broadcast on local radio stations from 1971 to 1972, and other public presentations given by&nbsp;GPU&nbsp;members as part the organization\u2019s educational mission.<\/p>\n\n\n\n<p><strong>James Liddy Papers:&nbsp;<\/strong><a href=\"http:\/\/digital.library.wisc.edu\/1711.dl\/wiarchives.uw-mil-uwmmss0300\">Finding Aid<\/a><br>The collection documents James Liddy\u2019s life as a poet and professor. Liddy was born and raised in Ireland, and after briefly practicing law, he turned to a life of poetry. He moved to San Francisco in 1967 and began teaching poetry and English at various institutions across the U.S. before finally settling down at the University of Wisconsin-Milwaukee in 1976, where he taught for over 30 years. The collection contains primarily correspondence, literary papers, and general files. Some of his works include:&nbsp;<em>In a Blue Smoke<\/em>&nbsp;(1964),&nbsp;<em>Baudelaire\u2019s Bar Flowers<\/em>&nbsp;(1975),&nbsp;<em>A White Thought in a White Shade<\/em>&nbsp;(1987),&nbsp;<em>Collected Poems<\/em>&nbsp;(1994), and&nbsp;<em>The Doctor\u2019s House<\/em>&nbsp;(2004).<\/p>\n\n\n\n<p><strong>Milwaukee Gay\/Lesbian Network Records<\/strong>:&nbsp;<a href=\"https:\/\/uwm.edu\/lib-collections\/mglcn\/\">Digital Collection<\/a>&nbsp;|&nbsp;<a href=\"http:\/\/digital.library.wisc.edu\/1711.dl\/wiarchives.uw-mil-uwmmss0206\">Finding Aid<\/a><br>Collection consists of regular and special programming produced by the Milwaukee Gay\/Lesbian Cable Network (MGLCN) from 1987 to 1994. MGLCN was established by a group of individuals who wanted to produce regular programming on local gay and lesbian issues using the newly available facilities of MATA (Milwaukee Access Telecommunications Authority) Community Media. MGLCN produced Tri-Cable Tonight, a monthly news and entertainment program; the New Tri-Cable Tonight, a panel discussion program; and Yellow on Thursday, a comedy show featuring shorts, skits, and parodies.<\/p>\n\n\n\n<p><strong>Milwaukee PrideFest Records<\/strong>:&nbsp;<a href=\"http:\/\/digital.library.wisc.edu\/1711.dl\/wiarchives.uw-mil-uwmmss0315\">Finding Aid<\/a><br>Milwaukee\u2019s first Pride event occurred when the Gay Liberation Front (GLF) organized a \u201cGay Pride Week\u201d in January 1971. The event was repeated in 1972 and 1973. While there were sporadic \u201cgay days\u201d and picnics in the late 1970s and 1980s, Milwaukeeans typically traveled to Chicago to observe or march in that city\u2019s celebrations. However, Pride events were held in Milwaukee in both 1980 and 1981. The Milwaukee Lesbian\/Gay Pride Committee (MLGPC) held its first annual pride event in 1988. In 1994 MLGPC was dissolved and PrideFest, Inc. was created. PrideFest has been held at various locations during its history, including Mitchell Park (1988), Cathedral Square Park (1989-1990), Juneau Park (1991-1993), Veterans Park (1994-1995), and the Henry Maier Festival Park (since 1996). The festival features a diverse range of performers and numerous activities such as a volleyball tournament, parade, religious ceremony, mass wedding\/commitment ceremony, and fireworks.<\/p>\n\n\n\n<p><strong>Miriam Ben-Shalom Papers<\/strong>:&nbsp;<a href=\"http:\/\/digital.library.wisc.edu\/1711.dl\/wiarchives.uw-mil-uwmmss0237\">Finding Aid<\/a><br>The collection contains personal papers and other documentation collected by Miriam Ben-Shalom, the first gay or lesbian member of the United States military service to be reinstated after being discharged for her sexual orientation. The collection documents Ben-Shalom\u2019s legal battles with the military, as well as the general topic of homosexuals and the military. Other materials pertain to the gay and lesbian veterans movement, feminism, and related social justice issues.<\/p>\n\n\n\n<p><strong>Milwaukee LGBT Oral History Project:<\/strong>&nbsp;<a href=\"https:\/\/collections.lib.uwm.edu\/digital\/collection\/lgbt\/search\">Digital Collection<\/a>&nbsp;|&nbsp;<a href=\"http:\/\/digital.library.wisc.edu\/1711.dl\/wiarchives.uw-mil-uwmmss0200\">Finding Aid<\/a><br>Collection consists of oral history interviews conducted by the Milwaukee LGBT History Project with members of Milwaukee\u2019s LGBT (lesbian, gay, bisexual, and transgender) community. The collection includes audio records and interview transcripts. Interviewees describe their coming out experiences, the Gay Liberation Movement in Milwaukee, early LGBT organizations, the impact of feminism on LGBT politics, and LGBT social activities.<\/p>\n\n\n\n<p><strong>Milwaukee Transgender Oral History Project<\/strong>:&nbsp;<a href=\"https:\/\/collections.lib.uwm.edu\/digital\/collection\/transhist\/search\">Digital Collection<\/a>&nbsp;|&nbsp;<a href=\"http:\/\/digital.library.wisc.edu\/1711.dl\/wiarchives.uw-mil-uwmmss0302\">Finding Aid<\/a><br>Interviews with eight individuals concerning Milwaukee\u2019s transgender community and its history. Among them are social activists, organizational leaders, healthcare workers, service providers, and performers. Individuals self-identify across a broad spectrum of gender identities, and some resist gender identification entirely. Topics covered include transgender people and the feminist movement, the intersection of transgender identity and sexual orientation, transgender healthcare, coming out, and community organizations.<\/p>\n\n\n\n<p><strong>Ray Vahey Papers<\/strong>:&nbsp;<a href=\"http:\/\/digital.library.wisc.edu\/1711.dl\/wiarchives.uw-mil-uwmmss0271\">Finding Aid<\/a><br>The Ray Vahey papers document Vahey\u2019s life with his partner, Richard Taylor, and their political activism on behalf of gay, lesbian, bisexual, and transgender civil rights in Wisconsin.<\/p>\n\n\n\n<p><strong>Shall Not Be Recognized Exhibition Records<\/strong>:&nbsp;<a href=\"https:\/\/uwm.edu\/lib-collections\/shall-not-be-recognized\/\">Digital Collection<\/a>&nbsp;|&nbsp;<a href=\"http:\/\/digital.library.wisc.edu\/1711.dl\/wiarchives.uw-mil-uwmmss0263\">Finding Aid<\/a><br>Materials regarding the&nbsp;<em>Shall Not Be Recognized: Portraits of Same-Sex Couples<\/em>&nbsp;exhibit, which documents the experiences of thirty same-sex couples in long-term, committed relationships in the area of Milwaukee, Wisconsin. The traveling exhibit was a collaboration between author Will Fellows and photographer Jeff Pearcy.<\/p>\n<\/div>\n\n\n\n<div id=\"tab-4cqh-updates\" class=\"uwm-p-tabcontent--pane\">\n<p><strong>Final Deliverables<\/strong><\/p>\n\n\n\n<p>The LGBTQ+ AV Archive Mining Project closed on October 26, 2021. The final deliverables, including the R dashboard and transcripts, can be found here:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Dashboard: <a href=\"http:\/\/ls-shiny-prod.uwm.edu\/collections_as_data\/code\/rshiny_dashboard\/\">http:\/\/ls-shiny-prod.uwm.edu\/collections_as_data\/code\/rshiny_dashboard\/<\/a><\/li>\n\n\n\n<li>Transcripts: <a href=\"https:\/\/dc.uwm.edu\/lgbtq\/\">https:\/\/dc.uwm.edu\/lgbtq\/<\/a><\/li>\n\n\n\n<li>Github repository: <a href=\"https:\/\/github.com\/51ÁÔÆæ\">https:\/\/github.com\/51ÁÔÆæ<\/a><\/li>\n\n\n\n<li>Final report: <a href=\"https:\/\/osf.io\/pjes7\">https:\/\/osf.io\/pjes7<\/a><\/li>\n<\/ul>\n\n\n\n<p><strong>Building a dashboard<\/strong><\/p>\n\n\n\n<p>Work on our speech-to-text workflow progressed through fall 2020, with a shift from the open source&nbsp;<a href=\"https:\/\/github.com\/mozilla\/DeepSpeech\">Mozilla DeepSpeech<\/a>, to experimentation with&nbsp;<a href=\"https:\/\/azure.microsoft.com\/en-us\/services\/cognitive-services\/speech-to-text\/#features\">Microsoft Azure\u2019s Speech-to-Text services<\/a>. Given the immediate improvements in accuracy \u2013 including a lower error rate for community-identified terms \u2013 we have been using the Azure workflow as our primary method for text extraction since December 2020. More on continued \u201cerror identification\u201d in a later post!<\/p>\n\n\n\n<p>With the creation of an initial corpus of text from our LGBTQ+ AV materials, we began work on a dashboard to provide a path for future users to search and access the transcripts, as well as do some cursory analysis of the corpus itself. And because we want to create an overall model that is easily replicated in a variety of settings, we began by using an existing dashboard package for R called&nbsp;<a href=\"https:\/\/github.com\/kgjerde\/corporaexplorer\">corporaexplorer<\/a>. The corporaexplorer package enabled us to quickly set up a working dashboard that provided search term filtering, multiple-term comparisons, a document map, a timeline, and heat maps to show the concentration of a term in a single object and across collections.<\/p>\n\n\n\n<figure class=\"alignleft uwm-c-img--left\"><a href=\"http:\/\/10.60.50.144\/corpus\/\"><img decoding=\"async\" src=\"https:\/\/uwm.edu\/libraries-backup\/wp-content\/uploads\/sites\/59\/2021\/05\/corpusexplorer-docmap-300x160.png\" alt=\"\" class=\"wp-image-30277\" \/><\/a><\/figure>\n\n\n\n<p>Additionally, we wanted to ensure the data search and visualization tools were sufficiently straightforward for students and community members to use, and that they accurately represented the occurrences of pertinent LGBTQ+-related terms. To test the tools, we pulled together a small group of researchers with knowledge of LGBTQ+ history who we expect will also derive some benefit from the data sets and tools themselves, for both teaching and research. Their feedback prompted us to review the accessibility of our dashboards, improve item-level metadata, and investigate how we can enable researchers to incorporate related data sets from other collections alongside our own. More on an improved dashboard in a later post!<\/p>\n\n\n\n<p><em><strong>Note<\/strong><\/em>: The pilot dashboard is available here:&nbsp;<a href=\"http:\/\/10.60.50.144\/corpus\/\">http:\/\/10.60.50.144\/corpus\/<\/a>. However, the link will work only if you are at 51ÁÔÆæ or connected via 51ÁÔÆæ VPN (as of May 2021).<\/p>\n\n\n\n<hr class=\"has-alpha-channel-opacity\" \/>\n\n\n\n<p><strong>Speech-to-Text Workflows and DeepSpeech<\/strong><\/p>\n\n\n\n<p>There are several parts to our project \u2013 all important and all intertwined. To help make sense of our updates, I\u2019ll try to tag them according to their most prominent feature in terms of speech-to-text progress, community engagement, professional development, and LGBT collections. To kick things off&nbsp;\u2013 an update on how things have progressed and where things stand in our speech-to-text processes. The good news is we have made progress! And here\u2019s how:&nbsp;<\/p>\n\n\n\n<p>Our goal is to extract meaningful text transcripts from the hundreds of hours of archival AV materials that are part of our LGBT collections in the 51ÁÔÆæ Archives. This includes oral history interviews, cable access shows dedicated to LGBT life in Milwaukee that debuted in the 1980s and 1990s, local radio programs from the same era, and news segments from Milwaukee\u2019s local mainstream news media.&nbsp;It\u2019s important that the text we extract is accurate enough to enable meaningful research. That&nbsp;means that language used by and about the LGBT community is especially important to capture accurately.&nbsp;In a later post we\u2019ll talk more about the issues raised&nbsp;by use of&nbsp;terms that are considered outdated and even offensive \u2013 an extremely important topic when we are considering how to make these data sets and the archival materials themselves publicly accessible. For now, we\u2019ll concentrate on getting the text out in the first place:&nbsp;<\/p>\n\n\n\n<p>The project is using a&nbsp;publicly licensed&nbsp;speech-to-text engine,&nbsp;<a href=\"https:\/\/github.com\/mozilla\/DeepSpeech\">Mozilla\u2019s DeepSpeech<\/a>.&nbsp;It is \u201cpre-trained\u201d using the Mozilla&nbsp;<a href=\"https:\/\/commonvoice.mozilla.org\/en\">Common Voice<\/a>&nbsp;dataset.&nbsp;The code&nbsp;is open, so the model can be further trained locally, to make up for deficits in the Common Voice dataset or to fine-tune (within reason) to a&nbsp;particular collection&nbsp;of AV. We\u2019re using&nbsp;DeepSpeech&nbsp;for&nbsp;all of&nbsp;the above reasons, and in order to develop a model that will work especially effectively with the archival AV materials that exist across the LGBTQ+ collections in the 51ÁÔÆæ Archives.&nbsp;&nbsp;<\/p>\n\n\n\n<p>Throughout the summer, Dan&nbsp;Siercks&nbsp;worked with&nbsp;DeepSpeech&nbsp;to augment the model in order to raise our confidence that the transcripts would recognize expected language that the Milwaukee LGBTQ community might use to describe themselves and their experiences.&nbsp;Cary Costello, our Disciplinary Scholar and 51ÁÔÆæ\u2019s Director of LGBT Studies,&nbsp;created a \u201cterminology list\u201d&nbsp;split into contemporary and historic LGBTQ+ terms, with each divided into&nbsp;tiers&nbsp;from most generic\/mainstream to the&nbsp;most narrow&nbsp;used by specific subcommunities.&nbsp;<\/p>\n\n\n\n<figure class=\"alignleft uwm-c-img--left default\"><img loading=\"lazy\" decoding=\"async\" width=\"750\" height=\"500\" src=\"https:\/\/uwm.edu\/libraries\/wp-content\/uploads\/sites\/572\/2023\/04\/collsdata_spreadsheet-screenshot-750x500-1.jpg\" alt=\"\" class=\"wp-image-40240\" srcset=\"https:\/\/uwm.edu\/libraries\/wp-content\/uploads\/sites\/572\/2023\/04\/collsdata_spreadsheet-screenshot-750x500-1.jpg 750w, https:\/\/uwm.edu\/libraries\/wp-content\/uploads\/sites\/572\/2023\/04\/collsdata_spreadsheet-screenshot-750x500-1-300x200.jpg 300w\" sizes=\"auto, (max-width: 750px) 100vw, 750px\" \/><\/figure>\n\n\n\n<p>Using a&nbsp;<a href=\"https:\/\/collections.lib.uwm.edu\/digital\/collection\/transhist\/search\">local test collection that included oral histories with human-created transcripts<\/a>, Dan&nbsp;developed a script to identify terms from Cary\u2019s list in our test collection transcripts. We then manually identified start-and-stop times (made even easier because&nbsp;these transcripts have been coded using&nbsp;<a href=\"https:\/\/www.oralhistoryonline.org\/\">OHMS<\/a>) and dropped those criteria into a spreadsheet. Using Audacity, Dan was able to create a simple process to locate and \u201csnip\u201d those audio snippets from the recordings to create a test data set that emphasized the LGBTQ+ terms we want our model to more reliably identify.&nbsp;While this approach needs to be carefully calibrated in order to avoid overcorrecting the model, we have found that our data sets have improved accuracy. We believe this is due to the term-specific&nbsp;augmentation, having run multiple iterations of the model, and a significant improvement in the model itself following an upgrade that we recently implemented locally.&nbsp;&nbsp;<\/p>\n<\/div>\n<\/div><\/div>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"","protected":false},"author":2929,"featured_media":0,"parent":32295,"menu_order":302,"comment_status":"closed","ping_status":"closed","template":"","meta":{"_acf_changed":false,"footnotes":"","uwm_wg_additional_authors":[]},"class_list":["post-40204","page","type-page","status-publish","hentry"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v27.3 (Yoast SEO v27.3) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>51ÁÔÆæ Libraries<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/uwm.edu\/libraries\/digital-humanities\/dh-lab-resources\/lgbtq-audio-archive-mining-project\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"LGBTQ+ Audio Archive Mining Project\" \/>\n<meta property=\"og:url\" content=\"https:\/\/uwm.edu\/libraries\/digital-humanities\/dh-lab-resources\/lgbtq-audio-archive-mining-project\/\" \/>\n<meta property=\"og:site_name\" content=\"51ÁÔÆæ Libraries\" \/>\n<meta property=\"article:modified_time\" content=\"2026-02-25T19:17:40+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/uwm.edu\/libraries\/wp-content\/uploads\/sites\/572\/2023\/04\/Tri-Cable-News-9-lesbians-of-color-support-group.jpg\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"14 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/uwm.edu\\\/libraries\\\/digital-humanities\\\/dh-lab-resources\\\/lgbtq-audio-archive-mining-project\\\/\",\"url\":\"https:\\\/\\\/uwm.edu\\\/libraries\\\/digital-humanities\\\/dh-lab-resources\\\/lgbtq-audio-archive-mining-project\\\/\",\"name\":\"LGBTQ+ Audio Archive Mining Project - 51ÁÔÆæ Libraries\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/uwm.edu\\\/libraries\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/uwm.edu\\\/libraries\\\/digital-humanities\\\/dh-lab-resources\\\/lgbtq-audio-archive-mining-project\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/uwm.edu\\\/libraries\\\/digital-humanities\\\/dh-lab-resources\\\/lgbtq-audio-archive-mining-project\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/uwm.edu\\\/libraries\\\/wp-content\\\/uploads\\\/sites\\\/572\\\/2023\\\/04\\\/Tri-Cable-News-9-lesbians-of-color-support-group.jpg\",\"datePublished\":\"2023-04-26T15:59:14+00:00\",\"dateModified\":\"2026-02-25T19:17:40+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/uwm.edu\\\/libraries\\\/digital-humanities\\\/dh-lab-resources\\\/lgbtq-audio-archive-mining-project\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/uwm.edu\\\/libraries\\\/digital-humanities\\\/dh-lab-resources\\\/lgbtq-audio-archive-mining-project\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/uwm.edu\\\/libraries\\\/digital-humanities\\\/dh-lab-resources\\\/lgbtq-audio-archive-mining-project\\\/#primaryimage\",\"url\":\"https:\\\/\\\/uwm.edu\\\/libraries\\\/wp-content\\\/uploads\\\/sites\\\/572\\\/2023\\\/04\\\/Tri-Cable-News-9-lesbians-of-color-support-group.jpg\",\"contentUrl\":\"https:\\\/\\\/uwm.edu\\\/libraries\\\/wp-content\\\/uploads\\\/sites\\\/572\\\/2023\\\/04\\\/Tri-Cable-News-9-lesbians-of-color-support-group.jpg\",\"width\":750,\"height\":500,\"caption\":\"Tri Cable Tonight Lesbians of Color Support Group Speaker\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/uwm.edu\\\/libraries\\\/digital-humanities\\\/dh-lab-resources\\\/lgbtq-audio-archive-mining-project\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/uwm.edu\\\/libraries\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Digital Humanities Services\",\"item\":\"https:\\\/\\\/uwm.edu\\\/libraries\\\/digital-humanities\\\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Resources\",\"item\":\"https:\\\/\\\/uwm.edu\\\/libraries\\\/digital-humanities\\\/dh-lab-resources\\\/\"},{\"@type\":\"ListItem\",\"position\":4,\"name\":\"LGBTQ+ Audio Archive Mining Project\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/uwm.edu\\\/libraries\\\/#website\",\"url\":\"https:\\\/\\\/uwm.edu\\\/libraries\\\/\",\"name\":\"51ÁÔÆæ Libraries\",\"description\":\"\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/uwm.edu\\\/libraries\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"51ÁÔÆæ Libraries","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/uwm.edu\/libraries\/digital-humanities\/dh-lab-resources\/lgbtq-audio-archive-mining-project\/","og_locale":"en_US","og_type":"article","og_title":"LGBTQ+ Audio Archive Mining Project","og_url":"https:\/\/uwm.edu\/libraries\/digital-humanities\/dh-lab-resources\/lgbtq-audio-archive-mining-project\/","og_site_name":"51ÁÔÆæ Libraries","article_modified_time":"2026-02-25T19:17:40+00:00","og_image":[{"url":"https:\/\/uwm.edu\/libraries\/wp-content\/uploads\/sites\/572\/2023\/04\/Tri-Cable-News-9-lesbians-of-color-support-group.jpg","type":"","width":"","height":""}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"14 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/uwm.edu\/libraries\/digital-humanities\/dh-lab-resources\/lgbtq-audio-archive-mining-project\/","url":"https:\/\/uwm.edu\/libraries\/digital-humanities\/dh-lab-resources\/lgbtq-audio-archive-mining-project\/","name":"LGBTQ+ Audio Archive Mining Project - 51ÁÔÆæ Libraries","isPartOf":{"@id":"https:\/\/uwm.edu\/libraries\/#website"},"primaryImageOfPage":{"@id":"https:\/\/uwm.edu\/libraries\/digital-humanities\/dh-lab-resources\/lgbtq-audio-archive-mining-project\/#primaryimage"},"image":{"@id":"https:\/\/uwm.edu\/libraries\/digital-humanities\/dh-lab-resources\/lgbtq-audio-archive-mining-project\/#primaryimage"},"thumbnailUrl":"https:\/\/uwm.edu\/libraries\/wp-content\/uploads\/sites\/572\/2023\/04\/Tri-Cable-News-9-lesbians-of-color-support-group.jpg","datePublished":"2023-04-26T15:59:14+00:00","dateModified":"2026-02-25T19:17:40+00:00","breadcrumb":{"@id":"https:\/\/uwm.edu\/libraries\/digital-humanities\/dh-lab-resources\/lgbtq-audio-archive-mining-project\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/uwm.edu\/libraries\/digital-humanities\/dh-lab-resources\/lgbtq-audio-archive-mining-project\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/uwm.edu\/libraries\/digital-humanities\/dh-lab-resources\/lgbtq-audio-archive-mining-project\/#primaryimage","url":"https:\/\/uwm.edu\/libraries\/wp-content\/uploads\/sites\/572\/2023\/04\/Tri-Cable-News-9-lesbians-of-color-support-group.jpg","contentUrl":"https:\/\/uwm.edu\/libraries\/wp-content\/uploads\/sites\/572\/2023\/04\/Tri-Cable-News-9-lesbians-of-color-support-group.jpg","width":750,"height":500,"caption":"Tri Cable Tonight Lesbians of Color Support Group Speaker"},{"@type":"BreadcrumbList","@id":"https:\/\/uwm.edu\/libraries\/digital-humanities\/dh-lab-resources\/lgbtq-audio-archive-mining-project\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/uwm.edu\/libraries\/"},{"@type":"ListItem","position":2,"name":"Digital Humanities Services","item":"https:\/\/uwm.edu\/libraries\/digital-humanities\/"},{"@type":"ListItem","position":3,"name":"Resources","item":"https:\/\/uwm.edu\/libraries\/digital-humanities\/dh-lab-resources\/"},{"@type":"ListItem","position":4,"name":"LGBTQ+ Audio Archive Mining Project"}]},{"@type":"WebSite","@id":"https:\/\/uwm.edu\/libraries\/#website","url":"https:\/\/uwm.edu\/libraries\/","name":"51ÁÔÆæ Libraries","description":"","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/uwm.edu\/libraries\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"}]}},"acf":[],"_links":{"self":[{"href":"https:\/\/uwm.edu\/libraries\/wp-json\/wp\/v2\/pages\/40204","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/uwm.edu\/libraries\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/uwm.edu\/libraries\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/uwm.edu\/libraries\/wp-json\/wp\/v2\/users\/2929"}],"replies":[{"embeddable":true,"href":"https:\/\/uwm.edu\/libraries\/wp-json\/wp\/v2\/comments?post=40204"}],"version-history":[{"count":7,"href":"https:\/\/uwm.edu\/libraries\/wp-json\/wp\/v2\/pages\/40204\/revisions"}],"predecessor-version":[{"id":47642,"href":"https:\/\/uwm.edu\/libraries\/wp-json\/wp\/v2\/pages\/40204\/revisions\/47642"}],"up":[{"embeddable":true,"href":"https:\/\/uwm.edu\/libraries\/wp-json\/wp\/v2\/pages\/32295"}],"wp:attachment":[{"href":"https:\/\/uwm.edu\/libraries\/wp-json\/wp\/v2\/media?parent=40204"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}