Say My Name:

Visualizing the Evolving Visibility of LGBT+ Identities in U.S. Online Media (2014–2024)


Over the past decade, U.S. media attention to the LGBTQ+ community has seen dramatic peaks and valleys: some moments draw collective focus, while others spotlight only specific identities. This project tracks weekly mention volumes of the relative keywords of "gay," "lesbian," "bisexual," and "transgender" across online news, and visualizes how each identity's visibility rises and falls in the past decade—revealing which communities are amplified or sidelined during shared historical moments.

What stories are still missing?


Visualization 1: Spikes View


 When Did Attention Peak?


Weekly mention counts were aggregated from daily totals. For each identity in each year, the week with the highest volume was identified, and the corresponding news and policy coverage were examined to pinpoint the events driving each surge. The streamgraph maps time horizontally; the thickness of each colored band indicates that identity’s relative media attention in a given week.

*From the streamgraph it becomes clear that gay (blue) sustains the highest baseline and most frequent spikes—dominating peaks around the Obergefell v. Hodges ruling (June 2015) and the Pulse nightclub shooting (June 2016). In contrast, transgender (green) remains nearly invisible most weeks but surges sharply during policy battles. Lesbian (orange) and bisexual (pink) mentions stay comparatively low, rising only for community-specific events and quickly eclipsed when “gay” or “trans” issues dominate.



Visualization 2: Keywords View


What Was Media Talking About?

For each identity, the top keywords were extracted from the entirety of that year’s mention data, and term size reflects annual frequency.  

*The gay word clouds are dominated by legal and marriage-related terminology—“marriage,” “couples,” “court,” “decision”—reflecting the central role of marriage-equality battles and judicial rulings in the discourse. Transgender clouds emphasize policy and rights language—“policy,” “rights,” “military/services,” “restroom”—underscoring that “trans” visibility is largely tied to legislative and access debates.

*By contrast, lesbian and bisexual clouds display a more dispersed vocabulary—“community,” “women,” “identity,” “film”—indicating that coverage of these identities often arises through cultural and artistic channels, such as film festivals, television narratives, and grassroots events, rather than through courtroom or policy dramas.





Methodology

Approximately 3,800,000 U.S. online news articles published between January 1, 2014 and December 31, 2024 were collected.

A layered Boolean and regex‐based filtering strategy isolated mentions of “gay,” “lesbian,” “bisexual,” and “transgender,” including synonyms, singular/plural variants, and topic-relevant terms (e.g. “pride,” “marriage equality”) while excluding unrelated uses of ambiguous terms. Raw daily counts were then aggregated into weekly totals to smooth short-term fluctuations and highlight longer-term trends. For each identity in each calendar year, the week with the highest mention volume was identified and its contemporaneous news and policy coverage examined to pinpoint the events driving that peak; these findings were recorded in a comprehensive spike table.


Data Overview

  • Timeframe: Jan 1, 2014 – Dec 31, 2024 (≈ 550 weeks)
  • Articles Collected: ~ 3,800,000 from major U.S. news outlets
  • Weekly Records: 4 identities × ~550 weeks ≈ 2,200 data points

Limitations

Search relied on Boolean + regex to filter mentions and remove homographs (e.g. “gay” = happy), but “gay” remains a catch-all term that may include broader LGBT+ discourse. Results should be interpreted as trend indicators, not precise counts. Keywords were sized by raw annual frequency without semantic clustering, so synonyms may appear separately.

Annual Total Mentions



As a supplement, this is an overview of each year’s combined mention volume across all four identities.

*This initial trend map allowed us to observe how media attention shifted across different identities over the decade.

*It also confirmed my prediction of the situation - gay has always been the most discussed topic, transgender started increase since Trump took office, been on a rapid rise since 2019 and peaked in 2022-2023, and lesbians and bisexuals have remained at a low level.


Final Thoughts



Gallup’s 2024 national survey
found that bisexual individuals account for 56.3% of LGBTQ+ adults in the United States, far surpassing those identifying as gay (21.1%), lesbian (14.6%), or transgender (13.9%). However, in the media dataset analyzed here, certain identities—particularly gay—have been granted narrative centrality, often standing in for the entire LGBTQ+ spectrum. Meanwhile, transgender visibility is crisis-dependent, rising only when legislation threatens existence. Lesbian and bisexual identities are left to flicker in cultural shadows, rarely driving the national conversation.

This disparity reveals a systemic imbalance, also raises difficult questions:
Whose stories get told, and why?
Who decides what counts as news?
Why are only certain identities allowed to be seen as political, urgent, or worthy of empathy?

This project doesn’t attempt to answer everything.
But if it leaves you unsettled—if it makes you question whose name gets said and whose is left out—then it’s done its job.

Thank you! 🌈



Acknowledgements

Course: Data Storytelling, ITP @ Tisch School of the Arts, NYU
Instructors: Shindy Melanie Johnson & John Henry Thompson
Special thanks to Shindy and John Henry for their guidance and support throughout this project