A Call for Intersectional Collaboration

A Call for Intersectional Collaboration
Photo by Shane Rounce / Unsplash

I'm about to conduct a research survey into the AI training knowledge gaps surrounding the scholarship and history of marginalized groups.

This will consist of identifying a number of primary sources of scholarship and historical accounts or documentation (photos, art, song, etc.) and probing different models of LLMs using adversarial and interpretability methodologies I've pioneered in my security research.

I need help to do this. Simply put, I need people who can tell and show me what things to look for. Is there an indigenous philosopher you're intimately familiar with but not many others know? A body of scholarship by black Americans that is often ignored? A musician who sings about the history of trans folks from a particular town? Anything like this is something I want to know about so I can test the LLMs to see if they have been trained, and to what extent, on that knowledge.

Why does this matter? Because these machines are being fronted as the space where all human knowledge will be absorbed, yet there is evidence of serious knowledge gaps as pertains to cultural and ethnic minorities and their accomplishments or points of view throughout history. At the same time, synthetic data generation is being used to train LLMs because the industry believes they're running out of knowledge to scrape... But my minor in Africana Studies, twenty years after the fact, had more casual knowledge of Henry Highland Garnet than Claude 3.7 did.

If you want to contribute, your contributions will be recognized publicly. I will make my findings available for free via Emergent Problems.

You can find me on BlueSky: hilvadvising.bsky.social