Independent projects
I support independent research that you lead, and I am glad to give feedback and guidance on it by email, at lab meetings, and one-on-one when you need it. This matters especially if you are preparing for or already in a PhD: experience running a study on a topic you chose yourself, and not only work in a supporting role, is particularly valuable. For a project you lead within the lab, here is how I suggest you find a topic, where to get data, and how we work through it together.
Choosing a topic: three building blocks
As a social work researcher, I usually suggest building a topic from three pieces: the population, the problem, and the methodology.
1. The population
Who are you most interested in? If immigrants, which immigrants? It might be the foreign-born broadly, or something more specific: people with a particular legal status, a particular ethnic group, an age group, and so on. One honest caveat: I have had little training in child welfare or aging, so if you are interested in general, non-immigrant child or older-adult populations, I may not be the right mentor.
2. The problem
Once you have a population, define a problem. Problems come from many places, including your lived experience and the news. In a world where large language models have made data analysis easy, I believe the most important things you bring are your insight and a genuine affection for the problem. Solving a good, important, worth-solving problem is what matters, and the strongest questions often come from challenging an assumption the field takes for granted rather than filling a small gap (Alvesson & Sandberg, 2011). An important problem is not always a grand challenge; it can come straight from your own life or a news story. I am more comfortable with quantitative than qualitative work, with a lot of experience in survey design, administrative data analysis, and NLP and LLMs, and that is where I can help most. Two short guides on framing a question: asking quantitative research questions and asking research questions in social work.
3. The methodology
Which methods do you want to use? The lab works mostly with natural language processing and large language models, geospatial analysis, decomposition, network analysis, and survey methods. Browsing my publications will give you a feel for how I structure a study. It is fine to start descriptive or exploratory. I believe in the value of a good descriptive or exploratory paper. Careful description is a contribution in its own right and can stand on its own as research (Gerring, 2012; Loeb et al., 2017), and I believe a strong exploratory study can be of more direct use to the communities we work with. Good exploration is also harder than it looks.
Your question is a conversation with prior work
A research question is never asked in a vacuum. It is a way of entering an ongoing scholarly conversation: you position your study against what others have already found, and you show what it confirms, complicates, or extends (Booth et al., 2016). The best questions usually push on something the field has taken for granted.
The intersection of immigrant well-being and technology is especially scattered. Relevant work has been done, often in isolation, across social work, public health, health informatics, human-computer interaction, communication, and migration studies. The same phenomenon, say a language barrier to telehealth, can be framed as a health-disparities question in public health, a design problem in human-computer interaction, and a policy question in social work. Part of building a strong question is recognizing which of these conversations you are joining and what that conversation currently assumes. We will work this out together, but it helps to hold it in mind as you shape the question.
One of my own studies, in those three pieces:
- Population: members of a large online fan community (the BTS fandom).
- Problem: how do people express emotion and find mental health support in a space that was never built for care? This came from noticing how much informal support happens in fan communities.
- Data: comments on fandom videos on YouTube.
- Method: NLP to measure emotional expression and supportive talk.
It was published in 2026 (Yoo, Rodwin, et al., 2026). The problem was specific and a bit unexpected, not a grand challenge.
Example questions: immigrant well-being and technology
A few open questions at the intersection of immigrant well-being and technology. Each grew out of one of my own studies and still has plenty left to do, and each uses data the lab already has.
- Do immigrants with limited English proficiency use telehealth at lower rates, and what explains the gap? (Yoo, Hong, & Choi, 2025; telemental health among adolescents, under review)
- How do AI literacy and ethics awareness relate to the subjective well-being of immigrants and natives? (AI competence and well-being, under review)
- How do immigrant and minority communities give and seek support in online communities when formal services are limited? (Yoo, Rodwin, et al., 2026; DACA peer support, under review)
- Who is willing to use AI mental health chatbots, and do they serve people who do not speak English well? (Yoo & Jang, 2026; AI mental health app language audit, in progress)
- How is emotional distress expressed in the online language of immigrant and minority communities, and can it be measured at scale? (Youm et al., 2026)
- Do new AI tools reach communities with limited English proficiency, or widen existing divides? (generative-AI divide work, under review; AI mental health app language audit, in progress)
A few directions: data you can use
Three kinds of data the lab uses, with what each holds and a few examples.
Clinical and case notes (health informatics)
Clinicians and social workers write a great deal of free text: intake assessments, progress notes, case notes. These notes are rich but messy, and learning to work with them is a valuable, transferable skill. A large body of work uses NLP to pull social and clinical information out of notes, though social factors such as social support and housing are still relatively understudied (Patra et al., 2021).
- MIMIC-III is a de-identified critical-care dataset that includes nursing and social work notes, and is the main publicly accessible clinical-notes dataset. Access needs PhysioNet credentialing (human-subjects training plus a data use agreement); here is a tutorial on getting MIMIC-III notes.
- U-M EPIC clinical data may be available to students at the master's level with a mentor's approval; contact me if you are interested.
My work with clinical notes asks how a construct like person-centered care shows up in text, how complete notes are, and how human-written and AI-generated notes differ (Stanhope et al., 2024; Yoo et al., 2024; Who Writes Better Notes? under review).
Survey data
If you prefer structured data, several national surveys carry technology, health, and immigrant-related items, and most are free public-use files with codebooks.
- CHIS (California Health Interview Survey): a large state survey with rich health, access, and language items.
- NHIS (National Health Interview Survey, CDC): the core national health survey, annual, with nativity and citizenship items.
- HINTS (Health Information National Trends Survey, NCI): focused on health information seeking and technology use.
- MEPS (Medical Expenditure Panel Survey, AHRQ): detailed health care use, access, and cost.
- All of Us Research Program (NIH): a large, diverse cohort combining surveys, electronic health records, and biological data, analyzed in a secure workbench (registration and training required).
Survey work on immigrants and technology is active: studies document a digital divide in immigrant households and disparities in telehealth use by limited English proficiency (Migration Policy Institute, 2021; Rodriguez et al., 2021).
Social media data
For practicing NLP or LLMs, social media is a rich and accessible source. Mind each platform's terms of use, and talk with me about ethics and IRB before you collect anything.
- Reddit is anonymous, text-heavy, and organized into subreddits, which makes it strong for studying help-seeking and peer support. Because so much of the text behind today's LLMs is from Reddit, it is also a great place to practice. Find a subreddit you care about and bring it to me. Others have used it to surface mental health markers in immigration communities (Mittal et al., 2023); my Reddit work is on peer support and migration discourse (Youm et al., 2026; DACA peer support, under review; r/Amerexit, under review).
- YouTube comments, through the Data API, capture reactions and discourse around a topic, as in my study of fandom communities (Yoo, Rodwin, et al., 2026).
- The TikTok Research API opens up short-video and comment data, which is newer and less explored.
How an independent project works
Roughly how a project goes. Expect it to be iterative; the topic, methods, and framing usually shift as you go.
- Scope the topic with me. Email me a few options, not just one, using the template below, and we refine them together; expect the question to change.
- Get any training you need. If a method is new to you, learn it first through a tutorial, a short course, or a small practice analysis. Do what you can with what you have.
- Run the analysis yourself. Hands-on work is how you learn the most, and I will help when you get stuck.
- Share and get feedback. Share what you found at lab meeting, and we iterate (a one-on-one if you need it).
- Write it up. Turn it into a paper, revising as the analysis and feedback develop.
- Working title
- Population
- Problem, in two or three sentences
- Research question or questions
- Dataset or datasets
- Methodology
- Why it matters and who it helps
- What is the problem, the gap or unmet need?
- Who does it affect, and how?
- What do we already know, and what is missing?
- What will your study add?
Next step
None of this has to be settled before you start. Bring me a population you care about, a problem, or just a subreddit, and we will shape it into a project together. Email me at nariyoo@umich.edu to get started, and we can meet one-on-one if you need to (book at calendly.com/nari-yoo). Once it has taken some shape, we share it at lab meeting and carry it forward from there.