Register

First Principles



Can personalized / precision / individual medicine be applied to psychiatric conditions?

Based on genetics or blood biomarkers? Probably not. Maybe there’s some slight association between the concentration of a few cytokines in the blood and the severity of PTSD, or differentiating one “type” of depression from another. But you will need enormous investment just to establish this, and in the end it’s not going to be a strong or specific enough signal to actually make a difference.

Based on brain-imaging imaging data? Maybe. If you used fMRI or EEG to record people’s brain activity (at rest, and/or while performing various tasks…), once you had collected enough data - there’s a good likelihood that modern machine learning methods could be leveraged to come up with shockingly good predictions for “what type of anxiety disorder is this and how can it be treated.

So should we try to do this?

Yes. (See: COVID and tiktok and Donald Trump and the state of the world. You think our brains were evolved to handle this? Go read this book).

Ok so where to start?

Harder to say. We don’t really even know what modality of data to go after. Functional MRI? Connectomics via something like DTI? EEG? The expensive, 256-channel headset? With saline? Or a cheaper, consumer-grade headset? Or some completely different modality?

That’s not clear, and it may not be clear for a while yet. And yet, unfortunately, to actually try to leverage machine learning to inform the diagnosis and treatment of psychiatric disorders, we are going to need a lot of data. I worked in this field for 7 years and I always loved getting this question from higher ups: “Ok, so what N do we actually need to do this?” Just the way you are asking that question means you aren’t really understanding this.

Machine Learning / Deep Learning / Neural Networks / AI / Gen AI / SNI

Why have you heard of these things? (Have you even heard of SNI? If not - don’t worry - that one's made up). You’ve heard of these things because of a series of kind of mind-blowing breakthroughs over the last decade. I am by no means an expert or a historian here, but here’s a quick overview.

Machine learning has been around for many decades. It basically just means using math (equations / algorithms / models) to predict things based on patterns or trends in a dataset. It can be as simple as using a cost function to optimize coefficients in a linear regression. If you want to predict how expensive a house is, based on square footage and number of bedrooms (house price = coef1*sq ft + coef2*bedrooms), and you have the price and details for 100 houses, you can figure out what coef1 and coef2 should be to make the best prediction. There are many other algorithms you can use in this way - e.g. support vector machines, random forests, or neural networks.

Neural Networks / Deep Learning / What most people probably mean when they say "AI"

Neural networks have also been around for many decades. But they have come a long way in that time. And in the early 2010s, they started making waves. One of the early examples of this was when a new architecture for a neural network, specialized for interpreting pictures (called a convolutional neural network), suddenly started working unbelievably well. You could take a picture of practically anything, digitize it into a RGB value for every pixel, and a mathematical model could tell you exactly what it was. (Although dogs and fried chicken sometimes gave it a hard time). If you want to read more about the science here, this is a great review written around this time, or you can even read the paper on AlexNet - the specific CNN that shocked everyone at an image classification conference in 2012.

But the thing is, it wasn't so much the academic advances in how to build or design neural networks that led to that moment - it was the data. It was the fact that because of the internet, people could now access an enormous number of pictures (millions and millions) to train the algorithms with. You can't train a neural network with millions of connections (each of which with a specific weight that is "learned" by looking at examples) if you only have a thousand pictures to learn from.

Then there was AlphaGo, there were deepfakes, there were ...., and now, there's ChatGPT. But the common thread, the real key to each of these watershed moments, was access to enormous quantities of data.

Weren't we talking about treating mental health?

Yeah, so the answer to your question from above is more than you were thinking. Is it prohibitive? Not necessarily. As technology advances, something like EEG will get better and cheaper. Portable, consumer-friendly devices will likely get to the point where they can collect quality data outside of a lab or clincal setting. It's possible they are already at that point. If you're reading this and you have a million dollars to give me - yes, absolutely, they are already at that point, and everything I wrote above about needing such a huge N, I didn't mean it. Please let's schedule a lunch. But even if we had funding, and even if we knew exactly which EEG device to use, and exactly what type of activity to record - do we know what clinical (symptom / mental functioning) data to collect to go along with it? Not really. And finally here's the good news: we can start to figure that out now.

Psychometrics

...is a field where people test and validate instruments (surveys) to ask people about themselves. And this field has produced a lot of surveys to ask people about themselves. Frankly, if we could just ask people ALL of these surveys, the AI would have plenty to work with on the sympton / personality / phenotype side of things. But that would be a lot of questions. And it would be boring.

HWAI metrics

Instead, we could make answering questions more engaging / interesting / rewarding. We could make it fun. We can write new, interesting, entertaining questions, and simultaneously whittle down to the most informative, most important questions to ask someone, if they have depression, or anxiety, etc.

People can participate and contribute, even if they consider themselves to represent a picture of mental health. Especially if they represent a picture of mental health. And for the record, it's wrong to think of these as "conditions", "disorders", or "diseases." (Not in a politically correct way, in a data-driven way). All of these things should be considered as distributions on a spectrum.

So what exactly is the goal here?

If I could somehow recruit 5,000-10,000 people to participate (even with the average person just spending 10-15 minutes answering ~20-50 questions), I believe that dataset in itself could be valuable. To help better understand mental health, to improve psychiatric treatment standards, or to support drug development. Or, to support recommendation engines, and/or social networking, where the goal is to effect positive outcomes, rather than maximize click-throughs (via morally dubious or even flat-out sinister methods).

And I am certain that if I could recruit even 10% of those users to wear some consumer-grade EEG headset for a half-hour -- or walk around with a wearable for a week -- or sleep with some mattress-monitoring device, the objective data combined with the phenotypic data (which will be very high quality, because its informed/improved by the full dataset) would be valuable. Worth millions? Probably not. (But what if it was 100,000 people answered questions and 10,000 did the EEG thing?)

What dataset? (from other page)

For now, just the answers to these questions. (And users are free to remain completely anonymous). Just the phenotypic data (personality/symptom scores), if collected right and built up to a few thousand people, could be extremely valuable for guiding future research (e.g. "Don't develop drugs for MDD - you have to target MDD typeA, typeB, and typeC separately"). Say two-thousand people get on board and contribute, and someday I can license it to a pharma for $40k. Everybody gets $20 each - and the satisfaction of knowing that they helped a lot of people with depression.)

But eventually, the true value of this dataset would be in the growth opportunity - if we could convince users to give more data - via a consumer-grade EEG-headset, for instance - then the ability to develop objective biomarkers to aid in understanding and treating mental health, I believe that dataset would be worth much more.

---------

Or I don't know. Maybe I'm just weird.