Well, It’s clear you know a lot about the heart and the ECG! To be clear, this was a project I started and got very far with but put aside after realizing I could not get more data without spending a lot of money on an Investigation Device exemption study (IDE).
The first goal was to convert a lead I ECG (dry electrodes) into a real-time beat by beat invasive arterial pressure, cranial pressure, L/R ventricle, and L/R atrium pressure. I started off learning about every stage of the heart and its relative cycle stage to the ECG. Then learned about what affected blood pressure throughout the body. Then fluid dynamics, breathing, and so on.
Most people use a method called pulse transit time to calculate a single BP measurement, but it never works well enough to be considered a class II (FDA) medical device. The first thing was to understand what kind of features to extract and where to get the data. I needed invasive arterial pressure with a multiple lead wet electrode ECG recorded at the same time. I ended up getting like 100 patient’s data from a study a long time ago with that kind of high-quality data, but only like 45 patients of that data were usable. Datasets included breathing, Sp02, internal heart chamber pressure, and cranial pressure in some cases for each patient. You can imagine this kind of data is hard to get, as they were all ICU patients in critical condition.
Professionals who have used ML for BP measurements always try to do so with a blood pressure cuff and ECG data. I always thought cuffs would never work anyway because cuffs are not that accurate actually, just enough for spot-based care by FDA standards. I had learned prior, in machine learning, you need accurate data to points to what you are trying to predict to make it feasible, especially in the medical field. I wanted real-time direct measurements of everything.
Signal processing was one of the most valuable processes to make sure the data was clean. ECG can show breathing, but it’s hard to extract because it’s a part of the noise. I learned all professionals in the field use a filter called baseline wandering removal to smooth out the ECG signal before analysis. But for me, noise always has importance to any problem. The wandering from the baseline is partly caused in part by breathing and patient movement. So I figured a way to keep the important stuff while removing the rest of the noise.
After some basic filtering and cleaning of the signals, the next step was to extract features, which would represent what was needed. ECG feature extraction involved the P wave, QRS complex, and T wave. I needed the time/location of the wave start, peak, and end of each wave. Then peak amplitudes for each, followed by the area and angle of each wave. I did much hands-on analysis with data sets, which led to the understanding that the past 2 to 3 heart cycles had a high correlation to beat-by-beat arterial pressure.
So I started to use past cycle features within the same instance to build the data models to predict a real-time BP waveform. That’s when I learned shifting time series data did not just apply to this problem, but every ML problem I have played with regarding time-series data. I only focused on real-time arterial BP at first. Long story short, I got it to where it could predict the entire BP wave within 2 - 14 mmHg per heart cycle. I had limited training data, with signal processing that was sub-par of what I imagined due to limited help, so there were few cases where it could be even a little more off. I wanted to learn how to do the invasive procedure on myself to collect more BP data. I found out this would have got me into serious trouble if I did so. Including the data would have been thrown out. I could not get any VC to believe me or take the risk. The medical field can be tricky for a self-taught person with no prior experience. Doctors were impressed, but I needed more data to get them to vouch for me.
Overall, neural networks and or rule-based models did very well. Training became a problem due to the data set sizes being millions of instances with many features. That’s a lot of CPU/GPU power I did not have. So I simplified the models drastically to get results. For me, it was a success at the very least, but to medical professionals, they needed to see more patient data, which I could not provide. Plus, I did not know how to automate the processes with real-time ECG streaming, filtering, extraction, and predictions due to my skill set. So, I stopped because I spent way too much time and money to get that far. I heard NO so many times from VCs that I stopped counting. This whole project was the first time I started to learn about machine learning with no background. It took a lot of time to fully understand how it works and where it should be applied when needed.
If you want, I can share some of the bigger models I made for stock market price predictions if you like? It’s only a daily scale, so they are not as I intended, but they produce decent predictions. It’s a different problem entirely than the above. I wanted to use 15 min and tick data, but unfortunately, the models got so big my computer would crash when opening the data models, so for now, I will never know if they would work as I imagined. Overall, I build these models in my head first before trying them out, as it helps me save time. Now, I want to learn about convolutional networks to convert 2D video into 3D by predicting the second perspective. The first step will need a field of depth prediction before predicting the second perspective accurately. Upscaling and sharping of predicted images will be required as well. I see how I want to do this, but I don’t know how yet. I don’t even understand how to build these types of models yet. This is why I’m here mostly, and that I love Perceptilabs setup and simplicity a lot.