Published by the Students of Johns Hopkins since 1896
June 12, 2024

Synthetic data could be the key for AI applications in medicine

By AKUL KESARWANI | April 6, 2023


CC BY-SA 4.0 / فاطمه درستی

A recent Hopkins paper seeks to bridge the domain gap between synthetic data and real data.

Advances in artificial intelligence (AI) have been revolutionizing many fields of science including medicine. However, this technology raises the issue of acquiring data. AI needs annotated data to learn and ultimately perform at a high enough level of accuracy, but in many cases such as complex and novel surgical scenarios, high-quality data is not easily accessible. 

Researchers at the Laboratory for Computational Sensing and Robotics published a paper in Nature Machine Intelligence where they discussed a new model that can create realistic synthetic X-ray image data to train AI algorithms.

Computer-assisted surgery is a broad class of technology that helps surgeons plan out various surgical interventions. The researchers of this paper were particularly interested in improving surgical navigation techniques that allow surgeons to track where surgical devices are on an image of the patient taken before surgery. 

Cong Gao, who graduated from the University’s PhD program in the Department of Computer Science in 2022 and is the paper’s lead author, discussed this research in an interview with The News-Letter.

“If you look at self-driving cars, navigation means you want to know where the car is on the map,“ he said. “In the surgical scenario, the map is your body. Here, navigation mostly refers to how you give the surgeon enough information so they can locate the surgical tool to the planned position.”

AI algorithms are used to help determine where the surgical tool is in relation to different landmarks on the 2D, X-ray images. Data on both the surgical tool and patient X-rays are used to train the AI models, but it is difficult to acquire enough of these highly-specific images. Instead, the researchers tried to create synthetic data in simulations. 

The main obstacle to creating synthetic data is that its characteristics can differ from those in real patient data. AI trained on synthetic data may only work on other synthetic images, posing issues when it is required to make predictions on real data. This problem is called the domain gap.

However, the researchers were able to work around this by using a concept known as domain randomization, where they take synthetic data and make large changes to the images. Adding these altered images to the training set can force the network to recognize new patterns and enhance performance. With this concept, they developed SyntheX, a system for creating AI X-ray image algorithms trained solely on synthetic data that can perform similarly to or even better than AI trained on real data.

The researchers compared their synthetic data models to real data models in three example tasks: hip imaging, surgical tool detection and COVID-19 chest X-ray datasets. Controlling for all other variables, they took real X-ray images and computerized tomography (CT) scans on 366 cadavers and recreated synthetic X-ray images on their simulation that replicated the real data. 

The AI model trained on synthetic data was able to detect specific structures and landmarks in the patient X-rays with similar rates of error, and, in some cases performed even better than real data models. The possibility of using synthetic data for AI in surgery opens opportunities for surgeons and researchers.

Gao described how one application is creating AI systems for new surgical tools that have not yet been tested in patients.

“You can also use this tool to generate scenarios that you have not performed yet. Let's say you have a novel surgical tool you want to test, but you don’t have an actual data set yet,“ he said. “You can plug the patient’s CT scan and the model for the surgical tool into this algorithm, and it will generate a very realistic X-ray image that includes both of the objects.”

Another important application is SyntheX’s ability to create large datasets for AI training. With a larger dataset to train on, AI tends to be more accurate. 

In the hip-imaging task, SyntheX was able to create 10,000 synthetic images from only 20 hip CT scans. This would allow researchers to avoid a large bottleneck of acquiring large datasets from real patient populations. 

Additionally, the creation of synthetic data eliminates the ethical and privacy concerns that arise when acquiring patient data for AI. This can help researchers speed up the development time for many innovative surgical applications. 

Gao plans to research new uses of AI with soft tissue applications and work on more advanced systems of human modeling in the future. He hopes this technology can help surgeons and improve patient outcomes.

“If the whole surgical procedure can be envisioned in a purely simulation environment, a lot of roadblocks can be anticipated before you walk into the operating room,” he said.

Have a tip or story idea?
Let us know!

Comments powered by Disqus

Please note All comments are eligible for publication in The News-Letter.

Be More Chill
Leisure Interactive Food Map
The News-Letter Print Locations
News-Letter Special Editions