James Nagy | Emory REU Computational Mathematics for Data Science

New algorithm helps make portable CT machine images clearer

Tue, 14 Dec 2021 00:00:00 +0000

This blog post was written by Mai Phuong Pham Huynh, Manuel Santana, and Ana Castillo and published with minor edits. The team was advised by Dr. James Nagy. In addition to this post, the team has also given a midterm presentation , created a poster, made a poster blitz video, published code, written a paper, and developed a lesson plan.

Research done at the 2021 REU/RET summer program at Emory University has devised a way to make medical diagnoses more effectively. The work behind this new algorithm relies heavily upon mathematical techniques that connect numerical linear algebra with optimization methods.

Some of the most important aspects of this work link branches of mathematics like calculus, statistics, algebra and geometry. The different contributions by mathematicians to these branches of mathematics were complementary and accelerated research in the area of image reconstruction. The results obtained from their work have offered other student mathematicians a different way to study these problems and shown that this is an effective way for radiology doctors to accurately diagnose illnesses and keep saving patients’ lives. The goal of the Point-of-Care Tomographic Imaging research group is to develop numerical methods that would estimate the geometry parameters of a portable CT scanning device and to reconstruct the image. CT images in portable devices are a lot clearer. This work has been one of the most interesting pieces of work ever accomplished.

COVID-19 and Imaging

The mathematical ideas involved in the Point-of-Care Tomographic Imaging project are of great importance. With these mathematical methods, imaging becomes a possibility with patients in all parts of the world, allowing them to live a healthy and happy life. To get better reconstructed images, the group considers a regularized linear least squares problem, to which the solution is approximated by using an alternating descent scheme known as block coordinate descent or (BCD). Each iteration of BCD has two steps. One BCD step solves a linear least square problem, and the next step solves the nonlinear problem.

"Coronavirus" by Yu. Samoilov is licensed under CC BY 2.0

By the 1900’s mathematicians had already found ways to use these iterative reconstruction techniques to reconstitute an image. In 1917, Johann Radon introduced the Radon Transform, followed by Stefan Kaczmarz in 1937. Scientists Allan McLeod Cormack and Godfrey Newbold Hounsfield developed the first scanning device. These innovations each contributed to modern medical scanning technology.

The medical field has greatly depended on these iterative reconstruction methods. In medical imaging, computed tomography (CT) techniques are becoming more and more popular for their ability to produce high quality images of the human body. CT methods use a combination of computer processes and mathematics to reconstruct images. The COVID-19 pandemic brought many challenges and setbacks to many doctors around the world and imaging played a vital role in helping diagnose the virus. Doctors have been able to use CT scans to diagnose and treat COVID-19 by examining the lungs of patients who are potentially infectious, or in recovery.

Point-of-Care Tomographic Imaging

Since the first CT scanner was developed, CT scanning technology has evolved significantly. Now, there are portable CT scanners that can be transported anywhere without trouble. The idea of transporting huge heavy CT scanners to remote locations was always a difficult task for many doctors. Point-of-care tomographic imaging has allowed radiologists to add portable CT scanners to their departments to increase patient satisfaction and improve medical outcomes. Portable CT scanners can be used to do scans of a patient without moving the patient out of bed.

A CT scanner is a device that is composed of a scanning gantry, x-ray generator, computer system, console panel and a physician’s viewing console. The scanning gantry is the part that will produce and detect x-rays. In a typical CT scan, a patient lays on a bed that moves through the gantry. An X-ray source rotates around the patient and shoots X-ray beams through the human body at different angles. These X-ray measurements are then processed on a computer using mathematical algorithms to create tomographic (cross-sectional) images of the tissues inside the body.

Limitations arise when using these portable CT scanners for medical procedures since these devices require extensive care, such as regular calibration for effective performance. This is where the point-of-care tomographic imaging problem begins.

The Beginning of the Problem

Two parameters that relate to the geometry of portable CT scanning devices are:

R - distance between source and detector and
θ - orientation of source to detector.

During the CT acquisition process, the x-ray source rotates around the patient. Each time the x-ray source moves to a new position, these parameters change. In point-of-care imaging, these parameters are essential during the data imaging process. Imprecise data (unknown parameters) result in the reconstruction of poor quality images, while precise data (known parameters) yield a well reconstructed image.

How were these geometry parameters estimated to obtain better reconstructed images?

The group considered solving a regularized linear least squares problem. The solution to this problem is approximated by using an alternating descent scheme known as block coordinate descent or (BCD). They used this alternating algorithm in which each iteration alternately solves a linear last squares problem (for the image) and a nonlinear least squares problem (for the geometry parameters). An alternate form of this algorithm was also studied as a fixed point iteration, with acceleration techniques used to improve convergence. Implementation and numerical examples were done using MATLAB to obtain results. To understand the background behind this problem, the connection between Beer’s Law and linear algebra needs to be studied.

Beer’s Law & Linear Algebra

The Beer-Lambert Law or Beer’s Law was first developed in 1729 by Pierre Bouguer, and in 1852, Beer added to the law. Beer’s law states that as an x-ray source emits an x-ray beam and it passes through an object, the beam loses energy exponentially. To create a 2-D image, an imaginary grid of pixels is placed atop the object, and an x-ray beam is then shot through the object. Beer’s Law is used to create a system of linear equations.

From Algebra, it is known that a system of equations has three types of solutions, infinitely many, exactly one, or no solution. When there is no solution, the system is called inconsistent, as opposed to consistent when there exists a solution. When there are more equations (x-rays) than unknowns (pixels), then the system is overdetermined. This is what happens in computed tomography. When there are more unknowns (pixels) than equations (x-rays), then the system is underdetermined. In each equation, every unknown represents a physical property, called attenuation, for each pixel value. Pixels that have the same amount of material have the same attenuation. An image is then created from the pixels. The images obtained based on the attenuation values allow doctors to make more accurate diagnoses. For example, bones have a different attenuation than lungs.

A system of linear equations can be solved in Precalculus by the Gauss-Jordan elimination method. In linear algebra, a matrix equation of the form Ax=b is solved to find x. Ideally, this can easily be done. Unfortunately, for this project, this might not be possible for several reasons: 1) b has noise in it, and 2) the exact matrix A may not be known because the parameters R and θ may be perturbed (for example, the CT machine may not be calibrated correctly). Matrix A may also be a huge matrix containing millions of rows and columns. Typical approaches learned in Pre-Calculus or a linear algebra class cannot be used to solve this problem. The tomographic imaging research group uses other methods to solve these large problems, referred to as ill-posed.

An ill-posed problem is often referred to as one that is not well-posed. In the 20th Century, French mathematician Jacques Hadamard defined a well-posed problem as one having three properties: 1) a solution exists, 2) the solution is unique, 3) the solution depends continously on initial values. An ill-posed problem needs to be approached differently, e.g. -regularization. Tikhonov regularization is a common form of regularization for ill-posed problems.

The Linear & NonLinear Least Squares Problem

Throughout the 1800s, mathematicians such as Legendre, Gauss, and Laplace contributed ideas to the method of least squares. For example, the method of least squares can be used to find the line that minimizes the sum of squared distances between teh line and the given data points. In computed tomography, a problem arises that involves least squares, a regularized linear least squares problem. This regularized linear least squares problem is composed of x, the image, A(p), a matrix that is constructed by the geometry parameters, R and θ, b, the measured data, and a parameter α, called the regularization parameter. The nonlinear least squares problem does not include the regularization parameter.

The solution to this problem is found by the block coordinate descent (BCD). The BCD algorithm is an optimization algorithm that solves this problem iteratively. BCD works by minimizing the parameters R and θ and x one at a time. The linear least squares problem is considered first. It takes the initial parameters R and θ, generate matrix A, to get x, the image. Here the parameters, R and θ are known and x is approximated. Then, the nonlinear least squares problem is considered. Once x is known, then x is used to approximate the parameters, R and θ. In this latter phase, x is known and the parameters R and θ are approximated. The pattern continues until a good image is produced.

Filter - Based Regularization

Since it is impossible to get data exactly from the detector due to numerous reasons, regularization is needed to lower the error the solution caused by noise in the data. Tikhonov regularization is one of the most common ways to handle these noise levels. The method is named after mathematician Andrey Tikhonov. Tikhonov worked in numerous topics and different fields of mathematics. His best contributions are in topology.

Singular value decomposition (SVD) plays an important role in understanding the basic idea behind Tikhonov Regularization. The singular values of the matrix A are significant. To achieve good reconstructed images with small error, it is necessary to have the balance of large and small singular values; the large singular values have good information about the solution, but the small singular values cause errors to blow up.

MATLAB Toolboxes & Implementation

MATLAB was used in the project, including the optimization, signal processing, and image processing toolboxes. The IR Tools, AIR Tools II, and Imfil packages were also used.

To simulate a problem, the IR tools package is used. IR tools also provide regularized linear least square solvers. Three of these linear least squares solvers are: 1) hybrid-LSQR algorithm, 2) IRN, 3) FISTA. To approximate the solution of the non-linear least squares problem two methods are used: 1)lsqnonlin, 2) imfil.

The implementation of BCD used the IR Tools package and naming conventions. In the base IRtools package, the function PRset is used to set up options for computed tomography problems, where PRset is updated to accept values related to the BCD (inital guess for parameters). To simulate a CT problem with unknown geometry parameters PRtomo_var is used with the image size being n×n. PRtomo_var generates all the data necessary to simulate the inverse problem. IRset updates to include parameters for BCD, such as which acceleration technique to use, and IRbcd implements the BCD algorithm.

The steps above permit the simulation and solving of a CT reconstruction experiment. With n=64 and perturbations added to the parameters R and θ, the test problem is generated. The image used is the Shepp-Logan image.

After solving the problem with the BCD algorithm, it can be seen that the solution with the BCD resembles that of the true parameters solution.

Acceleration Techniques: Tests & Results

Other mathematicians continue their work in these fields. Some of them introduced ideas on acceleration techniques to speed up the convergence rate of fixed point iteration problems. One of them is Donald G. Anderson, famous for his work on the Anderson Acceleration, also called Anderson mixing. In the tomographic imaging project, estimation of the geometry parameters can be viewed as a fixed point iteration technique. The research group used three fixed point acceleration schemes in their numerical experiments: 1) Irons-Tuck method, 2) Crossed - second method, 3) Anderson Acceleration. All these acceleration tests were performed and seemed to improve image resolution.

From tests performed, the acceleration techniques provided slightly better convergence, with the crossed secant method and Anderson Acceleration performing the best. The Irons-Tuck method converged much better in the angle parameters, but took much longer. The Irons-Tuck image seems to have the least background noise.

Poster Blitz Video

More About the Team

Mai Phuong Pham Huynh

Mai is a rising junior at Emory University, majoring in Applied Mathematics and Statistics. Her research focus is on numerical analysis and scientific computing. In her free time, she enjoys listening to violin concertos, doing giant Jigsaw puzzles, and searching for vintage vinyl records from the 50s-90s.

Manuel Santana

Manuel is a rising senior at Utah State University. His research interests are in all areas of computational and applied math. When not doing math he enjoys European handball, pickleball, raquetball, and cooking with his wife Emily.

Ana Castillo

Ana Castillo is in her 10th year teaching. She is certified to teach Spanish and Mathematics grades 6-12 in the states of Texas, Tennessee and Illinois. She is working for Proximity Learning, a company that has allowed her to teach virtually in different school districts in the United States. She has been a participant of the Park City Mathematics Institute Teacher Leadership Program (PCMI TLP) and part of a team that runs a Math Teachers’ Circle in the Rio Grande Valley Area. In her spare time, she enjoys taking long walks, listening to classical music (Vivaldi), spending time with her pets (dogs & donkeys), cooking, reading, writing and traveling.

Low-Precision Algorithms for Image Processing

Mon, 27 Jun 2022 11:36:49 -0400

This post was written by Xiaoyun Gong, Yizhou Chen, and Xiang Ji and published with minor edits. The team was advised by Dr. James Nagy. In addition to this post, the team has also created slides for a midterm presentation, a poster blitz video, code, and a paper.

In One Sentence:

Our group works on experimenting with iterative methods for solving inverse problems at different precision levels.

Background: Why Low Precision?

What is the most important aspect for an excellent gaming experience? A lot of people would answer real-time! Everyone wants their games to be fast, and it is always a bummer that the screen freezes during a critical combat. This is why we are investigating low precision arithmetic: to decrease the computation time and speed things up.

Nowadays, most computer systems operate on double precision (64-bit) arithmetic. However, if we decrease the number of bits for each number to 16 bits or even lower, the processing time can be much significantly reduced, although the benefit comes at the cost of a loss of accuracy.

Using tensor cores for mixed-precision scientific computing. Oct-2021. url: https://developer.nvidia.com/blog/tensor-cores-mixed-precision-scientific-computing/

Simulating Low Precision

Matlab function chop

To simulate low precision arithmetic on our 64-bit computers, we have imported a MATLAB package called chop. The toolbox allows us to explore single precision, half precision, and other customized formats. Each input needs to be transformed, but the real work comes from chopping each operation. The code below is a toy example of how to calculate $x + y \times z$ in half precision with chop.

Blocking

When the number is being chopped from double precision to half precision, a lot of bits are dropped (from 64 bits to 16 bits). This would certainly cause a level of inaccuracy, so in order to reduce the errors, a method called blocking is used. Blocking is the same as breaking a large operation into smaller chunks, where each is computed independently and the result is then summed.

We compute the inner product for each precision and block size for 20 times and calculate the average. The errors are calculated as the differences between the result of using the chopped version of inner product function and the built-in function in matlab.

On the left-hand side of the graph where the size of the vector is 1000, the errors of half precision are the largest because it has the least bits. If we take a closer look at only half precision, we get the graph on the right with different vector sizes. The errors decrease sharply when the blocking method is introduced. However, larger block sizes do not necessarily mean lower errors, as the graph suggested: the errors increase again as the block size keeps growing. That is because when the block size is large, it's the same as doing no blocking at all. For example, for a size-500 vector, once the block size reaches 500, it just means putting the whole vector into the first block, the same as when blocking is not introduced. Therefore, the line becomes flat from 500. We use 256 as our default block size in our codes because the matrix dimension is rather large in our problem.

Inverse Problems and Iterative Methods

Inverse problems are problems where our goal is to find the internal or hidden information (inputs) from outside measurements (results). The internal data can be approximated by iterative methods, a repeating implementation of the same system of equations, with variables getting updated each round, hoping the value generated can be closer to the true value we desire each term.

Conjugate Gradient Method

The Conjugate Gradient algorithm (CG) aims to solve the linear system Ax = b where A is SPD (symmetric and positive definite),transforming the problem of finding solution to an optimization problem where we want to minimize $\phi(x)=\frac{1}{2}x^{T}Ax-x^{T}b$. This can be easily seen from $\nabla \phi (x) = 0$ -> $Ax-b=0$.

In each step, the method provides us with a search direction and a step-length so that the error of this iteration is A-orthogonal to the search direction of the previous iteration. Eventually, it will converge to the minimal point. The CGLS algorithm is the least-squares version of the CG method, applied to the normal equation A^TAx = A^Tb. However, CGLS requires computing inner products, which can overflow for large-scale problems in low precision.

Chebyshev Semi-Iterative Method

The Chebyshev Semi-Iterative (CS) Method requires no inner product computation, which is great because inner products can cause overflow easily in low precision. But there is always the trade-off! The CS method requires the user to have an idea of the range of the matrix A’s eigenvalues. The result given by CS is a linear combination of all solutions in each iteration, and the weights are obtained from the Chebyshev polynomial, which has the favorable property to ensure that the result obtained in each iteration of CS is smaller than an upper bound.

Experiment

IR Tools

We modify the CGLS method in the IRtool package in Matlab so that it can operate in lower precision, and we use two test problems in the same package to investigate how the method performs at lower precision, mainly half precision.

Image Deblurring Using CGLS

First, we use our modified version of CGLS without regularization to solve the image deblurring problem. In this application, we solve Ax = b, where b is an observed blurred image, A is a matrix that models the blurring operation, and x is the desired clean image. We didn’t add any noise to b in the problem of Ax = b at the beginning, and the graphs are demonstrated below.

We use our modified version of CGLS for single and half precision, and the graph in single precision is similar to the graph in double precision. However, for half precision, the background is not the same as that in the double-precision or single-precision graph; it contains more artifacts.

We also plot the error norms of the solution at each iteration using different precision levels.

From the graph, all three error norms overlap from the beginning until around the 20th iteration, where the half-precision errors begin to deviate from those in single and double precision. The difference is due to the round-off errors of half precision, which add up and take over. Besides, the error norms for half precision terminates at the 28th iteration because overflow of inner products causes NaNs (Not a Number) to be computed during the iteration.

After investigating the idealized situations where there is no noise in the observed image, we then apply our code to problems that contains additive random noise to see how it is likely to perform in real life. That is, we try to compute x from the observed image b = Ax + noise.

For half precision, with 0.1% noise, the picture looks almost the same as the one that contains no noise. However, if the noise level is increased to 1%, the background has substantially more artifacts, while the middle object is still identifiable. Noise has taken over the black background but not the satellite yet. Eventually, the whole image is flushed with the noisy artifacts with 10% noise; the picture no longer contains any meaningful information. Notice that the results below are generated using x from the best iteration, that is the iteration with the smallest error norms, not from the last iteration.

Now we turn our attention to the error norm, the difference between the original image and the one our algorithm generates at each iteration. When 0.1% noise is added, as the number of iterations goes up, the error norm reduces significantly across all three formats. Intriguingly, for images with 1% or 10% noise, the best reconstruction is not the last iteration but somewhere along the middle (it’s around the 50th iteration for 1% and 10th for 10%). The reason behind the phenomenon is that while we are transforming the output image, b, the blended noise also gets inverted along each iteration. Eventually, the random data accumulate and dominate the solution at some point. We are showing the results where the error norm is the smallest to see what is the best possible solution we can compute. However, in reality the true x is not known, meaning we don’t know the error norms, so we can only show results from the last iteration, not from the best iteration.

Image Deblurring Using CS

In order to prevent the occurrence of overflow, we experiment with the CS algorithm (where no inner products are needed) and use chop for lower precision. Tikhonov regularization is applied to CS after we find out that the algorithm performs poorly due to the close-to-zero singular values of A when it’s ill-conditioned. Now we are solving: $$\min_{x} {||Ax-b||_2^2+\lambda^2||x||_2^2}$$ where $\lambda$ is a parameter that needs to be chosen. Here we show experiments for the case with 10% noise, and we use $\lambda$ = 0.199.

From the graph below, it is clear that even with 10% noise, the half-precision image looks very similar to that in double precision, better than what we have using CGLS (results from the last iteration).

For the image deblurring problem, we further comfirm the similarity by plotting the error norms.

We can see that the error norms of the three precision levels overlap, illustrating that the result in half precision is close to that in double precision.

Image Deblurring Using CGLS with regularization

To fairly compare CGLS and CS, we add Tikhonov regularization to CGLS and run the test problem again. The diagrams are listed below.

The diagram for half precision looks much better than that produced by CGLS without regularization. However, difference still presents between half and double precision in the background. At the end of the iteration for half precision, the error norms still increase again. If we zoom in the graph of the error norms for CS and CGLS with regularization, we can see that the error norms at half precision converge for CS but increases rapidly for CGLS, suggesting that for half precision, CS is a better choice, especially when the noise level is high. When the noise level is close to zero, CS becomes susceptible because of the accumulation of round-off errors. However, for double precision, the CGLS method with regularization is clearly more stable. Therefore, CGLS with regularization is more suitable for double precision.

We performed similar experiments with an image reconstruction problem from tomography; see our paper for further details!

More about us

Yizhou Chen

Hi! My name is Yizhou, but I go by Riley as well. I'm a rising junior at Emory University, double majoring in Applied Math and Physics. My interest is in computational Math & computational Physics. Outside work I enjoy watching sitcoms and my favourite one is Frasier! I am an animal person and I have a toy poodle who's nine years old. I also love cycling and hiking.

Xiaoyun Gong

Hello I am Xiaoyun Gong. I am a rising senior majoring in Applied Mathematics and Statistics. I am interested in math and I also enjoy coding!! In my free time I like watching anime (most recent favorite is Made in Abyss) and drawing. I like sweet food and I am a cat person. 🐱

Xiang Ji

Hi, I am Xiang Ji, but you can also call me Zoe. I am a rising junior at Emory University who is double majoring in applied mathematics and statistics and art history. My research interest is in computational mathematics and image processing. I enjoy going to art museums and watching movies. It's quite fun doing research this summer!

James Nagy

Mon, 27 Jun 2022 11:36:49 -0400

James G Nagy is a Samuel Candler Dobbs Professor and Chair of the Department of Mathematics at Emory University.