Overview
I am a researcher in Georgia Tech's Agile Communications Architecture group. I work on machine learning for wireless networking. The goal is to send images over wireless links using less bandwidth, while keeping the important meaning of the scene.
I built a generative compression pipeline in PyTorch using Stable Diffusion. It lets us transmit a much smaller representation of an image and then reconstruct it on the other side. I use OpenAI CLIP to check that the reconstructed image still matches the original image in meaning. I also prototype software-defined radio systems in GNU Radio. This helps me study how interference and bit errors affect real transmissions, and how ML can make them more resilient.
Right now, I am focused on DiT-JSCC (Diffusion Transformer Joint Source-Channel Coding). We moved away from converting images into short text captions, since captions can miss details. Instead, we send two streams of information. A semantic stream uses DINOv2 to capture the main objects and layout. A detail stream uses SwinJSCC to preserve textures and edges. The diffusion decoder uses both to rebuild a clear and faithful image at low bandwidth.
Technologies
PythonPyTorchStable DiffusionDiffusion TransformersOpenAI CLIPDINOv2SwinJSCCGNU RadioSoftware-Defined RadioWireless CommunicationsJoint Source-Channel CodingSemantic CommunicationImage CompressionGenerative AIComputer VisionDeep Learning