Fred's PhD Thesis Project: VQ VAE 2 Image Compression - GitHub
Có thể bạn quan tâm
- Notifications You must be signed in to change notification settings
- Fork 9
- Star 33
- Code
- Issues 2
- Pull requests 0
- Actions
- Projects
- Security 0
- Insights
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Latest commitHistory21 Commits | ||||
| figures | figures | |||
| .gitignore | .gitignore | |||
| LICENSE.txt | LICENSE.txt | |||
| README.md | README.md | |||
| create_hdf5.py | create_hdf5.py | |||
| create_images.py | create_images.py | |||
| create_vqvae2_output_hdf5.py | create_vqvae2_output_hdf5.py | |||
| networks.py | networks.py | |||
| profile_train_densenet.py | profile_train_densenet.py | |||
| test_model.ipynb | test_model.ipynb | |||
| train_densenet.py | train_densenet.py | |||
| train_vqvae.py | train_vqvae.py | |||
| utilities.py | utilities.py | |||
| View all files | ||||
Repository files navigation
- README
- Apache-2.0 license
This repository has repurposed the generative architecture of Razabi et al.'s Multi-Level Vector Quantized Variational AutoEncoder (VQ-VAE-2) to compress medical images in PyTorch.
Additionally, the compressed latent vectors and reconstructed images have been used to train the CheXNet (DenseNet-121 pre-trained on ImageNet) algorithm.
Usage
Architecture
This repository supports two-level VQ-VAE (top and bottom hierachical layers). The vector quantization workflow is outline below.

Two levels of convolution based encoding captures both local (first-layer) and global (second-layer) features.

We converted our datasets into HDF5 formats as inputs for faster training. We used the MIMIC-CXR and the CheXpert datasets for training and external validation.
Prerequisites
- python3
- PyTorch (torch)
- torchvision
- HDF5 (h5py)
- numpy
- tqdm
- matplotlib
- scikit-learn (sklearn)
Getting Started
Create HDF5 Dataset
We used HDF5 datasets to create and save padded images such that the training does not require pre-processing each time.

Training
- To run the VQ VAE training script using the default hyperparameters:
- To increase the compression ratio, change the stride at each hierarchy with first_stride or second_stride flags:
Note: strides increase in multiples of 2: 2, 4, 8, 16
- To run the training of DenseNet-121 classifier:
The training of DenseNet-121 can be conducted with original images, latent vectors, or reconstructed images:

Note: checkpoint files can be found in the [save_path]/ceckpoints directory from the training.
Testing
- Use the test_model.ipynb Jupyter Notebook to:
- create a padded image from any image file to fit into a square perspective ratio
- save reconstructed images from trained models
- calculate PSNR from saved images (from create_images.py)
Note: loading saved models require CUDA enabled devices. If the device does not have CUDA, load the file with:
torch.load('checkpoint.pt', location: 'cpu')- To run the profiling code on DenseNet-121 training, comment out the @profile decorator in line 266 of networks.py. Once the training is complete, the pytorch_memblab library will output the profiling info directly to terminal:
Results
- Loss curves are automatically generated in the [save_path] directory from the training.

- Reconstruction performance is satisfactory when evaluated with external datasets. In the example below, the algorithm trained with the CheXpert dataset (frontal view) and externally validated with the MIMIC-CXR dataset (both frontal and lateral views).

The trained model is robust to various input manipulations. Input image above, reconstructed image below:

- Classification performance of DenseNet-121 as determined by AUROC was satisfactory with the original and actually increased reconstructed, and compressed latent vector as input. We suspect that the VQ-VAE-2 is acting as a denoising autoencoder.

Download links for: saved models and original and reconstructed images from the validation MIMIC-CXR dataset
Authors
- Young Joon (Fred) Kwon MS |github|linkedin| MD PhD Student; Icahn School of Medicine at Mount Sinai
- G Anthony (Tony) Reina MD |github|linkedin| Chief AI Architect for Health & Life Sciences; Intel Corporation
- Ping Tak Peter Tang PhD |github|linkedin| Research Scientist; Facebook
- Eric K Oermann MD |github|linkedin| Instructor, Department of Neurosurgery; Director, AISINAI; Icahn School of Medicine at Mount Sinai
- Anthony B Costa PhD |github|linkedin| Assistant Professor, Department of Neurosurgery; Director, Sinai BioDesign; Icahn School of Medicine at Mount Sinai
License
This project is licensed under the APACHE License, version 2.0 - see the LICENSE.txt file for details
Acknowledgments
- MSTP T32 NIH T32 GM007280
- RSNA Medical Student Research Grant
- Intel Software and Services Group Research Grant
About
Fred's PhD Thesis Project: VQ VAE 2 Image Compression
Resources
ReadmeLicense
Apache-2.0 licenseUh oh!
There was an error while loading. Please reload this page.
Activity Custom propertiesStars
33 starsWatchers
3 watchingForks
9 forks Report repositoryReleases
No releases publishedPackages 0
Uh oh!
There was an error while loading. Please reload this page.
Contributors
Uh oh!
There was an error while loading. Please reload this page.
Languages
- Jupyter Notebook 75.6%
- Python 24.4%
Từ khóa » Vq Vae Image Compression
-
[PDF] Hierarchical Quantized Autoencoders - ArXiv
-
[PDF] Generating Diverse High-Fidelity Images With VQ ... - NIPS Papers
-
Generating Diverse High-Fidelity Images With VQ ... - NIPS Papers
-
Understanding VQ-VAE (DALL-E Explained Pt. 1) - ML@B Blog
-
Hierarchical Quantized Autoencoders - Papers With Code
-
[PDF] Generating Diverse High-Fidelity Images With VQ-VAE-2
-
[PDF] Soft Then Hard: Rethinking The Quantization In Neural Image ...
-
Generating Diverse High-fidelity Images With VQ-VAE-2
-
Almost Any Image Is Only 8k Vectors | By Ajit Rajasekharan
-
[PDF] ONLINE LEARNED CONTINUAL COMPRESSION WITH STACKED ...
-
Generating Diverse High-Resolution Images With VQ-VAE
-
[PDF] PILC: Practical Image Lossless Compression With An End-to-End ...
-
Standard Compression VQ-VAE And Its Reconstruction. Radiologists...
-
[PDF] Vector Quantization-Based Regularization For Autoencoders