杰瑞科技汇

python dicom jpeg

Of course! Working with DICOM images that contain JPEG (or JPEG-LS, JPEG 2000) pixel data is a very common task in medical imaging. The standard Python library for this is pydicom.

python dicom jpeg-图1
(图片来源网络,侵删)

The key challenge is that pydicom can read the DICOM metadata (like patient information, image dimensions, etc.) from a JPEG-compressed file, but it cannot directly decompress the pixel data into a usable NumPy array. For that, you need an additional library.

Here's a complete guide covering the essentials.


Prerequisites: The Right Tools

You need two main libraries:

  1. pydicom: The core library for reading, writing, and manipulating DICOM files.
  2. gdcm or pyjpegls: A decompressor library that pydicom can use.
    • gdcm (GDCM - Grassroots DICOM): A powerful, C++-based library. It's the most robust choice and supports JPEG, JPEG-LS, and JPEG 2000. This is the recommended option.
    • pyjpegls: A pure Python wrapper for the CharLS library. It's lighter but only supports JPEG-LS compression.

Installation

First, install pydicom. Then, install one of the decompressors.

python dicom jpeg-图2
(图片来源网络,侵删)

Option A: Recommended (using gdcm) This is the most reliable method. gdcm has its own dependencies that might need to be installed via your system's package manager (e.g., apt, brew, choco).

# Install pydicom
pip install pydicom
# Install gdcm (this can be tricky)
# On Ubuntu/Debian:
sudo apt-get update
sudo apt-get install gdcm
# On macOS (using Homebrew):
brew install gdcm
# On Windows, it's often easiest to use conda:
conda install -c conda-forge gdcm
# Now, install the Python binding for gdcm
pip install python-gdcm

Option B: Simpler (using pyjpegls) Use this if you are certain you only need to handle JPEG-LS compressed files and want to avoid system-level dependencies.

pip install pydicom pyjpegls

The Core Problem: PhotometricInterpretation

When you read a DICOM file, pydicom checks the PhotometricInterpretation tag. If it's something like RGB or YBR_FULL_422, it knows the pixel data is compressed and needs special handling.

If you try to access pixel_array on a compressed file without a decompressor, pydicom will raise an error:

python dicom jpeg-图3
(图片来源网络,侵删)
# This will FAIL if no decompressor is found
import pydicom
import numpy as np
# ds = pydicom.dcmread("path/to/your_compressed_image.dcm")
# print(f"Photometric Interpretation: {ds.PhotometricInterpretation}")
# print(f"Rows: {ds.Rows}, Columns: {ds.Columns}")
# This line will raise a NotImplementedError or similar
# pixel_array = ds.pixel_array

The Solution: Reading and Decompressing

With gdcm installed, pydicom will automatically detect it and use it to decompress the pixel data when you access the pixel_array attribute. The process is seamless.

Here is a complete, working example.

Example Code

import pydicom
import numpy as np
import matplotlib.pyplot as plt
import os
# --- 1. Make sure you have a DICOM file with JPEG compression ---
# For this example, we'll create a dummy path.
# Replace this with the actual path to your DICOM file.
# You can find sample DICOM files online (e.g., from The Cancer Imaging Archive - TCIA).
try:
    # Using a sample file from the internet for demonstration
    # This is a CT image with JPEG2000 compression, which gdcm handles well.
    file_url = "https://github.com/pydicom/pydicom/raw/master/tests/test_files/CT_small.dcm"
    filename = "CT_small.dcm"
    if not os.path.exists(filename):
        import urllib.request
        print(f"Downloading sample DICOM file...")
        urllib.request.urlretrieve(file_url, filename)
    ds = pydicom.dcmread(filename)
except FileNotFoundError:
    print(f"Error: The file '{filename}' was not found.")
    print("Please replace 'CT_small.dcm' with the path to your DICOM file.")
    exit()
# --- 2. Check the metadata to confirm compression ---
print(f"File: {filename}")
print(f"Photometric Interpretation: {ds.PhotometricInterpretation}")
print(f"Transfer Syntax UID: {ds.file_meta.TransferSyntaxUID.name}")
print(f"Rows: {ds.Rows}, Columns: {ds.Columns}")
print(f"Bits Allocated: {ds.BitsAllocated}")
print("-" * 30)
# --- 3. The Magic: Access pixel_array ---
# With gdcm installed, this line will automatically decompress the JPEG data.
# It might take a moment for large or highly compressed images.
try:
    pixel_array = ds.pixel_array
    print("Successfully read and decompressed pixel data!")
    print(f"Pixel array shape: {pixel_array.shape}")
    print(f"Pixel array dtype: {pixel_array.dtype}")
    print(f"Pixel value range: [{pixel_array.min()}, {pixel_array.max()}]")
except Exception as e:
    print(f"Failed to read pixel data. This usually means no compatible decompressor (like gdcm) was found.")
    print(f"Error: {e}")
    exit()
# --- 4. Visualize the image (optional, requires matplotlib) ---
try:
    plt.figure(figsize=(8, 8))
    # For grayscale, use a colormap
    plt.imshow(pixel_array, cmap="gray")
    plt.title("Decompressed DICOM Image")
    plt.colorbar()
    plt.axis('off')
    plt.show()
except ImportError:
    print("\nMatplotlib is not installed. Skipping image visualization.")
    print("To install it, run: pip install matplotlib")

Writing a DICOM File with JPEG Compression

Sometimes, you might want to take an uncompressed image (e.g., from a NumPy array) and save it in a DICOM file with JPEG compression to save space. pydicom makes this easy.

The key is to set the TransferSyntaxUID to a JPEG-compatible one before writing.

Common JPEG Transfer Syntax UIDs

  • JPEG Baseline (Process 1): pydicom.uid.JPEGBaseline8Bit
  • JPEG Lossless (Process 14): pydicom.uid.JPEGLosslessSVRNonHierarchical
  • JPEG-LS Lossless: pydicom.uid.JPEGLSLosslessOnly
  • JPEG-LS Lossy: pydicom.uid.JPEGLSLossy
  • JPEG 2000 Image Compression: pydicom.uid.JPEG2000ImageCompressionLosslessOnly or pydicom.uid.JPEG2000ImageCompression

Example: Writing a JPEG-LS Compressed DICOM

import pydicom
import numpy as np
from pydicom.dataset import Dataset, FileMetaDataset
from pydicom.uid import ExplicitVRLittleEndian, JPEGLSLosslessOnly
# 1. Create a new DICOM dataset
# This will be our header information
new_ds = Dataset()
# 2. Add some required standard tags (you can customize these)
new_ds.SOPClassUID = '1.2.840.10008.5.1.4.1.1.2'  # CT Image Storage
new_ds.SOPInstanceUID = pydicom.uid.generate_uid()
new_ds.PatientName = "Test^Patient"
new_ds.PatientID = "12345"
new_ds.StudyInstanceUID = pydicom.uid.generate_uid()
new_ds.SeriesInstanceUID = pydicom.uid.generate_uid()
# 3. Add pixel data related tags
# Let's create a sample 512x512 grayscale image
pixel_data = np.random.randint(0, 256, size=(512, 512), dtype=np.uint8)
new_ds.Rows = 512
new_ds.Columns = 512
new_ds.BitsAllocated = 8
new_ds.BitsStored = 8
new_ds.HighBit = 7
new_ds.PixelRepresentation = 0  # 0 for unsigned
new_ds.SamplesPerPixel = 1
new_ds.PhotometricInterpretation = "MONOCHROME2"
# 4. Set the Transfer Syntax to JPEG-LS Lossless
# This tells pydicom to compress the data when writing
new_ds.file_meta = FileMetaDataset()
new_ds.file_meta.TransferSyntaxUID = JPEGLSLosslessOnly
# 5. Add the pixel data
# pydicom will automatically compress it using the specified Transfer Syntax
new_ds.PixelData = pixel_data.tobytes()
# 6. Save the file
output_filename = "my_jpeg_ls_image.dcm"
pydicom.dcmwrite(output_filename, new_ds)
print(f"Successfully created a JPEG-LS compressed DICOM file: {output_filename}")
# --- Verification: Let's read it back ---
# This should now work easily if gdcm is installed
read_ds = pydicom.dcmread(output_filename)
print(f"\nVerification of written file:")
print(f"Photometric Interpretation: {read_ds.PhotometricInterpretation}")
print(f"Transfer Syntax UID: {read_ds.file_meta.TransferSyntaxUID.name}")
reconstructed_pixels = read_ds.pixel_array
print(f"Read back pixel array shape: {reconstructed_pixels.shape}")
print(f"Original and reconstructed arrays are equal: {np.array_equal(pixel_data, reconstructed_pixels)}")

Summary

Task Key Code/Concept Required Libraries
Read DICOM with JPEG ds = pydicom.dcmread("file.dcm")
pixel_array = ds.pixel_array
pydicom + gdcm (or pyjpegls)
Check if Compressed print(ds.PhotometricInterpretation)
print(ds.file_meta.TransferSyntaxUID.name)
pydicom
Write DICOM with JPEG new_ds.file_meta.TransferSyntaxUID = pydicom.uid.JPEGLSLosslessOnly
new_ds.PixelData = pixel_data.tobytes()
pydicom
Why it Works pydicom's pixel_array attribute acts as a property that triggers decompression if a compatible library is found. pydicom's internal logic + external decompressor
分享:
扫描分享到社交APP
上一篇
下一篇