Python Openslide 自动切割病理图像



Overview

全扫描(whole slide image, WSI)图像非常的大,处理起来比较麻烦,openslide 提供了一个很好的接口,这里介绍一个可用于处理大型病理图像的 python 库 – OpenSlide。

随着深度学习在医疗界的发展, 病理图像也越来越重要. 但是病理图像大多数在 10万x10万分辨率, 用平常的图像处理库没有办法读取. 开源的 openslide 库提供清洁便利的读取方法.


OpenSlide 本身作为一个 c 库存在,OpenSlide python 是其 python 的接口。这个库主要用于处理 whole-slide images 数据,即一般应用在数字病理上的高分辨率图片。这些图片有以下特点:

  • 一张图片可能包含几万兆的像素值,普通的库可能无法进行处理。
  • whole-slide 图片可能存在多个级别的分辨率。



OpenSlide Python is a Python interface to the OpenSlide library.


OpenSlide is a C library that provides a simple interface for reading whole-slide images, also known as virtual slides, which are high-resolution images used in digital pathology. These images can occupy tens of gigabytes when uncompressed, and so cannot be easily read using standard tools or libraries, which are designed for images that can be comfortably uncompressed into RAM. Whole-slide images are typically multi-resolution; OpenSlide allows reading a small amount of image data at the resolution closest to a desired zoom level.


OpenSlide can read virtual slides in several formats:



Requirements


  • Python 2 >= 2.6 or Python 3 >= 3.3
  • OpenSlide >= 3.4.0
  • Python Imaging Library or Pillow


Installation


  1. Install OpenSlide.
  2. pip install openslide-python


Mac 下部署 openslide 环境


brew install opencv
brew install openslide
pip install openslide-python




Using PIL


setup.py assumes that you want to use the Pillow fork of PIL. If you already have classic PIL installed, you can use it instead. Install OpenSlide Python with:

pip install --no-deps openslide-python

or, if you are installing by hand:

python setup.py install --single-version-externally-managed --record /dev/null


Documentation


Basic usage


OpenSlide objects

OpenSlide 对象表示的是一张 whole-slide 图片,其实例化方式也比较简单,直接将文件路径当做参数传入即可,而且也只有这一个参数。

其运行方式类似 python 的文件系统,即建立一个和文件的连接,在对文件进行处理的同时实时的传输数据,而不是将数据直接放入内存中。则类似的,这个类也可以作为上下文管理器使用。

所以其存在一个 close() 方法来关闭这个对象。

import openslide
from openslide import deepzoom
import matplotlib.pyplot as plt
slide = openslide.OpenSlide('./example.svs')
print(type(slide))
slide.close()
<class 'openslide.OpenSlide'>
with openslide.OpenSlide('./example.svs') as slide:
  print(type(slide))
<class 'openslide.OpenSlide'>
 

属性

  • level_count: 这张图片有几个级别的分辨率, 0 表示最高分辨率, 最低分辨率 (level_count - 1)
  • dimensions: level 为 0 时的 (width, height), 也就是最高分辨率的情况下 slide 的宽和高(元组)
  • level_dimensions: 每个 level 的 (width, height)
  • level_downsamples: 每个 level 下采样的倍数, 相对于 level 0, 即 level_dimension[k] = dimensions / level_downsamples[k]
  • properties: whole-slide 的 metadata, 是一个类似 dict 的对象, 其值都是 字符串
  • associated_images: 也是 metadata, 不过 dict 的值都是一张 pil 图片.


with openslide.OpenSlide('./example.svs') as slide:
    print(slide.level_count)
    print(slide.dimensions)
    print(slide.level_dimensions)
    print(slide.level_downsamples)
    print(slide.associated_images)
4
(148683, 41693)
((148683, 41693), (37170, 10423), (9292, 2605), (2323, 651))
(1.0, 4.000088325958833, 16.003087108552297, 64.02464105356638)
<_AssociatedImageMap {'label': <PIL.Image.Image image mode=RGBA size=401x321 at 0x13CCAB14898>, 'macro': <PIL.Image.Image image mode=RGBA size=1280x431 at 0x13CCAB47080>, 'thumbnail': <PIL.Image.Image image mode=RGBA size=1024x287 at 0x13CCAB019B0>}>

with openslide.OpenSlide('./example.svs') as slide:

    proper = slide.properties
    print(proper['aperio.AppMag'])
    print(proper['openslide.level[0].tile-height'])
40
256


with openslide.OpenSlide('./example.svs') as slide:
    fig, axs = plt.subplots(nrows=1, ncols=3, figsize=(20, 6))
    for i, (k, v) in enumerate(slide.associated_images.items()):
        axs[i].imshow(v)
        axs[i].set_title(k)
plt.show()


方法

  • read_region(location, level, size): 读取指定的区域
    • 其中 location 是读取区域的左上角在 level 0 中的坐标,level 表示我们要读取的是第几个 level 的图片,size 是 (width, height), 返回的是 PIL.Image。
    • 注意:不管 level 是不是 0,location 的定位都是根据 level 0 来的,而 size 是在不同 level 上选取的。

with openslide.OpenSlide('./example.svs') as slide:
    region = slide.read_region((47712, 24343), 1, (256, 256))
    print(type(region))
    plt.imshow(region)
plt.show()
<class 'PIL.Image.Image'>

  • get_best_level_for_downsample(downsample): 指定下采样倍数最好的level
    • 对于不同的下采样倍数,使用不同的 level 进行下采样会有不同的效果,比如要是下采样的多,使用更低一级的 level 效果可能更好。


with openslide.OpenSlide('./example.svs') as slide:
    for i in range(10, 100, 10):
        print("对于下采样%d倍,其最好的level是%d" % (i, slide.get_best_level_for_downsample(i)))
对于下采样10倍,其最好的level是1
对于下采样20倍,其最好的level是2
对于下采样30倍,其最好的level是2
对于下采样40倍,其最好的level是2
对于下采样50倍,其最好的level是2
对于下采样60倍,其最好的level是2
对于下采样70倍,其最好的level是3
对于下采样80倍,其最好的level是3
对于下采样90倍,其最好的level是3

  • get_thumbnail(size): 缩略图
    • size 是缩略图的 (width, height),注意到,这个缩略图是保持比例的,所以其会将 width 和 height 中最大的那个达到指定的值,另一个等比例缩放。
with openslide.OpenSlide('./example.svs') as slide:
    thumbnail = slide.get_thumbnail((256, 256))
    plt.imshow(thumbnail)
plt.show()


类方法

  • detect_format(filepath),返回供应商的信息

openslide.OpenSlide.detect_format('./example.svs')
'aperio'

有意思的是,OpenSlide 并没有提供得到 whole-slide image 某个 level 的 image 的 matrix 的方法,这是因为这些matrix 太大,并不适合直接放入内存中,实际上我们可以使用方法 read_region 间接得到,但我想我们都不会这么做。




class openslide.OpenSlide(filename)

An open whole-slide image.

If any operation on the object fails, OpenSlideError is raised. OpenSlide has latching error semantics: once OpenSlideError is raised, all future operations on the OpenSlide, other than close(), will also raise OpenSlideError.

close() is called automatically when the object is deleted. The object may be used as a context manager, in which case it will be closed upon exiting the context.

Parameters:

filename (str) – the file to open

Raises:
  • OpenSlideUnsupportedFormatError – if the file is not recognized by OpenSlide
  • OpenSlideError – if the file is recognized but an error occurred
classmethod detect_format(filename)

Return a string describing the format vendor of the specified file. This string is also accessible via the PROPERTY_NAME_VENDOR property.

If the file is not recognized, return None.

Parameters:filename (str) – the file to examine
level_count

The number of levels in the slide. Levels are numbered from 0 (highest resolution) to level_count - 1 (lowest resolution).

dimensions

(width, height) tuple for level 0 of the slide.

level_dimensions

A list of (width, height) tuples, one for each level of the slide. level_dimensions[k] are the dimensions of level k.

level_downsamples

A list of downsample factors for each level of the slide. level_downsamples[k] is the downsample factor of level k.

properties

Metadata about the slide, in the form of a Mapping from OpenSlide property name to property value. Property values are always strings. OpenSlide provides some standard properties, plus additional properties that vary by slide format.

associated_images

Images, such as label or macro images, which are associated with this slide. This is a Mapping from image name to RGBA Image.

Unlike in the C interface, these images are not premultiplied.

read_region(locationlevelsize)

Return an RGBA Image containing the contents of the specified region.

Unlike in the C interface, the image data is not premultiplied.

Parameters:
  • location (tuple) – (x, y) tuple giving the top left pixel in the level 0 reference frame
  • level (int) – the level number
  • size (tuple) – (width, height) tuple giving the region size
get_best_level_for_downsample(downsample)

Return the best level for displaying the given downsample.

Parameters:downsample (float) – the desired downsample factor
get_thumbnail(size)

Return an Image containing an RGB thumbnail of the slide.

Parameters:size (tuple) – the maximum size of the thumbnail as a (width, height) tuple
close()

Close the OpenSlide object.


Standard properties

这里有一些常量,其值是指定属性在 OpenSlide.properties 中的 key 值,这些属性是 OpenSlide 肯定提供的,我们可以直接使用这些常量来取出 whole-slide image 的这些属性值。

with openslide.OpenSlide('./example.svs') as slide:
    propers = [
        openslide.PROPERTY_NAME_COMMENT, openslide.PROPERTY_NAME_VENDOR, openslide.PROPERTY_NAME_QUICKHASH1, 
    ]
    for s in propers:
        print("%s: %s" % (s, slide.properties[s]))
        print('')
openslide.comment: Aperio Image Library v10.2.41
155000x41793 [0,100 148683x41693] (256x256) J2K/YUV16 Q=70|AppMag = 40|StripeWidth = 1000|ScanScope ID = SS1302|Filename = 26947|Date = 08/05/11|Time = 10:08:59|Time Zone = GMT-07:00|User = 38208fea-3f7a-4ce7-8c4e-e069c0e0d8f4|Parmset = EPC|MPP = 0.2520|Left = 15.669220|Top = 12.329573|LineCameraSkew = -0.000389|LineAreaXOffset = 0.000000|LineAreaYOffset = 0.000000|Focus Offset = 0.000000|DSR ID = aperio01|ImageID = 26947|Exposure Time = 109|Exposure Scale = 0.000001|DisplayColor = 0|OriginalWidth = 155000|OriginalHeight = 41793|ICC Profile = ScanScope v1

openslide.vendor: aperio

openslide.quickhash-1: 8502e939a67f8829dcd8a475e455f769bdecf7220e6024b26ecbb1cdcb6729ac


The openslide module provides attributes containing the names of some commonly-used OpenSlide properties.

openslide.PROPERTY_NAME_COMMENT

The name of the property containing a slide’s comment, if any.

包含图片评论的属性名

openslide.PROPERTY_NAME_VENDOR

The name of the property containing an identification of the vendor.

识别供应商的属性名

openslide.PROPERTY_NAME_QUICKHASH1

The name of the property containing the “quickhash-1” sum.

包含quickhash-1和的属性名

openslide.PROPERTY_NAME_BACKGROUND_COLOR

The name of the property containing a slide’s background color, if any. It is represented as an RGB hex triplet.

openslide.PROPERTY_NAME_OBJECTIVE_POWER

The name of the property containing a slide’s objective power, if known.

openslide.PROPERTY_NAME_MPP_X

The name of the property containing the number of microns per pixel in the X dimension of level 0, if known.

openslide.PROPERTY_NAME_MPP_Y

The name of the property containing the number of microns per pixel in the Y dimension of level 0, if known.

openslide.PROPERTY_NAME_BOUNDS_X

The name of the property containing the X coordinate of the rectangle bounding the non-empty region of the slide, if available.

openslide.PROPERTY_NAME_BOUNDS_Y

The name of the property containing the Y coordinate of the rectangle bounding the non-empty region of the slide, if available.

openslide.PROPERTY_NAME_BOUNDS_WIDTH

The name of the property containing the width of the rectangle bounding the non-empty region of the slide, if available.

openslide.PROPERTY_NAME_BOUNDS_HEIGHT

The name of the property containing the height of the rectangle bounding the non-empty region of the slide, if available.


Exceptions


exception openslide.OpenSlideError

An error produced by the OpenSlide library.

Once OpenSlideError has been raised by a particular OpenSlide, all future operations on that OpenSlide (other than close()) will also raise OpenSlideError.

exception openslide.OpenSlideUnsupportedFormatError

OpenSlide does not support the requested file. Subclass of OpenSlideError.


Wrapping a PIL Image


其接受的是 filename 或 Image 对象,即对普通图片创建类似 OpenSlide 的 API。

img = openslide.ImageSlide('./example2.jpg')
print(type(img))
print(dir(img))
<class 'openslide.ImageSlide'>
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_close', '_file_arg', '_image', 'associated_images', 'close', 'detect_format', 'dimensions', 'get_best_level_for_downsample', 'get_thumbnail', 'level_count', 'level_dimensions', 'level_downsamples', 'properties', 'read_region']
img.level_count
1 1

  • openslide.open_slide(filename)
    • 这是一个函数,接受的是文件名,可以是图片也可以是 whole-slide image,如果是图片则返回 ImageSlide 对象,如果是 whole-slide image 则返回 OpenSlide 对象。


class openslide.ImageSlide(file)

A wrapper around an Image object that provides an OpenSlide-compatible API.

Parameters:file – a filename or Image object
Raises IOError:if the file cannot be opened
openslide.open_slide(filename)

Return an OpenSlide for whole-slide images and an ImageSlide for other types of images.

Parameters:

filename (str) – the file to open

Raises:
  • OpenSlideError – if the file is recognized by OpenSlide but an error occurred
  • IOError – if the file is not recognized at all


Deep Zoom support

deep zoom 是一项技术用于可以快速放大和查看高分辨率图片各个部分的一种技术,其利用多个级别分辨率的图片来实现。这里提供的 api 有益于 whole-slide images 在 web browser 中的展现。

这个类 wrap 一个 OpenSlide 对象或 ImageSlide 对象,tile_size 是 tile 的 width 和 height,overlap 是加到 tile 每个边缘的额外像素数,一般为了更好的表现 tile_size+2*overlap 需要是 2 的幂。

limit_bounds 如果是 True,则只渲染非空 slide 区域。


属性

  • level_count: deep zoom 后产生的 level 数量;
  • tile_count: 所有产生的 tiles 的数量;
  • level_tiles: 每个 level tiles 的排列;
  • level_dimensions: 每个 level 的分辨率;

with openslide.OpenSlide('./example.svs') as slide:
    dzg = deepzoom.DeepZoomGenerator(slide)
    print(dzg.level_count)
    print(dzg.tile_count)
    print(dzg.level_tiles)
    print(dzg.level_dimensions)
19
129312
((1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (2, 1), (3, 1), (5, 2), (10, 3), (19, 6), (37, 11), (74, 21), (147, 42), (293, 83), (586, 165))
((1, 1), (2, 1), (3, 1), (5, 2), (10, 3), (19, 6), (37, 11), (73, 21), (146, 41), (291, 82), (581, 163), (1162, 326), (2324, 652), (4647, 1303), (9293, 2606), (18586, 5212), (37171, 10424), (74342, 20847), (148683, 41693))

方法

  • get_dzi(format): 
    • 得到 deep zoom dzi 文件的 xml metadata
    • 其中 format 是每个 tile 的交付格式(png 或 jpeg)

with openslide.OpenSlide('./example.svs') as slide:
    dzg = deepzoom.DeepZoomGenerator(slide)
    print(dzg.get_dzi('png'))
    print('')
    print(dzg.get_dzi('jpeg'))
<Image Format="png" Overlap="1" TileSize="254" xmlns="http://schemas.microsoft.com/deepzoom/2008"><Size Height="41693" Width="148683" /></Image>

<Image Format="jpeg" Overlap="1" TileSize="254" xmlns="http://schemas.microsoft.com/deepzoom/2008"><Size Height="41693" Width="148683" /></Image>


  • get_tile(level, address)
    • 得到一个 tile
    • 返回的是 PIL.Image 对象,其中 level 是 deep zoom level,address是 (column, row),生成的第 column 列、第 row 行的 tile;

with openslide.OpenSlide('./example.svs') as slide:
    dzg = deepzoom.DeepZoomGenerator(slide)
    tile = dzg.get_tile(10, (0, 0))
    plt.imshow(tile)
plt.show()


  • get_tile_coordinates(level, address)
    • 得到 tile 的位置
    • 其接受的参数和 get_tile 是一样的,得到的其实是 read_region 的参数,将这些参数传入 read_region 后得到的和 get_tile 是一样的内容,但因为 deep zoom 做了缩放的处理,所以分辨率可能是不一样的。

with openslide.OpenSlide('./example.svs') as slide:
    dzg = deepzoom.DeepZoomGenerator(slide)
    tile_coor = dzg.get_tile_coordinates(10, (0, 0))
    print(tile_coor)
    tile = slide.read_region(*tile_coor)
    plt.imshow(tile)
plt.show()
((0, 0), 3, (1020, 651))

  • get_tile_dimensions(level, address)
    • 得到指定 tile 的维度
with openslide.OpenSlide('./example.svs') as slide:
    dzg = deepzoom.DeepZoomGenerator(slide)
    tile_dim = dzg.get_tile_dimensions(10, (0, 0))
    print(tile_dim)
(255, 163)




OpenSlide Python provides functionality for generating individual Deep Zoom tiles from slide objects. This is useful for displaying whole-slide images in a web browser without converting the entire slide to Deep Zoom or a similar format.

class openslide.deepzoom.DeepZoomGenerator(osrtile_size=254overlap=1limit_bounds=False)

A Deep Zoom generator that wraps an OpenSlide or ImageSlide object.

Parameters:
  • osr – the slide object
  • tile_size (int) – the width and height of a single tile. For best viewer performance, tile_size + 2 * overlap should be a power of two.
  • overlap (int) – the number of extra pixels to add to each interior edge of a tile
  • limit_bounds (bool) – True to render only the non-empty slide region
level_count

The number of Deep Zoom levels in the image.

tile_count

The total number of Deep Zoom tiles in the image.

level_tiles

A list of (tiles_x, tiles_y) tuples for each Deep Zoom level. level_tiles[k] are the tile counts of level k.

level_dimensions

A list of (pixels_x, pixels_y) tuples for each Deep Zoom level. level_dimensions[k] are the dimensions of level k.

get_dzi(format)

Return a string containing the XML metadata for the Deep Zoom .dzi file.

Parameters:format (str) – the delivery format of the individual tiles (png or jpeg)
get_tile(leveladdress)

Return an RGB Image for a tile.

Parameters:
  • level (int) – the Deep Zoom level
  • address (tuple) – the address of the tile within the level as a (column, row) tuple
get_tile_coordinates(leveladdress)

Return the OpenSlide.read_region() arguments corresponding to the specified tile.

Most applications should use get_tile() instead.

Parameters:
  • level (int) – the Deep Zoom level
  • address (tuple) – the address of the tile within the level as a (column, row) tuple
get_tile_dimensions(leveladdress)

Return a (pixels_x, pixels_y) tuple for the specified tile.

Parameters:
  • level (int) – the Deep Zoom level
  • address (tuple) – the address of the tile within the level as a (column, row) tuple


Example programs


Several Deep Zoom examples are included with OpenSlide Python:

deepzoom_server.py
A basic server for a single slide. It serves a web page with a zoomable slide viewer, a list of slide properties, and the ability to view associated images.
deepzoom_multiserver.py
A basic server for a directory tree of slides. It serves an index page which links to zoomable slide viewers for all slides in the tree.
deepzoom_tile.py

A program to generate and store a complete Deep Zoom directory tree for a slide. It can optionally store an HTML page with a zoomable slide viewer, a list of slide properties, and the ability to view associated images.

This program is intended as an example. If you need to generate Deep Zoom trees for production applications, consider using VIPS instead.


使用 demo


首先安装,openslide,如果是ubuntu用户,直接 sudo apt install python-openslide,然后

导入库

import openslide

然后导入 DeepZoomGenerator, 主要是多层金字塔形式封装

from openslide.deepzoom import DeepZoomGenerator

打开所需要读入的文件, 给出文件名

slide = openslide.open_slide('02.svs')

类别调用

highth = 2000
data_gen = DeepZoomGenerator(slide, tile_size=highth, overlap=0, limit_bounds=False)

  • tile_size 可以设成自己想切的图像大小.
  • limit_bounds 表示的是大图整个边缘可能达不到自己设的长和宽. False, 则丢弃边缘图. True, 保存
  • overlap 表示的是边缘 overlap, 引入其他信息

输出总共切图个数和一共有多少个金字塔

print(data_gen.tile_count)
print(data_gen.level_count)
1036
17

  • num_w, 图像切片宽度个数
  • num_h, 图像切片长度个数
  • data_gen.get_tile(level, (row, col)),
    • level, 范围 data_gen.level_count,最大的为金字塔底层,最小为0。
    • row, 范围是(0,num_w).
    • col, 范围是(0,num_h). 

还有其他的 API 切图,可以看官方文档。

也可以自己实现,不调用 API.

num_w = int(np.floor(w/width))+1
num_h = int(np.floor(h/highth))+1

for i in range(num_w):
    for j in range(num_h):
        img = np.array(data_gen.get_tile(16, (i, j))) #切图
        io.imsave(join(result_path, "02"+str(i)+'_'+str(j)+".png"), img) #保存图像


简单使用

这是 openslide 的一些基本的函数. 下载好进行简单的一些测试, 如下:

import openslide
import matplotlib.pyplot as plt
import numpy as np

# 读入图片
slide = openslide.OpenSlide('xxx.tiff')
# 每一个级别 K 的对应的下采样因子, 下采样因子应该对应一个倍率
downsamples = slide.level_downsamples

# 最高倍下的宽高
[w, h] = slide.level_dimensions[0]
# 计算级别 k 下的总宽
size1 = int(w*(downsamples[0]/downsamples[2]))
# 计算 k 下的总高
size2 = int(h*(downsamples[0]/downsamples[2]))

# 先读取, 在数组化
region = np.array(slide.read_region((0, 0), 2, (size1, size2)))
# [m,n] = slide.dimensions
print(w, h)
# print(downsamples[0])
print(size1, size2)
plt.figure()
plt.imshow(region)
plt.show()


总体的截取分割图像


测试完之后,由于图片的问题,分割好的图进行保存之后,region.shape 是(, , 4)带有一个 a 通道,这样的话不利于使用 open cv 进行后续的处理(其实没影响,因为 open cv 自己会读取 RGB,把 a 通道会舍掉的,所以后面的操作没必要做),所以我选择了将 4 通道改为 3 通道,具体 demo 代码如下:

# -*- coding: utf-8 -*-
#coding:utf-8
import openslide
import numpy
import cv2
from scipy import misc
import time
start = time.time()
source = openslide.open_slide("2018-03-20 18_15_35.kfb.tiff") #载入全扫描图
# filelist = os.listdir(source)#该文件夹下所有的文件(包括文件夹)
downsamples = source.level_downsamples #level_downsamples    每一个级别K的对应的下采样因子,下采样因子应该对应一个倍率
[w, h] = source.level_dimensions[0] #第0层,也就是最高分辨率的宽高。
print(w, h)

# 为了计算第二层下的宽高
#size1 = int(w*(downsamples[0]/downsamples[2]))
#size2 = int(h*(downsamples[0]/downsamples[2]))

a = source.read_region((18000, 18000), 0, (2048, 2048)) # 返回一个RGBA图像,
region = numpy.array(a) # 数组化、

# 4通道改为3通道,切记我是为了要把region区域在ps打开标记,所以一定是RGB排序格式
r, g, b, a = cv2.split(region)
merged = cv2.merge([r, g, b])
print(merged)

misc.imsave("test2.tiff", merged) #保存
end=time.time()
print(end-start)

# 显示分割区域
#plt.figure()
#plt.imshow(region)
#plt.show()


标记 ROI 区域


ROI 标记图转成 mask 图


前面得到了一个 RGB 标记图,想要让此图作为 mask 图,然后对原图进行操作的话,需要进行两步:

  • 1. 需要检测出来标记图所标记的区域。
  • 2. 将检测出来的区域填充黑色或白色,做成mask图


现在, 对ps标记的RGB图,做mask图,如下:

import cv2
from scipy import misc

img = cv2.imread('test2kaobei.tif')
# Img2Grey
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Grey2Binary
ret, binary = cv2.threshold(gray, 254, 255, cv2.THRESH_BINARY)

# 轮廓检测(注意,返回值为3个参数)
aa, contours, hierarchy = cv2.findContours(binary, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)# cv2.RETR_TREE等可以改变填充
# 绘制轮廓**这个位置需要加一个循环,假如你的区域很多的话**具体参照最下的循环。
cv2.drawContours(aa, contours, 1, (0, 0, 255), -1) #(0, 0, 255)线条取色
print(aa)
cv2.imshow("img", aa)
#misc.imsave("test灰度图.tif", aa)

cv2.waitKey(0)


得到ROI区域的二值图


需要注意的是cv2.findContours()函数接受的参数为二值图,即黑白的(不是灰度图),所以读取的图像要先转成灰度的,再转成二值图。


使用 openslide-python 对 whole slide image(WSI)进行读取、显示和金字塔构建、生成 tiles


H&E 染色的病理切片怎么读取

特点是:太大,每张 600Mb~10Gb,一般软件打不开。

基于 python 开发,3 种打开方式:

import openslide
import matplotlib.pyplot as plt

# image file
img_path = 'path/to/img/1.tif'

# method 1
slide1 = openslide.OpenSlide(img_path)
# method 2
slide2 = openslide.open_slide(img_path)
# method 3
slide3 = openslide.ImageSlide(img_path)

# size of the image
print(slide.level_dimensions[0])

输出:

(68046, 80933)

这张图的像素是 (68046, 80933),用 OpenSlide 和 open_slide 打开没问题,但是用 ImageSlide 就内存溢出了。

打开之后,就可以看看 openslide 能够解析的图像信息了,以及实现图像切分等操作。

下面是部分可能需要用到的操作(python3.6):

from openslide.deepzoom import DeepZoomGenerator

# 图像扫描仪制造商
print(slide.detect_format(img_path))

# 幻灯片的各种属性
print(slide.properties)

# 下采样因子
downsamples = slide.level_downsamples

# 图像大小(宽,高)
[w, h] = slide.level_dimensions[0]
print(w,h)

# 得到原图的缩略图(206X400)
simg = slide.get_thumbnail((206,400))
# 显示缩略图
plt.imshow(simg)
plt.show()

# 实现 DeepZoomGenerator 的功能
data_gen = DeepZoomGenerator(slide2, tile_size=1023, overlap=1, limit_bounds=False)

# The number of Deep Zoom levels in the image
print(data_gen.level_count)

# The total number of Deep Zoom tiles in the image
print(data_gen.tile_count)

# A list of (tiles_x, tiles_y) tuples for each Deep Zoom level. level_tiles[k] are the tile counts of level k
print(data_gen.level_tiles)

# A list of (pixels_x, pixels_y) tuples for each Deep Zoom level. level_dimensions[k] are the dimensions of level k
print(data_gen.level_dimensions)

# Return a string containing the XML metadata for the Deep Zoom .dzi file
# Parameters:format (str)  the delivery format of the individual tiles (png or jpeg)
print(data_gen.get_dzi('png'))

显示 tiles

# Return an RGB Image for a tile.
# level (int): the Deep Zoom level
# address (tuple): the address of the tile within the level as a (column, row) tuple

tile_img1 = data_gen.get_tile(11,(0,0))
tile_img2 = data_gen.get_tile(11,(0,1))
plt.imshow(tile_img1)
plt.show()
plt.imshow(tile_img2)
plt.show()

注意:tile_size 的设置原则是:tile_size + overlap = 2^n

此处,1023+1=1024(2^10)

# Return the OpenSlide.read_region() arguments corresponding to the specified tile.
# Most applications should use get_tile() instead.
# level (int)  the Deep Zoom level
# address (tuple)  the address of the tile within the level as a (column, row) tuple
read_region = data_gen.get_tile_coordinates(11, (0,0))
print(read_region)

# Return a (pixels_x, pixels_y) tuple for the specified tile.
print(data_gen.get_tile_dimensions(12, (0,0)))

输出:

((0, 0), 2, (4092, 4092))
(1024, 1024)



References

https://www.dazhuanlan.com/2020/04/30/5eaa3836c7c98/

https://github.com/openslide/openslide